Dear members, My unit is market-year and I want to cluster at state-year level. I have an unbalanced panel data set with more than 400,000 observations over 20 years. Stata tells me that panels are not nested within clusters which is indeed the case since I identified about 90 households that moved between the two periods. So panel data itself has a multilevel structure - it is wrong to think of panel data as 'single level'. You could cluster by state, because again all observations in one panel are also in the same cluster. Moreover, there's no reason to use that option, as good alternatives are available: vce(jackknife), vce(bootstrap), and vce(robust). You could cluster by state, because again all observations in one panel (a state) are also in the same cluster (the very same state). This is what is meant by "panels are nested within clusters". This is what is meant by "panels are not nested within clusters". In this case, when we use the xtreg, fe cluster(localcode), it will produce error message: panels are not nested within clusters. Put another way, cases within a cluster are generally not independent of each other. There also may be higher levels of clustering. You could cluster by regions (groups of states), because all observations in one panel (a state) are also all in the same cluster (a region). reghdfe adjusts the degrees of freedom adjustments when the FE are nested within clusters. In the second study, longitudinal clustered data (e.g., repeated measures nested within units and units nested within clusters) are analyzed correctly and with a misspecification ignoring the highest level of the nesting structure. However, in some cases, firms do experience transfer from one location to the another, which I call "cluster transfer" (maybe a little bit ambiguous), perhaps because of typos or measurement errors. And fifth, the method allows for statistical tests of cluster confounding, i.e., whether differences between within- and between-cluster effects are statistically significant. A second leading example is panel data. The key feature of clustered data is that observations within a cluster are "more alike" than observations from different clusters. Repeated measures can occur on subjects that are nested within clusters. What can be done to make them evaluate under 12.2? states), because all observations in one panel (a state) are also all in the same cluster. The aforementioned model can be used to analyse data from one site or data from all sites. How to calculate differences between maximum value and current value for each row? Of course, I got the error "panels are not nested within clusters" because I forgot that people move across regions such that they appear in more than a single region. If so, the panel for id =1 will span two years, meaning two different "clusters". To fix this problem, I wanted to use a region-year variable which is a numeric variable indicating the year as well as the region. Best, Anne You could cluster by regions (groups of states), because all observations in one panel (a state) are also all in the same cluster (a region). Haizhen, Are the consequences of this Magic drug balanced with its benefits? See also the contributed command ivreg2 for Stata, which is available on SSC. To account for possible correlations between the persons within the same regions, I would like use clustered standard errors in my fixed effects regression. There are three ways to access the properties of a control within a cluster. reghdfe is a generalization of areg (and xtreg,fe, xtivreg,fe) for multiple levels of fixed effects (including heterogeneous slopes), alternative estimators (2sls, gmm2s, liml), and additional robust standard errors (multi-way clustering, HAC standard errors, etc). The clustering should not be on region-year pairs since, for example, the error for Bavaria in 2014 is presumably correlated with the error for Bavaria in 2013. You can't escape the error. Of course, I got the error "panels are not nested within clusters" because I forgot that people move across regions such that they appear in more than a single region. At the high level, the whole data set is partitioned into two nested clusterings. If nested (e.g., classroom and school district), you should cluster at the highest level of aggregation. If not nested (e.g., time and space), you can: 1) Include fixed-effects in one dimension and cluster in the other one. In addition, there may be other fixed effect factors. The values of "region-year" are chosen arbitrarily in this example. If you use year-fixed effects but cluster by company, your panels are not nested within clusters because one company can have (or rather, by definition should have, as you're looking at panel data) observations from several time periods. A novel and robust algorithm to efficiently absorb the fixed effects (extending the work of Guimaraes and Portugal, 2010). The conventional study design among medical and biological experimentalists involves collecting multiple measurements from a study subject. The key feature of clustered data is that observations within a cluster are "more alike" than observations from different clusters, an example being: patients clustered within family physicians. Is there something fundamentally wrong with our model, or is it just a model that xtreg is not set up to estimate? Why is today the shortest day but the solstice is actually tomorrow? The regression model errors are independent across clusters but correlated within clusters, meaning that schools are not nested within clusters. Could lfe do something similar, and would you be interested in a PR? XTREG's approach of not adjusting the degrees of freedom is appropriate when the fixed effects swept away by the within-group transformation are nested within clusters (meaning all the observations for any given group are in the same cluster), as is commonly the case (e.g., firm fixed effects are nested within firm, industry, or state clusters). This is what is meant by "panels are nested within clusters". A conventional study design among medical and biological experimentalists involves collecting multiple measurements from a study subject. The regression model errors are independent across clusters but correlated within clusters. Is there any other chance to cluster errors at village-level without excluding those 90 households? The key feature of clustered data is that observations within a cluster are "more alike" than observations from different clusters, an example being: patients clustered within family physicians. The regression model errors are independent across clusters but correlated within clusters. Each panel is considered a cluster, thus not limiting the types of hypotheses one can test. What does the error mean? The regression model errors are independent across clusters but correlated within clusters. See also White (1987) for the use of vce(cluster) for your comments. Consider the following data extract for one individual (panel): Here id = 1 is observed in more than one year but stays in the same region. We see that the first clustering S 1 contains three groups of objects. It is wrong to think of panel data as 'single level'. So panel data itself has a multilevel structure. Why that should be a problem? Mark (-xtivreg2- author). Consider the following data extract for one individual (panel). The next m 2 to strata two and have distribution and so forth. I tired to use xtreg, cluster at state-year level, it provides error message. The data are from all over Germany which means that they are from different regions. Is there something fundamentally wrong with our model, or is it just a model that xtreg is not set up to estimate? Process evaluation nested within a cluster randomised controlled trial (RCT). The values of "region-year" are chosen arbitrarily in this example. Each panel is considered a cluster with EVC enabled, thus not limiting the types of hypotheses one can test.