Given the replicate weights, the computation of variance for any ACS estimate is straightforward. Suppose that è is an ACS estimate of any type of statistic, such as mean, total, or proportion. Let Θ0
denote the estimate computed based on the full sample weight, and Θ1, Θ2,... Θ80, denote the estimates computed based on the replicate weights. The variance of Θ0
) is estimated as the sum of squared differences between each replicate estimate Θ r
= 1, ..., 80) and the full sample estimate Θ0
. The formula is as follows:1
This equation holds for count estimates as well as any other types of estimates, including percents, ratios, and medians.
There are certain cases, however, where this formula does not apply. The first and most important cases are estimates that are "controlled" to population totals and have their standard errors set to zero. These are estimates that are forced to equal intercensal estimates during the weighting process raking step-for example, total population and collapsed age, sex, and Hispanic origin estimates for weighting areas. Although race is included in the raking procedure, race group estimates are not controlled; the categories used in the weighting process (see Chapter 11) do not match the published tabulation groups because of multiple race responses and the "Some Other Race" category. Information on the final collapsing of the person post-stratification cells is passed from the weighting to the variance estimation process in order to identify estimates that are controlled. This is done independently for all weighting areas and then is applied to the geographic areas used for tabulation. Standard errors for those estimates are set to zero, and published margins of error are set to "*****" (with an appropriate accompanying footnote).
Another special case deals with zero-estimated counts of people, households, or HUs. A direct application of the replicate variance formula leads to a zero standard error for a zero-estimated count. However, there may be people, households, or HUs with that characteristic in that area that were not selected to be in the ACS sample, but a different sample might have selected them, so a zero standard error is not appropriate. For these cases, the following model-based estimation of standard error was implemented.
For ACS data in a census year, the ACS zero-estimated counts (for characteristics included in the 100 percent census ("short form") count) can be checked against the corresponding census estimates. At least 90 percent of the census counts for the ACS zero-estimated counts should be within a 90 percent confidence interval based on our modeled standard error.2
Let the variance of the estimate be modeled as some multiple ( K
) of the average final weight (for a state or the nation). That is:
v(0) = K
x (average weight)
Then, set the 90 percent upper bound for the zero estimate equal to the census count:
Solving for K yields:
was computed for all ACS zero-estimated counts from 2000, which matched Census 2000 100 percent counts, and then the 90th percentile of those K
s was determined. Based on the Census 2000 data, we use a value for K
of 400 (Navarro, 2001b). As this modeling method requires census counts, the 400 value can next be updated using the 2010 Census and 2010 ACS data.
For publication, the standard error ( SE
) of the zero count estimate is computed as:
The average weights (the maximum of the average housing unit and average person final weights) are calculated at the state and national level for each ACS single-year or multiyear data release. Estimates for geographic areas within a state use that states average weight, and estimates for geographic areas that cross state boundaries use the national average weight.
Finally, a similar method is used to produce an approximate standard error for both ACS zero and 100 percent estimates. We do not produce approximate standard errors for other zero estimates, such as ratios or medians.
A general replication-based variance formula can be expressed as
where c r
is the multiplier related to the r
-th replicate determined by the replication method. For the SDR method, the value of c r
is 4 / R
, where R
is the number of replicates (see Fay and Train, 1995).
This modeling was done only once, in 2001, prior to the publication of the 2000 ACS data.