Sample Design and Sampling Variability - Nativity and Parentage (Volume II, Part I - Subject Reports) - Survey Census 1960 (US, County & State)

Documentation:

Census 1960 (US, County & State)

you are here: choose a survey survey document chapter

Publisher: U.S. Census Bureau

Survey: Census 1960 (US, County & State)

Document:	Nativity and Parentage (Volume II, Part I - Subject Reports)
citation:	U.S. Bureau of the Census. U.S. Census of Population: 1960. Subject Reports, Nativity and Parentage. Final Report PC(2)-1A. U.S. Government Printing Office, Washington, D.C. 1965.

Chapter Contents

Nativity and Parentage (Volume II, Part I - Subject Reports)

Sample Design and Sampling Variability

Sample Design

For persons in housing units at the time of the 1960 Census, the sampling unit was the housing unit and all its occupants; for persons in group quarters, it was the person. On the first visit to an address, the enumerator assigned a sample key letter (A, B, C, or D) to each housing unit sequentially in the order in which he first visited the units, whether or not he completed an interview. Each enumerator was given a random key letter to start his assignment, and the order of canvassing was indicated in advance, although these instructions allowed some latitude in the order of visiting addresses. Each housing unit which was assigned the key letter "A" was designated as a sample unit and all persons enumerated in the unit were included in the sample. In every group quarters, the sample consisted of every fourth person in the order listed.

Although the sampling procedure did not automatically insure an exact 25-percent sample of persons or housing units in each locality, the sample design was unbiased if carried through according to instructions; and, generally, for large areas the deviation from 25 percent was found to be quite small. Biases may have arisen, however, when the enumerator failed to follow his listing and sampling instructions exactly.

Ratio Estimation

The statistics based on the sample of the 1960 Census returns are estimates that have been developed through the use of a ratio estimation procedure. This procedure was carried out for each of 44 groups of persons in each of the smallest areas for which sample data are published.¹ (For a more complete discussion of the ratio estimation procedure, see 1960 Census of Population, Volume I, Characteristics of the Population, Part 1, United States Summary.)

These ratio estimates reduce the component of sampling error arising from the variation in the size of household and achieve some of the gains of stratification in the selection of the sample, with the strata being the groups for which separate ratio estimates are computed. The net effect is a reduction in the sampling error and bias of most statistics below what would be obtained by weighting the results of the 25-percent sample by a uniform factor of four. The reduction in sampling, error is trivial for some items and substantial for others. A by-product of this estimation procedure, in general, is that estimates for this sample are consistent with the complete count with respect to the total population and for the subdivisions used as groups in the estimation procedure.

Sampling Variability

The figures from the 25-percent sample tabulations are subject to sampling variability, which can be estimated roughly from the standard errors shown in tables E and F. Somewhat more precise estimates of sampling error may be obtained by using the factors shown in table G in conjunction with table F for percentages and table E for absolute numbers. These tables² do not reflect the effect of response variance, processing variance, or bias arising in the collection, processing, and estimation steps. Estimates of the magnitude of some of these factors in the total error are being evaluated and will be published at a later date. The chances are about 2 out of 3 that the difference due to sampling variability between an estimate and the figure that would have been obtained from a complete count of the population is less than the standard error. The chances are about 19 out of 20 that the difference is less than twice the standard error and about 99 out of 100 that it is less than 2 ½ times the standard error. The amount by which the estimated standard error must be multiplied to obtain other odds deemed more appropriate can be found in most statistical text books.

Table E. Rough Approximation to Standard Error of Estimated Number
(Range of 2 chances out of 3)

Estimated percentage	Base of percentage
Estimated percentage	500	1,000	2,500	10,000	25,000	100,000
2 or 98	1.3	0.9	0.5	0.3	0.1	0.1
5 or 95	2.0	1.4	0.9	0.4	0.2	0.1
10 or 90	2.8	2.0	1.2	0.6	0.3	0.2
25 or 75	3.8	2.7	1.5	0.7	0.4	0.2
50	4.4	3.1	1.6	0.8	0.5	0.3

Table E shows rough approximations to standard errors of estimated numbers up to 50,000. The relative sampling errors of larger estimated numbers are somewhat smaller than for 50,000. For estimated numbers above 50,000, however, the nonsampling errors, e.g., response errors and processing errors may have an increasingly important effect on the total error.

TABLE F. Rough Approximation to Standard Error of Estimated Percentage
(Range of 2 chances out of 3)

Estimated percentage	Base of percentage
Estimated percentage	500	1,000	2,500	10,000	25,000	100,000
2 or 98	1.3	0.9	0.5	0.3	0.1	0.1
5 or 95	2.0	1.4	0.9	0.4	0.2	0.1
10 or 90	2.8	2.0	1.2	0.6	0.3	0.2
25 or 75	3.8	2.7	2.7	0.7	0.4	0.2
50	4.4	3.1	3.1	0.8	0.5	0.3

Table F shows rough standard errors of data in the form of percentages. Linear interpolation in tables E and F will provide approximate results that are satisfactory for most purposes.

For a discussion of the sampling variability of medians and means and of the method for obtaining standard errors of differences between two estimates, see 1960 Census of Population, Volume I, Characteristics of the Population, Part 1, United States Summary. For a discussion of the sampling variability of characteristics from the 1950 Census, see 1950 Census of Population, Volume IV, Special Reports, Part 3A, Nativity and Parentage.

Table G provides a factor by which the standard errors shown in table E or F should be multiplied to adjust for the combined effect of the sample design and the estimation procedure. To estimate a somewhat more precise standard error .for a given characteristic, locate in table G the 'factor applying to the characteristic. Where data are shown as cross-classifications of two characteristics, locate each characteristic in table G. The factor to be used for any cross-classification will usually lie between the values of the factors. When a given characteristic is cross-classified in extensive detail (e.g., by single years of age), the factor to be used is the smaller one shown in table G. Where a characteristic is cross-classified in broad groups (or used in broad groups), the factor to be used in table G should be closer to the larger one. Multiply the standard error given for the size of the estimate as shown in table E by this factor from table G. The result of this multiplication is the approximate standard error. Similarly, to obtain a somewhat more precise estimate of the standard error of a percentage, multiply the standard error as shown in table F by the factor from table G.

Illustration: Table 9 shows that in 1960 there were an estimated 36,242 Austrian-born males 55 to 64 years of age in the United States. Table G shows that, for data on country of origin, the appropriate standard error in table E should be multiplied by a factor of 1.4. Table E shows that a rough approximation to the standard error for an estimate of 36,242 is 295. The factor of 1.4 times 295 is 413, which means that the chances are approximately 2 out of 3 that the results of a complete census would not differ by more than 413 from this estimated 36,242. It also follows that there is only about 1 chance in 100 that a complete census result would differ by as much as 1,033, that is, by about 2 ½ times the number estimated from tables E and G.

Table G. Factor to Be Applied To Standard Errors

Characteristics	Factor
Nativity, parentage, county of origin	1.4
All other characteristics	1.0

Footnotes:

¹ Estimates of characteristics from the sample for a given area are produced using the formula

Where x is the estimate of the characteristic for the area obtained through the use of the ratio estimation procedure
X_i is the count of all sample persons with the characteristic for the area in one (i) of the 44 groups.
y_i is the count of all sample persons for the area in the same one of the 44 groups, and
Y_i is of persons in the complete count for the area in the same one of the 44 groups.
²These estimates of sampling variability are based on partial information on variances calculated from a sample of the 1960 Census results. Further estimates are being calculated and will he made available at a later date.

« Previous ‹Table of Contents› Next »

What's new?

Data Documentation