Sample Design and Sampling Variability - Mother Tongue of the Foreign Born (Volume II, Part I - Subject Reports) - Survey Census 1960 (US, County & State)

Documentation:

Census 1960 (US, County & State)

you are here: choose a survey survey document chapter

Publisher: U.S. Census Bureau

Survey: Census 1960 (US, County & State)

Document:	Mother Tongue of the Foreign Born (Volume II, Part I - Subject Reports)
citation:	U.S. Bureau of the Census. U.S. Census of Population: 1960. Subject Reports, Mother Tongue of the Foreign Borne. Final Report PC(2)-1E. U.S. Government Printing Office, Washington, D.C. 1966.

Chapter Contents

Mother Tongue of the Foreign Born (Volume II, Part I - Subject Reports)

Sample Design and Sampling Variability

Sample Design

For persons in housing units at the time of the 1960 Census, the sampling unit was the housing unit and all its occupants; for persons in group quarters, it was the person. On the first visit to an address, the enumerator assigned a sample key letter (A, B, C, or D) to each housing unit sequentially in the order in which he first visited the units, whether or not he completed an interview. Each enumerator was given a random key letter to start his assignment, and the order of canvassing was indicated in advance, although these instructions allowed some latitude In the order of visiting addresses. Each housing unit which was assigned the key letter "A" was designated as a sample unit and all persons enumerated in the unit were included in the sample. In every group quarters, the sample consisted of every fourth person in the order listed.

Although the sampling procedure did not automatically insure an exact 25-percent sample of persons or housing units in each locality, the sample design was unbiased if carried through according to instructions; and, generally, for large areas the deviation from 25 percent was found to be quite small. Biases may have arisen, however, when the enumerator failed to follow his listing and sampling instructions exactly.

Ratio Estimation

The statistics based on the sample of the 1960 Census returns are estimates that have been developed through the use of a ratio estimation procedure. This procedure was carried out for each of groups of persons in each of the smallest areas for which sample data are published.¹ (For a more complete discussion of the ratio estimation procedure, see 1960 Census of Population. Volume I, Characteristics of the Population. Part 1, United States Summary.)

These ratio estimates reduce the component of sampling error arising from the variation in the size of household and achieve some of the gains of stratification in the selection of the sample, with the strata being the groups for which separate ratio estimates are computed. The net effect is a reduction in the sampling error and bias of most statistics below what would be obtained by weighting the results of the 25-percent sample by a uniform factor of four. The reduction in sampling error is trivial for some items and substantial for others. A by-product of this estimation procedure, in general, is that estimates for this sample are consistent with the complete count with respect to the total population and for the subdivisions used as groups in the estimation procedure.

Sampling Variability

The figures from the 25-percent sample tabulations are subject to sampling variability, which can he estimated roughly from the standard errors shown in tables C and D. Somewhat more precise estimates of sampling error may be obtained by using the factors shown In table E in conjunction with table D for percentages and table C for absolute numbers. These tables² do not reflect the effect of response variance, processing variance, or bias arising in the collection, processing, and estimation steps. Estimates of the magnitude ofsome of these factors in the total error are being evaluated and will be published at a later date. The chances are about 2 out of 3 that the difference due to sampling variability between an estimate and the figure that would have been obtained from a complete count of the population is less than the standard error. The chances are about 19 out of 20 that the difference is less than twice the standard error and about 99 out of 100 that it is less than 2 ½ times the standard error. The amount by which the estimated standard error must be multiplied to obtain other odds deemed more appropriate can be found in most statistical text books.

Footnotes:

¹ Estimates of characteristics from the sample for a given area are produced using the formula:

where x' is the estimate of the characteristic for the area obtained through the use of the ratio estimation procedure,
xⁱ is the count of sample persons with the characteristic for the area in one (i) of the 44 groups,
yⁱ is the count of all sample persona for the area in the same one of the 44 groups, and
Yⁱ is the count of persons in the complete count for the area in the same one of the 44 groups.
²These estimates of sampling variability are based on partial information on variances calculated from a sample of the 1960 Census results.

Table C. Rough Approximation to Standard Error of Estimated Number
(Range of 2 chances out of 3)

Estimated number	Standard error
50	15
100	20
250	30
500	40
1,000	50
2,500	80
5,000	110
10,000	160
15,000	190
25,000	250
50,000	350

Table D. Rough Approximation to Standard Error of Estimated Percentage
(Range of 2 chances out of 3)

Estimated number	Base of percentage
Estimated number	500	1,000	2,500	10,000	25,000	100,000
2 or 98	1.3	0.9	0.5	0.3	0.1	0.1
5 or 95	2.0	1.4	0.9	0.4	0.2	0.1
10 or 90	2.8	2.0	1.2	0.6	0.3	0.2
25 or 75	3.8	2.7	1.5	0.7	0.4	0.2
50	4.4	3.1	1.6	0.8	0.5	0.3

Table C shows rough approximations to standard errors of estimated numbers up to 50,000. The relative sampling errors of larger estimated numbers are somewhat smaller than for 50,000. For estimated numbers above 50,000, however, the nonsampllng errors, e.g., response errors and processing errors, may have an increasingly important effect on the total error. Table D shows rough standard errors of data in the form of percentages. Linear interpolation in tables C and D will provide approximate results that are satisfactory for most purposes.

For a discussion of the sampling variability of medians and means and of the method for obtaining standard errors of differences between two estimates, see 1960 Census of Population, Volume I, Characteristics of the Population, Part 1, United States Summary. For a discussion of the sampling variability of characteristics from the 1940 Census, see 1940 Census. Nativity and Parentage of the White Population. Mother Tongue.

Table E provides a factor by which the standard errors shown in table C or D should be multiplied to adjust for the combined effect of the sample design and the estimation procedure. To estimate a somewhat more precise standard error for a given characteristic, locate in table E the factor applying to the characteristic. Where data are shown as cross-classifications of two characteristics, locate each characteristic in table E. The factor to be used for any cross-classification will usually lie between the values of the factors. When a given characteristic is cross-classified in extensive detail (e.g., by single years of age), the factor to be used is the smaller one shown in table E. Where a characteristic is cross-classified in broad groups (or used in broad groups), the factor to be used in table E should be closer to the larger one. Multiply the standard error given for the size of the estimate as shown in table C by this factor from table E. The result of this multiplication is the approximate standard error. Similarly, to obtain a somewhat more precise estimate of the standard error of a percentage, multiply the standard error as shown in table D by the factor from table E.

Illustration: Table 2 shows there were an estimated 34,556 foreign born with Chinese "mother tongue in California in 1960. Table E shows that, for data on mother tongue, the appropriate standard error in table C should be multiplied by a factor of 1.4. Linear interpolation in table C shows a rough approximation to the standard error for an estimate of 34, 556 is 288. The factor of 1.4 times the standard error is 403. This means the chances are 2 out of 3 that the results of a complete count of all people in California would not differ by more than 403 from an estimate in this sample. Furthermore, the chances are about 99 in 100 that the difference is less than 1,008, which is 2 ½ times the standard error determined from tables C and E.

Table E. Factor to Be Applied To Standard Errors

Characteristics	Factor
Foreign born, country of birth, mother tongue	1.4
All other characteristics	1.0

« Previous ‹Table of Contents› Next »

What's new?

Data Documentation