Sample Design and Sampling Variability - Occupation by Earnings and Education (Volume II, Part VII - Subject Reports) - Survey Census 1960 (US, County & State)

Documentation:

Census 1960 (US, County & State)

you are here: choose a survey survey document chapter

Publisher: U.S. Census Bureau

Survey: Census 1960 (US, County & State)

Document:	Occupation by Earnings and Education (Volume II, Part VII - Subject Reports)
citation:	U.S. Bureau of the Census. U.S. Census of Population: 1960. Subject Reports, Occupation by Earnings and Education. Final Report PC(2)-7B. U.S. Government Printing Office, Washington, D.C. 1963.

Chapter Contents

Occupation by Earnings and Education (Volume II, Part VII - Subject Reports)

Sample Design and Sampling Variability

Sample Design

For persons in housing units at the time of the 1960 Census, the sampling unit was the housing unit and all its occupants; for persons in group quarters, it was the person. On the first visit to an address, the enumerator assigned a sample key letter (A, B, C, or D) to each housing unit sequentially in the order in which he first visited the units, whether or not he completed an interview. Each enumerator was given a random key letter to start his assignment, and the order of canvassing was indicated in advance, although these instructions allowed some latitude in the order of visiting addresses. Each housing unit to which the key letter "A" was assigned was designated as a sample unit, and all persons enumerated in the unit were included in the sample. In every group quarters, the sample consisted of every fourth person in the order listed. The 1960 statistics in this report are based on a subsample of one-fifth of the original 25-percent sample schedules. The subsample was selected on the computer, using a stratified systematic Sample Design. The strata were made up as follows: For persons in regular housing units there were 36 strata, i.e., 9 household size groups by 2 tenure groups by 2 color groups; for persons in group quarters, there were 2 strata, i.e., the 2 color groups.

Although the sampling procedure did not automatically insure an exact 5-percent sample of persons, the Sample Design was unbiased if carried through according to instructions. Generally, for large areas, the deviation from the estimated sample size was found to be quite small. Biases may have arisen, however, when the enumerator failed to follow his listing and sampling instructions exactly.

Ratio Estimation

The statistics based on the 5-percent sample of the 1960 Census returns are estimates that have been developed through the use of a Ratio Estimation procedure. This procedure was carried out for each of the following 44 groups of persons in each of the sample weighting areas: ²

Group	Sex, color, and age	Relationship and tenure
	Male white:
1	Under 5
2	5 to 13
3	14 to 24	Head of owner household
4	14 to 24	Head of renter household
5	14 to 24	Not head of household
6-8	25 to 44	Same groups as age group 14 to 24
9-11	45 and over	Same groups as age group 14 to 24
	Male nonwhite:
12-22	Same groups as male white
	Female white:
23-33	Same groups as male white
	Female nonwhite:
34-44	Same groups as male white

The sample weighting areas were defined as those areas within a State consisting of central cities of urbanized areas, the remaining portion of urbanized areas not in central cities, urban places not in urbanized areas, or rural areas. ³

For each of the Mf groups, the ratio of the complete count to the sample count of the population in the group was determined. Each specific sample person in the group was assigned an integral weight so that the sum of the weights would equal the complete count for the group. For example, if the ratio for a group was 20.1, one-tenth of the persons (selected at random) within the group were assigned a weight of 21, and the remaining nine-tenths a weight of 20. The use of such a combination of integral weights rather than a single fractional weight was adopted to avoid the complications involved in rounding in the final tables. In order to increase the reliability, where there were fewer than 275 persons in the complete count in a group, or where the resulting weight was over 80, groups were combined in a specific order to satisfy both of these two conditions.

These ratio estimates reduce the component of sampling error arising from the variation in the size of household and achieve some of the gains of stratification in the selection of the sample, with the strata being the groups for which separate ratio estimates are computed. The net effect is a reduction in the sampling error and bias of most statistics below what would be obtained by weighting the results of the 5-percent sample by a uniform factor of 20. The reduction in sampling error will be trivial for some items and substantial for others. A by-product of this estimation procedure, in General, is that estimates for this sample are generally consistent with the complete count with respect to the total population and for the subdivisions used as groups in the estimation procedure. A more complete discussion of the technical aspects of these ratio estimates will be presented in another report.

Footnote:

² Estimates of characteristics from the sample for a given area are produced using the formula

Where x' is the estimate of the characteristic for the area obtained through the use of the Ratio Estimation procedure,
x_i is the count of sample persons with the characteristic for the area in one (i) of the 44 groups,
y_i is the count of all sample persons for the area in the same one of the 44 groups, and
Y_i is the count of persons in the complete count for the area in the same one of the 44 groups.

Sampling Variability

The figures from the 5-percent sample tabulations are subject to Sampling Variability, which can be estimated roughly from the standard errors shown in tables A and B. These tables⁴ do not reflect the effect of response variance, processing variance, or bias arising in the collection, processing, and estimation steps. Estimates of the magnitude of some of these factors in the total error are being evaluated and will be published at a later date. The chances are about 2 out of 3 that the difference due to Sampling Variability between an estimate and the figure that would have been obtained from a complete count of the population is less than the standard error. The chances are about 19 out of 20 that the difference is less than twice the standard error and about 99 out of 100 that it is less than 2 ½ times the standard error. The amount by which the estimated standard error must be multiplied to obtain other odds .deemed more appropriate can be found in most statistical textbooks.

Table A. Rough Approximation to Standard Error of Estimated Number
(Range of 2 chances out of 3)

Estimated number	Standard error
50	30
100	40
250	60
500	90
1,000	120
2,500	200
5,000	280
10,000	390
15,000	480
25,000	620
50,000	880

Table B. Rough Approximation to Standard Error of Estimated Percentage
(Range of 2 chances out of 3)

Estimated percentage	Base of percentage
Estimated percentage	500	1,000	2,500	10,000	25,000	100,000
2 or 98	3.3	2.3	1.3	0.8	0.3	0.3
5 or 95	5.0	4.0	2.3	1.0	0.5	0.3
10 or 90	7.0	5.0	3.0	1.5	0.8	0.
25 or 75	10.0	6.8	3.8	1.8	1.0	0
50	11.0	7.8	4.0	2.0	1.3	0.8

Table A shows rough standard errors of estimated numbers up to 50,000. The relative sampling errors of larger estimated numbers are somewhat smaller than for 50,000. For estimated numbers above 50,000, however, the nonsampling errors, e.g., response errors and processing errors may have an increasingly important effect on the total error. Table B shows rough standard errors of data in the form of percentages. Linear interpolation in tables A and B will provide approximate results that are satisfactory for most purposes.

For a discussion of the Sampling Variability of medians and means and of the method for obtaining standard errors of differences between two estimates see 1.960 Census of Population, Volume I, Characteristics of the Population, Part 1, United States Summary.

Illustration: Table 3 for the South shows that there are 17,000 craftsmen, foremen, and kindred workers, age 35 to 44 years, who completed 4 years of high school and who are in the income class $3,000 to $3,999. Table A shows that for an estimate of 17,000 the approximate standard error is 508, which means that the chances are approximately 2 out of 3 that the results of a complete census would not differ by more than 508 from this estimated 17,000. It also follows that there is only 1 chance in 100 that a complete census result would differ by as much as 1,270, that is, by about 2 ½ times the number estimated from table A.

Footnote:

⁴ The estimates of Sampling Variability are based on calculations from a preliminary sample of the 1960 Census results. Further estimates are being calculated and will be available at a later date.

« Previous ‹Table of Contents› Next »

What's new?

Data Documentation