P131. Imputation Of Age [3] - Summary Tape File 3 (STF3) - Census 1990

Data Dictionary:

Census 1990

you are here: choose a survey survey data set table details

Survey: Census 1990

Data Source:

U.S. Census Bureau

Data set: Summary Tape File 3 (STF3)

Table:

P131. Imputation Of Age [3]

Universe: Persons

Table Details

P131.

Imputation Of Age

Universe: Persons

Variable	Label
P131_001	Persons
P131_002	Allocated
P131_003	Not allocated

Relevant Documentation:

Excerpt from:	Social Explorer, U.S. Census Bureau; Census of Population and Housing, 1990: Summary Tape File 3 on CD-ROM [machine-readable data files] / prepared by the Bureau of the Census. Washington: The Bureau [producer and distributor], 1991.
	Summary Tape File 3 -> Appendix C. Accuracy of the Data -> Confidentiality of the Data

Confidentiality of the Data

To maintain the confidentiality required by law (Title 13, United States Code), the Bureau of the Census applies a confidentiality edit to the 1990 census data to assure that published data do not disclose information about specific individuals, households, or housing units. As a result, a small amount of uncertainty is introduced into the estimates of census characteristics. The sample itself provides adequate protection for most areas for which sample data are published since the resulting data are estimates of the actual counts; however, small areas require more protection. The edit is controlled so that the basic structure of the data is preserved.

The confidentiality edit is implemented by selecting a small subset of individual households from the internal sample data files and blanking a subset of the data items on these household records. Responses to those data items were then imputed using the same imputation procedures that were used for nonresponse. A larger subset of households is selected for the confidentiality edit for small areas to provide greater protection for these areas. The editing process is implemented in such a way that the quality and usefulness of the data were preserved.

Excerpt from:	Social Explorer, U.S. Census Bureau; Census of Population and Housing, 1990: Summary Tape File 3 on CD-ROM [machine-readable data files] / prepared by the Bureau of the Census. Washington: The Bureau [producer and distributor], 1991.
	Summary Tape File 3 -> Appendix C. Accuracy of the Data -> Editing of Unacceptable Data

Editing of Unacceptable Data

The objective of the processing operation is to produce a set of data that describes the population as accurately and clearly as possible. To meet this objective, questionnaires were edited during field data collection operations for consistency, completeness, and acceptability. Questionnaires also were reviewed by census clerks for omissions, certain specific inconsistencies, and population coverage. For example, write-in entries such as Dont know or NA were considered unacceptable. For some district offices, the initial edit was automated; however, for the majority of the district offices, it was performed by clerks. As a result of this operation, a telephone or personal visit followup was made to obtain missing information. Potential coverage errors were included in the followup, as well as a sample of questionnaires with omissions and/or inconsistencies. Subsequent to field operations, remaining incomplete or inconsistent information on the questionnaires was assigned using imputation procedures during the final automated edit of the collected data. Imputations, or computer assignments of acceptable codes in place of unacceptable entries or blanks, are needed most often when an entry for a given item is lacking or when the information reported for a person or housing unit on that item is inconsistent with other information for that same person or housing unit. As in previous censuses, the general procedure for changing unacceptable entries was to assign an entry for a person or housing unit that was consistent with entries for persons or housing units with similar characteristics. The assignment of acceptable codes in place of blanks or unacceptable entries enhances the usefulness of the data.

Another way in which corrections were made during the computer editing process was through substitution; that is, the assignment of a full set of characteristics for a person or housing unit. When there was an indication that a housing unit was occupied but the questionnaire contained no information for the people within the household or the occupants were not listed on the questionnaire, a previously accepted household was selected as a substitute, and the full set of characteristics for the substitute was duplicated. The assignment of the full set of housing characteristics occurred when there was no housing information available. If the housing unit was determined to be occupied, the housing characteristics were assigned from a previously processed occupied unit. If the housing unit was vacant, the housing characteristics were assigned from a previously processed vacant unit.

Table A. Unadjusted Standard Error for Estimated Totals [Based on a 1-in-6 simple random sample]
Estimated Total	Size of publication area²
Estimated Total	500	1,000	2,500	5,000	10,000	25,000	50,000	100,000	250,000	500,000	1,000,000	5,000,000	10,000,000	25,000,000
50	16	16	16	16	16	16	16	16	16	16	16	16	16	16
100	20	21	22	22	22	22	22	22	22	22	22	22	22	22
250	25	30	35	35	35	35	35	35	35	35	35	35	35	35
500		35	45	45	50	50	50	50	50	50	50	50	50	50
1,000			55	65	65	70	70	70	70	70	70	70	70	70
2,500				80	95	110	110	110	110	110	110	110	110	110
5000					110	140	150	150	160	160	160	160	160	160
10,000						170	200	210	220	220	220	220	220	220
15,000						170	230	250	270	270	270	270	270	270
25,000							250	310	340	350	350	350	350	350
75,000								310	510	570	590	610	610	610
100,000									550	630	670	700	700	710
250,000										790	970	1 090	1 100	1 100
500,000											1120	1 500	1 540	1 570
1,000,000												2 000	2 120	2 190
5,000,000													3 540	4 470
10,000,000														5 480

Footnote:

¹For estimated totals larger than 10,000,000, the standard error is somewhat larger than the table values. The formula given below should be used to calculate the standard error.

²The total count of persons in the area if the estimated total is a person characteristic, or the total count of housing units in the area if the estimated total is a housing unit characteristic.

Table B. Unadjusted Standard Error in Percentage Points for Estimated Percentage [Based on a 1 in-6 simple random sample]
Estimated percentage	Base of percentage¹
Estimated percentage	500	750	1,000	1,500	2,500	5,000	7,500	10,000	25,000	50,000	100,000	250,000	500,000
2 or 98	1.4	1.1	1.0	0.8	0.6	0.4	0.4	0.3	0.2	0.1	0.1	0.1	0.1
5 or 95	2.2	1.8	1.5	1.3	1.0	0.7	0.6	0.5	0.3	0.2	0.2	0.1	0.1
10 or 90	3.0	2.4	2.1	1.7	1.3	0.9	0.8	0.7	0.4	0.3	0.2	0.1	0.1
15 or 85	3.6	2.9	2.5	2.1	1.6	1.1	0.9	0.8	0.5	0.4	0.3	0.2	0.1
20 or 80	4.0	3.3	2.8	2.3	1.8	1.3	1.0	0.9	0.6	0.4	0.3	0.2	0.1
25 or 75	4.3	3.5	3.1	2.5	1.9	1.4	1.1	1.0	0.6	0.4	0.3	0.2	0.1
30 or 70	4.6	3.7	3.2	2.6	2.0	1.4	1.2	1.0	0.6	0.5	0.3	0.2	0.1
35 or 65	4.8	3.9	3.4	2.8	2.1	1.5	1.2	1.1	0.7	0.5	0.3	0.2	0.2
50	5.0	4.1	3.5	2.9	2.2	1.6	1.3	1.1	0.7	0.5	0.4	0.2	0.2

Footnote:

¹For a percentage and/or base of percentage not shown in the table, the formula given below may be used to calculate the standard error. This table should only be used for proportions, that is, where the numerator is a subset of the denominator.

Excerpt from:	Social Explorer, U.S. Census Bureau; Census of Population and Housing, 1990: Summary Tape File 3 on CD-ROM [machine-readable data files] / prepared by the Bureau of the Census. Washington: The Bureau [producer and distributor], 1991.
	Summary Tape File 3 -> Appendix B. Definitions of Subject Characteristics -> Population Characteristics -> Age

Age

The data on age were derived from answers to questionnaire item 5, which was asked of all persons. The age classification is based on the age of the person in complete years as of April 1, 1990. The age response in question 5a was used normally to represent a person's age. However, when the age response was unacceptable or unavailable, a person's age was derived from an acceptable year-of-birth response in question 5b.

Data on age are used to determine the applicability of other questions for a person and to classify other characteristics in census tabulations. Age data are needed to interpret most social and economic characteristics used to plan and examine many programs and policies. Therefore, age is tabulated by single years of age and by many different groupings, such as 5-year age groups.

Some tabulations are shown by the age of the householder. These data were derived from the age responses for each householder. (For more information on householder, see the discussion under "Household Type and Relationship.")

Median Age

This measure divides the age distribution into two equal parts: one-half of the cases falling below the median value and one-half above the value. Generally, median age is computed on the basis of more detailed age intervals than are shown in some census publications; thus, a median based on a less detailed distribution may differ slightly from a corresponding median for the same population based on a more detailed distribution. (For more information on medians, see the discussion under "Derived Measures.")

Limitation of the Data

Counts in 1970 and 1980 for persons 100 years old and over were substantially overstated. Improvements were made in the questionnaire design, in the allocation procedures, and to the respondent instruction guide to attempt to minimize this problem for the 1990 census.

Review of detailed 1990 census information indicated that respondents tended to provide their age as of the date of completion of the questionnaire, not their age as of April 1, 1990. In addition, there may have been a tendency for respondents to round their age up if they were close to having a birthday. It is likely that approximately 10 percent of persons in most age groups are actually 1 year younger. For most single years of age, the misstatements are largely offsetting. The problem is most pronounced at age 0 because persons lost to age 1 may not have been fully offset by the inclusion of babies born after April 1, 1990, and because there may have been more rounding up to age 1 to avoid reporting age as 0 years. (Age in complete months was not collected for infants under age 1.)

The reporting of age 1 year older than age on April 1, 1990, is likely to have been greater in areas where the census data were collected later in 1990. The magnitude of this problem was much less in the three previous censuses where age was typically derived from respondent data on year of birth and quarter of birth. (For more information on the design of the age question, see the section below that discusses "Comparability.")

Comparability

Age data have been collected in every census. For the first time since 1950, the 1990 data are not available by quarter year of age. This change was made so that coded information could be obtained for both age and year of birth. In each census since 1940, the age of a person was assigned when it was not reported. In censuses before 1940, with the exception of 1880, persons of unknown age were shown as a separate category. Since 1960, assignment of unknown age has been performed by a general procedure described as "imputation." The specific procedures for imputing age have been different in each census. (For more information on imputation, see Appendix C, Accuracy of the Data.)