Data Dictionary: Census 2000
you are here: choose a survey survey data set table details
Survey: Census 2000
Data Source: U.S. Census Bureau
Table: P40. Imputation Of Population Items [3]
Universe: Population not substituted
Table Details
P40. Imputation Of Population Items
Universe: Population not substituted
Variable Label
P040001
P040002
P040003
Relevant Documentation:
Excerpt from: Social Explorer, U.S. Census Bureau; 2000 Census of Population and Housing, Summary File 1: Technical Documentation, 2001.
 
Imputation
When information is missing or inconsistent, the Census Bureau uses a method called imputation to assign values. Imputation relies on the statistical principle of "homogeneity," or the tendency of households within a small geographic area to be similar in most characteristics. For example, the value of "rented" is likely to be imputed for a housing unit not reporting on owner/renter status in a neighborhood with multiunits or apartments where other respondents reported "rented" on the census questionnaire. In past censuses, when the occupancy status or the number of residents was not known for a housing unit, this information was imputed.

Internet Questionnaire Assistance (IQA)
An operation which allows respondents to use the Census Bureau's Internet site to (1) ask questions and receive answers about the census form, job opportunities, or the purpose of the census and (2) provide responses to the short form.

Interpolation
Interpolation frequently is used in calculating medians or quartiles based on interval data and in approximating standard errors from tables. Linear interpolation is used to estimate values of a function between two known values. Pareto interpolation is an alternative to linear interpolation. In Pareto interpolation, the median is derived by interpolating between the logarithms of the upper and lower income limits of the median category. It is used by the Census Bureau in calculating median income within intervals wider than $2,500.

Excerpt from: Social Explorer, U.S. Census Bureau; 2000 Census of Population and Housing, Summary File 1: Technical Documentation, 2001.
 
Editing of Unacceptable Data
The objective of the processing operation was to produce a set of data that describes the population as accurately and clearly as possible. In a major change from past practice, the information on Census 2000 questionnaires generally was not edited during field data collection nor during data capture operations for consistency, completeness, and acceptability. Enumerator-filled questionnaires were reviewed by census crew leaders and local office clerks for adherence to specified procedures. No clerical review of mail return questionnaires was done to ensure that the information on the form could be data captured, nor were households contacted as in previous censuses to collect data that were missing from census returns.

Most census questionnaires received by mail from respondents as well as those filled by enumerators were processed through a new contractor-built image scanning system that used optical mark and character recognition to convert the responses into computer files. The optical character recognition, or OCR, process used several pattern and context checks to estimate accuracy thresholds for each write-in field. The system also used "soft edits" on most interpreted numeric write-in responses to decide whether the field values read by the machine interpretation were acceptable. If the value read had a lower than acceptable accuracy threshold or was outside of the soft edit range, the image of the item was displayed to a keyer, who then entered the response.

To control the creation of possibly erroneous people from questionnaires completed incorrectly or containing stray marks, an edit on the number of people indicated on each mail return and enumerator-filled questionnaire was implemented as part of the data capture system. Failure of this edit resulted in the review of the questionnaire image at a workstation by an operator, that identified erroneous person records and corrected OCR interpretation errors in the population count field.

At Census Bureau headquarters, the mail response data records were subjected to a computer edit that identified households exhibiting a possible coverage problem and those with more than six household members-the maximum number of persons who could be enumerated on a mail questionnaire. Attempts were made to contact these households on the telephone to correct the count inconsistency and to collect the census data for those people for whom there was no room on the questionnaire.

Incomplete or inconsistent information on the questionnaire data records was assigned acceptable values using imputation procedures during the final automated edit of the collected data. Imputations, or computer assignments of acceptable codes in place of unacceptable entries or blanks, are needed most often when an entry for a given item is lacking or when the information reported for a person on that item is inconsistent with other information for that person. This process is known as allocation. As in previous censuses, the general procedure for changing unacceptable entries was to assign an entry for a person that was consistent with entries for persons with similar characteristics. The assignment of acceptable codes in place of blanks or unacceptable entries enhances the usefulness of the data. Allocation rates for census items are made available with the published census data.

Another way corrections were made during the computer editing process was through substitution; that is, the assignment of a full set of characteristics for people in a household. When there was an indication that a household was occupied by a specified number of people, but the questionnaire contained no information for the people within the household or the occupants were not listed on the questionnaire, a previously accepted household of the same size was selected as a substitute, and the full set of characteristics for the substitute was duplicated. Housing characteristics are not substituted. Matrix H18, Occupied Housing Units Substituted, represents a count of occupied housing units into which all persons have been substituted.

Excerpt from: Social Explorer, U.S. Census Bureau; 2000 Census of Population and Housing, Summary File 1: Technical Documentation, 2001.
 
Subject Content
Summary File 1 (SF 1) contains the 100-percent data, which is the information compiled from the questions asked of all people and about every housing unit. Population items include sex, age, race, Hispanic or Latino, household relationship, and group quarters. Housing items include occupancy status, vacancy status, and tenure (owner occupied or renter occupied).

There is a total of 171 population tables (identified with a "P") and 56 housing tables (identified with an "H" shown down to the block level, and 59 population tables shown down to the census tract level (identified with a "PCT") for a total of 286 tables. There are 14 population tables and 4 housing tables shown down to the block level, and 4 population tables shown down to the census tract level that are repeated by major race and Hispanic or Latino groups2.

SF 1 includes population and housing characteristics for the total population, population totals for an extensive list of race (American Indian and Alaska Native tribes, Asian, and Native Hawaiian and Other Pacific Islander) and Hispanic or Latino groups, and population and housing characteristics for a limited list of race and Hispanic or Latino groups. Population and housing items may be cross tabulated. Selected aggregates and medians also are provided. A complete listing of subjects in this file is found in the section, "Subject Locator."



footnotes:


2These selected tables are repeated by the following: White alone; Black or African American alone; American Indian and Alaska Native alone; Asian alone; Native Hawaiian and Other Pacific Islander alone; Some other race alone; Two or more races; Hispanic or Latino; and White alone, not Hispanic or Latino. One matrix, PCT12, is also repeated by Black or African American alone, not Hispanic or Latino; American Indian and Alaska Native alone, not Hispanic or Latino; Asian alone, not Hispanic or Latino; Native Hawaiian and Other Pacific Islander alone, not Hispanic or Latino; Some other race alone, not Hispanic or Latino; and Two or more races, not Hispanic or Latino.