Data Dictionary: Census 2000
Survey: Census 2000
Data Source: U.S. Census Bureau
Universe: Total population
Variable Details
PCT15. Ancestry
Universe: Total population
PCT015005 Ancestry not specified
Aggregation method:
Relevant Documentation:
Excerpt from: Social Explorer, U.S. Census Bureau; 2000 Census of Population and Housing, Summary File 3: Technical Documentation, 2002.
The data on ancestry were derived from answers to long-form questionnaire Item 10, which was asked of a sample of the population. The data represent self-classification by people according to the ancestry group or groups with which they most closely identify. Ancestry refers to a person's ethnic origin or descent, "roots," heritage, or the place of birth of the person, the persons parents, or their ancestors before their arrival in the United States. Some ethnic identities, such as Egyptian or Polish, can be traced to geographic areas outside the United States, while other ethnicities, such as Pennsylvania German or Cajun, evolved in the United States.

The intent of the ancestry question was not to measure the degree of attachment the respondent had to a particular ethnicity. For example, a response of "Irish" might reflect total involvement in an Irish community or only a memory of ancestors several generations removed from the individual. Also, the question was intended to provide data for groups that were not included in the Hispanic origin and race questions. Official Hispanic origin data come from long-form questionnaire Item 5, and official race data come from long-form questionnaire Item 6. Therefore, although data on all groups are collected, the ancestry data shown in these tabulations are for non-Hispanic and non-race groups. Hispanic and race groups are included in the "Other groups" category for the ancestry tables in these tabulations.

The ancestry question allowed respondents to report one or more ancestry groups, although only the first two were coded. If a response was in terms of a dual ancestry, for example, "Irish English," the person was assigned two codes, in this case one for Irish and another for English. However, in certain cases, multiple responses such as "French Canadian," "Greek Cypriote," and "Scotch Irish" were assigned a single code reflecting their status as unique groups. If a person reported one of these unique groups in addition to another group, for example, "Scotch Irish English," resulting in three terms, that person received one code for the unique group (Scotch-Irish) and another one for the remaining group (English). If a person reported "English Irish French," only English and Irish were coded. Certain combinations of ancestries where the ancestry group is a part of another, such as "German-Bavarian," were coded as a single ancestry using the more specific group (Bavarian). Also, responses such as "Polish-American" or "Italian-American" were coded and tabulated as a single entry (Polish or Italian).

The Census Bureau accepted "American" as a unique ethnicity if it was given alone, with an ambiguous response, or with state names. If the respondent listed any other ethnic identity such as "Italian-American," generally the "American" portion of the response was not coded. However, distinct groups such as "American Indian," "Mexican American," and "African American" were coded and identified separately because they represented groups who considered themselves different from those who reported as "Indian," "Mexican," or "African," respectively.

In all tabulations, when respondents provided an unclassifiable ethnic identity (for example, "multinational," "adopted," or "I have no idea"), the answer was included in tabulation category "Unclassified or not reported."

The tabulations on ancestry are presented using two types of data presentations - one using total people as the base, and the other using total responses as the base. The following are categories shown in the two data presentations.

Presentation Based on People
Single ancestries reported
Includes all people who reported only one ancestry group. Included in this category are people with multiple-term responses such as "Greek Cypriote" who are assigned a single code.

Multiple ancestries reported
Includes all people who reported more than one group and were assigned two ancestry codes.

Ancestry unclassified
Includes all people who provided a response that could not be assigned an ancestry code because they provided unclear entries or entries that represent religious groups.

Presentation Based on Responses
First ancestry reported
Includes the first response of all people who reported at least one codeable entry. For example, in this category, the count for Danish would include all those who reported only Danish and those who reported Danish first and then some other group.

Second ancestry reported
Includes the second response of all people who reported a multiple ancestry. Thus, the count for Danish in this category includes all people who reported Danish as the second response, regardless of the first response provided.

Total ancestries reported or total ancestries tallied
Includes the total number of ancestries reported and coded. If a person reported a multiple ancestry such as "French Danish," that response was counted twice in the tabulations once in the French category and again in the Danish category. Thus, the sum of the counts in this type of presentation is not the total population but the total of all responses.

An automated coding system was used for coding ancestry in Census 2000. This greatly reduced the potential for error associated with a clerical review. Specialists with knowledge of the subject matter reviewed, edited, coded, and resolved inconsistent or incomplete responses. The code list used in Census 2000, containing over 1,000 categories, reflects the results of the Census Bureau's experience with the 1990 ancestry question, research, and consultation with many ethnic experts. Many decisions were made to determine the classification of responses. These decisions affected the grouping of the tabulated data. For example, the Italian category includes the responses of Sicilian and Tuscan, as well as a number of other responses.

Limitation of the data
Although some people consider religious affiliation a component of ethnic identity, the ancestry question was not designed to collect any information concerning religion. Thus, if a religion was given as an answer to the ancestry question, it was listed in the "Other groups" category.

Ancestry should not be confused with a person's place of birth, although a person's place of birth and ancestry may be the same (see "Place of Birth").

The ancestry data in these tabulations are limited to groups that were not shown in the Hispanic origin and race tabulations. For example, since Mexican is shown in the Hispanic origin tables, it is not shown in the ancestry tables. Likewise, since Korean is shown in the race tables, it is not shown in the ancestry tables. Hispanic and race groups are included in the "Other groups" category for the ancestry tables in these tabulations.

Unlike other census questions, there was no imputation for nonresponse to the ancestry question.

The ancestry question was first introduced in 1980 as "What is this persons ancestry?" In 1990, the question was changed to "What is this persons ancestry or ethnic origin?" to improve understanding and response. This question was used again in Census 2000. The ancestry groups used as examples have changed over time. The changes were introduced to avoid or to minimize example-induced responses, and to ensure broad geographic and group coverage.