P145. Imputation Of Language Status [3] - Summary Tape File 3 (STF3) - Census 1990

Data Dictionary:

Census 1990

you are here: choose a survey survey data set table details

Survey: Census 1990

Data Source:

U.S. Census Bureau

Data set: Summary Tape File 3 (STF3)

Table:

P145. Imputation Of Language Status [3]

Universe: Persons 5 years and over

Table Details

P145.

Imputation Of Language Status

Universe: Persons 5 years and over

Variable	Label
P145_001	Persons 5 years and over
P145_002	Allocated
P145_003	Not allocated

Relevant Documentation:

Excerpt from:	Social Explorer, U.S. Census Bureau; Census of Population and Housing, 1990: Summary Tape File 3 on CD-ROM [machine-readable data files] / prepared by the Bureau of the Census. Washington: The Bureau [producer and distributor], 1991.
	Summary Tape File 3 -> Appendix C. Accuracy of the Data -> Confidentiality of the Data

Confidentiality of the Data

To maintain the confidentiality required by law (Title 13, United States Code), the Bureau of the Census applies a confidentiality edit to the 1990 census data to assure that published data do not disclose information about specific individuals, households, or housing units. As a result, a small amount of uncertainty is introduced into the estimates of census characteristics. The sample itself provides adequate protection for most areas for which sample data are published since the resulting data are estimates of the actual counts; however, small areas require more protection. The edit is controlled so that the basic structure of the data is preserved.

The confidentiality edit is implemented by selecting a small subset of individual households from the internal sample data files and blanking a subset of the data items on these household records. Responses to those data items were then imputed using the same imputation procedures that were used for nonresponse. A larger subset of households is selected for the confidentiality edit for small areas to provide greater protection for these areas. The editing process is implemented in such a way that the quality and usefulness of the data were preserved.

Excerpt from:	Social Explorer, U.S. Census Bureau; Census of Population and Housing, 1990: Summary Tape File 3 on CD-ROM [machine-readable data files] / prepared by the Bureau of the Census. Washington: The Bureau [producer and distributor], 1991.
	Summary Tape File 3 -> Appendix C. Accuracy of the Data -> Editing of Unacceptable Data

Editing of Unacceptable Data

The objective of the processing operation is to produce a set of data that describes the population as accurately and clearly as possible. To meet this objective, questionnaires were edited during field data collection operations for consistency, completeness, and acceptability. Questionnaires also were reviewed by census clerks for omissions, certain specific inconsistencies, and population coverage. For example, write-in entries such as Dont know or NA were considered unacceptable. For some district offices, the initial edit was automated; however, for the majority of the district offices, it was performed by clerks. As a result of this operation, a telephone or personal visit followup was made to obtain missing information. Potential coverage errors were included in the followup, as well as a sample of questionnaires with omissions and/or inconsistencies. Subsequent to field operations, remaining incomplete or inconsistent information on the questionnaires was assigned using imputation procedures during the final automated edit of the collected data. Imputations, or computer assignments of acceptable codes in place of unacceptable entries or blanks, are needed most often when an entry for a given item is lacking or when the information reported for a person or housing unit on that item is inconsistent with other information for that same person or housing unit. As in previous censuses, the general procedure for changing unacceptable entries was to assign an entry for a person or housing unit that was consistent with entries for persons or housing units with similar characteristics. The assignment of acceptable codes in place of blanks or unacceptable entries enhances the usefulness of the data.

Another way in which corrections were made during the computer editing process was through substitution; that is, the assignment of a full set of characteristics for a person or housing unit. When there was an indication that a housing unit was occupied but the questionnaire contained no information for the people within the household or the occupants were not listed on the questionnaire, a previously accepted household was selected as a substitute, and the full set of characteristics for the substitute was duplicated. The assignment of the full set of housing characteristics occurred when there was no housing information available. If the housing unit was determined to be occupied, the housing characteristics were assigned from a previously processed occupied unit. If the housing unit was vacant, the housing characteristics were assigned from a previously processed vacant unit.

Table A. Unadjusted Standard Error for Estimated Totals [Based on a 1-in-6 simple random sample]
Estimated Total	Size of publication area²
Estimated Total	500	1,000	2,500	5,000	10,000	25,000	50,000	100,000	250,000	500,000	1,000,000	5,000,000	10,000,000	25,000,000
50	16	16	16	16	16	16	16	16	16	16	16	16	16	16
100	20	21	22	22	22	22	22	22	22	22	22	22	22	22
250	25	30	35	35	35	35	35	35	35	35	35	35	35	35
500		35	45	45	50	50	50	50	50	50	50	50	50	50
1,000			55	65	65	70	70	70	70	70	70	70	70	70
2,500				80	95	110	110	110	110	110	110	110	110	110
5000					110	140	150	150	160	160	160	160	160	160
10,000						170	200	210	220	220	220	220	220	220
15,000						170	230	250	270	270	270	270	270	270
25,000							250	310	340	350	350	350	350	350
75,000								310	510	570	590	610	610	610
100,000									550	630	670	700	700	710
250,000										790	970	1 090	1 100	1 100
500,000											1120	1 500	1 540	1 570
1,000,000												2 000	2 120	2 190
5,000,000													3 540	4 470
10,000,000														5 480

Footnote:

¹For estimated totals larger than 10,000,000, the standard error is somewhat larger than the table values. The formula given below should be used to calculate the standard error.

²The total count of persons in the area if the estimated total is a person characteristic, or the total count of housing units in the area if the estimated total is a housing unit characteristic.

Table B. Unadjusted Standard Error in Percentage Points for Estimated Percentage [Based on a 1 in-6 simple random sample]
Estimated percentage	Base of percentage¹
Estimated percentage	500	750	1,000	1,500	2,500	5,000	7,500	10,000	25,000	50,000	100,000	250,000	500,000
2 or 98	1.4	1.1	1.0	0.8	0.6	0.4	0.4	0.3	0.2	0.1	0.1	0.1	0.1
5 or 95	2.2	1.8	1.5	1.3	1.0	0.7	0.6	0.5	0.3	0.2	0.2	0.1	0.1
10 or 90	3.0	2.4	2.1	1.7	1.3	0.9	0.8	0.7	0.4	0.3	0.2	0.1	0.1
15 or 85	3.6	2.9	2.5	2.1	1.6	1.1	0.9	0.8	0.5	0.4	0.3	0.2	0.1
20 or 80	4.0	3.3	2.8	2.3	1.8	1.3	1.0	0.9	0.6	0.4	0.3	0.2	0.1
25 or 75	4.3	3.5	3.1	2.5	1.9	1.4	1.1	1.0	0.6	0.4	0.3	0.2	0.1
30 or 70	4.6	3.7	3.2	2.6	2.0	1.4	1.2	1.0	0.6	0.5	0.3	0.2	0.1
35 or 65	4.8	3.9	3.4	2.8	2.1	1.5	1.2	1.1	0.7	0.5	0.3	0.2	0.2
50	5.0	4.1	3.5	2.9	2.2	1.6	1.3	1.1	0.7	0.5	0.4	0.2	0.2

Footnote:

¹For a percentage and/or base of percentage not shown in the table, the formula given below may be used to calculate the standard error. This table should only be used for proportions, that is, where the numerator is a subset of the denominator.

Excerpt from:	Social Explorer, U.S. Census Bureau; Census of Population and Housing, 1990: Summary Tape File 3 on CD-ROM [machine-readable data files] / prepared by the Bureau of the Census. Washington: The Bureau [producer and distributor], 1991.
	Summary Tape File 3 -> Appendix B. Definitions of Subject Characteristics -> Population Characteristics -> Language Spoken At Home and Ability to Speak English

Language Spoken At Home and Ability to Speak English

Language Spoken at Home--Data on language spoken at home were derived from the answers to questionnaire items 15a and 15b, which were asked of a sample of persons born before April 1, 1985. Instructions mailed with the 1990 census questionnaire stated that a respondent should mark "Yes" in question 15a if the person sometimes or always spoke a language other than English at home and should not mark "Yes" if a language was spoken only at school or if speaking was limited to a few expressions or slang. For question 15b, respondents were instructed to print the name of the non-English language spoken at home. If the person spoke more than one language other than English, the person was to report the language spoken more often or the language learned first.

The cover of the census questionnaire included information in Spanish which provided a telephone number for respondents to call to request a census questionnaire and instructions in Spanish. Instruction guides were also available in 32 other languages to assist enumerators who encountered households or respondents who spoke no English.

Questions 15a and 15b referred to languages spoken at home in an effort to measure the current use of languages other than English. Persons who knew languages other than English but did not use them at home or who only used them elsewhere were excluded. Persons who reported speaking a language other than English at home may also speak English; however, the questions did not permit determination of the main or dominant language of persons who spoke both English and another language. (For more information, see discussion below on "Ability to Speak English.")

For persons who indicated that they spoke a language other than English at home in question 15a, but failed to specify the name of the language in question 15b, the language was assigned based on the language of other speakers in the household; on the language of a person of the same Spanish origin or detailed race group living in the same or a nearby area; or on a person of the same ancestry or place of birth. In all cases where a person was assigned a non-English language, it was assumed that the language was spoken at home. Persons for whom the name of a language other than English was entered in question 15b, and for whom question 15a was blank were assumed to speak that language at home.

The write-in responses listed in question 15b (specific language spoken) were transcribed onto computer files and coded into more than 380 detailed language categories using an automated coding system. The automated procedure compared write-in responses reported by respondents with entries in a computer dictionary, which initially contained approximately 2,000 language names. The dictionary was updated with a large number of new names, variations in spelling, and a small number of residual categories. Each write-in response was given a numeric code that was associated with one of the detailed categories in the dictionary. If the respondent listed more than one non-English language, only the first was coded.

The write-in responses represented the names people used for languages they speak. They may not match the names or categories used by linguists. The sets of categories used are sometimes geographic and sometimes linguistic. Figure 1 provides an illustration of the content of the classification schemes used to present language data. For more information, write to the Chief, Population Division, U.S. Bureau of the Census, Washington, DC 20233.

Household Language

In households where one or more persons (age 5 years old or over) speak a language other than English, the household language assigned to all household members is the non-English language spoken by the first person with a non-English language in the following order:

householder, spouse, parent, sibling, child, grandchild, other relative, stepchild, unmarried partner, housemate or roommate, roomer, boarder, or foster child, or other nonrelative. Thus, persons who speak only English may have a non-English household language assigned to them in tabulations of persons by household language.

Figure 1. Four- and Twenty-Five-Group Classifications of 1990 Census Languages Spoken at Home with Illustrative Examples
Four-Group Classification	Twenty-Five-Group Classification	Examples
Spanish Other Indo-European	Spanish	Spanish, Ladino
	French	French, Cajun,French Creole
	Italian
	Portuguese
	German
	Yiddish
	Other West	Afrikaans, Dutch,
	Germanic	Pennsylvania Dutch
	Scandanavian	Danish, Norwegian, Swedish
	Polish
	Russian
	South Slavic	Serbocroatian, Bulgarian, Macedonian, Slovene
	Other Slavic	Czech, Slovak, Ukranian
	Greek
	Indic	Hindi, Bengali, Gujarathi, Punjabi, Romany, Sinhalese
	Other Indo European,	Armenian, Gaelic,
	not elsewhere classified	Lithuanian, Persian
Languages of Asia and the Pacific	Chinese
	Japanese
	Mon-Khmer	Cambodian
	Tagalog
	Korean
	Vietnamese
	Other languages	Chamorro, Dravidian
	(part)	Languages, Hawaiian,
		Ilocano, Thai, Turkish
All other languages	Arabic
	Hungarian
	Native North
	American languages
	Other languages	Amharic, Syriac,
	(part)	Finnish, Hebrew,
		Languages of
		Central and South
		America, Other
		Languages of Africa

Ability to Speak English

Persons 5 years old and over who reported that they spoke a language other than English in question 15a were also asked in question 15c to indicate their ability to speak English based on one of the following categories: "Very well," "Well," "Not well," or "Not at all."

The data on ability to speak English represent the person's own perception about his or her own ability or, because census questionnaires are usually completed by one household member, the responses may represent the perception of another household member. The instruction guides and questionnaires that were mailed to households did not include any information on how to interpret the response categories in question 15c.

Persons who reported that they spoke a language other than English at home but whose ability to speak English was not reported, were assigned the English-language ability of a randomly selected person of the same age, Spanish origin, nativity and year of entry, and language group.

Linguistic Isolation

A household in which no person age 14 years or over speaks only English and no person age 14 years or over who speaks a language other than English speaks English "Very well" is classified as "linguistically isolated." All the members of a linguistically isolated household are tabulated as linguistically isolated, including members under age 14 years who may speak only English.

Limitation of the Data

Persons who speak a language other than English at home may have first learned that language at school. However, these persons would be expected to indicate that they spoke English "Very well." Persons who speak a language other than English, but do not do so at home, should have been reported as not speaking a language other than English at home.
The extreme detail in which language names were coded may give a false impression of the linguistic precision of these data. The names used by speakers of a language to identify it may reflect ethnic, geographic, or political affiliations and do not necessarily respect linguistic distinctions. The categories shown in the tabulations were chosen on a number of criteria, such as information about the number of speakers of each language that might be expected in a sample of the United States population.

Comparability

Information on language has been collected in every census since 1890. The comparability of data among censuses is limited by changes in question wording, by the subpopulations to whom the question was addressed, and by the detail that was published.

The same question on language was asked in the 1980 and 1990 censuses. This question on the current language spoken at home replaced the questions asked in prior censuses on mother tongue; that is, the language other than English spoken in the person's home when he or she was a child; one's first language; or the language spoken before immigrating to the United States. The censuses of 1910-1940, 1960 and 1970 included questions on mother tongue. A change in coding procedure from 1980 to 1990 should have improved accuracy of coding and may affect the number of persons reported in some of the 380 plus categories. It should not greatly affect the 4-group or 25- group lists. In 1980, coding clerks supplied numeric codes for the written entries on each questionnaire using a 2,000 name reference list. In 1990 written entries were transcribed to a computer file and matched to a computer dictionary which began with the 2,000 name list, but expanded as unmatched names were referred to headquarters specialists for resolution.

The question on ability to speak English was asked for the first time in 1980. In tabulations from 1980, the categories "Very well" and "Well" were combined. Data from other surveys suggested a major difference between the category "Very well" and the remaining categories. In tabulations showing ability to speak English, persons who reported that they spoke English "Very well" are presented separately from persons who reported their ability to speak English as less than "Very well."