Documentation: | ACS 2016 (5-Year Estimates) |
you are here:
choose a survey
survey
document
chapter
Publisher: U.S. Census Bureau
Survey: ACS 2016 (5-Year Estimates)
Document: | 2016 ACS 1-year and 2012-2016 ACS 5-year Data Releases: Technical Documentation |
citation: | Social Explorer; U.S. Census Bureau; 2016 ACS 1-year and 2012-2016 ACS 5-year Data Releases: Technical Documentation. |
Chapter Contents
2016 ACS 1-year and 2012-2016 ACS 5-year Data Releases: Technical Documentation
The ACS Summary File is accessible from the American Community Survey main page. From the ACS main page, http://www.census.gov/acs/, mouse over the Data tab, select the option for Summary File Data, as shown below:
That will take you to the ACS Summary File page. Click on 1-year Summary File to go to the ACS Summary File FTP site.
This is the ACS Summary File--it is actually comprised of three folders, as well as templates, for each data release.
That will take you to the ACS Summary File page. Click on 1-year Summary File to go to the ACS Summary File FTP site.
This is the ACS Summary File--it is actually comprised of three folders, as well as templates, for each data release.
The Summary File is organized in three folders per data release as shown in the above screenshot. Each data release also includes a corresponding zip file for templates. These three directories contain the same combination of files; they are simply arranged differently to accommodate various user needs:
An illustration of how ACS 1-year files are arranged in the three folders is included below.
The following is a table that gives examples of the types of summary levels are in the state and state-level equivalent folders and files and those that are in the United States folders and files.
This zip file contains Excel files for each sequence (i.e., Seq1.xls, Seq2.xls), as well as the geography file (i.e., 2014_SFGeoFileTemplate.xls). These files provide users with two rows of metadata containing the variable names and their descriptions for every column. The templates are meant to be used with the comma delimited version of the files.
An illustration of how ACS 1-year files are arranged in the three folders is included below.
- All-in-one (1_year_entire_sf, 5_year_entire_sf)
- State tables (1_year_by_state, 5_year_by_state)
The following is a table that gives examples of the types of summary levels are in the state and state-level equivalent folders and files and those that are in the United States folders and files.
Each State, DC, and Puerto Rico | United States |
State | United States |
County | Region |
County subdivision | Division |
Place | Metropolitan or urban statistical areas |
Congressional districts (110th Congress) | New England City and Town Area (NECTA) |
Public Use Microdata Area (PUMA) | American Indian/Alaska Native/Hawaiian Home Land areas |
School Districts | Urban areas |
Alaska Native Regional Corporation | Zip Code Tabulation Areas (ZCTAs) |
- Topic tables (1_year_seq_by_state, 5_year_seq_by_state)
File Name: 2014 1 ak 0001 000.zip | ||
Example | Name | Range or Type |
2014 | Reference Year | ACS data year (last year of the period for multiyear periods) |
1 | Period Covered | 1=1-year, 5=5-year |
ak | State Level | US or abbreviations for state, District of Columbia, and Puerto Rico |
0001 | Sequence Number | 0001 to 9999 |
000 | IterationID | Iteration ID for Selected Population Tables and American Indian & Alaska Native Tables. Note: Iteration ID is always "000" for the standard 1-Year and 5-Year products. |
- Templates
This zip file contains Excel files for each sequence (i.e., Seq1.xls, Seq2.xls), as well as the geography file (i.e., 2014_SFGeoFileTemplate.xls). These files provide users with two rows of metadata containing the variable names and their descriptions for every column. The templates are meant to be used with the comma delimited version of the files.
Detailed Tables for similar subject areas are grouped together in "sequences." A sequence
number is an assigned number to a grouping of ACS tables. Table sequencing now follows these
rules:
1) Tables are sorted numerically by the "root" of their Table ID, where the "root" is defined as the numeric section after the first letter and before any additional letters, so for example the root of B06004APR is "06004". For tables with the same root, additionally sort them in the following order:
Non-iterated, non-collapsed, non-PR version (e.g., Table B06003)
Iterated, non-collapsed, non-PR versions (e.g., Tables B06004A, B06004B ... B06004I)
Non-iterated, collapsed, non-PR version (e.g., Table C06001)
Iterated, collapsed, non-PR version (e.g., Tables C08505A, C08505B ... C08505I)
Non-iterated, non-collapsed, PR version (e.g., Table B06003PR)
Iterated, non-collapsed, PR versions (e.g., Tables B06004APR,B06004BPR ... B06004IPR)
Non-iterated, collapsed, PR version (e.g., Table C06001PR)
Iterated, collapsed, PR version (e.g., Table C06001APR)
2) With tables sorted in this order, start with the first table and assign it to the first sequence. For each subsequent table, if the table has either a new "subject," a new "geography type," or would cause the number of cells in the sequence to exceed 245, then start a new sequence. "Subject" is described using the second and third characters in the Table ID, so for example the subject of B06004APR is "06" for place of birth. You can view a complete list of subjects at https://ask.census.gov/faq.php?id=5000&faqId=1687. "Geography type" can be one of three things:
Place of Residence geography type, Place of Work geography type, or Residence 1 Year Ago geography type.
3) If a table does not fit in one sequence, then put the first 245 cells of it in one sequence, and the rest in the next. If a table does not fit in two sequences, then put the first 245 cells of it in one sequence, the next 245 cells of it in the next sequence, and the rest in a third sequence.
The rules governing how many tables can be assigned the same sequence number depend on the following:
It is critical to know the sequence number associated with a Detailed Table (Table ID) for two reasons. First, one needs it in order to access the correct estimates and margins of error files for the desired table. Second, the field start position for the estimates or margins of error of a certain Detailed Table depends on its sequence number.
The Sequence Number and Table Number Lookup file, available in Excel and as a SAS dataset, lists Table IDs associated with each sequence number. This spreadsheet is available on the ACS Summary File Documentation page at http://www.census.gov/programs-surveys/acs/technicaldocumentation/summary-file-documentation.html.
For example, to find the sequence number associated with the Table B08406, a user must open and look for that Table ID in the Sequence Number and Table Number Lookup file. Shown below is a screenshot of this file opened to where the "tblid" is B08406. The next column in the file, "seq," shows that this Table ID is associated with the sequence number "0029." In order to access the estimate and margin of error file for Table B08406, a user will need to download the estimate and margin of error files labeled with the sequence number "0029."
1) Tables are sorted numerically by the "root" of their Table ID, where the "root" is defined as the numeric section after the first letter and before any additional letters, so for example the root of B06004APR is "06004". For tables with the same root, additionally sort them in the following order:
Non-iterated, non-collapsed, non-PR version (e.g., Table B06003)
Iterated, non-collapsed, non-PR versions (e.g., Tables B06004A, B06004B ... B06004I)
Non-iterated, collapsed, non-PR version (e.g., Table C06001)
Iterated, collapsed, non-PR version (e.g., Tables C08505A, C08505B ... C08505I)
Non-iterated, non-collapsed, PR version (e.g., Table B06003PR)
Iterated, non-collapsed, PR versions (e.g., Tables B06004APR,B06004BPR ... B06004IPR)
Non-iterated, collapsed, PR version (e.g., Table C06001PR)
Iterated, collapsed, PR version (e.g., Table C06001APR)
2) With tables sorted in this order, start with the first table and assign it to the first sequence. For each subsequent table, if the table has either a new "subject," a new "geography type," or would cause the number of cells in the sequence to exceed 245, then start a new sequence. "Subject" is described using the second and third characters in the Table ID, so for example the subject of B06004APR is "06" for place of birth. You can view a complete list of subjects at https://ask.census.gov/faq.php?id=5000&faqId=1687. "Geography type" can be one of three things:
Place of Residence geography type, Place of Work geography type, or Residence 1 Year Ago geography type.
3) If a table does not fit in one sequence, then put the first 245 cells of it in one sequence, and the rest in the next. If a table does not fit in two sequences, then put the first 245 cells of it in one sequence, the next 245 cells of it in the next sequence, and the rest in a third sequence.
The rules governing how many tables can be assigned the same sequence number depend on the following:
- There are no more than 256 cells per sequence, so the data can be read into a spreadsheet. There are 245 data cells and 11 other cells reserved for identifying information.
- There are approximately 170+ sequences for the 2014 ACS 1-year Summary File, and approximately 120+ sequences for the 2010-2014 ACS 5-year Summary File.
- Tables are grouped numerically by the "root" of their Table ID, (i.e., Table B00001 is in sequence file 0001).
- Tables with race iterations are grouped in the same sequence.
It is critical to know the sequence number associated with a Detailed Table (Table ID) for two reasons. First, one needs it in order to access the correct estimates and margins of error files for the desired table. Second, the field start position for the estimates or margins of error of a certain Detailed Table depends on its sequence number.
The Sequence Number and Table Number Lookup file, available in Excel and as a SAS dataset, lists Table IDs associated with each sequence number. This spreadsheet is available on the ACS Summary File Documentation page at http://www.census.gov/programs-surveys/acs/technicaldocumentation/summary-file-documentation.html.
For example, to find the sequence number associated with the Table B08406, a user must open and look for that Table ID in the Sequence Number and Table Number Lookup file. Shown below is a screenshot of this file opened to where the "tblid" is B08406. The next column in the file, "seq," shows that this Table ID is associated with the sequence number "0029." In order to access the estimate and margin of error file for Table B08406, a user will need to download the estimate and margin of error files labeled with the sequence number "0029."
There is a geography file that comes with the estimate and margin of error files. This file begins
with a "g" and is an ASCII file using either a position based format or comma delimited format.
A geography file exists for each state or state level equivalent.
Geography files are named using the following convention (using the 1-year data release as an example):
The geography files contain geographic information for an ACS tabulated area, including the name of the area. One variable on the file, called LOGRECNO, is the logical record number and is used to link the level of geography to the estimate and margin of error files. An example of how to use LOGRECNO is discussed in Chapter 2.5. The fields in the layout below are blank if the geography is not available for a release.
The following table provides the generic layout of the geography file (1, 5-Year):
We also provide an Excel template for the geography file named "SFGeoFile Template.xls." The template provides users with two rows containing the variable names and their descriptions (as displayed in the above table) for each column in the geography file. It is meant to be used with the comma delimited version of the geography file. The template is available in the Data folder for your dataset (i.e., http://www2.census.gov/programssurveys/ acs/summary_file/2014/data//in the zipped "2014_SummaryFileTemplates" folder).
Here is a screenshot of the Excel file:
Each state, the District of Columbia, Puerto Rico and the set of cross-state geographies, have one geography file associated with them regardless of how the Summary File is accessed. For example, the following screenshot shows the beginning of the state geography file for Maryland. In the screenshot, the logical record numbers corresponding with the state of Maryland, Allegany County, and Anne Arundel County are circled. The logical record number for the state of Maryland is "0000001", for Allegany County it is "0000012", and for Anne Arundel County it is "0000013".
Excess spaces in the pictured geography file have been removed for illustrative purposes.
Geography files are named using the following convention (using the 1-year data release as an example):
g 2014 1 ak.txt | ||
Example | Name | Range or Type |
G | File Type | g=geography |
2014 | Reference Year | ACS data year (last year of the period for multiyear periods) |
1 | Period Covered | 1=1-year, 5=5-year |
Ak | State Level | US or abbreviations for state, District of Columbia, and Puerto Rico |
The geography files contain geographic information for an ACS tabulated area, including the name of the area. One variable on the file, called LOGRECNO, is the logical record number and is used to link the level of geography to the estimate and margin of error files. An example of how to use LOGRECNO is discussed in Chapter 2.5. The fields in the layout below are blank if the geography is not available for a release.
The following table provides the generic layout of the geography file (1, 5-Year):
Variable Name | Description | Field Size | Starting Position |
RECORD CODES | |||
FILEID | Always equal to ACS Summary File identification | 6 | 1 |
STUSAB | State Postal Abbreviation | 2 | 7 |
SUMLEVEL | Summary Level | 3 | 9 |
COMPONENT | Geographic Component | 2 | 12 |
LOGRECNO | Logical Record Number | 7 | 14 |
GEOGRAPHIC AREA CODES | |||
US | US | 1 | 21 |
REGION | Census Region | 1 | 22 |
DIVISION | Census Division | 1 | 23 |
STATECE | State (Census Code) | 2 | 24 |
STATE | State (FIPS Code) | 2 | 26 |
COUNTY | County of current residence | 3 | 28 |
COUSUB | County Subdivision (FIPS) | 5 | 31 |
PLACE | Place (FIPS Code) | 5 | 36 |
TRACT | Census Tract | 6 | 41 |
BLKGRP | Block Group | 1 | 47 |
CONCIT | Consolidated City | 5 | 48 |
AIANHH | American Indian Area/Alaska Native Area/ Hawaiian Home Land (Census) | 4 | 53 |
AIANHHFP | American Indian Area/Alaska Native Area/ Hawaiian Home Land (FIPS) | 5 | 57 |
AIHHTLI | American Indian Trust Land/ Hawaiian Home Land Indicator | 1 | 62 |
AITSCE | American Indian Tribal Subdivision (Census) | 3 | 63 |
AITS | American Indian Tribal Subdivision (FIPS) | 5 | 66 |
ANRC | Alaska Native Regional Corporation (FIPS) | 5 | 71 |
CBSA | Metropolitan and Micropolitan Statistical Area | 5 | 76 |
CSA | Combined Statistical Area | 3 | 81 |
METDIV | Metropolitan Statistical Area-Metropolitan Division | 5 | 84 |
MACC | Metropolitan Area Central City | 1 | 89 |
MEMI | Metropolitan/Micropolitan Indicator Flag | 1 | 90 |
NECTA | New England City and Town Area | 5 | 91 |
CNECTA | New England City and Town Combined Statistical Area | 3 | 96 |
NECTADIV | New England City and Town Area Division | 5 | 99 |
UA | Urban Area | 5 | 104 |
BLANK | 5 | 109 | |
CDCURR | Current Congressional District *** | 2 | 114 |
SLDU | State Legislative District Upper | 3 | 116 |
SLDL | State Legislative District Lower | 3 | 119 |
BLANK | 6 | 122 | |
BLANK | 3 | 128 | |
ZCTA5 | 5-digit ZIP Code Tabulation Area | 5 | 131 |
SUBMCD | Subminor Civil Division (FIPS) | 5 | 136 |
SDELM | State-School District (Elementary) | 5 | 141 |
SDSEC | State-School District (Secondary) | 5 | 146 |
SDUNI | State-School District (Unified) | 5 | 151 |
UR | Urban/Rural | 1 | 156 |
PCI | Principal City Indicator | 1 | 157 |
BLANK | 6 | 158 | |
BLANK | 5 | 164 | |
PUMA5 | Public Use Microdata Area - 5% File | 5 | 169 |
BLANK | 5 | 174 | |
GEOID | Geographic Identifier | 40 | 179 |
NAME | Area Name | 1000 | 219 |
BTTR | Tribal Tract | 6 | 1219 |
BTBG | Tribal Block Group | 1 | 1225 |
BLANK | 43 | 1226 |
We also provide an Excel template for the geography file named "SFGeoFile Template.xls." The template provides users with two rows containing the variable names and their descriptions (as displayed in the above table) for each column in the geography file. It is meant to be used with the comma delimited version of the geography file. The template is available in the Data folder for your dataset (i.e., http://www2.census.gov/programssurveys/ acs/summary_file/2014/data//in the zipped "2014_SummaryFileTemplates" folder).
Here is a screenshot of the Excel file:
Each state, the District of Columbia, Puerto Rico and the set of cross-state geographies, have one geography file associated with them regardless of how the Summary File is accessed. For example, the following screenshot shows the beginning of the state geography file for Maryland. In the screenshot, the logical record numbers corresponding with the state of Maryland, Allegany County, and Anne Arundel County are circled. The logical record number for the state of Maryland is "0000001", for Allegany County it is "0000012", and for Anne Arundel County it is "0000013".
Excess spaces in the pictured geography file have been removed for illustrative purposes.
Each of the three Summary File directories include zipped files containing estimate files (file names beginning with an "e") and margins of error files (file names beginning with an "m"). The estimate files contain published ACS estimates and the margin of error files contain published ACS margins of error for their respective estimates. Here is the naming convention used for those files (using the 1-year data release as an example):
The estimates and margins of error for Detailed Tables are grouped together by sequence
numbers, as discussed in Chapter 2.3. There is an estimate and margin of error file for each
sequence number.
The format of the estimate and margin of error files are identical; they are strings of commadelimited ASCII text. Each row represents a different geographic area and the first six fields contain metadata such as the geographic area and the sequence number. Following those fields are the estimates or margins of error for the Detailed Tables. Starting and ending positions of the fields associated with each Detailed Table can be found using the Sequence Number and Table Number Lookup file, which is discussed in Chapter 2.3. The estimates or margins of error for one Detailed Table span several fields within a row.
Here is the record layout of the estimates and the margin of error files:
Going back to the example from Chapter 2.3, we know that Table B08406 corresponds to sequence "0029." Additionally, the Sequence Number and Table Number Lookup file (as shown earlier) tells us that Table B08406 begins at position seven and contains 51 cells.
In order to get estimates for Maryland; Allegany County, MD; and Anne Arundel County, MD one must recall the logical record numbers associated with each geography. In Chapter 2.4, we identified these to be "0000001," "0000012," and "0000013," respectively. The logical record number, LOGRECNO, must be used to merge the geography information to the estimate and margin of error files.
The example below shows the estimate file for sequence "0003" and all geographies. except census tracts and block groups for the state of Maryland using the 2010 ACS 1-year Summary File, For the 2008-2012 ACS 5-year Summary File, the dots "." in the below screenshot will be replaced by empty cells as documented in Chapter 4.2. Note that each row has a uniquely assigned logical record number, called LOGRECNO, which links the estimate to a specific geographic area. The pictured example has the logical record numbers corresponding to Maryland, Allegany County, and Anne Arundel County circled. Estimates for Table B08406 at these geographic levels can be found within their respective rows at field seven and continuing for 50 additional fields.
e 2014 1 ak 0001 000.txt | ||
Example | Name | Range or Type |
E | File Type | e=estimate, m=margin of error |
2014 | Reference Year | ACS data year (last year of the period for multiyear periods) |
1 | Period Covered | 1=1-year, 5=5-year |
ak | State Level | US or abbreviations for state, District of Columbia and Puerto Rico |
0001 | Sequence Number | 0001 to 9999 |
000 | Reserved for future use | Iteration value for future use |
The format of the estimate and margin of error files are identical; they are strings of commadelimited ASCII text. Each row represents a different geographic area and the first six fields contain metadata such as the geographic area and the sequence number. Following those fields are the estimates or margins of error for the Detailed Tables. Starting and ending positions of the fields associated with each Detailed Table can be found using the Sequence Number and Table Number Lookup file, which is discussed in Chapter 2.3. The estimates or margins of error for one Detailed Table span several fields within a row.
Here is the record layout of the estimates and the margin of error files:
Field Name | Description | Field Size |
FILEID | File Identification | 6 Characters |
FILETYPE | File Type | 6 Characters |
STUSAB | State/U.S.-Abbreviation (USPS) | 2 Characters |
CHARITER | Character Iteration | 3 Characters |
SEQUENCE | Sequence Number | 4 Characters |
LOGRECNO | Logical Record Number | 7 Characters |
Field # 7 and up | Estimates (or Margins of Error) | Various |
Going back to the example from Chapter 2.3, we know that Table B08406 corresponds to sequence "0029." Additionally, the Sequence Number and Table Number Lookup file (as shown earlier) tells us that Table B08406 begins at position seven and contains 51 cells.
In order to get estimates for Maryland; Allegany County, MD; and Anne Arundel County, MD one must recall the logical record numbers associated with each geography. In Chapter 2.4, we identified these to be "0000001," "0000012," and "0000013," respectively. The logical record number, LOGRECNO, must be used to merge the geography information to the estimate and margin of error files.
The example below shows the estimate file for sequence "0003" and all geographies. except census tracts and block groups for the state of Maryland using the 2010 ACS 1-year Summary File, For the 2008-2012 ACS 5-year Summary File, the dots "." in the below screenshot will be replaced by empty cells as documented in Chapter 4.2. Note that each row has a uniquely assigned logical record number, called LOGRECNO, which links the estimate to a specific geographic area. The pictured example has the logical record numbers corresponding to Maryland, Allegany County, and Anne Arundel County circled. Estimates for Table B08406 at these geographic levels can be found within their respective rows at field seven and continuing for 50 additional fields.