Documentation: | ACS 2010 (5-Year Estimates) Comparability Data |
you are here:
choose a survey
survey
document
chapter
Publisher: U.S. Census Bureau
Document: | ACS 2010 5-Year Summary File: Technical Documentation |
citation: | Social Explorer; U.S. Census Bureau; American Community Survey 2006-2010 Summary File: Technical Documentation. |
Chapter Contents
The 2006-2010 ACS 5-year Summary File is accessible from the American Community Survey home page. From the ACS home page, www.census.gov/acs, click on the Data and Documentation tab, select the option for Summary File, as shown below:
That will take you to the ACS Summary File page. Click on 2006-2010 ACS 5-Year Summary File to go to the ACS Summary file page.
This is a screenshot of the 2008-2010 ACS 3-year Summary File page; the one for the 2006-2010 ACS 5-year Summary File looks very similar. It is actually comprised of three folders that are explained the next chapter.
That will take you to the ACS Summary File page. Click on 2006-2010 ACS 5-Year Summary File to go to the ACS Summary file page.
This is a screenshot of the 2008-2010 ACS 3-year Summary File page; the one for the 2006-2010 ACS 5-year Summary File looks very similar. It is actually comprised of three folders that are explained the next chapter.
The Summary File is organized in three folders as shown in the above screenshot. These three directories contain the same combination of files; they are simply arranged differently to accommodate various user needs:
The naming convention used for the zipped files in this directory is the following:
As Appendix B shows, the "All in 2 Giant Files" and the "By State All Tables" folders contain the same tables as the "By State By Sequence Table Subset" folder. The difference is in the organization. The "By State All Tables" zipped files contain all of the sequence files for the given state, so each zipped file contains 354 files. The "All in 2 Giant Files" zipped file contains all sequence files for all states, which is thousands of files.
As mentioned earlier, the zipped files are divided by state or state-level equivalents. Those state- level equivalents include the District of Columbia and Puerto Rico. There is also a level called "United States," which is for summary levels that can cross state boundaries, such as the Nation, and all Regions, Divisions, Metropolitan Statistical Areas and Tribal Reservations. The United States level does not contain tables for geographies that are always entirely within a state, such as counties and places; for those tables, go to the folder or files for that state.
The following is a table that gives examples of the types of summary levels that are in the state folders and files and those that are in the United States folders and files. A complete list of all the ACS 5-year summary levels and which folder they are in can be found in Appendix F.
- 2006-2010_ACSSF_All_In_2_Giant_Files(Experienced-Users Only)
- 2006-2010_ACSSF_By_State_All_Tables
- 2006-2010_ACSSF_By_State_By_Sequence_Table_Subset
The naming convention used for the zipped files in this directory is the following:
As Appendix B shows, the "All in 2 Giant Files" and the "By State All Tables" folders contain the same tables as the "By State By Sequence Table Subset" folder. The difference is in the organization. The "By State All Tables" zipped files contain all of the sequence files for the given state, so each zipped file contains 354 files. The "All in 2 Giant Files" zipped file contains all sequence files for all states, which is thousands of files.
As mentioned earlier, the zipped files are divided by state or state-level equivalents. Those state- level equivalents include the District of Columbia and Puerto Rico. There is also a level called "United States," which is for summary levels that can cross state boundaries, such as the Nation, and all Regions, Divisions, Metropolitan Statistical Areas and Tribal Reservations. The United States level does not contain tables for geographies that are always entirely within a state, such as counties and places; for those tables, go to the folder or files for that state.
The following is a table that gives examples of the types of summary levels that are in the state folders and files and those that are in the United States folders and files. A complete list of all the ACS 5-year summary levels and which folder they are in can be found in Appendix F.
Detailed Tables for similar subject areas are grouped together in "sequences". A sequence number is an assigned number to a grouping of ACS tables, and it may change from year to year or product to product (1-, 3-, or 5- year data sets). The rules governing how many tables can be assigned the same sequence number depend on the following:
The Sequence Number and Table Number Lookup file-- available as an Excel spreadsheet, text file, and SAS dataset-- lists Table IDs associated with each sequence number. This spreadsheet, formerly known as "merge_5_6", is available at www2.census.gov/acs2010_5yr/summaryfile/. (The file is named "Sequence_Number_and_Table_Number_Lookup" with the ".xls", ".txt", or ".sas7bdat" extension.)
To find the sequence number associated with the table B08406, for example, a user must open and look for that Table ID in the Sequence Number and Table Number Lookup file. Shown below is a screenshot of this file, for the 2010 ACS 1-year, opened to where the "tblid" is B08406. The next column in the file, "seq", shows that this Table ID is associated with the sequence number "0003". In order to access the estimate and margin of error file for Table B08406, a user will need to download the estimate and margin of error files labeled with the sequence number "0003".
- There are no more than 256 columns per sequence, so the data can be read into a spreadsheet.
- Tables are grouped into sequences according to subject area, but they are not in numerical order (i.e., Table B00001 is not in sequence file 0001).
- Tables with race iterations are grouped in the same sequence.
The Sequence Number and Table Number Lookup file-- available as an Excel spreadsheet, text file, and SAS dataset-- lists Table IDs associated with each sequence number. This spreadsheet, formerly known as "merge_5_6", is available at www2.census.gov/acs2010_5yr/summaryfile/. (The file is named "Sequence_Number_and_Table_Number_Lookup" with the ".xls", ".txt", or ".sas7bdat" extension.)
To find the sequence number associated with the table B08406, for example, a user must open and look for that Table ID in the Sequence Number and Table Number Lookup file. Shown below is a screenshot of this file, for the 2010 ACS 1-year, opened to where the "tblid" is B08406. The next column in the file, "seq", shows that this Table ID is associated with the sequence number "0003". In order to access the estimate and margin of error file for Table B08406, a user will need to download the estimate and margin of error files labeled with the sequence number "0003".
There is a geography file that comes with the estimate and margin of error files. This file begins with a "g" and is an ASCII file using either a position based format or comma delimited format. A geography file exists for each state or state level equivalent.
Geography files are named using the following convention:
The geography files contain geographic information for an ACS tabulated area, including the name of the area. One variable on the file, called LOGRECNO, is the logical record number and is used to link the level of geography to the estimate and margin of error files. An example of how to use LOGRECNO is discusses in Chapter 2.5.
The following table provides the layout of the geography file:
This year, we created an Excel template for the geography file named "2010_SFGeoFileTemplate.xls". The template provides users with two rows containing the variable names and their descriptions (as displayed in the above table) for each column in the geography file. It is meant to be used with the comma delimited version of the geography file. The template is available at www2.census.gov/acs2010_5yr/summaryfile/UserTools/ in the zipped "2010_SummaryFileTemplates" zip file. Here is a screenshot of the Excel file:
Each state, the District of Columbia, Puerto Rico and the set of cross-state geographies, have one geography file associated with them regardless of how the Summary File is accessed. For example, the following screenshot shows the beginning of the state geography file for Maryland in the 2009 1-year ACS. In the screenshot, the logical record numbers corresponding with the state of Maryland, Allegany County, and Anne Arundel County are circled. The logical record number for the state of Maryland is "0000001", for Allegany County it is "0000012", and for Anne Arundel County it is "0000013".
Excess spaces in the pictured geography file have been removed for illustrative purposes.
Geography files are named using the following convention:
The geography files contain geographic information for an ACS tabulated area, including the name of the area. One variable on the file, called LOGRECNO, is the logical record number and is used to link the level of geography to the estimate and margin of error files. An example of how to use LOGRECNO is discusses in Chapter 2.5.
The following table provides the layout of the geography file:
Variable Name | Description | Field Size | Starting Position | Geographic Summary Levels For 3-Year Tables |
RECORD CODES | ||||
FILEID | Always equal to ACS Summary File identification | 6 | 1 | All Summary Levels |
STUSAB | State Postal Abbreviation | 2 | 7 | All Summary Levels |
SUMLEVEL | Summary Level | 3 | 9 | All Summary Levels |
COMPONENT | Geographic Component | 2 | 12 | All Summary Levels |
LOGRECNO | Logical Record Number | 7 | 14 | All Summary Levels |
GEOGRAPHIC AREA CODES | ||||
US | US | 1 | 21 | 10 |
REGION | Census Region | 1 | 22 | 20 |
DIVISION | Census Division | 1 | 23 | 30 |
STATECE | State (Census Code) | 2 | 24 | Reserved for future use |
STATE | State (FIPS Code) | 2 | 26 | 040, 050, 060, 160, 230, 312, 352, 500, 795, 950, 960, 970 |
COUNTY | County of current residence | 3 | 28 | 050, 060 |
COUSUB | County Subdivision (FIPS) | 5 | 31 | 60 |
PLACE | Place (FIPS Code) | 5 | 36 | 160, 312, 352 |
TRACT | Census Tract | 6 | 41 | Reserved for future use |
BLKGRP | Block Group | 1 | 47 | Reserved for future use |
CONCIT | Consolidated City | 5 | 48 | Reserved for future use |
AIANHH | American Indian Area/Alaska Native Area/ Hawaiian Home Land (Census) | 4 | 53 | 250 |
AIANHHFP | American Indian Area/Alaska Native Area/ Hawaiian Home Land (FIPS) | 5 | 57 | Reserved for future use |
AIHHTLI | American Indian Trust Land/ Hawaiian Home Land Indicator | 1 | 62 | Reserved for future use |
AITSCE | American Indian Tribal Subdivision (Census) | 3 | 63 | Reserved for future use |
AITS | American Indian Tribal Subdivision (FIPS) | 5 | 66 | Reserved for future use |
ANRC | Alaska Native Regional Corporation (FIPS) | 5 | 71 | 230 |
CBSA | Metropolitan and Micropolitan Statistical Area | 5 | 76 | 310, 312, 314 |
CSA | Combined Statistical Area | 3 | 81 | 330 |
METDIV | Metropolitan Statistical Area- Metropolitan Division | 5 | 84 | 314 |
MACC | Metropolitan Area Central City | 1 | 89 | Reserved for future use |
MEMI | Metropolitan/Micropolitan Indicator Flag | 1 | 90 | 010, 020, 030, 040, 314 |
NECTA | New England City and Town Area | 5 | 91 | 335, 350, 352 |
CNECTA | New England City and Town Combined Statistical Area | 3 | 96 | 335 |
NECTADIV | New England City and Town Area Division | 5 | 99 | 355 |
UA | Urban Area | 5 | 104 | 400 |
BLANK | 5 | 109 | Reserved for future use | |
CDCURR | Current Congressional District *** | 2 | 114 | 500 |
SLDU | State Legislative District Upper | 3 | 116 | Reserved for future use |
SLDL | State Legislative District Lower | 3 | 119 | Reserved for future use |
BLANK | 6 | 122 | Reserved for future use | |
BLANK | 3 | 128 | Reserved for future use | |
BLANK | 5 | 131 | Reserved for future use | |
SUBMCD | Subminor Civil Division (FIPS) | 5 | 136 | Reserved for future use |
SDELM | State-School District (Elementary) | 5 | 141 | 950 |
SDSEC | State-School District (Secondary) | 5 | 146 | 960 |
SDUNI | State-School District (Unified) | 5 | 151 | 970 |
UR | Urban/Rural | 1 | 156 | 010, 020, 030, 040 |
PCI | Principal City Indicator | 1 | 157 | 010, 020, 030, 040, 312, 352 |
BLANK | 6 | 158 | Reserved for future use | |
BLANK | 5 | 164 | Reserved for future use | |
PUMA5 | Public Use Microdata Area - 5% File | 5 | 169 | 795 |
BLANK | 5 | 174 | Reserved for future use | |
GEOID | Geographic Identifier | 40 | 179 | All Summary Levels |
NAME | Area Name | 200 | 219 | All Summary Levels |
BTTR | Tribal Tract | 6 | 419 | 256, 258, 291, 292, 293, 294 |
BTBG | Tribal Block Group | 1 | 425 | 258, 293, 294 |
BLANK | 50 | 426 | Reserved for future use |
This year, we created an Excel template for the geography file named "2010_SFGeoFileTemplate.xls". The template provides users with two rows containing the variable names and their descriptions (as displayed in the above table) for each column in the geography file. It is meant to be used with the comma delimited version of the geography file. The template is available at www2.census.gov/acs2010_5yr/summaryfile/UserTools/ in the zipped "2010_SummaryFileTemplates" zip file. Here is a screenshot of the Excel file:
Each state, the District of Columbia, Puerto Rico and the set of cross-state geographies, have one geography file associated with them regardless of how the Summary File is accessed. For example, the following screenshot shows the beginning of the state geography file for Maryland in the 2009 1-year ACS. In the screenshot, the logical record numbers corresponding with the state of Maryland, Allegany County, and Anne Arundel County are circled. The logical record number for the state of Maryland is "0000001", for Allegany County it is "0000012", and for Anne Arundel County it is "0000013".
Excess spaces in the pictured geography file have been removed for illustrative purposes.
Each of the three Summary File directories include zipped files containing estimate files (file names beginning with an "e") and margins of error files (file names beginning with an "m"). The estimate files contain published ACS estimates and the margin of error files contain published ACS margins of error for their respective estimates. Here is the naming convention used for those files:
The estimates and margins of error for Detailed Tables are grouped together in by sequence numbers, as discussed in Chapter 2.3. There is an estimate and margin of error file for each sequence number.
The format of the estimate and margin of error files are identical; they are strings of comma- delimited ASCII text. Each row represents a different geographic area and the first six fields contain metadata such as the geographic area and the sequence number. Following those fieldsare the estimates or margins of error for the Detailed Tables. Starting and ending positions of the fields associated with each Detailed Table can be found using the Sequence Number and Table Number Lookup file, which is discussed in Chapter 2.3. The estimates or margins of error for one Detailed Table span several fields within a row.
Here is the record layout of the estimates and the margin of error files:
Going back to the example from Chapter 2.3, we know that table B08406 corresponds to sequence "0003". Additionally, the Sequence Number and Table Number Lookup file (as shown earlier) tells us that table B08406 begins at position seven and contains 51 cells.
In order to get estimates for Maryland; Allegany County, MD; and Anne Arundel County, MD one must recall the logical record numbers associated with each geography. In Chapter 2.4, we identified these to be "0000001", "0000012", and "0000013", respectively. The logical record number, LOGRECNO, must be used to merge the geography information to the estimate and margin of error files.
The example below shows the estimate file for sequence "0003" and all geographies except census tracts and block groups for the state of Maryland using the 2010 ACS 1-year Summary File. Note that each row has a uniquely assigned logical record number, called LOGRECNO, which links the estimate to a specific geographic area. The pictured example has the logical record numbers corresponding to Maryland, Allegany County, and Anne Arundel County circled. Estimates for table B08406 at these geographic levels can be found within their respective rows at field seven and continuing for 50 additional fields.
The estimates and margins of error for Detailed Tables are grouped together in by sequence numbers, as discussed in Chapter 2.3. There is an estimate and margin of error file for each sequence number.
The format of the estimate and margin of error files are identical; they are strings of comma- delimited ASCII text. Each row represents a different geographic area and the first six fields contain metadata such as the geographic area and the sequence number. Following those fieldsare the estimates or margins of error for the Detailed Tables. Starting and ending positions of the fields associated with each Detailed Table can be found using the Sequence Number and Table Number Lookup file, which is discussed in Chapter 2.3. The estimates or margins of error for one Detailed Table span several fields within a row.
Here is the record layout of the estimates and the margin of error files:
Field Name | Description | Field Size |
FILEID | File Identification | 6 Characters |
FILETYPE | File Type | 6 Characters |
STUSAB | State/U. S. - Abbreviation (USPS) | 2 Characters |
CHARITER | Character Iteration | 3 Characters |
SEQUENCE | Sequence Number | 4 Characters |
LOGRECNO | Logical Record Number | 7 Characters |
Field # 7 and up | Estimates | Various |
Going back to the example from Chapter 2.3, we know that table B08406 corresponds to sequence "0003". Additionally, the Sequence Number and Table Number Lookup file (as shown earlier) tells us that table B08406 begins at position seven and contains 51 cells.
In order to get estimates for Maryland; Allegany County, MD; and Anne Arundel County, MD one must recall the logical record numbers associated with each geography. In Chapter 2.4, we identified these to be "0000001", "0000012", and "0000013", respectively. The logical record number, LOGRECNO, must be used to merge the geography information to the estimate and margin of error files.
The example below shows the estimate file for sequence "0003" and all geographies except census tracts and block groups for the state of Maryland using the 2010 ACS 1-year Summary File. Note that each row has a uniquely assigned logical record number, called LOGRECNO, which links the estimate to a specific geographic area. The pictured example has the logical record numbers corresponding to Maryland, Allegany County, and Anne Arundel County circled. Estimates for table B08406 at these geographic levels can be found within their respective rows at field seven and continuing for 50 additional fields.
TIGER/Line Shapefiles allow data users to directly link geographic areas to data from the American Community Survey and other surveys. The TIGER/Line Shapefiles are designed for use with geographic information system (GIS) software. Learn more about TIGER/Line Shapefiles at www.census.gov/geo/www/tiger/.
The variable GEOID joins the ACS Summary File to the TIGER/Line Shapefiles. For the ACS Summary File, GEOID is located in column AW of the geography file. It is not found in the estimates or margins of error files. (As discussed in previous chapters, the variable LOGRECNO is needed to join together the parts that make up the Summary File: the geography, estimates, and margins of error files). GEOID's corresponding variable in the 2010 TIGER/Line Shapefiles is GEOID10.
We will walk through an example of joining these files using Kent County, Delaware. In the ACS Summary File, the GEOID is 05000US10001. In the TIGER/Line Shapefiles, the GEOID10 is 10001. (GEOID is a concatenation of all the codes associated with a given geographic area, such as the state FIPS code, county FIPS code, etc. The exact concatenation varies by geographic area. In this example, 10=state FIPS code and 001=county FIPS code.)
The ACS Summary File GEOID contains the necessary information to connect to the TIGER/Line Shapefiles, but it needs to be modified in order to exactly match up. Notice that GEOID, 05000US10001, contains the GEOID10 string, 10001. In order to create an exact match of GEOID and GEOID10, it is necessary to remove all of the characters before and including the letter "S" in the ACS Summary File. By removing these characters, the new GEOID in the ACS Summary File exactly matches the field GEOID10 in the TIGER/Line Shapefiles.
The following is an example of how to modify the ACS Summary File's GEOID in Excel 2007 so it can be joined with TIGER/Line Shapefiles:
The variable GEOID joins the ACS Summary File to the TIGER/Line Shapefiles. For the ACS Summary File, GEOID is located in column AW of the geography file. It is not found in the estimates or margins of error files. (As discussed in previous chapters, the variable LOGRECNO is needed to join together the parts that make up the Summary File: the geography, estimates, and margins of error files). GEOID's corresponding variable in the 2010 TIGER/Line Shapefiles is GEOID10.
We will walk through an example of joining these files using Kent County, Delaware. In the ACS Summary File, the GEOID is 05000US10001. In the TIGER/Line Shapefiles, the GEOID10 is 10001. (GEOID is a concatenation of all the codes associated with a given geographic area, such as the state FIPS code, county FIPS code, etc. The exact concatenation varies by geographic area. In this example, 10=state FIPS code and 001=county FIPS code.)
The ACS Summary File GEOID contains the necessary information to connect to the TIGER/Line Shapefiles, but it needs to be modified in order to exactly match up. Notice that GEOID, 05000US10001, contains the GEOID10 string, 10001. In order to create an exact match of GEOID and GEOID10, it is necessary to remove all of the characters before and including the letter "S" in the ACS Summary File. By removing these characters, the new GEOID in the ACS Summary File exactly matches the field GEOID10 in the TIGER/Line Shapefiles.
The following is an example of how to modify the ACS Summary File's GEOID in Excel 2007 so it can be joined with TIGER/Line Shapefiles:
- Open the ACS Summary File comma delimited geography file in Excel. This example uses Delaware's geography file (g20101de.csv) available at www2.census.gov/acs2010_3yr/summaryfile/ with the column headers from the geography file template copied into Delaware's geography file. Learn more about the geography file template in Chapter 2.4.
- Insert 2 blank columns to the right of the column "GEOID." Your modified GEOID will eventually go into the second column. (Note: Columns F through AV in the diagrams following are hidden for illustrative purposes.)
- Next, select the column "GEOID."
- Select the "Data" tab from the top menu, then select "Text to Columns." to Columns Wizard" box should pop up.
- In the "Convert Text to Columns Wizard," select "Delimited" under "Choose the file type that best describes your data:" then click "Next."
- Check "Other" as the delimiter and type the letter "S" into the box. Click "Next.
- In the "Data preview" window, click on the top of both columns in "Data preview" and select "Text" under "Column data format." In "Destination," select the two blank columns that you created in Step 1. Click "Finish."
- Column AY should now contain the modified GEOID that corresponds to GEOID10 in the TIGER/Line Shapefiles. The second screenshot shows the TIGER/Line Shapefile for Kent County, Delaware.
- The ACS Summary File and the TIGER/Line Shapefile should now be ready to be joined using GIS software. Visit "Working with TIGER/Line Shapefiles" at http://www.census.gov/geo/www/tiger/wwtl/wwtl.html to learn more about how to access and use the TIGER/Line Shapefiles.