Calculating medians

User question: "I was curious how Social Explorer derives its totals when the user is looking at a custom selection of census tracts, particularly in the case of medians, e.g., median household income."

Before we delve into the details of calculating medians, let’s define what a median actually is.

What is a median?

In statistics, there are many types of averages – mean, median, and mode being the most common. Mean is the average everyone’s used to, where you add up all the numbers and then divide the result by the count of numbers. Mode is the value that occurs most often.

Median is the middle value. To find the median, you would list all the values in numerical order from smallest to largest. The median would be located in the center of that list — the middle value. If the list has an even count of values then an average of the two middle values is taken as the median.

Medians in the census data

The kind of Census data that Social Explorer provides is tabulated data. This means that there are no lists of individuals to sort and look up medians. So how are medians calculated when you don't have access to the underlying data? The policy of the Bureau is to compute medians using the bracket tables, which is exactly how Social Explorer computes medians for totals. Even though the Census Bureau has access to the underlying data, they still use the brackets for these computations.

How are medians calculated in aggregate data by Social Explorer?

Social Explorer computes medians from bracket tables. A bracket table categorizes values into ranges. For example, the Household Income table shows all households in brackets like this: <$10,000 (1st bracket), $10K to $15K (2nd bracket), $15K to $20K (3rd bracket), etc. When computing a median, Social Explorer works from the TOTALS column where all the selected geographies are aggregated. We use an algorithm to find the middle bracket, such that half the households fall into brackets above and half below this middle bracket. We then use the Pareto distribution to estimate the value within the bracket. Take a look at the following for an illustration.

Shows an illustration of how a median is computed from a bracket table.

This is the same procedure that the Census Bureau uses to compute medians.

Are standard errors computed for medians?

Standard errors are not computed for medians since they’re more subject to sampling fluctuations. When computing standard errors, bear in mind that they are more reliable with larger samples and can be difficult to compute in non-normal distributions.