import pandas as pd
import pandas_datareader as pdr
import altair as alt
Jobs Friday, October 2024
The Bureau of Labor Statistics just reported employment figures for September 2024. The report came above expectations, with total job growth of 254 thousand, way above the expected 150 thousand.
Here we show how to code a couple of easy charts to visualize job growth by sector, one for September growth and another one for year-to-date figures. Keep in mind that the BLS revises these figures, most of the time revisions are small but sometimes they are not.
All right, let’s get to it. First step is to import packages. We use pandas-datareader
to pull our data from the very helpful FRED.
Next, construct a dictionary with the FRED id for all sectors and total payroll. The idea of the dictionary is we have the id and a more meaningful name, which is the one we will use in our charts.
= {
payroll_dict "PAYEMS": 'TOTAL',
"USCONS": 'Construction',
"USTRADE": 'Retail',
"USPBS": 'Professional and Business',
"MANEMP": 'Manufacturing',
"USFIRE": 'Finance',
"USMINE": 'Mining',
"USEHS": 'Education and Health',
"USWTRADE": 'Wholesale Trade',
"USTPU": 'Transport and Utilities',
"USINFO": 'Information',
"USLAH": 'Leisure and Hospitality',
"USGOVT": 'Government',
"USSERV": 'Other'
}
Now we pull the data. Note how useful the above dictionary is: we use its keys to grab the data from FRED and then we map
the id into Name
, which are the dictionary values.
We pull the data starting in December of 2023 so we can compute changes; FRED gives us levels of employment but what we want are changes (or growth). Note also we melt
our dataframe to have it long, thus each row is a unique combination of date-sector. Have our data long will make it easier to plot.
= (
Employment # pull data from FRED
list(payroll_dict.keys()),
pdr.get_data_fred(='2023-12-01')
start# extract DATE index as column
.reset_index()# data to long or tidy
='DATE',
.melt(id_vars='Sector',
var_name= 'Payroll')
value_name # map names and compute monthly growth
= lambda df: df['Sector'].map(payroll_dict),
.assign(Name = lambda df: df.groupby('Sector')['Payroll'].diff() )
Growth # drop December observation
.dropna() )
September Growth
First chart for payroll growth in September. We build the chart in multiple layers, which is usually the best way to do it in altair
.
We show a neat trick to create a bar chart with different colors and labels with another color. We first create a base
layer that has the data and encodes the y- and x-axis but not the color. The plot
layer adds the color encoding to base
while the labels
layer adds the label text to the base
. If we added the labels to the plot
layer then the labels would follow the same color scale, steelblue and maroon instead of black. Then we add a layer for the dashed red line at 0 growth and one more for data attribution.
# base information
= (
base 'DATE==DATE.max()'),
alt.Chart(Employment.query(=alt.Title(f'Payroll Growth for {Employment['DATE'].max().strftime("%B %Y")}',
title='Thousands of Jobs') )
subtitle=0.8)
.mark_bar(opacity
.encode('Growth:Q').title(None)
alt.X(="datum.value + 'K'"),
.axis(labelExpr'Name:N').title(None).sort('-x')
alt.Y(
)
)
# now encode different color for pos/neg growth
= (
plot
base
.encode(= alt.condition(
color > 0,
alt.datum.Growth 'steelblue'),
alt.value('maroon')
alt.value(
) )
)
# label to show number
= (
labels
base=16,
.mark_text(fontSize='bold',
fontWeight='black',
color=0.75)
opacity'Growth:Q').format('.0f') )
.encode( alt.Text(
)
# mark 0 line
= (
rule
alt.Chart()='red', strokeDash=[6,3])
.mark_rule(color=alt.datum(0))
.encode(x
)
# source attribution
= (
footer
alt.Chart()="Source: FRED",
.mark_text(text='height', x=0,
y=40,
dy='left',
align='gray')
color
)
alt.layer(plot, labels, rule, footer)
When we look at the sector breakdown of the jobs report, Education and Health and Leisure and Hospitality account for the bulk of the growth, while Manufacturing is the only sector that decreased. One more reminder that services power our economy, a fact we sometimes seems to forget when going crazy about industrial policy.
Year to Date
Now let’s create a chart for year-to-date growth. This is pretty similar to the one above after we construct the YTD growth in the code snippet below. YTD is just the cumulative sum of growth.
'YTD_growth'] = Employment.groupby('Sector')['Growth'].cumsum() Employment[
Now we make the same chart except for the x-axis variable.
= (
base 'DATE==DATE.max()'),
alt.Chart(Employment.query(=alt.Title(f'Year-to-Date Payroll Growth' ,
title=f'{Employment['DATE'].max().strftime("%B %Y")}, Thousands of Jobs') )
subtitle=0.8)
.mark_bar(opacity
.encode('YTD_growth:Q').title(None)
alt.X(="datum.value + 'K'"),
.axis(labelExpr'Name:N').title(None).sort('-x')
alt.Y(
)
)
= (
plot
base
.encode(= alt.condition(
color > 0,
alt.datum.YTD_growth 'cornflowerblue'),
alt.value('indianred')
alt.value(
) )
)
= (
labels
base=16,
.mark_text(fontSize='bold',
fontWeight='black',
color=0.8)
opacity'YTD_growth:Q').format(',.0f') )
.encode( alt.Text(
)
alt.layer(plot, labels, rule, footer)
Manufacturing is again the laggard, joined by Information and Mining in the red. The losses in the Information sector highlight how tough the last couple of years have been on IT jobs. Again Education and Health and Leisure and Hospitality are top sectors, together with Government.