Jobs Friday, October 2024

Python

Employment

Two charts on job growth.

Published

October 6, 2024

The Bureau of Labor Statistics just reported employment figures for September 2024. The report came above expectations, with total job growth of 254 thousand, way above the expected 150 thousand.

Here we show how to code a couple of easy charts to visualize job growth by sector, one for September growth and another one for year-to-date figures. Keep in mind that the BLS revises these figures, most of the time revisions are small but sometimes they are not.

All right, let’s get to it. First step is to import packages. We use pandas-datareader to pull our data from the very helpful FRED.

import pandas as pd
import pandas_datareader as pdr
import altair as alt

Next, construct a dictionary with the FRED id for all sectors and total payroll. The idea of the dictionary is we have the id and a more meaningful name, which is the one we will use in our charts.

payroll_dict = {
    "PAYEMS": 'TOTAL',
    "USCONS": 'Construction',
    "USTRADE": 'Retail',
    "USPBS": 'Professional and Business',
    "MANEMP": 'Manufacturing',
    "USFIRE": 'Finance',
    "USMINE": 'Mining',   
    "USEHS": 'Education and Health',
    "USWTRADE": 'Wholesale Trade',
    "USTPU": 'Transport and Utilities',
    "USINFO": 'Information',
    "USLAH": 'Leisure and Hospitality',
    "USGOVT": 'Government',
    "USSERV": 'Other'
}

Now we pull the data. Note how useful the above dictionary is: we use its keys to grab the data from FRED and then we map the id into Name, which are the dictionary values.

We pull the data starting in December of 2023 so we can compute changes; FRED gives us levels of employment but what we want are changes (or growth). Note also we melt our dataframe to have it long, thus each row is a unique combination of date-sector. Have our data long will make it easier to plot.

Employment = (
    # pull data from FRED
    pdr.get_data_fred(list(payroll_dict.keys()),
        start='2023-12-01')
    # extract DATE index as column
    .reset_index()
    # data to long or tidy
    .melt(id_vars='DATE',
        var_name='Sector',
        value_name = 'Payroll')
    # map names and compute monthly growth
    .assign(Name = lambda df: df['Sector'].map(payroll_dict),
        Growth = lambda df: df.groupby('Sector')['Payroll'].diff() )
    # drop December observation
    .dropna()
)

September Growth

First chart for payroll growth in September. We build the chart in multiple layers, which is usually the best way to do it in altair.

We show a neat trick to create a bar chart with different colors and labels with another color. We first create a base layer that has the data and encodes the y- and x-axis but not the color. The plot layer adds the color encoding to base while the labels layer adds the label text to the base. If we added the labels to the plot layer then the labels would follow the same color scale, steelblue and maroon instead of black. Then we add a layer for the dashed red line at 0 growth and one more for data attribution.

# base information
base = (
alt.Chart(Employment.query('DATE==DATE.max()'),
    title=alt.Title(f'Payroll Growth for {Employment['DATE'].max().strftime("%B %Y")}',
        subtitle='Thousands of Jobs') )
    .mark_bar(opacity=0.8)
    .encode(
        alt.X('Growth:Q').title(None)
            .axis(labelExpr="datum.value + 'K'"),
        alt.Y('Name:N').title(None).sort('-x')
    )
)

# now encode different color for pos/neg growth
plot = (
    base
    .encode(
        color = alt.condition(
            alt.datum.Growth > 0,
            alt.value('steelblue'),
            alt.value('maroon')
    ) )
)

# label to show number
labels = (
    base
    .mark_text(fontSize=16, 
            fontWeight='bold', 
            color='black',
            opacity=0.75)
    .encode( alt.Text('Growth:Q').format('.0f') )
)

# mark 0 line
rule = (
    alt.Chart()
    .mark_rule(color='red', strokeDash=[6,3])
    .encode(x=alt.datum(0))
)

# source attribution
footer = (
    alt.Chart()
    .mark_text(text="Source: FRED", 
        y='height', x=0, 
        dy=40, 
        align='left',
        color='gray')
) 

alt.layer(plot, labels, rule, footer)

When we look at the sector breakdown of the jobs report, Education and Health and Leisure and Hospitality account for the bulk of the growth, while Manufacturing is the only sector that decreased. One more reminder that services power our economy, a fact we sometimes seems to forget when going crazy about industrial policy.

Year to Date

Now let’s create a chart for year-to-date growth. This is pretty similar to the one above after we construct the YTD growth in the code snippet below. YTD is just the cumulative sum of growth.

Employment['YTD_growth'] = Employment.groupby('Sector')['Growth'].cumsum()

Now we make the same chart except for the x-axis variable.

base = (
alt.Chart(Employment.query('DATE==DATE.max()'),
    title=alt.Title(f'Year-to-Date Payroll Growth' ,
        subtitle=f'{Employment['DATE'].max().strftime("%B %Y")}, Thousands of Jobs') )
    .mark_bar(opacity=0.8)
    .encode(
        alt.X('YTD_growth:Q').title(None)
            .axis(labelExpr="datum.value + 'K'"),
        alt.Y('Name:N').title(None).sort('-x')
    )
)

plot = (
    base
    .encode(
        color = alt.condition(
            alt.datum.YTD_growth > 0,
            alt.value('cornflowerblue'),
            alt.value('indianred')
    ) )
)

labels = (
    base
    .mark_text(fontSize=16, 
            fontWeight='bold', 
            color='black',
            opacity=0.8)
    .encode( alt.Text('YTD_growth:Q').format(',.0f') )
)

alt.layer(plot, labels, rule, footer)

Manufacturing is again the laggard, joined by Information and Mining in the red. The losses in the Information sector highlight how tough the last couple of years have been on IT jobs. Again Education and Health and Leisure and Hospitality are top sectors, together with Government.