How Do The Holidays Impact Engineering Productivity? A Statistical Analysis.
An analysis for all the procrastinators out there.
Intro: The Working World, The Planning Fallacy, And The Holidays
Not much work happens in the working world - or at least significantly less than I imagined. Upon entering the workforce, I discovered a universal and unspoken agreement of times when minimal productivity was permitted, such as:
Fridays Past 3 pm: To stay at work past this time is purely optics.
Mondays Before 11 am: The first few hours of every Monday are dedicated to staring at your monitor in bewilderment and contemplating how you will read all your emails or produce anything of value over the next five days.
The Holidays and The Summer: Taking three-day weekends over four consecutive weeks is a bold yet widely accepted strategy. And working between December 22nd and New Years? A light suggestion, like a tip jar or speed limits.
To be clear, I'm not complaining. As a new grad, I assumed everyone worked at 110% capacity year-round, so leisure time was a pleasant surprise.
Another fun find of my early work days was that humans constantly fall short in setting and hitting deadlines. Workplace initiatives are effectively dressed-up high school group projects - people of all ages and professions struggle to manage due dates. World-renowned economists Daniel Kahneman and Amos Tversky studied systematic failures in project management and termed this phenomenon the planning fallacy. From Wikipedia:
"The planning fallacy is a phenomenon in which predictions about how much time will be needed to complete a future task display an optimism bias and underestimate the time needed."
In his book Thinking, Fast and Slow, Kahneman cites a handful of notable studies that detail the planning fallacy in action, including home renovations costing an average of 2x the original budget and 90 percent of high-speed railroad projects missing on spending and passenger predictions. Unfortunately, infrastructure megaprojects and high school group assignments have much in common.
Human beings are weird about time management, yet things get done at some point. Amazon delivers boxes, Apple launches new products, and Netflix produces a never-ending parade of reality TV. The world turns despite humanity's aggressive desire to avoid doing much of anything.
So when do tasks get done? And how do people manage legitimate deadlines amidst holidays, leisure, and procrastination? If the year is one big sprint - when are we at our most and least productive?
Methodology: When Does the Work Get Done?
Our goal is to identify seasonal trends in human productivity, focusing on work-rate fluctuations within a calendar year. We'll utilize GitHub pull requests as our indicator of task throughput. A pull request occurs when a software contributor/developer is ready to merge new code with the project's central codebase. Pull requests are the performance review and ribbon-cutting ceremony for a new feature or bug fix. Think of the most impactful product features of the last twenty years - the like button, the retweet, the upvote - they all started with pull requests.
Because we are reviewing pull requests (knowns as PRs), our analysis findings directly apply to domains reliant on software engineering. As such, our results will lack generalizability in non-tech-focused fields (brick-and-mortar retail, manufacturing, agriculture, etc.).
Dataset:
Out of sheer benevolence, Google BigQuery offers a complete snapshot of content from over 2.8 million open-source GitHub repositories. As such, we can quickly analyze the source code of almost 2 billion files via simple SQL queries. We will examine all pull requests from this dataset between 2014 and 2021.
Analysis Overview:
Our analysis will address the following questions
What months are the most and least productive? Are there any macro-level trends throughout the year?
How does developer output change when examined at a weekly level? And what weeks feature the highest and lowest levels of output?
Connecting the dots (and some speculation): why do humans work this way?
Analysis Pt. 1: What's Up With January?
Before beginning this analysis, I assumed December was an obvious candidate for lowest productivity month, as this period encourages extended vacation time and leisure maximization. And yet December is a reasonably productive month. In fact, according to GitHub, January maintains the lowest level of software development activity.
I was legitimately shocked by this result. I always assumed the beginning of the year featured a re-energized workforce ready to hit the ground running. Apparently, I assumed wrong. So I did some Google research and found several studies corroborating January as a month of below-average output, including this fascinating Priceonomics analysis of task management tools.
So why does work-rate fall at the beginning of the year? Some hypotheses include:
Holidays: January has New Years and Martin Luther King Day.
Planning Cycles: Priceonomics cites corporate planning cycles as a likely contributor to diminished productivity. The article reasons, "[reduced task velocity] may be because the early year is typically for setting goals, not completing them — and as we near year's end, we're struggling to get everything done."
Employee Turnover: People often leave jobs at the beginning of the year following annual bonuses and stock vesting timelines. Companies may spend January and February recruiting and onboarding new employees.
So the beginning of the year is for slacking, and the year's end is for hustling. This outcome inspires more questions than answers, particularly surrounding developer activity in December. How can a month featuring multiple cultural holidays and a weeklong period of minimal work maintain decent productivity?
Analysis Pt. 2: Q4 Is a Procrastinator's Paradise.
Humans are bad at managing deadlines, yet there are times when targets must be hit. Corporate America lives quarter to quarter, and the year's end signifies an inflexible deadline for Q4 goals and annual goals. In other words, people must get their work done by December 31st.
To understand December's surprising developer activity, I aggregated the same GitHub data at a weekly level. As you can see, the latter half of the year features a surge in pull requests:
Q4 is quirky and short. Thanksgiving, Christmas, and New Years lessen available workdays, but company goals are often ambitious to close out the year. So people have to achieve more with less time at their disposal. As a result, we see a massive ramp in PR activity starting at the end of September, with a crescendo in the middle of December.
When we sort calender-weeks by PR activity, our most productive periods come between October and mid-December.
And our stretches of lowest productivity occur between Christmas and the end of January.
Perhaps Christmas is actually a five-week holiday that also consumes January.
Results and Takeaways: People Plan Gradually and Execute Later
The calendar year ends on December 31st, but Christmas is the finish line for work schedules. In reviewing fluctuations in developer activity, we can observe a few clear-cut trends:
Developer activity is low from the beginning of the year through early February.
There is a minor dip in engineering output throughout the Summer months.
Work-rate ramps significantly at the end of September and escalates through mid-December.
Humans are slow planners and skilled procrastinators but don't like missing annual deadlines.
Final Thoughts: Thinking, Slow then Fast
In 2014, researchers from the University of Chicago and the University of Toronto conducted a series of studies surrounding time categorization and management. In one such experiment, farmers in rural India were offered a financial incentive to open bank accounts and make an initial deposit. One set of farmers was given a deadline of December 31st that year, while a second group was incentivized to deposit funds before January 31st the following year. The results were striking. Farmers with a December 31st deadline deposited funds at a rate near-seven times the cohort with a January 31st deadline (28% to 4.5%). In assessing their findings, the researchers proposed dueling modes of time categorization. A deadline is perceived as "unlike-the-present" or "like-the-present," with the latter categorization triggering an execution-focused mindset and a higher likelihood of task completion. It turns out everyone procrastinates - farmers and software engineers alike.
There are times when minimal work output and procrastination are permitted, but the year's end is not one of those times. Once October begins, there is no longer an "unlike-the-present" timeline. As such, the last three months of the year are a sprint to atone for the prior nine months, with cultural celebrations interspersed throughout this gauntlet of catch-up work. Moreover, Q4's holiday extravaganza exacts karmic retribution on stragglers, as Christmas, New Years, and Thanksgiving collectively reduce available work days by around 10%. Unfortunately, Thanksgiving turkey, Christmas cheer, and New Year's parties come at a price.
And while our beloved winter celebrations amplify work stress, they also enrich our well-being and provide immediate relief from our self-inflicted suffering. Workers must deliver something at some point, so why not hustle hardest in advance of the happiest time of year? I always thought of the Christmas holidays as a drain on work productivity, but maybe they are a great motivator. Salvaging deadlines is never fun, but perhaps our end-of-year slog is a little less bad with the promise of gingerbread houses, Christmas movies, and family time.
Want to chat about data and statistics? Have an interesting data project? Just want to say hi? Email daniel@statsignificant.com
Speculation: the lack of development productivity in January is due to spending the month desperately running around fixing problems in production caused by rushed implementations in December.
Every year my New Year's resolution is to try to strike a better work/life balance?