Scaling your Team from 5 to 250 Engineers 🌱

Patterns and recommendations about velocity, quality, and outcome.

Jun 16, 2022

This is a guest post written by my friend Eiso Kant.

As Founder and CEO of Athenian, Eiso spent the last five years speaking to over 400 engineering orgs, giving him in-depth insights into the challenges that they face.

Eiso also co-hosts Developing Leadership with Jason Warner (former CTO of GitHub) — the podcast guiding listeners toward more successful engineering leadership.

Scaling your engineering organization is like zooming out on Google maps. You start with a lot of fine-grained visibility on streets, traffic, restaurants, and shops. But the more you zoom out, the more challenging it becomes to find helpful information.

As an engineering leader, you experience the same. In the beginning, you can quickly identify patterns and bottlenecks. But as your team grows, your visibility of the underlying issues will decrease.

Today we will uncover the challenges you are likely to face as you go through the rocky road of scaling from 5 to 250 engineers. We will do so by looking at three major angles:

🏃‍♂️ Velocity
🔍 Quality
🎯 Outcome

We will also dig into the metrics you should look at as you scale, and how you can use tools like Athenian to create a culture of continuous improvement within your engineering team.

Let’s dive in 👇

🏃‍♂️ What Happens To Velocity?

Many engineering leaders believe the myth that when you double your team, you will double the output. But my experience working with hundreds of orgs shows that the speed of shipping code doesn’t increase linearly as you scale.

If you’re growing fast, you’ll have a constant onboarding challenge, which takes a toll on velocity.

At the same time, most of your leaders will probably be newly promoted and might lack the experience to support the organization's growth.

So, let’s take a look at each scaling stage:

🌱 From 5 to 20 Engineers

In startups, we often scale teams fast and fix things when they break.

Early-stage companies trying to get to product-market fit (PMF) put a lot of pressure on delivery, which leads to keeping a steady speed but accumulating a lot of tech debt.

💡 Recommendations:

Create core processes to improve the developer experience continuously, while focusing on market fit.
Avoid tech debt on everything that doesn’t allow you to scale in the future.
Appoint a Head of Engineering to talk with team leaders, understand blockers, and design the product and org architecture to support the next growth level.

🪴 From 20 to 100 Engineers

You start zooming out on the map and lose visibility of the details. Managers begin having managers, and the new people you bring in will slow you down because it takes time to grasp the full context of the product and organization.

If you don't set the right foundations in the previous stage, this is where you can lose 25-50% of effectiveness.

Your code reviews will be consistently blocked because of team dependencies, and you'll start having misalignments, especially as your lines of communication increase.

💡 Recommendations:

Bring expertise from the outside and promote internal people simultaneously so you can pair them.
Create a team dedicated to the developer experience, but keep teams on their toes.
Double down on processes and goal definition - this is where you want to build the foundations for a 1000 dev team.
Thoughtfully plan the structure of product teams (value delivery) and platform teams (product foundations and enablement) for what's coming next.

🌳 From 100 to 250 Engineers

While growing fast, you face the problem of onboarding many new people and having lots of internal people move to manager or tech lead positions.

You will continuously reshape the organization but lack visibility on where to act. You are looking at the city 10.000 km in the air, which means you can’t see where bottlenecks are. This will slow you down.

💡 Recommendations:

Train managers and tech leads on how to delegate so they can focus on team performance and people management.
Systematically monitor organizational bottlenecks, gather with your leaders to discuss potential solutions, and define explicit goals for improving velocity.
Push responsibility to the teams. They are the ones that can solve problems while you focus on minimizing dependencies between groups.

🔍 What Happens To Quality?

Most of us fall into the build trap.

As we grow, we primarily focus on delivering new features, eventually reaching a point where tech debt is too big to tame, leading to a full product rewrite.

If you are growing fast, you’ll have to constantly deal with tech debt. Most of your engineers will complain about it, but they will rarely bring real data to the table, which makes it hard to know where you are.

🌱 From 5 to 20 Engineers

When you have <5 engineers, there are not enough customers that a bug or outage could impact. So you should mostly focus on speed rather than quality.

However, when you scale from 5 to 20 and find some PMF, your customer base increases. This is when you'll start to dig into the quality of the product.

💡 Recommendations:

Ensure you define a process to capture the issues you’re facing and keep critical problems under control.
Prioritize bugs that block any potential growth of your product.
Promote good traceability and logging standards from the get-go.

🪴 From 20 to 100 Engineers

PMF is clear at this stage and the size/value of your customers increases.

The number of bugs and critical incidents will also increase, making teams feel underwater. As a result, the product you built might no longer work for this market size.

💡 Recommendations:

Have a strong bug backlog management, ensuring you fix all critical and high issues quickly.
Remember that issue priority is based on criticality and frequency, so you need to fix issues you know will hit you multiple times early on.

🌳 From 100 to 250 Engineers

The challenges of the previous stage will escalate when you grow from 100 to 250 and beyond.

Some of your bugs will now come from the accumulation of tech debt, negatively impacting your MTTR. In addition, teams will start pushing bugs to one another if you don't put good team structures in place (Conway's law).

💡 Recommendations:

Platform teams should be taming MTTR. A low MTTR is what allows you to keep moving fast.
Another critical metric you need to pay attention to is change failure rate.
As with velocity, make sure you have regular reviews to create the habit of looking at these metrics and acting on them from your management team.

🎯 What Happens to Impact / Outcome?

As we keep growing, Engineering Leaders need to become more opinionated, because the tradeoffs between velocity, quality, and customer impact become more evident.

If you’re growing fast, you’ll have to constantly deal with the features vs. quality tradeoff, which takes a toll on the capacity to deliver on the product. Being intentional in making these decisions will play a critical role in your success.

🌱 From 5 to 20 Engineers

Shipping is king as you aim for product-market-fit.

Your roadmap is still directly impacted by conversations with customers, so you will invest most of your efforts in new features.

However, the technical decisions you made when you were <5 engineers will start to slow you down.

💡 Recommendations:

Focus on new features until you achieve PMF, but pay attention to investing in bugs, developer efficiency, and architecture decisions that block the growth of your product.
As for everything else, let it burn, but keep an eye on it.

🪴 From 20 to 100 Engineers

You now have a live product with a mature client base.

You need to move your efforts from primarily creating new features to honing in on internal processes.

Remember that business value is everything that creates value for the business, so you will focus more and more on fixing bugs and cleaning up tech debt, and less on new features.

💡 Recommendations:

Ensure teams define and report their level of investment.
Standardize processes across teams for you to understand effort investment at a high level.
Don’t be punitive if teams don’t meet the investment level you were expecting. Understand why and help course-correct.

🌳 From 100 to 250 Engineers

Your visibility will continue to decrease at this stage, so trust is critical.

You might realize one quarter later that the team wasn’t focused on delivering the expected features.

Investment is the mastermind behind outcomes. You need to be decisive on time allocation and push a lot of these decisions to the teams.

💡 Recommendations:

Think of investment as a strategic component, define quarterly/yearly goals, and adapt based on insights from regular meetings.
Support and educate the teams on investment decisions. You also need to coordinate investment happening across the org and promote transparency with the other directors.

Now that we’ve seen the challenges of scaling your engineering org, let’s dive into what engineering leaders can do to prevent these scenarios.

It starts with data! 👇

📈 What Metrics Should You Look At?

What can you do to keep things in check as your team grows?

The key is to understand how the different parts of your delivery pipeline work, to identify the levers you can pull for smooth sailing as you scale.

Modern engineering teams focus on delivering value through continuous delivery, integration, and improvement.

To do so, here are the key metrics to keep an eye on:

🏃‍♂️ Velocity Metrics

Velocity metrics help you access the agility of your team so you can understand how to improve the speed and direction of developing software across all the stages of your pipeline.

Here are the main KPIs you should track:

Lead Time
Release Frequency
Released PRs
PR Cycle Time

Let’s see them all 👇

Lead Time

Lead Time tells you how long it takes for your team to go from a ticket being in progress to a released PR, allowing you to detect early bottlenecks and resolve them on the spot.

Release Frequency

Release Frequency tells you how often you release value to your customers. Together with Lead Time, Release Frequency measures software delivery performance tempo.

Released PRs

The number of pull requests released during a period is an important metric to consider in combination with your PR Cycle Time.

A decreasing Lead Time might be linked to a decrease in the number of released PRs.

PR Cycle Time

The elapsed time between the 1st commit of a PR and the code being used in production. Compared to the lead time, the PR cycle time focuses on just the code pipeline section.

🔍 Quality Metrics

Quality metrics should be as close as possible to your end-user experience.

Without quality metrics, it’s easy to find yourself going at a faster speed at the expense of your end-user experience.

Here are the main KPIs you should track:

Bugs Fixing Ratio
Bugs Raised by Priority
Mean Time To Restore by Priority

Let’s see them all.

Bugs Fixing Ratio

The Bugs Fixing Ratio is the ratio of bugs fixed to resolved during the time period selected.

*Comparison of bugs raised and fixed over a 1yr period. Source: Athenian*

Bugs Raised by Priority

Bugs Raised by Priority helps you get more granular information into the bugs raised by severity level.

Mean Time to Restore by Priority

As an engineering leader, you want to make sure that your team is fixing customer-impacting issues the fastest.

🎯 Impact / Outcome Metrics

When building software, it's easy to lose sight of the business objectives we impact.

It may be hard to tie the refactoring of a major component to some impact to the end user. Conversely, it may be easier to map new features — e.g. the ability to complete a purchase with one click instead of four — to measurable business goals.

One of the best ways to keep a pulse on this is to track engineering allocation by work type.

Throughput by work type can be looked at from the number of tickets fixed or the number of PRs released. It helps you understand if you’re truly aligning your engineering org with business goals.

*Allocation per work type. Source: Athenian*

📌 Final thoughts

Engineering Leaders have two responsibilities:

🔨 Improve the developer experience
🎯 Deliver impact to the end-user

These two are infinitely linked, and, when done successfully, will help you create a culture of continuous improvement.

If you’re a new engineering leader, ask for help by reaching out to more experienced leaders. You’ll find that the great ones are those who are constantly questioning their own methods and looking for ways to improve.

Finally, don’t be afraid of the challenges ahead. With the right people, the right mindset, and the right tools, you can take your product and company to new heights, and have a happy team while doing so.

Athenian is a Data-Enabled Engineering platform that helps engineering leaders build a continuous improvement culture by leveraging insights and aligning teams with company goals.

If this sounds like something your engineering org needs, you can find out more below.

Learn more about Athenian

And that’s it for this week! If you liked the article, please do any of these:

1) ❤️ Share it — Refactoring lives thanks to word of mouth. Share the article with your team or with someone to whom it might be useful!

2) ✉️ Subscribe to the newsletter — if you aren’t already, consider becoming a paid subscriber. That also gives you access to the community and the curated library.

Learn more about the paid plan ✨

p.s. 30-days money-back guarantee with no questions asked!

🤔 How would you rate this edition?

Great • Good • Meh

Scaling your Team from 5 to 250 Engineers 🌱

Patterns and recommendations about velocity, quality, and outcome.

🏃‍♂️ What Happens To Velocity?

🌱 From 5 to 20 Engineers

🪴 From 20 to 100 Engineers

🌳 From 100 to 250 Engineers

🔍 What Happens To Quality?

🌱 From 5 to 20 Engineers

🪴 From 20 to 100 Engineers

🌳 From 100 to 250 Engineers

🎯 What Happens to Impact / Outcome?

🌱 From 5 to 20 Engineers

🪴 From 20 to 100 Engineers

🌳 From 100 to 250 Engineers

📈 What Metrics Should You Look At?

🏃‍♂️ Velocity Metrics

Lead Time

Release Frequency

Released PRs

PR Cycle Time

🔍 Quality Metrics

Bugs Fixing Ratio

Bugs Raised by Priority

Mean Time to Restore by Priority

🎯 Impact / Outcome Metrics

📌 Final thoughts

🤔 How would you rate this edition?

Discussion about this post