Scaling is not just a technical problem

The phrase "scaling issues" gets used to describe everything from slow database queries to customer success bottlenecks to engineering team coordination problems. Real scaling challenges are usually a combination of all three — and a purely technical response to a scaling problem often misses the most important levers.

This guide addresses SaaS scaling holistically: the technical bottlenecks that need to be addressed, the organisational changes that often need to accompany technical scaling, and the prioritisation framework for deciding what to tackle first.

Identify the actual bottleneck before you add capacity

The instinctive response to performance problems is to add more capacity — more servers, more database replicas, more application instances. This sometimes helps. But if the bottleneck is not resource capacity but rather an inefficient query, a missing index, or an architectural decision that creates a hot path, adding capacity just makes the problem slightly less visible for a while.

Before scaling anything, instrument your system properly. You need to know:

Which requests are slowest?
Where time is being spent (database queries, external API calls, computation)?
Which database queries are running most often and how long they take?
Where are the error rates highest?

Application performance monitoring tools (Datadog, New Relic, or open-source alternatives) give you this visibility. Without this data, scaling decisions are guesswork.

Common technical scaling bottlenecks and how to address them

Database query performance

The database is the bottleneck in the majority of SaaS scaling problems. Specifically: missing indexes, N+1 queries, and queries that work fine with 10,000 rows but degrade badly with 10,000,000.

The most impactful quick fixes are usually: adding indexes for frequently-queried columns, identifying and eliminating N+1 query patterns (where one request to a page triggers hundreds of small database queries), and reviewing the queries behind your slowest endpoints.

Longer-term solutions include read replicas for read-heavy workloads, database connection pooling to reduce connection overhead, and caching for frequently-read data that does not change often.

Synchronous work that should be asynchronous

Any work that does not need to complete before returning a response to the user should be done asynchronously. Sending emails, generating reports, processing file imports, syncing with external APIs — if these are happening synchronously in web requests, they are slowing every request down and making your application fragile when those external services are slow or unavailable.

Moving this work to background job queues is one of the highest-impact architectural changes you can make to a SaaS product that has outgrown its original design.

Caching

Caching is not a silver bullet, but it is very effective for specific use cases: data that changes infrequently but is read very frequently, expensive computations that produce the same result for the same inputs, and third-party API responses that do not need to be real-time.

Redis is the standard tool for application-level caching in most SaaS stacks. The key challenge is cache invalidation — making sure cached data is updated when the underlying data changes. Get this wrong and you have a cache that serves stale data, which can produce very confusing user experiences.

Stateless application servers

If your application servers store state locally — session data, uploaded files, in-memory caches — you cannot scale horizontally by adding more application server instances. Making your application servers stateless (storing sessions in Redis, files in object storage like S3) is a prerequisite for horizontal scaling.

Scaling your engineering team alongside the product

A SaaS product that has grown to handle 10,000 customers needs a different engineering organisation than one that launched to its first 100. The practices that work at small scale — everyone knows every part of the system, deployment is done by one senior engineer on Friday afternoons, testing is largely manual — become liabilities at scale.

The most important organisational scaling changes are usually:

Proper CI/CD: Automated testing and deployment processes that do not depend on any one person
On-call rotation: Defined responsibility for production incidents distributed across the team
Service ownership: Clear ownership of different parts of the system so that responsibility is unambiguous
Runbooks: Written procedures for common operational tasks and incident response

When to consider a microservices architecture

Microservices — decomposing a monolithic application into smaller, independently deployable services — is presented as a solution to scaling problems. Sometimes it is. More often it is a source of significant operational complexity that is not justified by the actual scaling requirements.

Consider microservices when: a specific part of your system has scaling requirements dramatically different from the rest, different parts of your system need to be deployed independently, or your engineering team is large enough that a monolith creates coordination bottlenecks between teams.

Do not consider microservices because it sounds modern, because a consultant recommended it, or because it worked for a much larger company. For most SaaS products under £20M ARR, a well-structured monolith is the right architecture.

Practical scaling priorities

When you are facing scaling pressure and need to decide what to work on first, this is a reasonable prioritisation:

Fix the database queries that are causing the most pain
Move synchronous work that should be asynchronous to background jobs
Add caching for the most-read data
Make application servers stateless so you can scale horizontally
Add read replicas if database read load is the remaining bottleneck
Review architecture for parts of the system that have dramatically different scaling needs

Most SaaS products that are struggling with scale can get through at least the first three or four of these without major architectural changes. The key is doing them in order, based on measured bottlenecks, rather than jumping to complex infrastructure solutions before the simpler fixes have been exhausted.

How to Scale Your SaaS Product Without Breaking Everything