How do you design systems that scale to a million users

How do you design systems that scale to a million users

If you are new to this blog, then read How to ace the system design interview blog first. I have described the guidelines involved in solving a system interview problem. We will be applying those steps here.

Interview Scenario:

Interviewer: Walks in the room. After the initial introductions, “ I would like to test you on your system design skills”

You: Sure! (Internally you are just hoping that the problem is a easy one :))

Interviewer: Let’s start by assuming that you are a program manager working at a startup. Now imagine that your startup hit it big time and the number of users scaled exponentially. You are tasked with scaling your system to accommodate the increase in data, users and computing needs. How would you go scaling your system?

You: Ok. Here is what I would do

This design question can be asked in multiple ways. For example,

  • How do you design a scale a system to millions of users
  • How do you scale your design hosted on AWS to reach more users
  • How do you scale your design hosted on Google Cloud to reach more users

We will use the following high level steps to solve this problem.

High Level Steps:

  1. Scope the problem and clarify the requirements
  2. Do estimations
  3. High-Level Design
  4. Design core components
    1. Define API
  5. Detailed design (Depends on the time allocated for the interview)
  6. Resolve any bottlenecks in the design
  7. Summarize the solution

Step 1: Scope the problem and clarify the requirements

Define the product and service first. What is your startup/ service?

Assume that your startup is scaling from 100 users to 1 Million users. Users post TEXT only content to our servers and we process the data and store the content and metadata. We should be able to serve the contents back reliably. The system should be able to handle the increase in the number of requests (read and write).

The scope is limited to:

  • User registration
  • Text only data
  • Data stored for a specific time (10 years)
  • The system is always available
  • System is reliable

Out of scope:

  • No media content
  • Search capabilities

You can proceed to the next step only after confirming these assumptions with the interviewer

If the interviewer challenges you on out of scope requirements, then you can still stick to your script by letting them know that you will revisit the requirements at the end

Step 2: Do estimations

Before starting estimations, you would need to state some base assumptions to kickstart the calculations.

In this case, we are looking at the following assumptions and estimations…………

  • Avg write size is 1KB
  • Read to write ratio is 100:1
  • 1 million users
  • 1 million writes a day 365M paste a year
  • 100 million reads per million writes (100:1)
  • Writes per sec = 1M/24*3600 = 12 writes/sec
  • Read per sec = 100M/24*3600 = 1160 reads/sec
  • Data storage – Links are stored for 10 years
  • Data stored in 1 day = 1KB*1M = 1 GB/day
  • Data stored in 10 year = 1GB * 10* 365 = ~3.65TB data to be stored

You can use the above calculations to create a high level design.

Step 3: High Level Design

What are the components involved in this design? You have a web server that talks to the end client. The server takes the data from the user, does some processing, and stores the data in the database.

We have seen a similar service earlier by designing a system for Pastebin.

Scaling System Overview

Step 4: Design core components

Database Design:

We would store two separate data tables. One for the data and other for User. Data table would include information such as the actual data, timestamp, IP address, and expiry timeline. The User data would use the user ID as the key and the table would include data such as user name, email, timestamp, and other details collected during sign up.

API for read and write:

There are two API needed, one to write and one to read.

Step 5: Detailed Design

The goal of the design is to scale the design for a million users. We have a couple of options to scale the design

We will add memory caching design to speed up the data read and reduce latency in the design. We will cache 20% of the daily requests for faster response time (400 MB). We will use the least recently used (LRU) scheme for replacing the cache data. The application servers, before hitting backend storage, can quickly check if the cache has the desired URL. Whenever there is a cache miss, our servers would be hitting the database. Then we will update the cache and pass the new entry to all the cache replicas.

We will add a load balancer to the design to improve the responsiveness. Load balancers can be added between the client and the webserver. The load balancers can also be added between the server to the data storage.

Data Storage:

As we start scaling to more users (100 Million +), we would start generating a lot of data. The data size also increases if we start adding media such as photos and videos. We need to store these efficiently and scalability is a big issue. We should follow data sharding.

Sharding at the core is splitting your data up to where it resides in smaller chunks, spread across distinct separate buckets. A bucket could be a table, a Postgres schema, or a different physical database. Then as you need to continue scaling you’re able to move your shards to new physical nodes thus improving performance.

While data sharding, we need a key to use for the data. We can use userID or some other parameter to create data sharding.

Detailed scaling systemd design

Step 6: Resolve Bottlenecks

Add redundancy to the design by adding backup servers to the design. We would also have multiple distributed databases, due to data sharding. Hence, we need servers that would aggregate the data back from different shards. These aggregator servers will be connected to the application servers. We need to add load balancers to the design for traffic distribution.

We will add a content distribution network (CDN) to the design for scaling purposes. A content delivery network (CDN) refers to a geographically distributed group of servers that work together to provide fast delivery of Internet content.

A CDN allows for the quick transfer of assets needed for loading Internet content including HTML pages, JavaScript files, stylesheets, images, and videos. The popularity of CDN services continues to grow, and today the majority of web traffic is served through CDNs, including traffic from major sites like Facebook, Netflix, and Amazon.

Step 7: Summary

Finally, summarize the detailed design to the interviewer by going through the flow and confirming that the design meets the initial assumptions and constraints. Acknowledge that the next steps would be to work on excluded scope such as the custom URL option.

Hopefully, this example helps you understand solving system design questions. If you would like me to attempt other questions, then please leave a comment or reach out at

0 0 votes
Article Rating
Notify of
Inline Feedbacks
View all comments