If you are new to this blog, then read How to ace the system design interview blog first. I have described the guidelines involved in solving a system design interview problem. We will be applying those steps here.
Interviewer: Walks in the room. After the initial introductions, “ I would like to test you on your system design skills”
You: Sure! (Internally you are just hoping that the problem is a easy one :))
Interviewer: Let’s start by assuming that you are a program manager working at a new startup. You are tasked with creating a new budgeting app similar to Mint.com, or PersonalCapital.com. I would like to see how you would go about designing such a system. I am also interested in your thoughts about scaling this system.
You: Ok. Here is what I would do
This design question can be asked in multiple ways. For example,
- How do you design a system for budgeting service such as Mint or Personal Capital
- How do you design a large scale system for a financial management app
We will use the following high level steps to solve this problem.
High Level Steps:
- Scope the problem and clarify the requirements
- Do estimations
- High-Level Design
- Design core components
- Define API
- Detailed design (Depends on the time allocated for the interview)
- Resolve any bottlenecks in the design
- Summarize the solution
Step 1: Scope the problem and clarify the requirements
Define the product and service first. What is your startup/ service?
We will be designing a budgeting app that syncs users’ financial accounts such as bank accounts, credit cards, PayPal along with bills and investments. We will display a one stop dashboard for all of the user’s financial accounts. We will easily create budgets with specific spending categories for the user.
The scope is limited to:
- The users will be registered
- Users will connect to multiple financial accounts through the app
- We will add business to the transaction categories
- The system will include transaction categories
- The service will sort out the transactions based on the categories
- The app will create a automatic budget with fixed categories
- App will send notifications to the user based on budget limits
- System is highly scalable
- Service is highly available
Out of scope:
- Stocks, retirements accounts and funds management
- Additional services such as analytics
You can proceed to the next step only after confirming these assumptions with the interviewer
If the interviewer challenges you on out of scope requirements, then you can still stick to your script by letting them know that you will revisit the requirements at the end
Step 2: Do estimations
Before starting estimations, you would need to state some base assumptions to kickstart the calculations.
In this case, we are looking at the following assumptions and estimations…………
- Assume there are 1 millions users for this app
- We will fix the number of transaction categories to 25
- Examples of categories include Rent/Mortgage, Grocery, Gas, Utilities, Coffee, Cellphone, Internet, Shopping, and Travel
- Assume each user have up to 10 accounts to add including bank, and credit cards
- Total financial accounts is 10 Million
- Each transaction will have a business associated. For example Starbucks for Coffee
- Assume there are 10K business listed via all the transactions
- Assume total transaction per user per month is 200
- Total number of transaction per month is 200*10M = 2 Billion/month
- Storage calculation: Assume each transaction has user id, transaction id, amount, seller, timestamp ~100 Bytes
- Total storage per month is 2 Billion * 100 Bytes = 200 GB
- Total storage per yr = 200 Gb * 12 = 2.4 TB
You can use the above calculations to create a high level design.
Step 3: High Level Design
What are the components involved in this design?
The app is a one stop shop for aggregating all the financial transactions of the user. After registration process is completed and when the user logs in, the app prompts the user to connect all the financial accounts.
We have three separate services to manage the app.
- Account Consolidation Service: The user enters all the information needed to connect the financial accounts. The account consolidation service will pull in the financial account data (via API) to the account database. The financial data API include the API dev key, user log in data, and time stamp.
- Transactional Data Service: This service extracts transactions for the given account from the financial institution, storing the results as raw log files in the financial transactions store. The service will add the categories to each transaction from the 25 fixed categories we started with.
- Budgeting Service: The budgeting service pulls in the financial transaction data from the store and aggregate monthly spending by category. The budgeting service will include a template for budgeting based on the monthly spending. The service does not account for the user overriding the financial data categories.
Step 4: Design core components
We need multiple tables of data to manage this application. We will have the following
- User Data Table: This will manage the user ID data including the user login info, time stamp, GPS info, contact information used when creating the login
- Account Data Table: This will include all the financial account info pulled by the account consolidation service
- Transactions Data Table: This will include all the transactions data with categorized transactions
- Budget Data Table: This will include all the monthly spending aggregates by category. The budget service will pull this table to create the budget template for the user
Step 5: Detailed Design
The account consolidation service will send polling data to the financial account servers and wait for the account data or the financial transactional data. The issue with polling is that the client has to keep asking the server for any new data. As a result, a lot of responses are empty, creating HTTP overhead. We can overcome this issue via other mitigation methods such as long polling or message queue services.
We will use AWS Queue service for our messaging service. Amazon Simple Queue Service (SQS) is a fully managed message queuing service that enables you to decouple and scale microservices, distributed systems, and serverless applications. SQS eliminates the complexity and overhead associated with managing and operating message oriented middleware, and empowers developers to focus on differentiating work. Using SQS, you can send, store, and receive messages between software components at any volume, without losing messages or requiring other services to be available. Get started with SQS in minutes using the AWS console, Command Line Interface or SDK of your choice, and three simple commands.
As we start scaling to accommodate more downloaded pages and media content, we need to plan for storing the large amount of data. We need to store these efficiently and scalability is a big issue. We should follow data sharding.
Sharding at the core is splitting your data up to where it resides in smaller chunks, spread across distinct separate buckets. A bucket could be a table, a Postgres schema, or a different physical database. Then as you need to continue scaling you’re able to move your shards to new physical nodes thus improving performance.
Step 6: Resolve Bottlenecks
Add redundancy to the design by adding backup servers to the design. We would also have multiple distributed databases, due to data sharding. Hence, we need servers that would aggregate the data back from different shards. These aggregator servers will be connected to the application servers. We need to add load balancers to the design for traffic distribution.
We will add a content distribution network (CDN) to the design for scaling purposes. A content delivery network (CDN) refers to a geographically distributed group of servers which work together to provide fast delivery of Internet content.
Step 7: Summary
Finally, summarize the detailed design to the interviewer by going through the flow and confirming that the design meets the initial assumptions and constraints. Acknowledge that the next steps would be to work on excluded scope such as the custom URL option.
Hopefully, this example helps you understand solving system design questions. If you would like me to attempt other questions, then please leave a comment or reach out at [email protected]