Design: Spotify

·

11 min read

Design Spotify

editor-page-cover


Any good System Design question should reveal a lot about a candidate’s understanding of architecture and inner workings of large-scale systems. The question should be complex enough to engage the candidate and allow them to demonstrate their skill for building web-scale services, but not so complicated that the interviewee feels overwhelmed.

When I interviewed SWE candidates at Facebook and Microsoft, I wasn’t just looking for strong coding skills and soft skills that demonstrated alignment with company values. I was also looking for a candidate who could design a software system that met requirements, explain the system’s architecture and tradeoffs, and dive deep into an area of specialization like API design or a recommendation engine.

This post aims to address two audiences.

  1. Interviewers: Senior software engineers and engineering managers should get an idea of how to conduct a System Design interview and evaluate candidates.

  2. Candidates: Engineers should learn what is expected for a System Design Interview, how to prepare, and how to approach the conversation.

My favorite SD question to ask is something along the lines of “design a music streaming service like Spotify.” Streaming services for music and video are great systems for understanding a candidate’s thought process and technical acumen. They’re appropriately flashy and interesting but still demand a sufficient level of forethought and scalability.

Familiarity: The majority of streaming services should be familiar to candidates, at least on the user side. It is important for a System Design question to be a technology or product that is well-known and widely used.

To start, I’ll briefly explain the general hierarchy of SWE seniority. Then we will discuss the requirements of the system and how candidates should approach this kind of question in their interview.

We’ll cover:

System Design Interview determines SWE seniority

System Design is one of the key determining factors in filtering software engineers to different levels of seniority. I’ve seen candidates that applied to a senior engineering position be swiftly down-leveled because of a few technical oversights in a System Design interview.

The majority of engineers at large companies fall somewhere between E4 and E6. I’ll briefly cover what is expected out of a candidate at each of these levels.

Entry-level: At this level, engineers have a narrow focus on a few different software components and how they interact with each other.

Senior: They have a more holistic view of the software system they’re working on, and can describe various scenarios from end-to-end. They can explain how each scenario is executed, give concrete examples, and offer ways to improve the resiliency of a system.

Staff: Engineers at this level are capable of everything mentioned above, but they also monitor the software system over the course of its entire lifetime. By considering the architecture’s ability to sustain and support growth, they plan how a system evolves and scales.

It’s not all about down-leveling though. If you’re prepared, you may find yourself being offered a position more senior than the role you applied for.

Design Spotify

For this question, I ask the candidate to design a music streaming platform like Spotify (or Apple Music). I’ll briefly explain the requirements of the system below, but we will outline how to approach this conversation in the next section.

Functional requirements

  • Users should be able to stream music.

  • The system should store an archive of music, sorted by artist and album.

  • The database of songs should be searchable.

Non-functional requirements

  • Streaming should be very low-latency. Music should begin playing within 200ms of a user pressing play.

  • The system should support a repository of 100 million songs.

How to answer this question well

An interviewer is not expecting exactly correct answers that correspond with a rubric. There is , in fact, no “right” answer. Instead, they want to see comprehension of the problem at hand. A good interviewee will lead a conversant and comfortable walkthrough of their assumptions, calculations, tradeoffs, and design choices.

Some of the best advice I can give to both interviewers and interviewees pertains to asking questions. It’s great if a candidate asks all the clarifying questions they need to when posed with a problem, but ultimately, the interviewer should provide a guiding hand. If a candidate fails to ask crucial questions, the interviewer shouldn’t let them lead the conversation astray. If you’re an interviewer be sure to reveal key expectations and assumptions of the problem even if an interviewee doesn’t know to ask for them.

Clarifying questions

When designing any large-scale distributed system, there needs to be a range of clarifying questions that a candidate should ask. These questions about designing Spotify are arranged from basic to advanced. Interviewers should have an idea of what level of SWE an interviewer is just based on the clarifying questions they know to ask.

  • How big is the music repository?

The standard for most music streaming platforms is 100 million songs. We answered this already when talking about the non-functional requirements, but in some cases this information won’t be offered immediately.

  • How frequently is the repo updated?

Every week.

  • How many users does the service have?

There are hundreds of millions of users, but there is a more pertinent follow-up question that only skilled candidates will really know to ask.

  • How many concurrent users are there?

This is the question that really matters when it comes to System Design. Even if a candidate doesn’t ask it, the interviewer should give them the hint that they should expect an average of 5 million active users, with peak traffic being around 10 million active users.

  • Of the concurrent users, how many are streaming music?

A followup question that is, again, okay to divulge unprompted. On average about 80% of the active users will be streaming music with the remaining 20% sticking to low load activities like browsing and managing their playlists.

These are not the only questions that are relevant to designing Spotify, but they provide a great foundation for the conversation to come. After the round of clarifying questions, hopefully the candidate is a little more comfortable and the interviewer has made some initial notes as to how they expect the candidate to proceed.

The high-level design conversation

At this point, it should be fairly simple to come up with the high-level design of a workflow of the system.

  • A user makes a search.

  • A search indexer parses data.

  • The system returns a page of search results.

  • The user clicks on a file.

  • Music starts streaming.

The real meat of the problem comes from designing the system for low latency.

Nice numbers: When picking numbers for estimations, it is best to stick to 5s and 10s. Otherwise, your back of the napkin calculations quickly become more about you doing grade school math and less about designing the system at hand.

Storage considerations

Assuming that the average song is 5 minutes and takes up 5MB of storage, we can calculate how much storage it will take to store 100 million songs. Given just these numbers, we can begin by saying that it will take 500TB to store this data.

The candidate should build upon this assumption. It is important to store multiple copies of the data so that songs will always be available even in the event of a partial failure of the system. The industry standard is to replicate data three times, so with replication, the total storage is now up to 1500TB.

A really strong candidate – E6 level or equivalent – may recognize that the system not only needs to replicate data, but create and keep files of different qualities. Much like a video streaming service, music streaming services also allow users to stream different song qualities based on their network connection and individual preferences. If a user is driving through a place with a spotty network, they should still be able to seamlessly stream music, just at a lower quality.

Apple Music's own calculations for the size of different quality files.

Apple Music's own calculations for the size of different quality files.

For simplicity’s sake, we can say that our low quality files are 1MB per song and the high quality files are 10MB per song. With these added provisions, the required storage is roughly around 5000TB.

Multimedia is not the only data consideration, however. A candidate should be sure to include metadata. On the music side, there are artist names and bios, album covers, album names, song titles, and potentially lyrics — but the system should store user metadata as well. Given the number of users, metadata adds up, and will ultimately take up a significant amount of space. Metadata also demands a different storage location than multimedia data and will affect the high-level design components that are deemed necessary.

Design for low latency

Given the assumption that the average song is around 5MB, and that the average 3G connection reaches speeds of 3-5 megabits/second, it would take ~8 seconds to download a 5MB song. This is significantly longer than 200ms. How candidates tackle this problem will likely reveal the most about their individual skill sets or problem solving tendencies.

The key idea that a candidate should get is that the system will have to chunk song files and buffer their download. The system should be able to rapidly download the first couple seconds of a song and then use the playback of those seconds to download more and more of the song.

If the device is able to download 0.1MB, it can begin playing the song almost instantly. Then, while the first few seconds are playing, the system can download the next chunks of the song. After about 10 seconds, the system will have the complete song downloaded. Really talented candidates will even highlight the possibility of using the time spent streaming to cache the next couple of songs in the queue. In doing so, we can create a better user experience if they decide to skip a song or two.

Don’t rush through your analysis of how to design the system, but having extra time to expand upon your assessment can be extremely helpful to both you and your interviewer. If you wind up with extra time, take initiative and discuss relevant design specifics that align with your interest and area of specialization. For example, in the “design Spotify” problem, areas to hone in on are:

  • How to build a search index

  • Adaptive streaming

  • API design/API calls

  • Recommendation engine (for machine learning candidates)

Content delivery network (CDN)

A content delivery network is crucial to ensure low latency for a global system, especially one that is data intensive. It is important to have nodes that are physically close to geographically significant areas. For example, the two-way latency from a node in Virginia (U.S. East) to one in California (U.S. West) and back is around 63 ms. And from that same U.S. East location to one in Cape Town, South Africa is 225 ms.

Our system may allow music to start playing within the >200 ms window, but only if the user is close to the main node of the system. Accounting for travel time latency adds an additional layer of complexity to our non-functional requirements. To ensure a positive user experience, we need a CDN to minimize response times by optimizing the delivery of data based on location.

To set up a CDN, we need to have a routing service that directs data to the correct proxy services based on the location of the request. A CDN will also need to be considered in the API design of the system. Web servers and load balancers will need to go through the CDN’s routing service before a response can be delivered.

A content delivery network is a complex system by its own nature, and it necessitates a more in-depth explanation than can be communicated here. If you’re interested in delving more into the infrastructure of a CDN, this lesson on Designing Content Delivery Network from our course Grokking Modern System Design Interview for Engineers & Managers outlines the complete system architecture in-depth.

How to succeed in your next System Design Interview

Regardless of if you’re an engineering manager conducting a round of interviews, or a fresh developer gearing up to interview at your first big company, you should understand what exactly your role is in a System Design Interview.

For candidates: There is a lot to say about how to do well in your next SDI. The best tip I have is to spend an adequate amount of time preparing. System Design is no simple task, and not something you can just improvise on the spot.

During your preparation and your next interview, keep these pieces of advice in mind:

For interviewers: Be engaged in the conversation and try to help a candidate along. Even if they aren’t asking the right questions, don’t let them flounder. As an interviewer, you can help guide them. It’s entirely possible that a great candidate will get flustered and need some time to warm up and get in the groove.

Here are a couple more quick takeaways:

  • Evaluate a candidate on their interview performance without letting their System Design experience (or lack thereof) get in the way.

  • Let them take the conversation where they feel most comfortable. A front end developer will probably want to talk about APIs, while a machine learning engineer may be eager to show off their recommendation engine skills. Both are highly relevant to the system at hand and ultimately help you determine the best fit in the long run.

If you’re looking to prepare for your next interview there is no better resource than the Educative course: Grokking Modern System Design Interview for Software Engineers and Managers. This course describes in detail all major System Design building blocks and then walks through over a dozen more real-world System Design problems in an interview style format.

Happy learning!

Continue reading about System Design Interviews


2022