One of the most interesting parts of preparing system design interview is that you can get to know a lot of details about how existing systems are built.
To make the weekly post more helpful, I’d like to cover a wide range of topics. We’ve been talking about stuff like recommendation, ranking a lot in the past few weeks, this time I want to cover something different.
It starts with a very simple question – how to design Facebook chat function?
With great news like Facebook buys Whatsapp for $19B and Facebook messenger gets really popular recently, chat function is definitely a hot topic. So in this post, I’m quite happy to talk about messages.
Few things to mention here. First and foremost, as I mentioned in previous posts, system design interviews can be extremely diversified. It’s mostly up to the interviewer to decide which direction to discuss. As a result, different interviewers can have completely different discussions even with the same question and you should never expect this article to be something like a standard answer.
Also, I’ve never worked on Facebook messenger nor Whatsapp. All the discussion here is based on Gainlo team’s analysis.
As said earlier, it’s better to have a high-level solution and talk about the overall infrastructure. If you have no prior experience with messaging app, you might find it not easy to come up with a basic solution. But that’s totally fine. Let’s have a very naive solution and optimize it later.
Basically, one of the most common ways to build a messaging app is to have a chat server that acts as the core of the whole system. When a message comes, it won’t be sent to the receiver directly. Instead, it goes to the chat server and is stored there first. And then, based on the receiver’s status, the server may send the message immediately to him or send a push notification.
A more detailed flow works like this:
- User A wants to send message “Hello Gainlo” to user B. A first send the message to the chat server.
- The chat server receives the message and sends an acknowledgement back to A, meaning the message is received. Based on the product, the front end may display a single check mark in A’s UI.
- Case 1: if B is online and connected to the chat server, that’s great. The chat server just sends the message to B.
- Case 2: If B is not online, the chat server sends a push notification to B.
- B receives the message and sends back an acknowledgement to the chat server.
- The chat server notifies A that B received the message and updates with a double check mark in A’s UI.
The whole system can be costly and inefficient once it’s scaled to certain level. So any way we can optimize the system in order to support a huge amount of concurrent requests?
There are many approaches. One obvious cost here is that when delivering messages to the receiver, the chat server might need to spawn an OS process/thread, initialize HTTP (maybe other protocol) request and close connection at the end. In fact, this happens to every message. Even if we do the other way around that the receiver keeps requesting the server to check if there’s any new message, it’s still costly.
One solution is to use HTTP persistent connection. In a nutshell, receivers can make an HTTP GET request over a persistent connection that doesn’t return until the chat server provides any data back. Each request will be re-established when it’s timed out or interrupt. This approach provides a lot of advantages in terms of response time, throughput and cost.
If you want to know more about HTTP persistent connection, you can check things like BOSH.
Another cool feature of Facebook chat is showing online friends. Although the feature seems to be simple at the first glance, it improves user experience tremendously and it’s definitely worth to discuss. If you are asked to design this feature, how would you do it?
Obviously, the most straightforward approach is that once a user is online, he sends a notification to all his friends. But how would you evaluate the cost of this?
When it’s at the peak time, we roughly need O(average number of friends * peak users) of requests, which can be a lot when there are millions of users. And this cost can be even more than the message cost itself. One idea to improve this is to reduce unnecessary requests. For instance, we can issue notification only when this user reloads a page or sends a message. In other words, we can limit the scope to only “very active users”. Or we won’t send notification until a user has been online for 5min. This solves the cases where a user shows online and immediately goes offline.
There are many other topics I haven’t covered in the post, for example if you dig deeper about the network stuff, we can talk about what network protocol can be used in the connection. Also, how to deal with system error and replicate the data can be interesting as well since chat app is quite different.
Feel free to leave a comment if you want to have further discussion with me.