Microservices Used for Twitter System Design

9.1 Data Partitioning

To scale out our databases we will need to partition our data. Horizontal partitioning (aka Sharding) can be a good first step. We can use partitions schemes such as:

Hash-Based Partitioning
List-Based Partitioning
Range Based Partitioning
Composite Partitioning

The above approaches can still cause uneven data and load distribution, we can solve this using Consistent hashing.

9.2 Mutual friends

For mutual friends, we can build a social graph for every user. Each node in the graph will represent a user and a directional edge will represent followers and followees.
After that, we can traverse the followers of a user to find and suggest a mutual friend. This would require a graph database such as Neo4j and ArangoDB.
This is a pretty simple algorithm, to improve our suggestion accuracy, we will need to incorporate a recommendation model which uses machine learning as part of our algorithm.

9.3 Metrics and Analytics

Recording analytics and metrics is one of our extended requirements.
As we will be using Apache Kafka to publish all sorts of events, we can process these events and run analytics on the data using Apache Spark which is an open-source unified analytics engine for large-scale data processing.

9.4 Caching

In a social media application, we have to be careful about using cache as our users expect the latest data. So, to prevent usage spikes from our resources we can cache the top 20% of the tweets.
To further improve efficiency we can add pagination to our system APIs. This decision will be helpful for users with limited network bandwidth as they won’t have to retrieve old messages unless requested.

9.5 Media access and storage

As we know, most of our storage space will be used for storing media files such as images, videos, or other files. Our media service will be handling both access and storage of the user media files.
But where can we store files at scale? Well, object storage is what we’re looking for. Object stores break data files up into pieces called objects.
It then stores those objects in a single repository, which can be spread out across multiple networked systems. We can also use distributed file storage such as HDFS or GlusterFS.

9.6 Content Delivery Network (CDN)

Content Delivery Network (CDN) increases content availability and redundancy while reducing bandwidth costs.
Generally, static files such as images, and videos are served from CDN. We can use services like Amazon CloudFront or Cloudflare CDN for this use case.

Designing Twitter – A System Design Interview Question

Designing Twitter (or Facebook feed or Facebook search..) is a quite common question that interviewers ask candidates. A lot of candidates get afraid of this round more than the coding round because they don’t get an idea of what topics and tradeoffs they should cover within this limited timeframe.

Important Topics for Designing Twitter

How Would You Design Twitter?
Requirements for Twitter System Design
Capacity Estimation for Twitter System Design
Use Case Design for Twitter System Design
Low Level Design for Twitter System Design
High Level Design for Twitter System Design
Data Model Design for Twitter System Design
API Design for Twitter System Design
Microservices Used for Twitter System Design
Scalability for Twitter System Design

Microservices Used for Twitter System Design

9.1 Data Partitioning

9.2 Mutual friends

9.3 Metrics and Analytics

9.4 Caching

9.5 Media access and storage

9.6 Content Delivery Network (CDN)

Designing Twitter – A System Design Interview Question

Categories

Contact US

Microservices Used for Twitter System Design

9.1 Data Partitioning

9.2 Mutual friends

9.3 Metrics and Analytics

9.4 Caching

9.5 Media access and storage

9.6 Content Delivery Network (CDN)

Designing Twitter – A System Design Interview Question

Similar Reads

Categories

Contact US