Q: What are the types of proxies?

There are two types of proxies in system design: Forward Proxy In interactions between clients and servers, a "forward proxy" operates on behalf of (replaces) the client to assist users. It represents the user personally and relays the user's requests. The server won't be aware that the request and response are being sent through the proxy while using the "forward proxy." Reverse Proxy A reverse Proxy is most helpful in complicated systems. Reverse proxies are intended to assist servers by acting on their behalf. The client won't be aware that the request and response are passing through a proxy when using a reverse proxy. The main server can delegate several functions to a "reverse proxy," which can also serve as a gatekeeper, screener, load-balancer, and general helper.

Question 1

What is System Design?

Accepted Answer

System design is the process of designing and defining the architecture, components, modules, interfaces, and data for a system to satisfy specified requirements. It involves decomposing the system into smaller subsystems, determining how the subsystems will work together, and defining the relationships between them.

System design is an iterative process that involves understanding the problem to be solved, identifying the requirements of the system, and designing a solution that meets those requirements. It is a critical step in the development of any system, as it lays the foundation for the subsequent implementation and testing phases.

Question 2

What are Microservices?

Accepted Answer

Microservices is a software architecture pattern in which a system is divided into manageable, small, independent components that can be developed, deployed, and scaled independently. This can make it easier to update and modify the system, and it can also improve the scalability and reliability of the system by allowing components to be scaled or replaced independently. These modules can be created, used, and maintained separately.

Application Programming Interfaces (APIs) allow these services to communicate and coordinate with one another. An API defines a set of rules and protocols that govern how one service can access the functionality of another service. When a service needs to send a message to another service, it sends the message to a message queue. The receiving service then retrieves the message from the queue and processes it.

Overall, message queues act as a backbone of communication in microservices architecture each microservice is focused on a specific task or capability and communicates with other microservices through well-defined interfaces.

A must-know for anyone heading into the technical round, this is one of the most frequently asked Software design interview questions.

Question 3

What are some common challenges you encounter when designing systems?

Accepted Answer

Complexity: Large systems can be complex, with many interdependent components and interactions. It can be challenging to understand and design such systems in a way that is efficient and easy to maintain.
Scalability: As the number of users or the amount of data increases, a system may need to scale to handle the increased load. This can be challenging, as it requires designing the system to be flexible and efficient enough to handle the increased demand without becoming overwhelmed.
Performance: Ensuring that a system performs well and meets the required performance targets can be challenging, particularly when dealing with real-time data or processing, or when handling large amounts of data or traffic.
Fault Tolerance: Designing a system to be fault-tolerant and highly available can be challenging, as it requires designing for potential failures and ensuring that the system can continue to operate even in the face of such failures.
Security and Privacy: Ensuring that a system is secure and protects the privacy of its users can be a complex and ongoing challenge, particularly in the face of constantly evolving threats and regulations.
Integration: Integrating a new system with existing systems can be challenging, as it requires understanding the interfaces and protocols used by the existing systems and designing the new system to be compatible with them.
User Experience: Designing a system that is easy to use and intuitive can be challenging, as it requires understanding the needs and expectations of the users and designing the system to meet those needs.
Maintenance and Updates: Ensuring that a system is easy to maintain and update over time can be challenging, as it requires designing the system to be flexible and modular, and providing the necessary tools and documentation for ongoing maintenance.

Question 4

What kinds of documentation are used in system design?

Accepted Answer

Documentation helps to communicate the design of the system to stakeholders and developers. Some common types of documentation that are used in system design include:

Requirements Documents: These documents outline the requirements and constraints that the system must meet, and provide a clear understanding of the needs and goals of the system.
Functional Specification: This document provides a detailed description of the functionality of the system.
Design Specification: This document describes the overall design of the system, including the architecture, components, and interfaces.
Implementation Plan: This document outlines the steps that will be taken to develop and implement the system, including the tools, technologies, and resources.
Test Plan: This document outlines the approach that will be taken to test the system, including the types of tests that will be performed, the criteria that must be met, and the resources required.
User Manual: This document provides instructions for users on how to use the system, including any necessary setup and configuration steps.
Technical Manual: This document provides technical information about the system, including details on its architecture, components, and interfaces.
Maintenance Manual: This document outlines the steps that will be taken to maintain and update the system over time, including any necessary procedures for troubleshooting and repair.

Question 5

Can you give some metric pointers used to gauge system performance?

Accepted Answer

Among the most important application metrics for gauging system performance are:

User Satisfaction and Apdex Score: The Application Performance Index (Apdex) is a standard used to measure the performance of applications. Apdex is calculated by dividing the number of satisfactory responses by the total number of responses and then multiplying the result by a scaling factor.
Response Time: This is the amount of time it takes for the system to respond to a request or input.
Resource Utilization: This is the number of resources (e.g., CPU, memory, network bandwidth) that the system is using.
Error Rate: This is the percentage of requests or transactions that result in an error. A high error rate can indicate that the system is not functioning correctly or is not meeting the requirements of the users.
Throughput and Latency Under Load: It is important to evaluate the performance of a system under different levels of load, as this can help identify potential bottlenecks or issues that may arise when the system is being used heavily.

This is one of the most frequently asked Software design questions.

Question 6

What are CDN edge servers?

Accepted Answer

The CDN edge servers are used to cache content that has been fetched from your origin server or storage cluster. Point of presence is another expression that is frequently connected to edge servers (POP). The physical location of the edge servers is referred to as a POP.

There may be several edge servers at that POP that are used for content caching.

The distance between a visitor and a web server can be reduced by delivering different portions of a website from different locations. CDN edge servers can store a copy of the content that is being delivered, allowing them to serve it directly to users without having to retrieve it from the origin server each time. This lowers latency, the purpose of CDN edge servers is to accomplish this.

Question 7

What do you mean by "load balancing"? Why is it crucial for system design?

Accepted Answer

The idea of efficiently spreading incoming traffic among a collection of diverse backend servers is known as load balancing. Server pools are groups of these servers. Today's websites are made to quickly and accurately respond to millions of customer requests while handling a high volume of traffic. More servers must be added to fulfill these requests.

In this case, it is crucial to appropriately disperse request traffic among the servers to prevent excessive load on any of them. A load balancer functions as a traffic cop, addressing the requests and distributing them among the available servers so that no one server is overloaded, which can impair the operation of the service.

The load balancer switches traffic to the remaining available servers when a server goes offline. Requests are automatically forwarded to a new server when one is added to the setup. Below are some advantages of load balancers:

They assist in avoiding requests being routed to unreliable or unhealthy servers thus helping in avoiding resource overloading.
Reduces the likelihood of a single point of failure since requests are diverted to other servers when one goes down.
Both the requests and the server responses are encrypted before being sent. It facilitates SSL termination and gets rid of installing X.509 certificates on each server.
System security is affected by load balancing, which also enables ongoing software updates to accommodate system changes.

Question 8

What do you understand by Database Indexing?

Accepted Answer

By using database indexing, you may make it quicker and simpler to search through your tables and discover the desired rows or columns. A database table's columns can be used to generate indexes, which serve as the foundation for quick random lookups and effective access to sorted information.

There are several types of indexes that can be created, including:

Clustered Indexes: These indexes physically reorganize the data in a table to match the order of the index. There can only be one clustered index per table.
Non-clustered Indexes: These indexes store the data in the original order but create a separate data structure that stores the indexed columns in a sorted order. Multiple non-clustered indexes can be created on a single table.
Unique Indexes: These indexes enforce the uniqueness of the indexed columns, ensuring that no two rows in the table have the same values for those columns.

It's no surprise that this one pops up often in Software design interviews.

Question 9

Can you describe Availability in terms of System Design?

Accepted Answer

A means to guarantee system reliability is through availability. It translates to: The system must always be online and respond to customer requests. A system with high availability can function reliably and consistently, with minimal downtime or interruptions. In other words, the system must be accessible and respond to user requests anytime a user wants to utilize the service.

By calculating the proportion of time the system is operational within a specified time window, availability may be calculated.

Availability = Uptime / Uptime + Downtime

The "availability percentages" of a system are typically expressed in terms of the number of 9s (as shown in the table below).

It is referred to as having "2 nines" of availability if availability is 99.00%, "3 nines" if availability is 99.9%, and so forth.

Question 10

What exactly do you mean by "leader election"?

Accepted Answer

One server may need to take charge of updating third-party APIs in a distributed environment where numerous servers contribute to the application's availability because other servers may interfere with the third-party APIs' use.

This server is known as the primary server, and the selection procedure is known as the leader election. When the leader server fails, the servers in the distributed environment must recognize it and choose a new leader.

There are several strategies that can be used to implement leader election in a distributed system, including:

Voting: One approach is to have each node in the system vote for a leader. The node with the most votes becomes the leader.
Token Passing: In this approach, a token is passed from node to node, and the node that currently holds the token is the leader.
Priority-based: In this approach, each node is assigned a priority, and the node with the highest priority becomes the leader.
Time-based: In this approach, the leader is determined based on the time that each node has been running. The node that has been running the longest becomes the leader.

A common question in System Design interviews, don't miss this one.

Question 11

What role do Network protocols play in system design?

Accepted Answer

A network protocol is a set of rules and conventions that govern the communication between devices on a network. In system design, the choice of network protocol can have a significant impact on the performance and scalability of the system, as well as on its security and reliability.

Many different types of network protocols can be used in system design, including:

TCP (Transmission Control Protocol): This is a connection-oriented protocol that is used to establish a reliable, end-to-end communication channel between devices on a network. It is commonly used for applications that require reliable delivery of data, such as email and file transfer.
IP (Internet Protocol): It is responsible for routing packets of data across a network. It is the backbone of the internet and is used by most other network protocols to transport data. IP is a connectionless protocol, which means that it does not establish a dedicated connection between the sender and receiver before transmitting data. IP has two main versions: IPv4 and IPv6. IPv4 is the most widely used version.
UDP (User Datagram Protocol): This is a connectionless protocol that is used to send data packets between devices on a network without establishing a dedicated end-to-end connection. It is often used for applications that require fast, real-time communication, such as online gaming and voice-over IP (VoIP).
HTTP (Hypertext Transfer Protocol): This is a stateless protocol that is used to transfer web content between a web server and a web browser. It is the foundation of the World Wide Web and is used by almost all web applications.
HTTPS (Hypertext Transfer Protocol Secure): This is an extension of HTTP that uses encryption to secure the communication between a web server and a web browser. It is commonly used for sensitive information, such as online banking and e-commerce transactions.

Question 12

What connections exist between scalability and performance?

Accepted Answer

Scalability refers to a program's ability to manage a lot of traffic, whereas performance refers to measuring how quickly the application is operating. The system's performance improves in direct proportion to the resources provided to it. Scalability is directly related to the performance of any design because it allows the handling of larger data sets in the case of expanding activity.

There are several ways in which scalability and performance are related in system design:

Scalability can Affect Performance: A system that is not designed to be scalable may experience a decrease in performance as the workload increases.
Performance can Affect Scalability: A system that has poor performance may struggle to scale up, as it may be unable to handle the increased workload without experiencing further performance degradation.
Scalability and Performance can be Balanced: A well-designed system should be able to scale up or down as needed while maintaining good performance. This can be achieved by using techniques such as load balancing, caching, and database optimization.

Question 13

What do you mean by Polling?

Accepted Answer

By sending a network request to the server and requesting the most recent data, polling is all about our client checking on the server. Regular intervals like 5 seconds, 15 seconds, 1 minute, or any other time required by the use case are typical for these requests:

Polling every few seconds still falls short of real-time and has the following drawbacks, especially if there are more than a million concurrent users:

fast-flowing network requests (not great for the client)
numerous requests coming in fairly regularly (1 million+ requests per second, not ideal for the server loads!)

Polling is therefore best employed in situations when short pauses in data updates are not problematic for your application. Polling quickly is not particularly efficient or performant.

Question 14

Can you tell me the types of distributed systems?

Accepted Answer

One of the most frequently posed System design questions, be ready for it.

The four architecture types listed below are often used by distributed systems and processes:

Client Server

A server, which served as a shared resource like a printer, database, or web server, was the foundation of the distributed system design. It once had numerous clients, such as users operating the computers who decided how to use, display, and modify shared resources as well as submit modified data back to the server.

Three-tier

To make application deployment simpler, this style of most common architecture stores client-related data in a middle layer rather than directly on the client. This middle layer is sometimes referred to as an agent since it takes requests from clients, some of whom may be stateless, processes the information, and then sends it to the servers.

Multi-tier

Enterprise web services were the first to develop this for servers that house business logic and communicate with both the data layers and the display levels.

Peer-to-peer

In this design, there are no specialized servers required for intelligent work. Each of the involved machines can play either a client or a server role, and all of the servers' decision-making and duties are distributed among them.

Question 15

What is the use of Database query in system design?

Accepted Answer

A staple in system design questions, be prepared to answer this one.

A database query is a request for data or information from a database. In the context of system design, a database query or Structured Query Language (SQL) can be used to retrieve, add, update, or delete data from a database.

Queries are an essential part of any system that needs to store and retrieve data. For example, a retail website may use database queries to retrieve customer information, process orders, and track inventory. A social media platform may use database queries to store and retrieve user profiles, posts, and messages.

Using SQL can greatly improve the efficiency and performance of a system by allowing it to access and manipulate data stored in a database quickly. Queries can be optimized to retrieve only the data that is needed and to do so in the most efficient way possible. Overall, the use of database queries or SQL in system design is crucial for storing, organizing, and accessing data in a scalable and efficient manner.

Question 16

Can you describe proxies?

Accepted Answer

Proxy servers are typically some type of software or hardware that resides between a client and another server. It may be located anywhere between the clients and the destination servers, including the user's computer. A proxy server receives requests from clients, transmits them to the origin servers, and then returns the server's response to the client that requested them in the first place.

Proxy servers are widely used to process requests, filter requests, log requests, and occasionally alter requests (by adding/removing headers, encrypting/decrypting, or compressing). It facilitates the coordination of requests coming from several servers and can be applied to optimize request traffic on a system-wide level.

Question 17

What are the types of proxies?

Accepted Answer

There are two types of proxies in system design:

Forward Proxy

In interactions between clients and servers, a "forward proxy" operates on behalf of (replaces) the client to assist users. It represents the user personally and relays the user's requests. The server won't be aware that the request and response are being sent through the proxy while using the "forward proxy."

Reverse Proxy

A reverse Proxy is most helpful in complicated systems. Reverse proxies are intended to assist servers by acting on their behalf. The client won't be aware that the request and response are passing through a proxy when using a reverse proxy.

The main server can delegate several functions to a "reverse proxy," which can also serve as a gatekeeper, screener, load-balancer, and general helper.

Question 18

What Consistency design patterns are there for different systems?

Accepted Answer

Every read request ought to receive the data that was most recently written, according to consistency from the CAP theorem. When there are several versions of the same data, it becomes difficult to synchronize them so that the clients always receive up-to-date information. The available consistency patterns are as follows:

Weak Consistency

The read request may or may not be able to obtain the new data following a write operation.

Real-time use cases like VoIP, video chat, and online gaming all benefit from this kind of stability.

Eventual Consistency

The reads will finally view the most recent data within milliseconds after a data write. Here, asynchronous replication of the data is used. DNS and email systems both use them. In highly accessible systems, this works well.

Strong Consistency

The succeeding reads will view the most recent data following a data write. Here, synchronous replication of the data is used. This is seen in RDBMS and file systems, which are appropriate for systems needing data transfers.

Question 19

What is block storage?

Accepted Answer

Block storage is a method of storing data that divides the data into equal-sized blocks and assigns a unique identifier to each block for convenience. Blocks can be stored anywhere in the system instead of following a predetermined path, which makes better use of the system's resources.
Few examples for block storage tools are LVM (Logical Volume Manager), SAN (Storage Area Network), ZFS (Zettabyte File System), and many more.

Choosing the right tool(s) for a system depends on the specific requirements of the system and the underlying storage infrastructure. Block storage is typically used to store data that needs to be accessed quickly and frequently, such as operating system files, application data, and database files. It is also used to store data that needs to be accessed randomly, as it allows individual blocks of data to be accessed directly, rather than having to read through a large amount of data sequentially.

Block storage is a powerful tool for storing and managing data in a system, and it is often used in conjunction with other types of storage to create a well-rounded storage strategy.

Question 20

Can you brief me about File Storage?

Accepted Answer

A hierarchical storage methodology is file storage. The information is saved in files using this technique. Folders contain the files, which are then housed in directories. It is often used in systems to store data that is accessed less frequently than data stored in block storage, such as large documents, media files, and backups.

Only a small amount of data, primarily structured data, can be stored using this method. This data storage technique can be troublesome as the size of the data exceeds a certain threshold.

Several factors like performance, capacity, data organization, and data protection can affect it.

Question 21

Can you explain Object Storage?

Accepted Answer

Large amounts of unstructured data can be handled by object storage. Each object is typically a large file, such as a video or image, and is stored with a unique identifier and metadata that describes the object. Due to the importance of backups, unstructured data, and log files for all systems, this type of storage offers systems a tremendous amount of flexibility and value.

Object storage would be beneficial for your business if you were designing a system with large datasets. It is designed to scale horizontally, allowing it to store vast amounts of data without experiencing a decrease in performance.

A few object storage tool examples are Amazon S3, Google Cloud Storage, Ceph, OpenStack Swift, and many more. Some factors to consider while deciding on a tool would be scalability, durability, cost and the features that are required for the system. An operating system cannot directly access object storage. RESTful APIs are used for communication at the application level. Due to its dynamic scalability, object storage is the preferred method of data storage for data backups and archiving.

Question 22

Can you tell me the difference between web servers and application servers?

Accepted Answer

Web servers and application servers are both types of servers that are used in computer networks to serve different purposes. A web server is a dedicated server with the sole purpose of handling web requests. These are designed to host and serve web content, such as HTML, CSS, and JavaScript files, to clients over the internet web servers are typically optimized for serving static content, such as images, videos, and documents, and do not typically include processing capabilities beyond basic HTTP handling.

Application servers, on the other hand, are servers that are designed to host and run applications. These applications may be web-based or standalone, and they may include dynamic content, such as databases, user accounts, and business logic. Application servers often include features such as database connectivity, security, and scaling capabilities, and they may be built using frameworks such as Java EE or .NET.

Question 23

What is the major tool used in the structured design?

Accepted Answer

In structured design, the major tool used is a flowchart, which is a graphical representation of a process or system that shows the steps or activities involved and the relationships between them. Flowcharts are used to visualize and document the design of a system, and they can help to identify and resolve problems or issues during the design process.

Flowcharts are widely used in structured design because they provide a clear and concise way to represent the design of a system, and they are easy to understand and communicate to others. They are also flexible and can be used for a wide range of systems, from simple to complex.

In addition to flowcharts, other tools that are commonly used in the structured design include data flow diagrams, entity-relationship diagrams, and state diagrams.

Question 24

Can you tell me about Latency?

Accepted Answer

Latency is the amount of time it takes for a request to be processed and a response to be returned in a system. In system design, latency is an important consideration because it can impact the performance and user experience of the system. Several factors can contribute to latency in a system, including the speed of the network connection, the processing power of the servers or devices involved, and the complexity of the algorithms and processes being used.

To reduce latency in a system, designers can consider several strategies, such as optimizing the algorithms and processes being used, using faster hardware and networking equipment, and implementing caching and other performance-enhancing techniques.

Question 25

What is Throughput?

Accepted Answer

It refers to the amount of data or transactions that the system can handle in a given period of time. It is often used to evaluate the capacity or scalability of a system, as well as to identify potential bottlenecks or performance issues. Dividing the requests and spreading them to other resources is the one promising method of boosting the system's throughput that has been discovered.

Numerous factors as below can impact the throughput of a system:

Hardware: The hardware components of the system, such as the processors, memory, and storage, can have a significant impact on the throughput of the system.
Network: The network infrastructure of the system, including the bandwidth and latency of the network connection, can also impact the throughput of the system.
Software: The software and algorithms used in the system can also impact the throughput of the system.
Workload: The workload of the system can also impact the throughput of the system. A system that is handling a high volume of requests or transactions may experience a decrease in throughput compared to a system with a lower workload.

Question 26

Describe the CAP theorem.

Accepted Answer

According to the CAP(Consistency-Availability-Partition Tolerance) theorem, a distributed system cannot concurrently guarantee C, A, and P. It can only offer two of the three assurances at most. Let us use a distributed database system to help us comprehend this.

Consistency: According to this, the data must continue to be consistent after a database action has been completed. For instance, all queries should return the same information after a database update.
Availability: The databases must always be accessible and responsive, they cannot suffer downtime.
Partition Tolerance: The communication becoming inconsistent should not hinder the functioning of the database system.

The following diagram illustrates which CAP Theorem components each database simultaneously assures. We can see that RDBMS databases offer availability and consistency at the same time. Consistency and Availability are provided by SQL Server and Maria DB. Consistency and partition tolerance are ensured by the Redis, MongoDB, and Hbase databases. Availability and partition tolerance is achieved by Cassandra and CouchDB.

This is a regular feature in the list of top System design questions, be ready to tackle it in your next interview.

Question 27

What distinguishes horizontal scaling from vertical scaling?

Accepted Answer

Horizontal scaling is increasing the number of computers on a network to distribute the processing and memory workload among a distributed network of devices.

Vertical scaling refers to the idea of enhancing the resource capacity of a single machine, by adding RAM, effective processors, etc. Without changing any code, it can help the server's capabilities.

Some other factors to consider when deciding between horizontal scaling and vertical scaling in system design:

Cost: Horizontal scaling typically requires more hardware and infrastructure, which can be more expensive than vertical scaling.
Complexity: Horizontal scaling may require more complex configuration and management, as it involves adding and coordinating multiple nodes or servers.
Performance: Vertical scaling can often provide a faster performance improvement than horizontal scaling, as it increases the resources of a single node rather than distributing the workload across multiple nodes.

This is a frequently asked question in Software design interviews.

Question 28

Caching: What is it? What different cache update methods are there for caching?

Accepted Answer

Caching is the practice of keeping copies of files in a temporary storage space referred to as a cache, which facilitates faster data access and lowers site latency. Only a certain amount of data can be kept in the cache. Because of this, choosing cache update strategies that are best suited to the needs of the business is crucial. The different caching techniques are as follows:

Cache-aside: In this approach, it is up to our application to write to and read data from the storage. The storage and the cache do not directly interact. In this case, the application searches the cache for an entry before retrieving it from the database and adding it to the cache for later use. The cache-aside strategy, also known as lazy loading, only caches the requested entry, preventing unnecessary data caching.

Write-through: According to this strategy, the system will read from and write data into the cache, treating it as its primary data store. The database is then updated to reflect these changes by the cache. Entries are synchronously written from the cache to the database.

Write-behind(write-back): The application performs the following actions in this strategy:

Change or update a cache entry
Asynchronously insert the entry into the data store to increase write performance.

Refresh-ahead: By employing this technique, we can set the cache to automatically refresh the cache entry before it expires.

Expect to come across this, one of the top System design interview questions.

Question 29

Can you tell me the disadvantages of Cache-aside, write-through, and write-behind cache update strategies?

Accepted Answer

Below is the drawbacks for each:

Cache-aside

When a cache miss occurs, there will be a noticeable delay because data must first be fetched from the database before being cached. If data is updated in the database, there is a greater chance that it will become stale. By forcing an update of the cache entry with the time-to-live parameter, this can be minimized. Increased latency occurs when a cache node fails and is replaced by a new, empty node.

Write-through

Because of the synchronous write operation, this strategy operates slowly overall. The data stored in the cache has a chance of never being read. By offering the right TTL, this risk can be minimized.

Write-behind

The main drawback of this approach is the potential for data loss if the cache is destroyed before the contents are written into the database.

Question 30

What are Content Delivery Networks? What are the various types of it?

Accepted Answer

A network of globally dispersed proxy servers known as a "content delivery network," or CDN for short, serves content to users from locations close to them. Static files like HTML, CSS, JS files, images, and videos are typically served from CDNs on websites.

Users don't have to wait long because data is delivered from nearby centers. As some of the burdens are shared by CDNs, the load on the servers is significantly reduced.

We have 2 types of CDNs as below:

Push CDNs: Every time the server makes changes, in this case, the CDNs receive the content. We are accountable for uploading the content to CDNs.Only when content is changed or added, the CDN is updated, maximizing storage while minimizing traffic.

Push CDNs generally work well for websites with less traffic or content.

Pull CDNs: When the first user requests the content from the site, fresh content is fetched from the server in this case. As a result, initial requests take longer to complete until the content is stored or cached on the CDN. Although these CDNs use the least amount of space possible, when outdated files are pulled before being changed, it can result in redundant traffic. Pull CDNs are effective when used with busy websites.

Question 31

Can you explain the concept of Sharding?

Accepted Answer

The process of sharding involves dividing a sizable logical database into numerous small datasets called shards. Some database data persists across all shards, whereas other data only appears in one shard. The terms "vertical sharding" and "horizontal sharding" can be used to describe these two situations. A sharded database can now accommodate more requests than a single large machine can. Databases can be scaled by using sharding, which increases throughput, storage capacity, and availability while assisting in handling the increased load.

We must choose a sharding key to partition your data before you can shard it. An indexed field or an indexed compound field that appears in each document in the collection can serve as the sharding key. Our application can run fewer queries thanks to sharding. The application knows how to route requests when it receives them. Instead of searching through the entire database, this means it must look through less information. Sharding boosts the overall performance and scalability of the application.

Question 32

What do you understand by Data Partitioning?

Accepted Answer

Data partitioning is the process of splitting a large dataset into smaller, more manageable pieces called partitions. Partitioning a large database can enhance its performance, controllability, and availability. In some situations, partitioning can improve performance when accessing a partitioned table. It is common to practice partitioning databases for load balancing, performance, manageability, and availability.

There are 3 types of partitioning, viz. Horizontal Partitioning, Vertical Partitioning and Functional Partitioning. An example of Horizontal Partitioning is as below:

By acting as a leading column in indexes, partitioning can reduce index size and improve the chances of finding the most desirable indexes in memory. Scanning a region when a significant portion of it is used in the resultset is much quicker than accessing data that is dispersed throughout the entire table by index. Performance is improved because adding and removing sections enables mass uploading and deletion of data. Rarely used data can be transferred to less expensive data storage systems.

A must-know for anyone heading into the technical rounds, this is one of the most frequently asked System design interview questions.

Question 33

Can you explain Database Replication in system design?

Accepted Answer

Database replication is a process of copying data from a database on one server to a database on another server. This is typically used to improve data availability, scalability, and performance by allowing multiple servers to handle the load of a database system. There are several different types of replication, including master-slave replication, where one server acts as the primary source of data and the other servers act as replicas, and peer-to-peer replication, where all servers are equal and can both read and write to the database.

Replication can be useful in a variety of situations, including when a database needs to be accessed by users in different geographic locations or when a database needs to be backed up for disaster recovery purposes.

Question 34

Define RAID.

Accepted Answer

RAID (Redundant Array of Independent Disks) is a technology used to improve the performance, reliability, and fault tolerance of a data storage system. It works by combining multiple physical disks into a logical unit, allowing the system to read and write data across the disks in a way that increases performance and reliability.

There are several different RAID configurations, each with its own unique set of characteristics and benefits. Some common RAID configurations include:

RAID 0 is a striping technique that spreads data across multiple disks. It does not provide any redundancy, so if one of the disks fails, all data on the array is lost. However, it does improve performance by allowing multiple disks to be accessed concurrently.

RAID 1 is a mirroring technique that stores copies of data on multiple disks. It provides redundancy by maintaining multiple copies of data, so if one of the disks fails, the data is still available on the other disk. However, it does not improve performance because all data must be written to both disks.

RAID 5 is a striping technique that stores parity information on multiple disks. It provides redundancy by using the parity information to reconstruct data in the event of a disk failure. It also improves performance by allowing multiple disks to be accessed concurrently. However, it requires at least three disks and has a higher overhead than RAID 0 or RAID 1.

RAID 6 is similar to RAID 5, but it uses double parity to provide even greater redundancy. It can survive the failure of two disks, but it has a higher overhead and may not improve performance as much as RAID 5.

RAID 10 is a combination of RAID 1 (mirroring) and RAID 0 (striping). It provides redundancy by maintaining multiple copies of data and improves performance by allowing multiple disks to be accessed concurrently. However, it requires at least four disks and has a higher overhead than other RAID levels.

These are just a few examples of the different RAID levels that are available. Many other RAID levels have been developed, each with its unique combination of features and trade-offs. It is important to carefully consider the requirements of your system and choose a RAID level that meets your needs.

Question 35

Do you know about file systems in system design?

Accepted Answer

A file system is a way of organizing and storing data on a storage disk. It determines how files are named, stored, and retrieved. A file system also provides a way of organizing and grouping files, as well as setting permissions on files and directories to control who can access them.

A distributed file system is a file system that allows multiple computers to access and store data on a network of computers.

In system design, the choice of a distributed file system is an important decision that can have significant implications for the performance, reliability, and scalability of the system. There are several different types of distributed file systems, each with its own set of features and characteristics.

Question 36

Can you give me some examples of a distributed file system in system design?

Accepted Answer

Some common types of distributed file systems include:

Network File System (NFS): This is a widely used protocol that allows a computer to access files over a network as if they were stored on its local hard drive. It is simple to use and allows users to access files from any computer on the network.
Server Message Block (SMB): This is a protocol that allows a computer to access files over a network and is commonly used on Windows systems. It supports features such as file locking and support for multiple users accessing the same file simultaneously.
Andrew File System (AFS): This is a distributed file system designed to provide fast, reliable access to files over a wide area network, such as the Internet.
GlusterFS: This distributed file system is designed to scale out to large numbers of servers and handle large amounts of data. It is often used for storing large volumes of unstructured data, such as photos, videos, and other types of media.
HDFS (Hadoop Distributed File System): This is a distributed file system designed for use with the Hadoop framework, which is a popular tool for storing and processing large amounts of data, and it is often used for storing data that is used for big data analysis.
Google File System(GFS): GFS is based on a distributed architecture, where data is divided into chunks and stored on multiple servers. It uses a master-slave architecture, with a single master server responsible for managing the file system and a large number of slave servers responsible for storing and serving the data.

Question 37

What do you understand by System Design Patterns?

Accepted Answer

These are a way of capturing the best practices and lessons learned from building systems and providing a proven, reusable solution to common problems.

Some common system design patterns include:

Client-server: This pattern involves dividing a system into two components: a client and a server. The client is responsible for making requests to the server, and the server is responsible for processing the requests and returning a response. This pattern is commonly used in distributed systems to allow clients to access resources or services over a network.
Model-view-controller: This pattern is commonly used in user interface design and involves dividing a system into three components: a model, a view, and a controller. The model represents the data and logic of the system, the view presents the data to the user, and the controller mediates between the model and the view. This pattern allows for a separation of concerns and can make it easier to maintain and update the system.
Pipe and Filter: This pattern involves dividing a system into a series of independent, reusable components that are connected in a pipeline. Each component performs a specific task and passes the data on to the next component in the pipeline. This pattern allows for flexibility and reuse and can make it easier to maintain and update the system.
Publish-subscribe: This pattern involves dividing a system into two components: a publisher and a subscriber. The publisher is responsible for sending messages to the subscriber, and the subscriber is responsible for receiving and processing the messages. This pattern is commonly used in distributed systems to allow for decoupled communication between components.

These are just a few examples of the many system design patterns that are available. It is important to carefully consider the requirements of the system and choose the design patterns that are most appropriate for the system.

Question 38

What role does Database play in System Design?

Accepted Answer

A database is a structured collection of data that is stored and accessed electronically. There are many different types of databases available, and it is important to carefully consider the requirements of the system when choosing a database.

Some common types of databases that are commonly used in system design include:

Relational /SQL Databases: These are databases that store data in a structured, tabular format. They are based on the relational model, which organizes data into tables with rows and columns and defines relationships between the tables. Relational databases are widely used because of their reliability and support for complex queries.
NoSQL Databases: These are databases that do not use the traditional relational model and are designed to handle large amounts of unstructured data. NoSQL databases are often used for storing data that does not fit well into a tabular structure, such as documents, images, or log data.
Key-value Stores: These are databases that store data as a collection of key-value pairs. They are designed to be simple and fast and are often used for storing data that does not require complex queries or relationships.
Object-oriented Databases: These are databases that store data in an object-oriented format. They are designed to support the storage of complex data structures and are often used in applications that require complex data modeling.

Factors to consider may include the amount of data that needs to be stored, the complexity of the data, the performance and scalability requirements, and the availability and reliability requirements.

Question 39

Can you give some examples for each of Relational and Non-Relational databases?

Accepted Answer

Some examples of popular relational databases include:

MySQL: An open-source relational database management system that is widely used for web applications and data storage.
Oracle: A powerful and feature-rich commercial relational database management system that is used by many large organizations.
Microsoft SQL Server: A popular commercial relational database management system developed by Microsoft.
PostgreSQL: An open-source, object-relational database management system that is known for its reliability and robustness.
SQLite: A lightweight, self-contained, serverless relational database management system that is commonly used in mobile and embedded applications.

Some examples of popular non-relational databases include:

MongoDB: A widely used document database that stores data as JSON-like documents and supports a variety of data types.
Redis: A key-value store that is known for its high performance and ability to store large amounts of data in memory.
Cassandra: A distributed database that is designed for high availability and scalability, and is often used for storing large amounts of data.
Neo4j: A graph database that is used for storing and querying complex relationships between data.

Question 40

Can you explain ACID properties?

Accepted Answer

ACID (Atomicity, Consistency, Isolation, Durability) is a set of properties that are used to guarantee the integrity and reliability of data in a database.

The ACID properties are:

Atomicity: This property guarantees that a database transaction is either completely successful or completely unsuccessful. If a transaction fails, it is rolled back to its previous state, ensuring that the data remains in a consistent state.
Consistency: This property guarantees that a database will only allow transactions that maintain the integrity of the data. For example, if a database has a constraint that prevents the deletion of a row if it is referenced by other rows, the consistency property ensures that this constraint is enforced.
Isolation: This property guarantees that concurrent transactions are isolated from each other, meaning that they do not interfere with each other. This ensures that the data remains in a consistent state even when multiple transactions are being processed concurrently.
Durability: This property guarantees that once a transaction is committed, it will be persisted in the database and will not be lost in the event of a failure.

Question 41

What are Message queues?

Accepted Answer

Messages are routed from a sender to the recipient, through a message queue. The main purpose of a message queue is to decouple the communication between these parts so that they can operate independently and asynchronously. It operates under the FIFO (first in, first out) principle.

In a message queue system, there are three main components:

Producers: These are the components that send messages to the queue.
Queue: This is the message queue itself, which stores the messages until they are retrieved by the consumers.
Consumers: These are the components that retrieve messages from the queue and process them.

Message queues can also be used to provide reliability and fault tolerance to a system. If a consumer fails to process a message, the message can be returned to the queue for processing by another consumer. This allows the system to continue operating even if one or more components fail.

Question 42

What are the types of message queues available and explain one of them?

Accepted Answer

There are many different types of message queue systems available, including Apache Kafka, RabbitMQ, and ActiveMQ. Each has its own set of features and capabilities and can be used in different types of systems depending on the specific requirements.

Apache Kafka: Apache Kafka is a distributed, real-time message queue system that is widely used for building scalable, high-throughput, and fault-tolerant architectures for data processing, streaming and messaging. Kafka is a publish-subscribe messaging system that allows producers to send messages to topics and consumers to read from those topics. Topics are partitioned and replicated across the cluster, allowing for high scalability and availability. One of the key features of Kafka is its ability to handle large amounts of data with low latency.

Question 43

Can you explain Database schema?

Accepted Answer

A database schema is a structure that defines the organization and relationships of data in a database. It specifies the data types, the attributes of each data element, and the relationships between different data elements.

In system design, the database schema is an important consideration because it determines how the data will be stored and accessed. It needs to be carefully planned and designed to ensure that the database is efficient, scalable, and easy to use.

There are two main types of database schema:

Physical Schema: This defines how the data is stored on disk, including the data types, sizes, and locations of the data elements.
Logical Schema: This defines the logical structure of the data, including the data types, attributes, and relationships between different data elements.

The physical schema is usually implemented by the database management system (DBMS), while the logical schema is defined by the database designer. The logical schema is used to create the physical schema, and it is also used by users and applications to interact with the database.

Question 44

Can you help me understand CQRS and 2PC?

Accepted Answer

CQRS (Command Query Responsibility Segregation) is a design pattern that is used to separate the responsibilities of reading and writing data in a system. In a CQRS system, the write side of the system is responsible for storing data, while the read side is responsible for retrieving and displaying data. These two sides are usually implemented as separate components, and they can use different data models and storage mechanisms.

2PC (Two-Phase Commit) is a protocol that is used to coordinate transactions across multiple systems in a distributed environment. It has 2 phases i.e. the Voting/Prepare/Commit request phase and the commit phase where based on the voting, the system decides to commit or abort the transaction.

It is designed to ensure that all systems participating in a transaction either commit or roll back the changes they have made as a group, to ensure that the overall transaction is either completed or canceled.

In a 2PC system, there is a central coordinator that coordinates the transaction, and multiple participants that participate in the transaction.

Question 45

What are Consistent Hashing and Bloom Filters?

Accepted Answer

Consistent hashing is a technique that is used to distribute keys or data elements across a distributed system in a way that minimizes the number of keys that need to be moved when the number of nodes in the system changes. It is commonly used in distributed systems to distribute data and workloads evenly across multiple servers or components and to ensure that the system remains balanced and efficient even when nodes are added or removed.

A Bloom filter is a probabilistic data structure that is used to test whether an element is a member of a set. It is a space-efficient data structure that allows for fast membership tests, but it also has a certain probability of false positives, meaning that it may occasionally indicate that an element is a member of the set when it is not. If all of the positions are set to 1, the element is considered to be a member of the set. However, if any of the positions are set to 0, the element is considered to not be a member of the set.

Question 46

How would you design a system for a robot to learn the layout of a room and traverse it?

Accepted Answer

Sensor Selection: The first step would be to select the appropriate sensors for the task. Depending on the requirements of the system, the robot could use a combination of sensors such as LIDAR, ultrasound, stereo vision, or infrared sensors to gather data about its surroundings.
Data Collection and Processing: The robot would then use its sensors to collect data about the layout of the room and its surroundings. This data would be processed by the robot's onboard computer to create a map of the room and identify the location of obstacles and landmarks.
Path Planning: The robot would then use algorithms to plan a path through the room, avoiding obstacles and navigating to its desired destination. This could involve the use of techniques such as A* search or gradient descent.
Motion Control: Once a path has been planned, the robot would need to use its motors and control systems to execute the planned path and navigate through the room. This could involve the use of feedback control loops and PID controllers to ensure that the robot follows the planned path accurately.

Question 47

What will be the steps to design a URL shortening service like bit.ly or TinyURL?

Accepted Answer

The steps to design any URL shortening service is as follows -

Requirements: Requirements can be categorized into three categories:
- Functional Requirements: The shortened URL must be unique. Users must be redirected to the original URL once they hit the short URL. The URL link should expire after a certain time period.
- Non-Functional Requirements: The system ought to have high availability with minimum latency. This is necessary because all URL redirections will start to fail if our service is unavailable.
- Additional Requirements: The misuse of services must be avoided, and other APIs must be able to use the shortened URL.
Constraints and Estimations: To keep the scalable service following must be considered:
- Write per second: 100 million/ 24/2600 = 1160.
- Read: Assuming the read-to-write ratio would be 10:1 -> 1160*10 =11600
- Storage: Considering we store each link for 10 years and expect 100M new requests per month, then
  100 million * 10 years* 365 = 12 billion
  Further, assuming each record stored will be approx. 300 bytes then
  12 billion * 300 bytes = 3.6TB
Database: A key-value store, such as Redis or a document database like MongoDB, is used to store the mapping between short URLs and long URLs. The database should be optimized for high read and write performance, as it will be the primary source of truth for the service.
API: Two primary APIs would be required. The first will be POST to generate a new short URL, and the second will be GET for redirecting the URL.
API Key, Original URL, and Expiration(Optional) are the parameters that form the API.
URL Shortening logic: Encode the original URL using Base62 consisting of upper case A-Z, lower case a-z and numbers 0-9. Number of URLs= 62^N, N is the number of characters in the generated URL.
Another way is to use the hash function MD5 message-digest algorithm, which generates a 128-bit hash value.
Caching: To improve performance, the service can use a caching layer, such as Memcached, to store frequently accessed URL mappings in memory. This can help reduce the load on the database and speed up the retrieval of long URLs.
Load Balancing: The service should be designed to handle a high volume of traffic and be scalable to accommodate future growth. This can be achieved by using a load balancer, such as HAProxy or NGINX, to distribute incoming requests across multiple instances of the API.
Monitoring and Logging: The service should be monitored for performance and availability, and logs should be generated and stored for debugging purposes. This can be achieved using tools such as New Relic, Datadog, or ELK (Elasticsearch, Logstash, and Kibana).
Security: Security should be a top priority when designing the URL shortening service. This might involve implementing measures such as SSL/TLS encryption, rate limiting, and input validation to prevent attacks such as SQL injection, XSS, and DDoS.

A common System design interview question, don't miss this one.

Question 48

Can you elaborate on your design approach for a chat service like Whatsapp?

Accepted Answer

One of the most frequently posed System design interview questions, be ready for it.

Start by establishing a design scope by determining the major features to be included as below:

One-on-one and Group chat
Offline/Online/Typing Status of Users
Sharing of media files

Next, break down the application into components as below:

Client: Here client refers to the mobile app or a web application. Clients do not directly interact with each other. Instead, they connect to a chat server.
Server: It houses all the programs, databases, and frameworks required for the chat application to function. It will receive the message, identify the right recipient, queue the message, and direct the message to the identified recipient.
WebSocket Server: A WebSocket is a constant link between the client and the server that offers a two-way communication channel. This means that data can be sent from the server to the client without a request being made first. Real-time communication is the ideal scenario for WebSockets.
Storage of Messages/Media: Use a NoSQL database for the messages and a dependable and strong relational database for generic data like profile settings. NoSQL databases like Cassandra are ideal for storing messages because they enable easier horizontal scaling and low-latency data access.
Polling: It is defined as a technique where the client periodically queries the server to see if there are any new messages. Polling could be expensive, depending on how frequently it is done.
To combat this, we can use long polling. When a client uses long polling, the connection is kept open until either new messages are actually available or a timeout threshold is reached. The process is restarted as soon as the client receives new messages by making a new request to the server.

Common frontend (client) languages are Javascript, Java, Swift and common backend (server) languages are Erlang, Node.js, PHP, Scala, Java, and many more.

Deciding on the application framework is equally vital. We have some chat protocols like Extensible Messaging and Presence Protocol (XMPP) used by WhatsApp and Message Queue Telemetry Transport (MQTT) which is relatively new.

Question 49

How will you approach designing a Web Crawler?

Accepted Answer

The Web crawler is a search engine-related tool, similar to Google and DuckDuckGo, that indexes website material online so that it may be made available for every result.

Which features are some of the requirements?

Create and implement a Scalable service for retrieving millions of online documents and gathering data from the whole web.
Every search query requires the retrieval of new data.

What are some typical issues that arise?

How should updates be handled when people type quickly?
How should dynamically changing web pages be prioritized?

The potential pointers:

Consider using URL Frontier Architecture to put this system into place.
Learn the differences between scraping and crawling.

Question 50

How will you design a message board platform like Quora?

Accepted Answer

A staple in System design interview questions, be prepared to answer this one.

To design a message board platform like Quora, you can follow these steps:

Define requirements: Start by defining the key features of your platform. Some of the core features of a message board platform like Quora include the ability to ask and answer questions, the ability to vote on questions and answers, the ability to comment on questions and answers, the ability to follow users, and the ability to search for questions and answers.
Architecture: Architecture refers to the high-level structure and organization of a software system. In the case of a message board platform like Quora, the architecture would involve the components and services.
Front-end: The front-end would be responsible for presenting the platform to users and handling user interactions. It would likely be built using a JavaScript framework such as React or Angular.
API: The API would serve as the interface between the front-end and back-end, allowing the front-end to request data from the back-end and the back-end to receive and process requests from the front-end.
Back-end: The back-end would be responsible for storing and managing data, processing user requests, and returning the results to the front-end. It would likely consist of a set of microservices, each responsible for a specific task, such as storing questions and answers, managing votes, or handling comments.
Database: The database would store all the data for the platform, such as questions, answers, votes, comments, and user information.
Data Modeling: you would need to model the following data entities:
- Questions: The questions asked by users on the platform.
- Answers: The answers provided by users to questions.
- Users: The users who ask questions and provide answers.
- Votes: The votes cast by users on questions and answers.Comments: The comments made by users on questions and answers.
- Followers: The relationships between users, allowing users to follow each other.
API Design: The API design should consider the following:
The API should provide a set of endpoints for the front-end to access, such as endpoints for creating questions, answering questions, and voting on questions and answers.

API should define the format of the requests and responses between the front-end and back-end. This may involve using a standardized format such as JSON.

The API must include mechanisms for authenticating users, such as OAuth, to ensure that only authorized users can access the platform's data.

The API should include error handling to ensure that the front-end can handle error responses from the back-end in a predictable and user-friendly manner.

Front-end Development: Develop the front-end of your platform using a JavaScript framework such as React or Angular. The front-end should provide a user-friendly interface for users to ask questions, answer questions, vote on questions and answers, comment on questions and answers, follow users, and search for questions and answers.
Security: Some of the security measures that should be considered for a message board platform include:
Authentication: Implementing robust user authentication mechanisms, such as OAuth, to ensure that only authorized users can access the platform's data.
Authorization: Implementing role-based access control mechanisms to ensure that users can only perform actions that they are authorized to perform.
Encryption: Encrypting sensitive data, such as passwords, to protect it from unauthorized access.
Input validation: Validating user input to prevent malicious attacks, such as SQL injection or cross-site scripting (XSS) attacks.
Scalability: Make sure your platform is scalable to accommodate a growing user base. Consider using cloud-based infrastructure, such as AWS or Google Cloud, to easily scale your platform as needed.
Deployment: Deploy your platform to a production environment, making sure to monitor the platform for performance and stability.

This is a high-level overview of the steps involved in designing a message board platform like Quora. In practice, the design process can be much more complex and will likely involve many more details and considerations.

Question 51

How will you design social media platforms like Twitter?

Accepted Answer

To design a social media platform like Twitter, you would need to consider the following components:

Front-end: The front-end would be responsible for presenting the platform to users and handling user interactions. It would likely be built using a JavaScript framework such as React or Angular.
API: The API would serve as the interface between the front-end and back-end, allowing the front-end to request data from the back-end and the back-end to receive and process requests from the front-end.
Back-end: The back-end would be responsible for storing and managing data, processing user requests, and returning the results to the front-end. It would likely consist of a set of microservices, each responsible for a specific task, such as storing tweets, managing user accounts, and handling real-time updates.
Database: The database would store all the data for the platform, such as tweets, user information, and relationships between users.
Real-time Updates: The platform would need to support real-time updates to provide a responsive and dynamic experience for users. This could be achieved using a technology such as WebSockets or Server-Sent Events.
Search: The platform would need to include a search functionality to allow users to search for tweets and users.
Notification: The platform would need to include a notification system to alert users of new mentions, replies, and other events.
Analytics: The platform would need to include analytics functionality to provide insights into user behavior and engagement.

In terms of data modeling, you would need to model the following data entities:

Tweets: The tweets posted by users.
Users: The users who post tweets and follow other users.
Relationships: The relationships between users, allowing users to follow each other.
Mentions: The mentions of other users in tweets.
Hashtags: The hashtags used in tweets.

Real-time updates is an important feature of many social media platforms, including Twitter, as it enables users to receive updates in near real-time as events occur. To implement real-time updates, you would need to consider the following components:

WebSockets or Server-Sent Events: To provide real-time updates, you would need to use a technology that enables bi-directional communication between the client and server. Two popular technologies for this are WebSockets and Server-Sent Events.
Notification Service: The notification service would handle the real-time updates and notify the front-end whenever new events occur, such as a new tweet or a new mention. This service would likely be built as a microservice and would use WebSockets or Server-Sent Events to communicate with the front-end.
Event Storage: The event storage would store events such as tweets and mentions, allowing the notification service to retrieve the latest events and notify the front-end. This could be implemented using a database or a distributed event store.
Event Processing: The event processing component would handle the creation of new events, such as when a user posts a tweet or receives a mention. It would store the new events in the event storage and notify the notification service to update the front-end.

The search functionality is a key component of many social media platforms, including Twitter, as it allows users to find and discover content. To implement search in your platform, you would need to consider the following components:

Indexing Service: The indexing service would be responsible for indexing the data in your platform, such as tweets and user information so that it can be searched efficiently. The indexing service could use technology such as Elasticsearch or Apache Solr, which are popular open-source search engines.
Query Service: The query service would handle user search requests and return the results to the front-end. It would use the indexed data to perform the search and return the results in a format that can be easily consumed by the front-end.
Data Storage: The data storage component would store all the data in your platform, such as tweets and user information. The indexing service would use this data to create the indexed data that is used by the query service.

Some popular algorithms for search include:

TF-IDF (Term Frequency-Inverse Document Frequency):
PageRank
BM25 (Best Matching 25)

Question 52

What steps will you follow to design a riding app like Ola or Uber?

Accepted Answer

The biggest challenge to combat in this design would be demand vs. supply hence we would need two services for supply (of cabs) and demand (of riders). We will go with Uber’s example to understand the design and architecture.

The supply service uses latitude and longitude information to track the taxis (geolocation). The load balancer receives location updates from all of the active taxis, say, once every five seconds, via a web application firewall.
The demand service is received via web sockets. It begins tracking the rider's GPS location as soon as the request is made. The demand service also receives other requests, such as the kind of car, how many seats, etc., in addition to the rider's location.
Uber's architecture includes a dispatch system (Dispatch optimization/DISCO) to balance supply and demand. Riders and drivers are located by DISCO using a more precise Google S2 Library that divides the location's map into tiny cells rather than just using latitude and longitude. You can have 1km square cells across the map, based on your needs. It is easier to store cell data in the distributed system and access it using the ID because each of these cells is given a distinct ID. Consistent hashing can be used to store cell data.
Uber uses its four sets of map regions to determine the resources and map quality of a region. The grades for these sub-regions are A(Urban), B(Rural and SubUrban), AB(Union of Urban and Rural), and C(Highways).
To calculate ETA, the shortest route on a map with road networks can be discovered using Dijkstra's algorithm, which is the most straightforward method. More sophisticated AI algorithms may also be used to produce the most accurate time estimates because the shortest path (in terms of distance) isn't always the quickest path (heavy traffic may affect the arrival time).
Initially, Uber stored profile-related data and other items in an RDBMS PostgreSQL database. They had to change to a more scalable option though as more drivers and riders joined the app and the service spread to more cities. Uber currently uses a MySQL-based, schema-less NoSQL database.

Question 53

Can you explain how to design an API Rate Limiter for GitHub?

Accepted Answer

An API rate limiter is a system that is designed to limit the rate at which an API can be accessed by clients. This can be useful in situations where it is necessary to prevent excessive use of the API, such as to protect against denial of service attacks or to ensure fair usage by all clients.

There are several factors to consider when designing an API rate limiter:

Limits: The first step in designing an API rate limiter is to determine the appropriate limits for the API. This may involve considering factors such as the expected volume of traffic, the needs of the clients, and the resources of the server. The limits may be set on a per-user or per-application basis, or they may be based on other criteria such as IP address or location.
Algorithm: Multiple algorithms can be used to implement an API rate limiter, such as a fixed window algorithm or a sliding window algorithm. It is important to choose an algorithm that is appropriate for the needs of the API and that can scale to handle a large number of requests.
Storage: The API rate limiter will need to store information about the usage of the API, such as the number of requests made and the time of the last request. It is important to choose a storage solution that is efficient and scalable.
Implementation: The API rate limiter will need to be implemented as part of the API server, and it will need to be integrated with the authentication and authorization mechanisms of the API. It is important to ensure that the rate limiter is reliable and performs well under different workloads.

Question 54

What is Search Typehead? How will you design it?

Accepted Answer

Search typeahead is a feature that suggests search terms or results as a user types in a search query. It is commonly used in search engines and other types of search interfaces to help users find relevant information more quickly and easily.

A search typeahead is a feature commonly found in search boxes that provides suggestions to the user as they type. It is designed to help users find what they are looking for more quickly and efficiently. The goal of a search typeahead is to provide relevant and accurate suggestions in real-time as the user types.

To design a search typeahead, we would typically follow these steps:

Determine the Data Source: The first step in designing a search typeahead is to determine where the suggestions will come from. This could be a database of keywords, a list of products, or a combination of both.
Implement the Suggestion Engine: The suggestion engine is responsible for returning the relevant suggestions based on the user's input. This could be done using a trie data structure, a simple search algorithm, or a more complex machine learning model.
Implement User Interface: The user interface should display the suggestions to the user in a clear and concise manner. The suggestions should be easily distinguishable and the selected suggestion should be highlighted.
Optimize for Performance: Performance is critical for a search typeahead, as it needs to provide suggestions in real-time as the user types. We should consider using caching, pre-computing results, and minimizing the number of database queries to optimize performance.
Consider Personalization: To provide a better user experience, we may want to consider personalizing the suggestions based on the user's previous search history or other relevant data.
Implement Analytics: To improve search typeahead over time, it is important to gather data on how it is being used and make changes as necessary. We can implement analytics to track usage patterns, user behavior, and the success rate of the suggestions.

There are several algorithms and data structures that can be used to implement the suggestion engine, including:

Trie Data Structure: A trie (also known as a prefix tree) is a tree-like data structure that is optimized for searching prefixes.
Simple Search Algorithms: Simple search algorithms such as linear search or binary search can be used to search through a list of words to find relevant suggestions.
Machine Learning Models: Machine learning models, such as a neural network or a decision tree, can be used to predict the most relevant suggestions based on the user's input.
Hybrid Approach: A hybrid approach can be used to combine the strengths of multiple algorithms. For example, you could use a trie to quickly find suggestions that start with the user's input, and then use a machine learning model to rank the suggestions based on relevance.

The choice of algorithm for your suggestion engine will depend on the specific requirements of your platform, including the size and structure of your data, the desired performance, and the complexity of the search problem. We may need to evaluate multiple algorithms and perform performance testing to determine the best choice for your specific use case.

A well-designed search typeahead can greatly improve the user experience and make it easier for users to find what they are looking for. It is important to consider the specific requirements of the platform and evaluate different options to determine the best design for our specific use case.

Question 55

Design Tic-Tac-Toe game.

Accepted Answer

Create a game server that listens for incoming connections from clients.
When a client connects to the server, the server creates a new game instance and adds the client to the game as a player.
The server maintains a list of active games and routes messages from clients to the appropriate game instance.
The game instance handles the game logic, including updating the game board, checking for wins or draws, and sending updates to the clients.
The client UI displays the game board and allows the player to make their moves. The client sends move requests to the server, which are then passed on to the game instance. The client also receives updates from the server and updates the UI accordingly.
The server and clients communicate using a network protocol, such as HTTP or WebSockets.

This is just one possible design for a tic-tac-toe game, and there are many other ways to implement it. For example, we can use a client-server architecture, or we could implement the game as a standalone application. Also, additional features could be added, such as support for multiple players, leaderboards, or different game modes.

Question 56

Explain your approach to designing a parking lot system.

Accepted Answer

The parking lot system has a server that manages the parking spaces and handles requests from clients.
Clients can be drivers looking for a parking space, or staff managing the parking lot. Clients connect to the server through a user interface, such as a website or a mobile app.
The server maintains a database of parking spaces, vehicles, and parking tickets. It also stores information about parking rates and policies.
When a driver arrives at the parking lot and wants to park their vehicle, they use the client UI to request a parking space. The server responds by reserving an available space and issuing a parking ticket to the driver.
When the driver is ready to leave, they use the client UI to request their vehicle. The server retrieves the vehicle from the parking space and calculates the parking fee based on the ticket's time stamp and the current rate. The driver can then pay the fee through the client UI.
Staff can use the client UI to view the status of the parking lot (such as the number of available spaces), and to perform tasks such as issuing tickets for parked vehicles that violate the parking rules.

Question 57

Can you tell me a possible system design for ATM?

Accepted Answer

ATM is connected to a bank's server through a network connection. The server stores information about customer accounts, as well as the transactions that have been made using the ATM.
The ATM has a processor that runs the software and performs the operations required to fulfill requests from the user. It also communicates with the server to perform operations on the user's account.
The ATM has a user interface that consists of a display screen, a keypad for entering commands and data, and a card reader for reading bank cards.
When a user inserts their bank card into the ATM, the card reader reads the card and sends the card's information to the processor. The processor uses this information to identify the user's account and retrieves the account information from the server.
The user can then enter their PIN (personal identification number) using the keypad to authenticate themselves. If the PIN is correct, the processor allows the user to access their account.
The user can then use ATM to perform various operations, such as checking their account balance, withdrawing cash, or transferring money to another account. The processor communicates with the server to perform these operations and updates the user's account information as necessary.
The ATM has a printer for printing receipts for transactions, and a cash dispenser for dispensing cash to the user.

Question 58

How would you design a system for managing and scheduling flights for an airline?

Accepted Answer

Start by gathering data on the available aircraft and their capabilities (such as capacity, range, and fuel efficiency), as well as data on the routes and destinations that the airline serves. Gather data on the passengers, including their personal and contact information, ticket information, and any special requests or needs (such as wheelchair assistance or special meals).
Next, design a system for storing and managing the aircraft, route, and passenger data. This could involve using a database or a distributed storage system to store the data, as well as implementing processes for adding, updating, and deleting data as needed.
Design a system for scheduling flights that take into account the availability of aircraft, the demand for different routes, and any constraints or rules (such as the need for certain types of aircraft or the requirement for a certain number of flights per week). This could involve using a scheduling algorithm that optimizes the use of aircraft and minimizes costs, or it could involve implementing a calendar-based system where flights can be manually scheduled by the staff.
To enable passengers to book flights online, design a system for allowing passengers to browse the available flights and select a flight that works for them. This could involve integrating the scheduling system with a website or a mobile app that allows passengers to view the available flights and select one to book.
Furthermore, design a system for managing and tracking the flights, including the ability to communicate with passengers and provide updates on the status of their flight, as well as the ability to track and report on the performance of the airline. Also design a system for handling canceled or delayed flights, including the ability to rebook passengers on alternative flights or provide compensation as needed.

Question 59

How will you design a recommendation system?

Accepted Answer

There are many different approaches to designing a recommendation system, and the specific design will depend on the goals of the system and the type of data it has available. Below are some general steps that I will follow to design a recommendation system:

Define the Problem: Identify the goals of the recommendation system clearly and the type of recommendations it should make (such as products, movies, or articles).
Collect Data: Gather data on the items that the recommendation system will recommend, as well as data on the users who will receive the recommendations. This data may include explicit ratings or preferences (such as "likes" or "favorites"), or it may be derived from user behavior (such as clickstream data or purchase history).
Preprocess the Data: Clean and transform the data to make it suitable for use by the recommendation system. This may involve removing duplicates or outliers, handling missing values, or aggregating data from multiple sources.
Choose a Recommendation Algorithm: Select a recommendation algorithm that is appropriate for the type of data and the goals of the recommendation system. Common algorithms include collaborative filtering, content-based filtering, and matrix factorization.
Train and Evaluate the Recommendation System: Use the data and the chosen algorithm to train the recommendation system. Evaluate the performance of the system using metrics such as accuracy, precision, or recall.
Deploy the Recommendation System: Integrate the recommendation system into the application or platform where it will be used, and test it to ensure it is working as expected.

Question 60

Explain the system design for a video streaming service like Youtube/Netflix.

Accepted Answer

The service has a server that stores video content and handles requests from clients.
Clients can be users accessing the service through a web browser or a mobile app, or devices such as smart TVs or streaming media players.
The server maintains a database of users, their subscriptions and payment information, and their watch history. It also stores information about the video content, including titles, descriptions, tags, and ratings.
When a user logs in to the service, the server retrieves their account information and sends it to the client. The client displays the user's personalized home screen, which shows recommendations based on the user's watch history and preferences.
The user can search for and browse the available content, or they can subscribe to channels to receive updates when new videos are posted. The client sends requests to the server to retrieve the requested content.
The server streams the selected video to the client, and the client plays the video on the user's device. The client records the user's progress through the video and sends this information back to the server. The server updates the user's watch history and uses it to make recommendations to the user in the future.
The service has a payment system for handling subscriptions and any additional charges (such as rental fees for individual movies). The server communicates with the payment system to process transactions and update the user's account balance.

Question 61

Can you tell me the steps to design a traffic control system?

Accepted Answer

The traffic control system has a server that manages the traffic signals and handles requests from clients.
Clients can be traffic control operators or traffic management systems that need to communicate with traffic signals. Clients connect to the server through a user interface, such as a website or a mobile app.
The server maintains a database of traffic signals and their current status (such as green, yellow, or red). It also stores information about the roads and intersections that the traffic signals control, as well as any special rules or conditions (such as construction or events).
The server receives input from sensors and cameras at the intersections, which provide real-time data on the traffic flow and conditions. The server uses this data to adjust the timing and patterns of the traffic signals as needed to optimize the flow of traffic.
Traffic control operators can use the client UI to manually override the traffic signals or set special rules (such as turning all the signals red during an emergency). The server updates the traffic signals as requested and sends a confirmation to the operator.
Traffic management systems (such as those used by public transportation) can use the client UI to request priority for their vehicles at intersections. The server adjusts the traffic signals to give priority to the requested vehicles.

Question 62

How would you design a real-time bidding system for online advertising?

Accepted Answer

Start by defining the goals of the system and the types of ads it should support (such as display ads, video ads, or native ads). Gather data on the available ad inventory, including information on the ad formats, sizes, and targeting options.Also gather data on the advertisers who will be bidding for ad space, including their budgets, targeting preferences, and bid strategies.
Design a system for storing and managing the ad inventory and advertiser data. This could involve using a database or a distributed storage system to store the data, as well as implementing processes for adding, updating, and deleting data as needed.
Design a system for matching ads to available inventory. This could involve using algorithms to match ads to inventory based on factors such as targeting, ad format, and budget, or it could involve implementing a marketplace where advertisers can bid on specific ad slots.
To enable real-time bidding, design a system for processing bids and selecting the winning bid in real-time. This could involve using an auction-based approach, where bids are processed and the winning bid is selected based on the highest bid, or it could involve using a first-price auction approach, where the winning bid is selected based on the bid that is closest to the ad's true value.
Lastly, design a system for tracking and reporting on the performance of the ads and the effectiveness of the bidding system.

Question 63

How would you design a system for detecting and preventing fraud in online transactions?

Accepted Answer

Start by gathering data on past fraud cases and identifying common patterns or characteristics of fraudulent transactions. This could involve analyzing data such as the types of products or services being purchased, the locations of the transactions, the payment methods being used, and the behavior of the users involved.
Design a system for monitoring and analyzing ongoing transactions in real time. This could involve implementing a machine learning model that is trained on past fraud data and can detect patterns or anomalies that may indicate fraudulent activity. The system could also incorporate additional data sources, such as data from third-party fraud detection services or information on the user's device or IP address.
To prevent fraudulent transactions, design a system for blocking or flagging transactions that the system has identified as potentially fraudulent. This could involve automatically rejecting the transaction or requiring additional authentication or verification from the user before proceeding with the transaction.
Design a system for reviewing and investigating flagged transactions to determine if they are fraudulent. This could involve assigning the flagged transactions to a team of fraud analysts who can review the transactions and make a decision on whether to approve or reject them.
In the end, design a system for tracking and reporting on the performance of the fraud detection system, including metrics such as the number of fraudulent transactions detected, the number of false positives, and the overall effectiveness of the system. Also, regularly update and improve the system based on new data.

Question 64

How would you design a system for scheduling and managing appointments (e.g., at a doctor's office)?

Accepted Answer

Start by gathering data on the available resources (such as doctors, exam rooms, and equipment) and the types of appointments that can be scheduled (such as regular check-ups, specialist appointments, or surgeries). Also gather data on the patients, including their personal and contact information, insurance details, and medical history.
Design a system for storing and managing the resource and patient data. This could involve using a database or a distributed storage system to store the data, as well as implementing processes for adding, updating, and deleting data as needed.
Design a system for scheduling appointments that takes into account the availability of resources, the preferences of the patients and doctors, and any constraints or rules (such as the need for certain types of equipment or the requirement for advance notice). This could involve using a scheduling algorithm that optimizes the use of resources and minimizes conflicts, or it could involve implementing a calendar-based system where appointments can be manually scheduled by the staff.
To enable patients to schedule appointments online, design a system for allowing patients to browse the available appointments and select a time that works for them. This could involve integrating the scheduling system with a website or a mobile app that allows patients to view the available appointments and select one to book.
Finally, design a system for managing and tracking appointments, including reminders for patients, notifications for staff, and the ability to reschedule or cancel appointments as needed.

System Design Interview Questions and Answers for 2024 Programming

Beginner

Advanced

Company Based Questions