Accreditation Bodies
Accreditation Bodies
Accreditation Bodies
Supercharge your career with our Multi-Cloud Engineer Bootcamp
KNOW MORESystem Design Interview questions are an integral part of assessing a candidate's ability to develop complex software systems. It involves designing the architecture, components, interfaces, and data for a system to meet specific requirements. Successful candidates should possess the ability to translate business requirements into technical solutions and effectively communicate them. This comprehensive blog on System Design Interview questions and answers is designed to help candidates prepare for these challenging interviews. The blog is divided into three sections covering fundamental questions for beginners, intermediate questions, and advanced questions for experienced professionals. It covers essential topics like system architecture, design patterns, scalability, fault tolerance, and more. We have compiled a list of expert-selected System Design interview questions and answers to help you succeed in your interviews for various System Design positions. These system design interview questions with solutions are divided into 5 categories viz. General, Freshers, Intermediate, Advanced/Expert, and Company Based.
Filter By
Clear all
System design is the process of designing and defining the architecture, components, modules, interfaces, and data for a system to satisfy specified requirements. It involves decomposing the system into smaller subsystems, determining how the subsystems will work together, and defining the relationships between them.
System design is an iterative process that involves understanding the problem to be solved, identifying the requirements of the system, and designing a solution that meets those requirements. It is a critical step in the development of any system, as it lays the foundation for the subsequent implementation and testing phases.
Microservices is a software architecture pattern in which a system is divided into manageable, small, independent components that can be developed, deployed, and scaled independently. This can make it easier to update and modify the system, and it can also improve the scalability and reliability of the system by allowing components to be scaled or replaced independently. These modules can be created, used, and maintained separately.
Application Programming Interfaces (APIs) allow these services to communicate and coordinate with one another. An API defines a set of rules and protocols that govern how one service can access the functionality of another service. When a service needs to send a message to another service, it sends the message to a message queue. The receiving service then retrieves the message from the queue and processes it.
Overall, message queues act as a backbone of communication in microservices architecture each microservice is focused on a specific task or capability and communicates with other microservices through well-defined interfaces.
A must-know for anyone heading into the technical round, this is one of the most frequently asked Software design interview questions.
Documentation helps to communicate the design of the system to stakeholders and developers. Some common types of documentation that are used in system design include:
Among the most important application metrics for gauging system performance are:
This is one of the most frequently asked Software design questions.
The CDN edge servers are used to cache content that has been fetched from your origin server or storage cluster. Point of presence is another expression that is frequently connected to edge servers (POP). The physical location of the edge servers is referred to as a POP.
There may be several edge servers at that POP that are used for content caching.
The distance between a visitor and a web server can be reduced by delivering different portions of a website from different locations. CDN edge servers can store a copy of the content that is being delivered, allowing them to serve it directly to users without having to retrieve it from the origin server each time. This lowers latency, the purpose of CDN edge servers is to accomplish this.
The idea of efficiently spreading incoming traffic among a collection of diverse backend servers is known as load balancing. Server pools are groups of these servers. Today's websites are made to quickly and accurately respond to millions of customer requests while handling a high volume of traffic. More servers must be added to fulfill these requests.
In this case, it is crucial to appropriately disperse request traffic among the servers to prevent excessive load on any of them. A load balancer functions as a traffic cop, addressing the requests and distributing them among the available servers so that no one server is overloaded, which can impair the operation of the service.
The load balancer switches traffic to the remaining available servers when a server goes offline. Requests are automatically forwarded to a new server when one is added to the setup. Below are some advantages of load balancers:
By using database indexing, you may make it quicker and simpler to search through your tables and discover the desired rows or columns. A database table's columns can be used to generate indexes, which serve as the foundation for quick random lookups and effective access to sorted information.
There are several types of indexes that can be created, including:
It's no surprise that this one pops up often in Software design interviews.
A means to guarantee system reliability is through availability. It translates to: The system must always be online and respond to customer requests. A system with high availability can function reliably and consistently, with minimal downtime or interruptions. In other words, the system must be accessible and respond to user requests anytime a user wants to utilize the service.
By calculating the proportion of time the system is operational within a specified time window, availability may be calculated.
Availability = Uptime / Uptime + Downtime
The "availability percentages" of a system are typically expressed in terms of the number of 9s (as shown in the table below).
It is referred to as having "2 nines" of availability if availability is 99.00%, "3 nines" if availability is 99.9%, and so forth.
One server may need to take charge of updating third-party APIs in a distributed environment where numerous servers contribute to the application's availability because other servers may interfere with the third-party APIs' use.
This server is known as the primary server, and the selection procedure is known as the leader election. When the leader server fails, the servers in the distributed environment must recognize it and choose a new leader.
There are several strategies that can be used to implement leader election in a distributed system, including:
A common question in System Design interviews, don't miss this one.
A network protocol is a set of rules and conventions that govern the communication between devices on a network. In system design, the choice of network protocol can have a significant impact on the performance and scalability of the system, as well as on its security and reliability.
Many different types of network protocols can be used in system design, including:
Scalability refers to a program's ability to manage a lot of traffic, whereas performance refers to measuring how quickly the application is operating. The system's performance improves in direct proportion to the resources provided to it. Scalability is directly related to the performance of any design because it allows the handling of larger data sets in the case of expanding activity.
There are several ways in which scalability and performance are related in system design:
By sending a network request to the server and requesting the most recent data, polling is all about our client checking on the server. Regular intervals like 5 seconds, 15 seconds, 1 minute, or any other time required by the use case are typical for these requests:
Polling every few seconds still falls short of real-time and has the following drawbacks, especially if there are more than a million concurrent users:
Polling is therefore best employed in situations when short pauses in data updates are not problematic for your application. Polling quickly is not particularly efficient or performant.
One of the most frequently posed System design questions, be ready for it.
The four architecture types listed below are often used by distributed systems and processes:
A server, which served as a shared resource like a printer, database, or web server, was the foundation of the distributed system design. It once had numerous clients, such as users operating the computers who decided how to use, display, and modify shared resources as well as submit modified data back to the server.
To make application deployment simpler, this style of most common architecture stores client-related data in a middle layer rather than directly on the client. This middle layer is sometimes referred to as an agent since it takes requests from clients, some of whom may be stateless, processes the information, and then sends it to the servers.
Enterprise web services were the first to develop this for servers that house business logic and communicate with both the data layers and the display levels.
In this design, there are no specialized servers required for intelligent work. Each of the involved machines can play either a client or a server role, and all of the servers' decision-making and duties are distributed among them.
A staple in system design questions, be prepared to answer this one.
A database query is a request for data or information from a database. In the context of system design, a database query or Structured Query Language (SQL) can be used to retrieve, add, update, or delete data from a database.
Queries are an essential part of any system that needs to store and retrieve data. For example, a retail website may use database queries to retrieve customer information, process orders, and track inventory. A social media platform may use database queries to store and retrieve user profiles, posts, and messages.
Using SQL can greatly improve the efficiency and performance of a system by allowing it to access and manipulate data stored in a database quickly. Queries can be optimized to retrieve only the data that is needed and to do so in the most efficient way possible. Overall, the use of database queries or SQL in system design is crucial for storing, organizing, and accessing data in a scalable and efficient manner.
Proxy servers are typically some type of software or hardware that resides between a client and another server. It may be located anywhere between the clients and the destination servers, including the user's computer. A proxy server receives requests from clients, transmits them to the origin servers, and then returns the server's response to the client that requested them in the first place.
Proxy servers are widely used to process requests, filter requests, log requests, and occasionally alter requests (by adding/removing headers, encrypting/decrypting, or compressing). It facilitates the coordination of requests coming from several servers and can be applied to optimize request traffic on a system-wide level.
There are two types of proxies in system design:
Forward Proxy
In interactions between clients and servers, a "forward proxy" operates on behalf of (replaces) the client to assist users. It represents the user personally and relays the user's requests. The server won't be aware that the request and response are being sent through the proxy while using the "forward proxy."
Reverse Proxy
A reverse Proxy is most helpful in complicated systems. Reverse proxies are intended to assist servers by acting on their behalf. The client won't be aware that the request and response are passing through a proxy when using a reverse proxy.
The main server can delegate several functions to a "reverse proxy," which can also serve as a gatekeeper, screener, load-balancer, and general helper.
Every read request ought to receive the data that was most recently written, according to consistency from the CAP theorem. When there are several versions of the same data, it becomes difficult to synchronize them so that the clients always receive up-to-date information. The available consistency patterns are as follows:
Weak Consistency
The read request may or may not be able to obtain the new data following a write operation.
Real-time use cases like VoIP, video chat, and online gaming all benefit from this kind of stability.
Eventual Consistency
The reads will finally view the most recent data within milliseconds after a data write. Here, asynchronous replication of the data is used. DNS and email systems both use them. In highly accessible systems, this works well.
Strong Consistency
The succeeding reads will view the most recent data following a data write. Here, synchronous replication of the data is used. This is seen in RDBMS and file systems, which are appropriate for systems needing data transfers.
Block storage is a method of storing data that divides the data into equal-sized blocks and assigns a unique identifier to each block for convenience. Blocks can be stored anywhere in the system instead of following a predetermined path, which makes better use of the system's resources.
Few examples for block storage tools are LVM (Logical Volume Manager), SAN (Storage Area Network), ZFS (Zettabyte File System), and many more.
Choosing the right tool(s) for a system depends on the specific requirements of the system and the underlying storage infrastructure. Block storage is typically used to store data that needs to be accessed quickly and frequently, such as operating system files, application data, and database files. It is also used to store data that needs to be accessed randomly, as it allows individual blocks of data to be accessed directly, rather than having to read through a large amount of data sequentially.
Block storage is a powerful tool for storing and managing data in a system, and it is often used in conjunction with other types of storage to create a well-rounded storage strategy.
A hierarchical storage methodology is file storage. The information is saved in files using this technique. Folders contain the files, which are then housed in directories. It is often used in systems to store data that is accessed less frequently than data stored in block storage, such as large documents, media files, and backups.
Only a small amount of data, primarily structured data, can be stored using this method. This data storage technique can be troublesome as the size of the data exceeds a certain threshold.
Several factors like performance, capacity, data organization, and data protection can affect it.
Large amounts of unstructured data can be handled by object storage. Each object is typically a large file, such as a video or image, and is stored with a unique identifier and metadata that describes the object. Due to the importance of backups, unstructured data, and log files for all systems, this type of storage offers systems a tremendous amount of flexibility and value.
Object storage would be beneficial for your business if you were designing a system with large datasets. It is designed to scale horizontally, allowing it to store vast amounts of data without experiencing a decrease in performance.
A few object storage tool examples are Amazon S3, Google Cloud Storage, Ceph, OpenStack Swift, and many more. Some factors to consider while deciding on a tool would be scalability, durability, cost and the features that are required for the system. An operating system cannot directly access object storage. RESTful APIs are used for communication at the application level. Due to its dynamic scalability, object storage is the preferred method of data storage for data backups and archiving.
Web servers and application servers are both types of servers that are used in computer networks to serve different purposes. A web server is a dedicated server with the sole purpose of handling web requests. These are designed to host and serve web content, such as HTML, CSS, and JavaScript files, to clients over the internet web servers are typically optimized for serving static content, such as images, videos, and documents, and do not typically include processing capabilities beyond basic HTTP handling.
Application servers, on the other hand, are servers that are designed to host and run applications. These applications may be web-based or standalone, and they may include dynamic content, such as databases, user accounts, and business logic. Application servers often include features such as database connectivity, security, and scaling capabilities, and they may be built using frameworks such as Java EE or .NET.
In structured design, the major tool used is a flowchart, which is a graphical representation of a process or system that shows the steps or activities involved and the relationships between them. Flowcharts are used to visualize and document the design of a system, and they can help to identify and resolve problems or issues during the design process.
Flowcharts are widely used in structured design because they provide a clear and concise way to represent the design of a system, and they are easy to understand and communicate to others. They are also flexible and can be used for a wide range of systems, from simple to complex.
In addition to flowcharts, other tools that are commonly used in the structured design include data flow diagrams, entity-relationship diagrams, and state diagrams.
Latency is the amount of time it takes for a request to be processed and a response to be returned in a system. In system design, latency is an important consideration because it can impact the performance and user experience of the system. Several factors can contribute to latency in a system, including the speed of the network connection, the processing power of the servers or devices involved, and the complexity of the algorithms and processes being used.
To reduce latency in a system, designers can consider several strategies, such as optimizing the algorithms and processes being used, using faster hardware and networking equipment, and implementing caching and other performance-enhancing techniques.
It refers to the amount of data or transactions that the system can handle in a given period of time. It is often used to evaluate the capacity or scalability of a system, as well as to identify potential bottlenecks or performance issues. Dividing the requests and spreading them to other resources is the one promising method of boosting the system's throughput that has been discovered.
Numerous factors as below can impact the throughput of a system:
According to the CAP(Consistency-Availability-Partition Tolerance) theorem, a distributed system cannot concurrently guarantee C, A, and P. It can only offer two of the three assurances at most. Let us use a distributed database system to help us comprehend this.
The following diagram illustrates which CAP Theorem components each database simultaneously assures. We can see that RDBMS databases offer availability and consistency at the same time. Consistency and Availability are provided by SQL Server and Maria DB. Consistency and partition tolerance are ensured by the Redis, MongoDB, and Hbase databases. Availability and partition tolerance is achieved by Cassandra and CouchDB.
This is a regular feature in the list of top System design questions, be ready to tackle it in your next interview.
Horizontal scaling is increasing the number of computers on a network to distribute the processing and memory workload among a distributed network of devices.
Vertical scaling refers to the idea of enhancing the resource capacity of a single machine, by adding RAM, effective processors, etc. Without changing any code, it can help the server's capabilities.
Some other factors to consider when deciding between horizontal scaling and vertical scaling in system design:
This is a frequently asked question in Software design interviews.
Caching is the practice of keeping copies of files in a temporary storage space referred to as a cache, which facilitates faster data access and lowers site latency. Only a certain amount of data can be kept in the cache. Because of this, choosing cache update strategies that are best suited to the needs of the business is crucial. The different caching techniques are as follows:
Cache-aside: In this approach, it is up to our application to write to and read data from the storage. The storage and the cache do not directly interact. In this case, the application searches the cache for an entry before retrieving it from the database and adding it to the cache for later use. The cache-aside strategy, also known as lazy loading, only caches the requested entry, preventing unnecessary data caching.
Write-through: According to this strategy, the system will read from and write data into the cache, treating it as its primary data store. The database is then updated to reflect these changes by the cache. Entries are synchronously written from the cache to the database.
Write-behind(write-back): The application performs the following actions in this strategy:
Refresh-ahead: By employing this technique, we can set the cache to automatically refresh the cache entry before it expires.
Expect to come across this, one of the top System design interview questions.
Below is the drawbacks for each:
Cache-aside
When a cache miss occurs, there will be a noticeable delay because data must first be fetched from the database before being cached. If data is updated in the database, there is a greater chance that it will become stale. By forcing an update of the cache entry with the time-to-live parameter, this can be minimized. Increased latency occurs when a cache node fails and is replaced by a new, empty node.
Write-through
Because of the synchronous write operation, this strategy operates slowly overall. The data stored in the cache has a chance of never being read. By offering the right TTL, this risk can be minimized.
Write-behind
The main drawback of this approach is the potential for data loss if the cache is destroyed before the contents are written into the database.
A network of globally dispersed proxy servers known as a "content delivery network," or CDN for short, serves content to users from locations close to them. Static files like HTML, CSS, JS files, images, and videos are typically served from CDNs on websites.
Users don't have to wait long because data is delivered from nearby centers. As some of the burdens are shared by CDNs, the load on the servers is significantly reduced.
We have 2 types of CDNs as below:
Push CDNs: Every time the server makes changes, in this case, the CDNs receive the content. We are accountable for uploading the content to CDNs.Only when content is changed or added, the CDN is updated, maximizing storage while minimizing traffic.
Push CDNs generally work well for websites with less traffic or content.
Pull CDNs: When the first user requests the content from the site, fresh content is fetched from the server in this case. As a result, initial requests take longer to complete until the content is stored or cached on the CDN. Although these CDNs use the least amount of space possible, when outdated files are pulled before being changed, it can result in redundant traffic. Pull CDNs are effective when used with busy websites.
The process of sharding involves dividing a sizable logical database into numerous small datasets called shards. Some database data persists across all shards, whereas other data only appears in one shard. The terms "vertical sharding" and "horizontal sharding" can be used to describe these two situations. A sharded database can now accommodate more requests than a single large machine can. Databases can be scaled by using sharding, which increases throughput, storage capacity, and availability while assisting in handling the increased load.
We must choose a sharding key to partition your data before you can shard it. An indexed field or an indexed compound field that appears in each document in the collection can serve as the sharding key. Our application can run fewer queries thanks to sharding. The application knows how to route requests when it receives them. Instead of searching through the entire database, this means it must look through less information. Sharding boosts the overall performance and scalability of the application.
Data partitioning is the process of splitting a large dataset into smaller, more manageable pieces called partitions. Partitioning a large database can enhance its performance, controllability, and availability. In some situations, partitioning can improve performance when accessing a partitioned table. It is common to practice partitioning databases for load balancing, performance, manageability, and availability.
There are 3 types of partitioning, viz. Horizontal Partitioning, Vertical Partitioning and Functional Partitioning. An example of Horizontal Partitioning is as below:
By acting as a leading column in indexes, partitioning can reduce index size and improve the chances of finding the most desirable indexes in memory. Scanning a region when a significant portion of it is used in the resultset is much quicker than accessing data that is dispersed throughout the entire table by index. Performance is improved because adding and removing sections enables mass uploading and deletion of data. Rarely used data can be transferred to less expensive data storage systems.
A must-know for anyone heading into the technical rounds, this is one of the most frequently asked System design interview questions.
Database replication is a process of copying data from a database on one server to a database on another server. This is typically used to improve data availability, scalability, and performance by allowing multiple servers to handle the load of a database system. There are several different types of replication, including master-slave replication, where one server acts as the primary source of data and the other servers act as replicas, and peer-to-peer replication, where all servers are equal and can both read and write to the database.
Replication can be useful in a variety of situations, including when a database needs to be accessed by users in different geographic locations or when a database needs to be backed up for disaster recovery purposes.
RAID (Redundant Array of Independent Disks) is a technology used to improve the performance, reliability, and fault tolerance of a data storage system. It works by combining multiple physical disks into a logical unit, allowing the system to read and write data across the disks in a way that increases performance and reliability.
There are several different RAID configurations, each with its own unique set of characteristics and benefits. Some common RAID configurations include:
These are just a few examples of the different RAID levels that are available. Many other RAID levels have been developed, each with its unique combination of features and trade-offs. It is important to carefully consider the requirements of your system and choose a RAID level that meets your needs.
A file system is a way of organizing and storing data on a storage disk. It determines how files are named, stored, and retrieved. A file system also provides a way of organizing and grouping files, as well as setting permissions on files and directories to control who can access them.
A distributed file system is a file system that allows multiple computers to access and store data on a network of computers.
In system design, the choice of a distributed file system is an important decision that can have significant implications for the performance, reliability, and scalability of the system. There are several different types of distributed file systems, each with its own set of features and characteristics.
Some common types of distributed file systems include:
These are a way of capturing the best practices and lessons learned from building systems and providing a proven, reusable solution to common problems.
Some common system design patterns include:
These are just a few examples of the many system design patterns that are available. It is important to carefully consider the requirements of the system and choose the design patterns that are most appropriate for the system.
A database is a structured collection of data that is stored and accessed electronically. There are many different types of databases available, and it is important to carefully consider the requirements of the system when choosing a database.
Some common types of databases that are commonly used in system design include:
Factors to consider may include the amount of data that needs to be stored, the complexity of the data, the performance and scalability requirements, and the availability and reliability requirements.
Some examples of popular relational databases include:
Some examples of popular non-relational databases include:
ACID (Atomicity, Consistency, Isolation, Durability) is a set of properties that are used to guarantee the integrity and reliability of data in a database.
The ACID properties are:
Messages are routed from a sender to the recipient, through a message queue. The main purpose of a message queue is to decouple the communication between these parts so that they can operate independently and asynchronously. It operates under the FIFO (first in, first out) principle.
In a message queue system, there are three main components:
Message queues can also be used to provide reliability and fault tolerance to a system. If a consumer fails to process a message, the message can be returned to the queue for processing by another consumer. This allows the system to continue operating even if one or more components fail.
There are many different types of message queue systems available, including Apache Kafka, RabbitMQ, and ActiveMQ. Each has its own set of features and capabilities and can be used in different types of systems depending on the specific requirements.
Apache Kafka: Apache Kafka is a distributed, real-time message queue system that is widely used for building scalable, high-throughput, and fault-tolerant architectures for data processing, streaming and messaging. Kafka is a publish-subscribe messaging system that allows producers to send messages to topics and consumers to read from those topics. Topics are partitioned and replicated across the cluster, allowing for high scalability and availability. One of the key features of Kafka is its ability to handle large amounts of data with low latency.
A database schema is a structure that defines the organization and relationships of data in a database. It specifies the data types, the attributes of each data element, and the relationships between different data elements.
In system design, the database schema is an important consideration because it determines how the data will be stored and accessed. It needs to be carefully planned and designed to ensure that the database is efficient, scalable, and easy to use.
There are two main types of database schema:
The physical schema is usually implemented by the database management system (DBMS), while the logical schema is defined by the database designer. The logical schema is used to create the physical schema, and it is also used by users and applications to interact with the database.
CQRS (Command Query Responsibility Segregation) is a design pattern that is used to separate the responsibilities of reading and writing data in a system. In a CQRS system, the write side of the system is responsible for storing data, while the read side is responsible for retrieving and displaying data. These two sides are usually implemented as separate components, and they can use different data models and storage mechanisms.
2PC (Two-Phase Commit) is a protocol that is used to coordinate transactions across multiple systems in a distributed environment. It has 2 phases i.e. the Voting/Prepare/Commit request phase and the commit phase where based on the voting, the system decides to commit or abort the transaction.
It is designed to ensure that all systems participating in a transaction either commit or roll back the changes they have made as a group, to ensure that the overall transaction is either completed or canceled.
In a 2PC system, there is a central coordinator that coordinates the transaction, and multiple participants that participate in the transaction.
Consistent hashing is a technique that is used to distribute keys or data elements across a distributed system in a way that minimizes the number of keys that need to be moved when the number of nodes in the system changes. It is commonly used in distributed systems to distribute data and workloads evenly across multiple servers or components and to ensure that the system remains balanced and efficient even when nodes are added or removed.
A Bloom filter is a probabilistic data structure that is used to test whether an element is a member of a set. It is a space-efficient data structure that allows for fast membership tests, but it also has a certain probability of false positives, meaning that it may occasionally indicate that an element is a member of the set when it is not. If all of the positions are set to 1, the element is considered to be a member of the set. However, if any of the positions are set to 0, the element is considered to not be a member of the set.
The steps to design any URL shortening service is as follows -
A common System design interview question, don't miss this one.
One of the most frequently posed System design interview questions, be ready for it.
Start by establishing a design scope by determining the major features to be included as below:
Next, break down the application into components as below:
Common frontend (client) languages are Javascript, Java, Swift and common backend (server) languages are Erlang, Node.js, PHP, Scala, Java, and many more.
Deciding on the application framework is equally vital. We have some chat protocols like Extensible Messaging and Presence Protocol (XMPP) used by WhatsApp and Message Queue Telemetry Transport (MQTT) which is relatively new.
The Web crawler is a search engine-related tool, similar to Google and DuckDuckGo, that indexes website material online so that it may be made available for every result.
Which features are some of the requirements?
What are some typical issues that arise?
The potential pointers:
A staple in System design interview questions, be prepared to answer this one.
To design a message board platform like Quora, you can follow these steps:
API should define the format of the requests and responses between the front-end and back-end. This may involve using a standardized format such as JSON.
The API must include mechanisms for authenticating users, such as OAuth, to ensure that only authorized users can access the platform's data.
The API should include error handling to ensure that the front-end can handle error responses from the back-end in a predictable and user-friendly manner.
This is a high-level overview of the steps involved in designing a message board platform like Quora. In practice, the design process can be much more complex and will likely involve many more details and considerations.
To design a social media platform like Twitter, you would need to consider the following components:
In terms of data modeling, you would need to model the following data entities:
Real-time updates is an important feature of many social media platforms, including Twitter, as it enables users to receive updates in near real-time as events occur. To implement real-time updates, you would need to consider the following components:
The search functionality is a key component of many social media platforms, including Twitter, as it allows users to find and discover content. To implement search in your platform, you would need to consider the following components:
Some popular algorithms for search include:
The biggest challenge to combat in this design would be demand vs. supply hence we would need two services for supply (of cabs) and demand (of riders). We will go with Uber’s example to understand the design and architecture.
An API rate limiter is a system that is designed to limit the rate at which an API can be accessed by clients. This can be useful in situations where it is necessary to prevent excessive use of the API, such as to protect against denial of service attacks or to ensure fair usage by all clients.
There are several factors to consider when designing an API rate limiter:
Search typeahead is a feature that suggests search terms or results as a user types in a search query. It is commonly used in search engines and other types of search interfaces to help users find relevant information more quickly and easily.
A search typeahead is a feature commonly found in search boxes that provides suggestions to the user as they type. It is designed to help users find what they are looking for more quickly and efficiently. The goal of a search typeahead is to provide relevant and accurate suggestions in real-time as the user types.
To design a search typeahead, we would typically follow these steps:
There are several algorithms and data structures that can be used to implement the suggestion engine, including:
The choice of algorithm for your suggestion engine will depend on the specific requirements of your platform, including the size and structure of your data, the desired performance, and the complexity of the search problem. We may need to evaluate multiple algorithms and perform performance testing to determine the best choice for your specific use case.
A well-designed search typeahead can greatly improve the user experience and make it easier for users to find what they are looking for. It is important to consider the specific requirements of the platform and evaluate different options to determine the best design for our specific use case.
This is just one possible design for a tic-tac-toe game, and there are many other ways to implement it. For example, we can use a client-server architecture, or we could implement the game as a standalone application. Also, additional features could be added, such as support for multiple players, leaderboards, or different game modes.
There are many different approaches to designing a recommendation system, and the specific design will depend on the goals of the system and the type of data it has available. Below are some general steps that I will follow to design a recommendation system:
We'll need the following to build software that can back an Amazon pickup location -
Design round interview questions on this topic can pop up often, so it you should be well-versed in the entire development lifecycle.
To design this specific page, here are the things you need to keep in mind.
One of the most common Software design questions, don't miss this one. You can practice this by recreating pages from the top tech giants.
One of the most frequently posed System design questions, be ready for it.
A staple in System design interview questions, be prepared to answer this one.
A staple in Software design interview questions, be prepared to answer this one.
Answering system design questions like this will require you to have hands-on experience. Here is how to design this wearable.
A distributed system that can index and search a large dataset should be designed as per these considerations -
Expect to come across this popular question in System design interviews.
Here is how we can design the Server Architecture for a platform like Gmail.
A must-know for anyone heading into an interview, this question is frequently asked in Front-end System design interviews.
We can design a system like Google Photos by following the steps below -
System design interview questions can be asked from other similar scenarios like this.
To design services like these, we consider the following points -
This is a common question in Software design interviews, don't miss this one.
One of the most frequently posed Software design questions, be ready for it.
A staple in System design questions and answers, be prepared to answer this one.
The API layers for Facebook chat will include
This is a regular feature in the list of top System design questions, be ready to tackle it.
This can be done by considering the following factors.
We can design a distributed system for storing and processing large amounts of structured and unstructured data with the following considerations.
It's no surprise that this one pops up often as one of the top System design interview questions.
System design is the process of designing and defining the architecture, components, modules, interfaces, and data for a system to satisfy specified requirements. It involves decomposing the system into smaller subsystems, determining how the subsystems will work together, and defining the relationships between them.
System design is an iterative process that involves understanding the problem to be solved, identifying the requirements of the system, and designing a solution that meets those requirements. It is a critical step in the development of any system, as it lays the foundation for the subsequent implementation and testing phases.
Microservices is a software architecture pattern in which a system is divided into manageable, small, independent components that can be developed, deployed, and scaled independently. This can make it easier to update and modify the system, and it can also improve the scalability and reliability of the system by allowing components to be scaled or replaced independently. These modules can be created, used, and maintained separately.
Application Programming Interfaces (APIs) allow these services to communicate and coordinate with one another. An API defines a set of rules and protocols that govern how one service can access the functionality of another service. When a service needs to send a message to another service, it sends the message to a message queue. The receiving service then retrieves the message from the queue and processes it.
Overall, message queues act as a backbone of communication in microservices architecture each microservice is focused on a specific task or capability and communicates with other microservices through well-defined interfaces.
A must-know for anyone heading into the technical round, this is one of the most frequently asked Software design interview questions.
Documentation helps to communicate the design of the system to stakeholders and developers. Some common types of documentation that are used in system design include:
Among the most important application metrics for gauging system performance are:
This is one of the most frequently asked Software design questions.
The CDN edge servers are used to cache content that has been fetched from your origin server or storage cluster. Point of presence is another expression that is frequently connected to edge servers (POP). The physical location of the edge servers is referred to as a POP.
There may be several edge servers at that POP that are used for content caching.
The distance between a visitor and a web server can be reduced by delivering different portions of a website from different locations. CDN edge servers can store a copy of the content that is being delivered, allowing them to serve it directly to users without having to retrieve it from the origin server each time. This lowers latency, the purpose of CDN edge servers is to accomplish this.
The idea of efficiently spreading incoming traffic among a collection of diverse backend servers is known as load balancing. Server pools are groups of these servers. Today's websites are made to quickly and accurately respond to millions of customer requests while handling a high volume of traffic. More servers must be added to fulfill these requests.
In this case, it is crucial to appropriately disperse request traffic among the servers to prevent excessive load on any of them. A load balancer functions as a traffic cop, addressing the requests and distributing them among the available servers so that no one server is overloaded, which can impair the operation of the service.
The load balancer switches traffic to the remaining available servers when a server goes offline. Requests are automatically forwarded to a new server when one is added to the setup. Below are some advantages of load balancers:
By using database indexing, you may make it quicker and simpler to search through your tables and discover the desired rows or columns. A database table's columns can be used to generate indexes, which serve as the foundation for quick random lookups and effective access to sorted information.
There are several types of indexes that can be created, including:
It's no surprise that this one pops up often in Software design interviews.
A means to guarantee system reliability is through availability. It translates to: The system must always be online and respond to customer requests. A system with high availability can function reliably and consistently, with minimal downtime or interruptions. In other words, the system must be accessible and respond to user requests anytime a user wants to utilize the service.
By calculating the proportion of time the system is operational within a specified time window, availability may be calculated.
Availability = Uptime / Uptime + Downtime
The "availability percentages" of a system are typically expressed in terms of the number of 9s (as shown in the table below).
It is referred to as having "2 nines" of availability if availability is 99.00%, "3 nines" if availability is 99.9%, and so forth.
One server may need to take charge of updating third-party APIs in a distributed environment where numerous servers contribute to the application's availability because other servers may interfere with the third-party APIs' use.
This server is known as the primary server, and the selection procedure is known as the leader election. When the leader server fails, the servers in the distributed environment must recognize it and choose a new leader.
There are several strategies that can be used to implement leader election in a distributed system, including:
A common question in System Design interviews, don't miss this one.
A network protocol is a set of rules and conventions that govern the communication between devices on a network. In system design, the choice of network protocol can have a significant impact on the performance and scalability of the system, as well as on its security and reliability.
Many different types of network protocols can be used in system design, including:
Scalability refers to a program's ability to manage a lot of traffic, whereas performance refers to measuring how quickly the application is operating. The system's performance improves in direct proportion to the resources provided to it. Scalability is directly related to the performance of any design because it allows the handling of larger data sets in the case of expanding activity.
There are several ways in which scalability and performance are related in system design:
By sending a network request to the server and requesting the most recent data, polling is all about our client checking on the server. Regular intervals like 5 seconds, 15 seconds, 1 minute, or any other time required by the use case are typical for these requests:
Polling every few seconds still falls short of real-time and has the following drawbacks, especially if there are more than a million concurrent users:
Polling is therefore best employed in situations when short pauses in data updates are not problematic for your application. Polling quickly is not particularly efficient or performant.
One of the most frequently posed System design questions, be ready for it.
The four architecture types listed below are often used by distributed systems and processes:
A server, which served as a shared resource like a printer, database, or web server, was the foundation of the distributed system design. It once had numerous clients, such as users operating the computers who decided how to use, display, and modify shared resources as well as submit modified data back to the server.
To make application deployment simpler, this style of most common architecture stores client-related data in a middle layer rather than directly on the client. This middle layer is sometimes referred to as an agent since it takes requests from clients, some of whom may be stateless, processes the information, and then sends it to the servers.
Enterprise web services were the first to develop this for servers that house business logic and communicate with both the data layers and the display levels.
In this design, there are no specialized servers required for intelligent work. Each of the involved machines can play either a client or a server role, and all of the servers' decision-making and duties are distributed among them.
A staple in system design questions, be prepared to answer this one.
A database query is a request for data or information from a database. In the context of system design, a database query or Structured Query Language (SQL) can be used to retrieve, add, update, or delete data from a database.
Queries are an essential part of any system that needs to store and retrieve data. For example, a retail website may use database queries to retrieve customer information, process orders, and track inventory. A social media platform may use database queries to store and retrieve user profiles, posts, and messages.
Using SQL can greatly improve the efficiency and performance of a system by allowing it to access and manipulate data stored in a database quickly. Queries can be optimized to retrieve only the data that is needed and to do so in the most efficient way possible. Overall, the use of database queries or SQL in system design is crucial for storing, organizing, and accessing data in a scalable and efficient manner.
Proxy servers are typically some type of software or hardware that resides between a client and another server. It may be located anywhere between the clients and the destination servers, including the user's computer. A proxy server receives requests from clients, transmits them to the origin servers, and then returns the server's response to the client that requested them in the first place.
Proxy servers are widely used to process requests, filter requests, log requests, and occasionally alter requests (by adding/removing headers, encrypting/decrypting, or compressing). It facilitates the coordination of requests coming from several servers and can be applied to optimize request traffic on a system-wide level.
There are two types of proxies in system design:
Forward Proxy
In interactions between clients and servers, a "forward proxy" operates on behalf of (replaces) the client to assist users. It represents the user personally and relays the user's requests. The server won't be aware that the request and response are being sent through the proxy while using the "forward proxy."
Reverse Proxy
A reverse Proxy is most helpful in complicated systems. Reverse proxies are intended to assist servers by acting on their behalf. The client won't be aware that the request and response are passing through a proxy when using a reverse proxy.
The main server can delegate several functions to a "reverse proxy," which can also serve as a gatekeeper, screener, load-balancer, and general helper.
Every read request ought to receive the data that was most recently written, according to consistency from the CAP theorem. When there are several versions of the same data, it becomes difficult to synchronize them so that the clients always receive up-to-date information. The available consistency patterns are as follows:
Weak Consistency
The read request may or may not be able to obtain the new data following a write operation.
Real-time use cases like VoIP, video chat, and online gaming all benefit from this kind of stability.
Eventual Consistency
The reads will finally view the most recent data within milliseconds after a data write. Here, asynchronous replication of the data is used. DNS and email systems both use them. In highly accessible systems, this works well.
Strong Consistency
The succeeding reads will view the most recent data following a data write. Here, synchronous replication of the data is used. This is seen in RDBMS and file systems, which are appropriate for systems needing data transfers.
Block storage is a method of storing data that divides the data into equal-sized blocks and assigns a unique identifier to each block for convenience. Blocks can be stored anywhere in the system instead of following a predetermined path, which makes better use of the system's resources.
Few examples for block storage tools are LVM (Logical Volume Manager), SAN (Storage Area Network), ZFS (Zettabyte File System), and many more.
Choosing the right tool(s) for a system depends on the specific requirements of the system and the underlying storage infrastructure. Block storage is typically used to store data that needs to be accessed quickly and frequently, such as operating system files, application data, and database files. It is also used to store data that needs to be accessed randomly, as it allows individual blocks of data to be accessed directly, rather than having to read through a large amount of data sequentially.
Block storage is a powerful tool for storing and managing data in a system, and it is often used in conjunction with other types of storage to create a well-rounded storage strategy.
A hierarchical storage methodology is file storage. The information is saved in files using this technique. Folders contain the files, which are then housed in directories. It is often used in systems to store data that is accessed less frequently than data stored in block storage, such as large documents, media files, and backups.
Only a small amount of data, primarily structured data, can be stored using this method. This data storage technique can be troublesome as the size of the data exceeds a certain threshold.
Several factors like performance, capacity, data organization, and data protection can affect it.
Large amounts of unstructured data can be handled by object storage. Each object is typically a large file, such as a video or image, and is stored with a unique identifier and metadata that describes the object. Due to the importance of backups, unstructured data, and log files for all systems, this type of storage offers systems a tremendous amount of flexibility and value.
Object storage would be beneficial for your business if you were designing a system with large datasets. It is designed to scale horizontally, allowing it to store vast amounts of data without experiencing a decrease in performance.
A few object storage tool examples are Amazon S3, Google Cloud Storage, Ceph, OpenStack Swift, and many more. Some factors to consider while deciding on a tool would be scalability, durability, cost and the features that are required for the system. An operating system cannot directly access object storage. RESTful APIs are used for communication at the application level. Due to its dynamic scalability, object storage is the preferred method of data storage for data backups and archiving.
Web servers and application servers are both types of servers that are used in computer networks to serve different purposes. A web server is a dedicated server with the sole purpose of handling web requests. These are designed to host and serve web content, such as HTML, CSS, and JavaScript files, to clients over the internet web servers are typically optimized for serving static content, such as images, videos, and documents, and do not typically include processing capabilities beyond basic HTTP handling.
Application servers, on the other hand, are servers that are designed to host and run applications. These applications may be web-based or standalone, and they may include dynamic content, such as databases, user accounts, and business logic. Application servers often include features such as database connectivity, security, and scaling capabilities, and they may be built using frameworks such as Java EE or .NET.
In structured design, the major tool used is a flowchart, which is a graphical representation of a process or system that shows the steps or activities involved and the relationships between them. Flowcharts are used to visualize and document the design of a system, and they can help to identify and resolve problems or issues during the design process.
Flowcharts are widely used in structured design because they provide a clear and concise way to represent the design of a system, and they are easy to understand and communicate to others. They are also flexible and can be used for a wide range of systems, from simple to complex.
In addition to flowcharts, other tools that are commonly used in the structured design include data flow diagrams, entity-relationship diagrams, and state diagrams.
Latency is the amount of time it takes for a request to be processed and a response to be returned in a system. In system design, latency is an important consideration because it can impact the performance and user experience of the system. Several factors can contribute to latency in a system, including the speed of the network connection, the processing power of the servers or devices involved, and the complexity of the algorithms and processes being used.
To reduce latency in a system, designers can consider several strategies, such as optimizing the algorithms and processes being used, using faster hardware and networking equipment, and implementing caching and other performance-enhancing techniques.
It refers to the amount of data or transactions that the system can handle in a given period of time. It is often used to evaluate the capacity or scalability of a system, as well as to identify potential bottlenecks or performance issues. Dividing the requests and spreading them to other resources is the one promising method of boosting the system's throughput that has been discovered.
Numerous factors as below can impact the throughput of a system:
According to the CAP(Consistency-Availability-Partition Tolerance) theorem, a distributed system cannot concurrently guarantee C, A, and P. It can only offer two of the three assurances at most. Let us use a distributed database system to help us comprehend this.
The following diagram illustrates which CAP Theorem components each database simultaneously assures. We can see that RDBMS databases offer availability and consistency at the same time. Consistency and Availability are provided by SQL Server and Maria DB. Consistency and partition tolerance are ensured by the Redis, MongoDB, and Hbase databases. Availability and partition tolerance is achieved by Cassandra and CouchDB.
This is a regular feature in the list of top System design questions, be ready to tackle it in your next interview.
Horizontal scaling is increasing the number of computers on a network to distribute the processing and memory workload among a distributed network of devices.
Vertical scaling refers to the idea of enhancing the resource capacity of a single machine, by adding RAM, effective processors, etc. Without changing any code, it can help the server's capabilities.
Some other factors to consider when deciding between horizontal scaling and vertical scaling in system design:
This is a frequently asked question in Software design interviews.
Caching is the practice of keeping copies of files in a temporary storage space referred to as a cache, which facilitates faster data access and lowers site latency. Only a certain amount of data can be kept in the cache. Because of this, choosing cache update strategies that are best suited to the needs of the business is crucial. The different caching techniques are as follows:
Cache-aside: In this approach, it is up to our application to write to and read data from the storage. The storage and the cache do not directly interact. In this case, the application searches the cache for an entry before retrieving it from the database and adding it to the cache for later use. The cache-aside strategy, also known as lazy loading, only caches the requested entry, preventing unnecessary data caching.
Write-through: According to this strategy, the system will read from and write data into the cache, treating it as its primary data store. The database is then updated to reflect these changes by the cache. Entries are synchronously written from the cache to the database.
Write-behind(write-back): The application performs the following actions in this strategy:
Refresh-ahead: By employing this technique, we can set the cache to automatically refresh the cache entry before it expires.
Expect to come across this, one of the top System design interview questions.
Below is the drawbacks for each:
Cache-aside
When a cache miss occurs, there will be a noticeable delay because data must first be fetched from the database before being cached. If data is updated in the database, there is a greater chance that it will become stale. By forcing an update of the cache entry with the time-to-live parameter, this can be minimized. Increased latency occurs when a cache node fails and is replaced by a new, empty node.
Write-through
Because of the synchronous write operation, this strategy operates slowly overall. The data stored in the cache has a chance of never being read. By offering the right TTL, this risk can be minimized.
Write-behind
The main drawback of this approach is the potential for data loss if the cache is destroyed before the contents are written into the database.
A network of globally dispersed proxy servers known as a "content delivery network," or CDN for short, serves content to users from locations close to them. Static files like HTML, CSS, JS files, images, and videos are typically served from CDNs on websites.
Users don't have to wait long because data is delivered from nearby centers. As some of the burdens are shared by CDNs, the load on the servers is significantly reduced.
We have 2 types of CDNs as below:
Push CDNs: Every time the server makes changes, in this case, the CDNs receive the content. We are accountable for uploading the content to CDNs.Only when content is changed or added, the CDN is updated, maximizing storage while minimizing traffic.
Push CDNs generally work well for websites with less traffic or content.
Pull CDNs: When the first user requests the content from the site, fresh content is fetched from the server in this case. As a result, initial requests take longer to complete until the content is stored or cached on the CDN. Although these CDNs use the least amount of space possible, when outdated files are pulled before being changed, it can result in redundant traffic. Pull CDNs are effective when used with busy websites.
The process of sharding involves dividing a sizable logical database into numerous small datasets called shards. Some database data persists across all shards, whereas other data only appears in one shard. The terms "vertical sharding" and "horizontal sharding" can be used to describe these two situations. A sharded database can now accommodate more requests than a single large machine can. Databases can be scaled by using sharding, which increases throughput, storage capacity, and availability while assisting in handling the increased load.
We must choose a sharding key to partition your data before you can shard it. An indexed field or an indexed compound field that appears in each document in the collection can serve as the sharding key. Our application can run fewer queries thanks to sharding. The application knows how to route requests when it receives them. Instead of searching through the entire database, this means it must look through less information. Sharding boosts the overall performance and scalability of the application.
Data partitioning is the process of splitting a large dataset into smaller, more manageable pieces called partitions. Partitioning a large database can enhance its performance, controllability, and availability. In some situations, partitioning can improve performance when accessing a partitioned table. It is common to practice partitioning databases for load balancing, performance, manageability, and availability.
There are 3 types of partitioning, viz. Horizontal Partitioning, Vertical Partitioning and Functional Partitioning. An example of Horizontal Partitioning is as below:
By acting as a leading column in indexes, partitioning can reduce index size and improve the chances of finding the most desirable indexes in memory. Scanning a region when a significant portion of it is used in the resultset is much quicker than accessing data that is dispersed throughout the entire table by index. Performance is improved because adding and removing sections enables mass uploading and deletion of data. Rarely used data can be transferred to less expensive data storage systems.
A must-know for anyone heading into the technical rounds, this is one of the most frequently asked System design interview questions.
Database replication is a process of copying data from a database on one server to a database on another server. This is typically used to improve data availability, scalability, and performance by allowing multiple servers to handle the load of a database system. There are several different types of replication, including master-slave replication, where one server acts as the primary source of data and the other servers act as replicas, and peer-to-peer replication, where all servers are equal and can both read and write to the database.
Replication can be useful in a variety of situations, including when a database needs to be accessed by users in different geographic locations or when a database needs to be backed up for disaster recovery purposes.
RAID (Redundant Array of Independent Disks) is a technology used to improve the performance, reliability, and fault tolerance of a data storage system. It works by combining multiple physical disks into a logical unit, allowing the system to read and write data across the disks in a way that increases performance and reliability.
There are several different RAID configurations, each with its own unique set of characteristics and benefits. Some common RAID configurations include:
These are just a few examples of the different RAID levels that are available. Many other RAID levels have been developed, each with its unique combination of features and trade-offs. It is important to carefully consider the requirements of your system and choose a RAID level that meets your needs.