System Design Concepts for Beginners
Do you want to level up as a Junior Software Engineer so that you can build scalable apps or get a pay bump by passing your system design interview round?
Well, you're gonna need a box of essential system Design Concepts, which include -
-
Networking
-
API patterns
-
Databases and more
Before we dive in, here are the core system design concepts I am hoping you will be able to grasp by the time you are done. They include:
-
Vertical Scaling
-
Horizontal Scaling
-
Load Balancer
-
CDN (Content Delivery Network)
-
Caching
-
IP Address
-
TCP / IP
-
Domain Name Service (DNS)
-
HTTP
-
REST
-
GraphQL
-
gRPC
-
WebSockets
-
SQL/ NoSQL
-
Sharding
-
Replication
-
CAP Theorem
-
Message Queues
Vertical Scaling
It is simply adding more resources like RAM and CPU to your server.
Let’s consider a server accepting and fulfilling requests for your application, but now we're getting more users. So, we need to scale it, and the easiest thing to do is to add more resources like RAM or upgrade your CPU.
This is known as vertical scaling. It's pretty straightforward, but it's constrained.
Horizontal Scaling
Horizontal scaling is all about adding replicas. As a result, we benefit from fail tolerance and additional redundancy.
Further, it minimizes cases of single point of failure.
A better approach to vertical scaling would be to add replicas so that each server can handle a subset of requests & this is known as horizontal scaling.
It's more powerful because we can almost scale infinitely and don't need good machines. In fact, it also adds redundancy and fault tolerance because if one of our servers goes down, all of the other servers can continue to fulfill requests.
Proxy Server
In this section, I will be seeking to answer the following critical questions about a proxy server:
- What is a proxy server?
- What are the functions of a proxy server?
- Benefits of using a proxy?
What is a proxy server?
A proxy server is an intermediary between hardware/software that sits between the client & server.
What are the functions of a proxy server?
- They filter requests
- They log requests
- They transform requests by adding or removing headers
- They encrypt/decrypt or are involved in compression.
Benefits of using a proxy server
a) Cache:
They can serve a lot of requests, i.e., they can coordinate requests from multiple servers and can be used to optimize traffic requests.
User 1(server 2)→ about
User 2(server 3)→ about [proxy] --------→ (about request) → server
User 3(server 1) → about
b) It can collapse requests for data spatially close together in the storage, which will decrease request latency, e.g., merge requests in the same region & send them at once to the server.
c) Proxies are useful under high-load situations, or when we have limited caching, e.g., they can merge and filter requests.
d) They provide anonymity and may be used to bypass IP address blocking. For example, they hide the client IP and instead send the proxy server address to the server.
Load Balancers
This eliminates our previous single point of failure, but the downside is that this approach is much more complicated.
Load balancers are of three types in a general perspective. They comprise of:-
- Hardware
- Software
- Hybrid approach, e.g., HAProxy
When dealing with load balancers, the big question is how do we ensure that one server won't get overloaded while the others sit idle?
For that, we'll need a load balancer, a server known as a reverse proxy that directs incoming requests to the appropriate server.
We can use an algorithm like round robin, which will balance traffic by cycling through our pool of servers, or we can go with another approach, like hashing the incoming request ID.
There are two types of Round Robin algorithms. These comprise of:-
Round Robin
req1 → server 1
req2 → server 2
req3 → server 3
If you consider the three requests above, when done, the entire process is restarted again, i.e., requests will be sent from server 1 to the last server (server 3).
Round Robin with weighted server
I will illustrate this with two servers where the first can handle two requests while the second can only take one. That basically means server 1 has more capabilities.
req1 → server 1
req2 → server 1
req3 → server 2
Other variations of the round-robin include:-
a) Least connections
b) Least response time – figures out which node is giving responses very fast
In our case, the goal is to even the amount of traffic each server gets, but if our servers are located worldwide, we could even use a load balancer to route a request to the nearest location.
Content Delivery Networks (CDN)
You can use a CDN if you serve static files like images, videos, and sometimes HTML, CSS, and JavaScript files.
It's a network of servers located all around the world. CDNs don't really run any application logic. Instead, they work by copying files hosted on your server, the origin server, and copying them onto the CDN servers.
This can be done on a push or pull basis, but CDNs are just one technique for caching, which is all about creating copies of data to be re-fetched faster.
Caching
Making Network requests can be expensive. So, our browsers will sometimes cache data onto our disk, but reading it can be costly, so our computers will copy it into memory.
But, reading memory can be expensive, so our operating systems will copy a subset of it into our L1, L2, or L3 CPU cache.
IP Address
How do computers communicate with each other? Well, every computer is assigned an IP address. The latter is responsible for uniquely identifying a device on a network.
TCP / IP
To round it all out, we have the poorly named TCP /IP or the Internet Protocol Suite since it actually includes UDP as well.
But focusing on TCP for a second, there has to be some set of rules, AKA protocols, that decide how we send data over the Internet.
Like in real life, we have a system that decides how we send mail to each other.
Usually, when we send data like files, they're broken down into individual packets and sent over the Internet. When they arrive at the destination, the packets are numbered to be reassembled in the same order.
If some packets are missing, TCP ensures that they'll be resent. This makes it a reliable protocol and that is why many other protocols like HTTP and WebSockets are built on top of TCP.
Domain Name System (DNS)
When you type pythonhaven.com into your search bar, how does your computer know which IP address it belongs to?
...
We have the domain name system(DNS), a largely decentralized service that translates a domain to its IP address.
When you buy a domain from a DNS registrar, you can create a DNS A record, which stands for address, and then you can enter the IP address of your server.
So, when you search, and your computer does a DNS search to obtain the IP address, it'll use the record mapping to get the address, and then your operating system will cache it so that it doesn't need to make a DNS query every single time.
HTTP
But wait a minute … Why do we usually use HTTP to view websites? Well, TCP is too low-level. As a result, we don't want to have to worry about individual data packets.
We have an application-level protocol like HTTP, which is what developers like you and I use on a day-to-day basis.
It follows the client-server model, where a client will initiate a request that includes two parts.
1. request header
The first is the request header. Think of that as the shipping label you put on a package. It tells you where the package is going, who it's from, and other metadata about it.
2. request body
The second part is the request body, basically the package contents.
Testing
To test it out, open your Dev tools Network Tab and click subscribe (on a youtube channel).
Take a closer look at the request. We can see that the response also includes a header and a body.
But even with HTTP, there are multiple API patterns we could follow. These include:
REST API
The Popular one is REST, a standardization around HTTP APIs, making them stateless and following consistent guidelines.
For example, a successful request should include a 200 code in its response header, whereas a bad request from the client would return a 400 code, and an issue with the server would result in a 500-level code.
What are the key things to look for in REST API:
1) Uniform Interface
- Service is resource-based
- Manipulation of data is allowed through representation &
- Message is self-descriptive
2) Stateless
- All state is defined as part of the query or path parameters. As a result, there is no need to explicitly store state.
3) Cacheable
4) Client Server
- The client code is separate from the server code.
Note that REST doesn't need more bandwidth.
GraphQL
GraphQL is another API pattern introduced by Facebook in 2015. The idea is that instead of making another request for every single resource on your server like you would do with REST API, with GraphQL, you can make a single request, AKA a query, and choose exactly which resources you want to fetch.
This means you can fetch multiple resources with a single request, and you also don't end up over fetching any data that's not actually needed.
Clients can use GraphQL to only request the required data, offering flexible and efficient data retrieval.
gRPC
Another API pattern is gRPC, though it's considered a framework Google released in 2016.
It was also meant to be an improvement over REST APIs.
It's an RPC framework mainly used for server-to-server communication, but there's also grpc web, which allows using grpc from a browser, which has also been increasing over the last few years.
The performance boost comes from protocol buffers. Comparing them to JSON, which is what REST APIs use, protocol buffers are an improvement in that data is serialized into a binary format, which is usually more storage efficient and, of course, sending less data over a network will be faster.
The downside is that JSON is much more human-readable since it's just plain text.
To summarize, gRPC is a high-performance RPC framework from Google, using Protocol Buffers for efficient serialization.
WebSockets
Another app layer protocol is WebSockets.
Let's take chat apps, for example, to understand the main problem that it solves. Usually, when someone sends you a message, you receive it immediately.
If we were to implement this with HTTP, we would have to use polling, where we would periodically request to check if a new message is available for us.
But unlike HTTP 1, websockets support bi-directional communication. So, when you get a new message, it's immediately pushed to your device, and whenever you send a message, it's immediately moved to the receiver's device.
The takeaway is that WebSockets enable real-time, full-duplex communication over a single connection.
SOAP
SOAP is a protocol-based XML standard with features like security and transactions.
MQTT
MQTT is a simple communications protocol that works well for low-bandwidth networks and is particularly popular in IoT scenarios.
Choosing SQL vs. No-SQL
Lets explore when to choose either SQL or NoSQL herein!
SQL
We have SQL or relational database Management Systems like MySQL and Postgres to store data.
But why should we use a database when we can store everything in text files stored on disk?
Well, with databases, we can more efficiently store data using data structures like B trees, and we have fast retrieval of data using SQL queries since data is organized into rows and columns.
ACID
Tables in relational database Management systems are usually acid-compliant, which means they offer durability because data is stored on disk, so even if a machine restarts, the data will remain there.
Isolation means that different concurrent transactions won't interfere with each other.
Atomicity means that every transaction is All or Nothing.
Lastly, we have consistency, meaning foreign key and other constraints will consistently be enforced.
NoSQL
Consistency is fundamental because it led to the creation of NoSQL or non-relational databases.
Consistency makes databases harder to scale, so NoSQL databases drop this constraint and the whole idea of relations.
You can also consider it sacrificing ACID compliance for performance and scalability.
In NoSQL, schemas are dynamic. You can also add columns on the fly and each row doesn't have to contain each column.
Queries focus on a collection of documents, also known as UNQL (Unstructured Query Language).
They are also horizontally scalable, meaning we can add more servers easily.
They are cloud-based computing & storage designed to be scaled across multiple data servers.
There are many NoSQL databases, but popular ones include key-value stores, document stores, wide-column databases, and graph databases.
Types of No-SQL Databases
a) Key–value stores comprise Redis, Voldemort, & Dynamo.
b) Document Databases include MongoDB, & CouchDB.
c) Wide-column databases are column families and are best suited for analyzing large datasets. Examples include HBase and Cassandra.
d) Graph Database depicts data saved in graph structures with nodes (entities), properties, & lines (connections). Examples include: infinite graph, Neo4J.
When do you choose SQL?
Here are some valuable scenarios that dictate the use of SQL:
- When the schema is not likely to change
- Your data is structured and unchanging
- When you want to ensure ACID compliance.
When to choose No-SQL Database?
Here are good reasons that will encourage you to use NoSQL Database:
- When schemas are dynamic
- When storing large volumes of data with rapid development and no fixed structure.
- When data needs to be stored across servers in different regions.
In NoSQL, schemas are dynamic, columns can be added on the fly & each row doesn't have to contain each column.
Queries center on a collection of documents, which are also referred to as UNQL(Unstructured Query Language)
They are horizontally scalable, meaning we can add more servers easily.
No-SQL sacrifices ACID compliance for performance & scalability.
Cloud-based computing & storage – designed to be scaled across multiple data servers.
Sharding
Back to consistency for a second. If we don't have to enforce any foreign key constraints, we can break up our database and scale it horizontally with different machines.
This technique is called sharding.
But how do we decide which portion of the data to put on which machine? Well, we usually have a Shard key.
So, given a table of people, our Shard key could be the person's ID.
However, sharding can get very complicated.
A more straightforward approach is replication. If we want to scale our database reads, we can make read-only
copies of our database. This is called Leader-Follower replication.
Every write will get sent to the leader who sends those to the followers, but every read could go to a leader or a follower.
There's also Leader-Leader replication, where every replica can be used to read or write, but this can result in inconsistent data. So, it would be best to use it where you can have a replica for every region in the world.
CAP
CAP comprises of consistency, availability & partition. Here's everything important you need to know about CAP.
a) Consistency
- Every read receives the most current writing or an error.
b) Availability
- Every request receives a (non-error) response without guaranteeing that it contains the most recent write.
c) Partition
- An arbitrary amount of messages may be discarded (or delayed) by the network between nodes. While potential security concerns with a CDN are valid, reputable providers mitigate risks, such as using SSL encryption and monitoring networks for security threats.
Website owners can also enhance their website's security by regularly updating their software and using secure authentication methods.
CAP Theorem
It can be complex to keep replicas in sync. As a result, the cap theorem was created to weigh trade-offs with replicated designs.
It states that given a network partition in a database, as database designers, we can only choose to favor either data consistency or data availability in case of network failure.
If you choose consistency, you will respond with errors or timeout as there's a network failure, and you can't update values to all the nodes.
On the other hand, if you choose availability, you will send whatever the last response you have, regardless of any new update.
Overall, you can maintain consistency and availability without partition failure.
What makes it confusing is that consistency here means something different than when dealing with ACID. It's a controversial theorem, which is why the more complete pack else(ELC) theorem was created.
Message Queues
In this segment, I explore the following concepts:
1) What is a message queue?
2) How it works
3) Available Message Queues
What is a message queue?
Message queues are asynchronous service-to-service communication used in server-less and micro-services architecture.
Message queues are like databases because they have durable storage and can be replicated for redundancy or sharded for scalability. But, they would have many use cases if our system received more data than it could process.
Let's introduce a message queue so that our data can be persisted before we can process it. In doing so, we also get the added benefit that different parts of our app can become decoupled.
How Message Queue Works
The producer stores Messages in the queue until they are processed by consumers and deleted.
A single consumer processes each message only once.
Queues are also used for fault tolerance. In case of any service outages, they can retry requests.
Examples of Message Queues
- Apache ActiveMQ
- Apache Kafka
- Apache Qpid[5]
- Apache RocketMQ
- JBoss Messaging
- RabbitMQ
- Sun Open Message Queue
- Tarantool
- JORAM
Monolithic vs. Microservices
Monolithic architecture is like a giant container wherein an application's software components are constructed and tightly connected.
Issues with Monolithic architecture:
- Agility – in case of adding new services, you need to change the whole architecture or platform.
- Scalability
- Fault tolerance – in case something is down, the whole system is down.
- Single Framework
Microservice Architecture
This is where a monolithic application is decomposed into different components.
- Different services for different components & interact with each other through REST APIS. For example, you can have one element for search, for index & another for notifications, etc.
- Each has its LB cache, indexes, & REST API.
What are the benefits of a Microservice architecture?:
- Single capabilities
- Independent as a product
- Decoupling
- Continuous Delivery
-Componentization
- Autonomy
-Scalability
What are the best practices when dealing with Micro-service architecture?
- separate data stores
- separate as per features
- Stateless servers
The Verdict
If you've learned anything by now, software engineering is about finding new and complicated ways to store and move data around.