Scalability For Junior Engineers

Assalam u Alaikum & Hello everyone! 👋

I hope you all are good 😁

Today, we are going to learn about scalability 🚀 If you are curious to know What scalability is, how we measure it, and what are the approaches to scale the software, then this post is for you.

I'll be explaining all the concepts in a very simple manner and targetted audience for this post is especially the junior engineers who are passionate about learning and growth in engineering 🚀

PS: This post does not cover the technical implementations but the knowledge of the core concepts that you should know before implementing them.

Let's get started 🔓

What Is Scalability?

First, we have to understand What is Scalability. What does scaling a system means?

" Scalability is an ability to adjust the capacity of the system to cost-efficiently fulfill the demands. "

Core Concepts

Scalability usually means an ability to handle more users, clients, data, transactions, or requests without affecting the user experience.
It is important to remember that scalability should allow us to scale down as much as scale up and that scaling should be relatively cheap and quick to do.

How To Measure Scalability

Now we know what scalability is, let's discuss how to measure the scalability of a system. The scalability of a system depends on many factors, but if we narrow them down, there are three main measurements that we can consider to measure the system's capability.

Handling Data
Higher Concurrent Requests
Higher Interaction Rates

Handling Data

As your business grows and becomes more popular, you will be handling more and more data.

If we take the example of hashnode, the data is increasing every day. Hashnode has to efficiently handle more user accounts and blogs, and much more pieces of data.

Processing more data puts pressure on your system, as data needs to be sorted, searched through, read from disks, written to disks, and sent over the network.

Facebook generates 4 petabytes of data per day.
Twitter generates 12 Terabytes of data per day.
As of 2021, there are over 1 billion active Instagrammers.

From the above examples, you can measure how big and scalable these systems are. They handle tons of tons of data per day.

Higher Concurrent Requests

Concurrency measures how many clients your system can serve at the same time. If you are building a web application, concurrency means how many users can use your application at the same time without affecting their user experience.

Concurrency is difficult, as your servers have a limited amount of central processing units (CPUs) and execution threads. It is even more difficult, as you may need to synchronize parallel execution of your code to ensure consistency of your data. Higher concurrency means more open connections, more active threads, more messages being processed at the same time, and more CPU context switches.

Twitter has 396.5 Million users globally.
Instagram has 500 Million active users per day.

Can you imagine how concurrent these systems are? 🤯

Higher Interaction Rates

The third measurement of scalability is the Rate of interactions between your system and your users. It is related to concurrency but is a slightly different dimension. The rate of interactions measures how often your clients exchange information with your servers.

For example, if you are building a website, your clients can navigate from one page to another page in a few seconds. The rate of interactions can be higher or lower independently of the number of concurrent users, and it depends more on the type of application you are building. The main challenge related to the interaction rate is latency.

Latency: Latency is the time it takes for data to pass from one point on a network to another.

Scaling Software Systems

Now that we have learned how we can measure scalability, let's now discuss how we can scale the software system to fulfill the needs.

There are two types of scaling:

Vertical Scaling
Horizontal Scaling

Vertical Scaling

Vertical scalability is accomplished by upgrading the hardware and/or network throughput. It is often the simplest solution for short-term scalability, as it does not require architectural changes to your application. If you are running your server with 8GB of memory, it is easy to upgrade to 32GB or even 128GB by just replacing the hardware. You do not have to modify the way your application works or add any abstraction layers to support this way of scaling. If you are hosting your application on virtual servers, scaling vertically may be as easy as a few clicks to order an upgrade of your virtual server instance to a more powerful one.

Advantages Of Vertical Scaling

There are various ways you can vertically scale your system. Some of them are:

Adding more I/O capacity by adding more hard drives in the Redundant Array of Independent Disks (RAID) arrays.
Improving I/O access times by switching to solid-state drives (SSDs)
Reducing I/O operations by increasing RAM.
Improving network throughput by upgrading network interfaces or installing additional ones.
Switching to servers with more processors or more virtual cores. Servers with 12 and even 24 threads (virtual cores) are affordable enough to be a reasonable scaling option.

As you can see, vertical scaling is nothing but adding or improving the resources of your server machine. This is a great option, especially for very small applications or if you can afford hardware upgrades.

Disadvantages Of Vertical Scaling

However, vertical scaling has some serious limitations as well. Some of the drawbacks are:

Vertical scalability becomes extremely expensive beyond a certain point.
Vertical Scaling has hard limits. No matter how much money you may be willing to spend, it is not possible to continually add memory. Similar limits apply to CPU speed, number of cores per server, and hard drive speed. Simply put, at a certain point, no hardware is available that could support further growth.
Operating system design or the application itself may prevent you from scaling vertically beyond a certain point.

-The number one disadvantage of having an application that resides on a single instance is that it represents a single point of failure.

Because hardware changes are involved, it will, more often than not, translate into downtime.

Horizontal Scaling

Horizontal scalability is accomplished by a number of methods to allow increased capacity by adding more servers. Horizontal scalability is considered the holy grail of scalability, as it overcomes the increasing cost of capacity units associated with scaling by buying ever-stronger hardware. In addition, when scaling horizontally you can always add more servers—you never reach a hard limit, as is the case with vertical scalability.

Advantages Of Horizontal Scaling

Here are some of the advantages of Horizontal Scaling:

Easily scalable tools that make it easy to upgrade it.
Easier to run fault tolerance.
Better use of smaller systems.
Cost of implementation is less expensive compared to scaling vertically.
Infinite Scale can use an endless number of instances to enable limitless growth.
Horizontal scaling offers protection against single points of failure.
Highly resilient, low downtime.

Horizontal scaling is much more suitable for large-scale systems. Because the overall resources of your server are distributed, therefore you can scale your system without any sort of hardware limitations.

Disadvantages Of Horizontal Scaling

Horizontal scaling also has some downsides, some of which are:

There is more than one node that can increase the complexity of the maintenance.
Initial costs are higher when implementing this horizontal scaling.
Data consistency can be challenging across multiple machines (joins require cross-server communication).
Servers may still encounter hardware limit issues if machines are too small

Building Highly Scalable Systems

Let's take a look at what strategies we have to make the software system scalable.

Load Balancing

Load balancing is critical for systems that are distributed and scaled horizontally. Load balancers help to manage the load of the servers as needed. Remember scaling up is important, but scaling down is also a crucial component of scalability. If your system is not scaling down when the load is less, your resources are wasted and the cost is on you.

Load balancers use an algorithm to spread the workload across servers to ensure no single server gets overwhelmed. It’s an absolute necessity to avoid performance issues.

Use Caching, Where Needed

Caching is a good way to return fast responses to your clients and reduce database hits. Caching is a very vast topic and it's really important to understand where you need caching and when you need it. There are a lot of strategies for caching as well. But overall, it's a highly used component in scaling software systems.

Asynchronous processing

It refers to processes that are separated into discrete steps which don’t need to wait for the previous one to be completed before processing. For example, a user can be shown a “sent!” notification while the email is still technically processing.

Asynchronous processing removes some of the bottlenecks that affect the performance for large-scale software.

Limit concurrent access to limited resources

Don’t duplicate efforts. If more than one request asks for the same calculation from the same resource, let the first finish and just use that result. This adds speed while reducing strain on the system.

Use a scalable database

It's highly important that you should use a suitable database for your application needs. NoSQL databases tend to be more scalable than SQL. SQL does scale read operations well enough, but when it comes to writing operations, it conflicts with restrictions meant to enforce ACID principles.

Scaling NoSQL requires less effort, so if ACID compliance isn’t a concern a NoSQL database may be the right choice. But on the other hand, if you want to use an SQL database, there are a lot of strategies to scale them as well.

Maintenance

Maintainable cost is much higher than the development cost in many software systems. So keep a close eye on building product that is maintainable. Set software up for automated testing and maintenance so that when it grows, the work of maintaining it doesn’t get out of hand.‍#

Conclusion

Scalability usually means an ability to handle more users, clients, data, transactions, or requests without affecting the user experience.

It is important to remember that scalability should allow us to scale down as much as scale up and that scaling should be relatively cheap and quick to do.

Scaling Vertically versus Scaling Horizontally is not necessarily an either/or choice. Take the decision as per your needs. Start by splitting out the parts of your app with the highest load.

Keep an eye on the maintenance costs.

I hope this article was a good read for you. Do share it with your friends and other peers. If you know any resources that can help in understanding scalability and learning it, please share them in the comments with me and everyone. It will be helpful!

Thank you so much! ❤️

👉 Follow Me: Github Twitter LinkedIn Youtube

Rehan Sattar