Cloud-Native System Performance - Improvising Compute ⚙️
This post is available early for members only. It will be public 30 days after the date of publication.
Infrastructure is just a piece of hardware capable of doing nothing on its own. The very purpose of infrastructure is to be able to run and support applications that make sense for the business. The applications could be any - homegrown solutions, commercial off the shelf (COTS), or getting them developed from a 3rd party - all of them need infrastructure.
Middlewares, application servers, database servers, gateways, reverse proxies, etc. The core of these is the infrastructure that provides the necessary computation power. The hardware has capacities - let's sum that up as the number of instructions they can process in a given amount of time.
Hardware comes prebuilt. Now it is up to the developers who develop the applications and deploy them on top of this hardware to churn out the optimum performance from these machines. Applications are deployed on infrastructures to address a specific business purpose. Poor development results in overuse of underlying capacity, while a well-developed application can execute the same task with less resource consumption.
Resource consumption is inversely proportional to profits. Better performance results in low resource consumption, which results in executing more tasks for the business.
This post is part of the series "Cloud-Native System Performance". The posts are:
One might think that it all depends on the application code - that it is a core reason that determines the system's performance. Yes, it is true to an extent, but there is more to it in today's cloud-native world. Thinking about the compute aspect of an end-to-end system - mere assumptions and wrong choices of architecture components can also result in bad performance and annoying user experiences.
Resource consumption cannot be absolute zero - I hope we don’t need to explain that. For the application to run, CPU cycles have to be used. It all boils down to the efficiency of the code and the developer who developed it. The table below discusses a few challenges related to application development and efficient resource consumption.
Inefficient algorithms cause unnecessary performance overhead for a given computational task.
Choosing the right algorithm for a given task without compromising the fulfillment of the intended purpose should be done carefully. Algorithms are built to execute heavy computational workloads. Explicit efforts are made for optimization.
Systems under load can cause an accumulation of incoming requests at various points within the architecture. There is a risk of data and connection loss.
Understand the bottlenecks that exist in the system. Once the bottlenecks are identified, the solutions could be along the lines of -
Internal systems are not “fast” enough to process requests.
Database query performance is too low to serve the ongoing demand.
Some internal components are too fast to process and pass the requests to other components that are not ready to handle the load.
Address these concerns by enhancing/replacing the current setup with appropriate options.
The application processes lock the memory address to ensure consistency of data being operated on. Due to this, other processes have to wait until the address is unlocked for modification.
Try to implement the lock for a few lines of code as possible. Apply locks only when required, and once locked, unlock them as soon as possible.
Prefer optimistic locking to pessimistic locking.
Pessimistic: Lock -> Read -> Process -> Write -> Unlock
Optimistic: Read -> Process -> Lock -> Validate -> Write -> Unlock. The validate step checks for the current value with the read value to be the same.
In inter-dependent multi-threaded applications, the CPU changes the context to respect dependencies. Too many context switches increase the overall latency of the system.
Try to avoid context switches as much as possible. Reduce the dependencies between the threads. Implement async programming using which the CPU need not wait for a response from another thread or process.
Reduce the size of threads.
The system complains about needing more memory to run the application logic.
Having better garbage collector algorithms is one of the possible options after we have made sure we have optimized the main memory usage for storing variables.
Well, there could be many more challenges apart from the above ones - I shall cover them all in a consolidated booklet. But the above information should be enough to get an idea of how the compute-related aspects can play a role in affecting the performance of the entire system.
Read more challenges in the FREE eBook, available for download below: (PDF & ePub formats)
Cloud (IaaS) providers offer plenty of options (per service) to help improve the performance of the applications deployed on their infrastructure. So much so that, at times, teams get complacent and rely completely on the cloud provider options to address their performance requirements. Nevertheless, the performance delta addressed just by better code is more than significant.
Here we discuss some of the features provided by major cloud providers to be leveraged to improve cloud-native system performance.
One of the first services offered by cloud providers was perhaps the ability to spin virtual machines as needed. We can choose the flavor of the OS, size, networking, and various other aspects. That’s the flexibility.
Additionally, containerization allows shared but dedicated resource allocation, thus breaking the one application, one server rule.
Microservice architecture avoids a lot of inter-process interference and resource contentions, thus optimizing the resource consumption for performance.
Going beyond containers - if you don’t want to worry about the vulnerability scans in images used, cluster management, and orchestration, and you only want to write the code and let it run, then serverless is a great option.
However, refactoring existing applications requires huge efforts, which is where containerization is an easier option, to begin with. Applications, especially web services to be developed from scratch, should always consider the possibility of serverless.
Every application has different needs. Standard VMs offer everything that is required in a general-purpose server - CPU, Memory, and Networking capabilities.
However, not every application running on these VMs follows a standard as far as resource consumption is concerned.
Major cloud providers offer features to optimize the virtual resources - especially VMs - to align them with the application’s needs. In general, these optimizations can be categorized into 3:
There are 2 types of scalings:
Scaling up - where the virtual compute resources scale by their size.
Scaling out - where they scale in numbers.
In the cloud-native world, both options are available. Given the nature of templated size selection - the scaling up of resources is not always the best case since applications may experience bottlenecks in one of the aspects. Increasing the size of everything in a VM to address a single bottleneck creates unutilized resources. This simply adds to the costs.
Cloud providers offer the auto-scaling feature where you can be sure that the desired number of instances are always up and running even if some of them fail in the meanwhile. This is done automatically.
The performance optimization analysis requires us to identify the bottlenecks in the system so that we can make decisions to eradicate them from the system.
Cloud providers offer various monitoring solutions that are inbuilt and well integrated with all the services they provide. Monitoring solutions can be defined and leveraged to record system behavior based on the kind, details, and frequency of logs generated from the system.
Needless to say, there are many more approaches to address the performance issues in a cloud-native way. Check out the booklet for more.
Designing the applications to process a few tasks within a thread or a process results in better performing applications as longer processes tend to block the CPU.
Additionally, single-threaded runtime environments or asynchronous processing logic should be leveraged to avoid blocking.
More cloud-native approaches in FREE eBook! Link below: (PDF & ePub formats)
There is always room for improvement when it comes to tuning system performance. Needless to say, things that are to be done should be done. Application code is bound to consume some resources no matter what. While discussing this with the team, give enough regard to the aspects which are supposed to be addressed in the application code, even based on business priorities. For example, one cannot ignore resource consumption caused by security aspects even though it does not directly affect the end-user. It always makes sense to draw a line by defining in-scope and out-of-scope requirements.
I have compiled this series into a FREE eBook with more details and deeper insights. Link below! (PDF & ePub formats)