Subscription billing platforms such as LogiSense are inherently exposed to highly variable workloads which are a challenge to manage efficiently. Two of the system's primary functions, rating event records and creating invoices, are generally short-lived, compute-intensive, high throughput processes that occur at unpredictable times and intervals across a customer base.
Given that the infrastructure must be able to handle these load spikes whenever they occur, how do you ensure that your computing resources are utilized efficiently during normal load periods? Imagine a business processing a billion user-generated events each month across 100,000 accounts; what if an error is discovered during invoicing that needs to be fixed quickly across that large base? This is a scenario only handled well in an elastic compute environment.
Successfully sizing systems in the days of bare metal servers often involved an in-depth capacity planning exercise. Architects, System Engineers, and DBAs would create a model that allowed them to predict how the underlying infrastructure would respond under various load levels. These models could range from basic to very complex depending on the level of accuracy required. With the model complete it was simply a matter of entering peak load parameters and then provisioning sufficient infrastructure to satisfy the calculated resource requirements.
There are three obvious issues with this approach:
1. Assuming peak load levels are predictable
It assumes that peak load levels are predictable, which they often aren't. To solve this issue some level of over-provisioning is necessary, the extent of which often depends on service level allowances.
2. Does not account for growth
Given that there is often lag time associated with physical hardware acquisition, additional over-provisioning is often required to ensure that sufficient capacity is available when growth does occur.
3. Highly under-utilized
Given that the infrastructure is oversized even for peak load scenarios, it will likely be highly under-utilized much of the time. The cost implications of running a large system in this state can be significant, and your level of success is always dependent on accurate load prediction.
Fortunately, today's cloud computing platforms go a long way to addressing these issues. Assuming that a software system is designed using appropriate architectural patterns, elastic services such as Amazon EC2 and S3 and serverless services such as Amazon Aurora and Lambda literally flip the old provisioning model on its head.
Imagine a world where compute, storage, and network can be configured to dynamically increase and decrease computing capacity in real-time via scaling policies that respond to system events or resource threshold settings, such as memory consumption or CPU utilization. As increased load scenarios arise, additional capacity is automatically provisioned and brought online in a matter of minutes or even seconds. As the load on the system decreases the elastic resources are automatically de-provisioned, effectively returning the system to its steady-state configuration.