The cost of hosting is often overlooked when planning a budget for a new web application. While development costs are a big part of your upfront app development budget, hosting can become a significant expense as your web app’s popularity grows after release. Modeling your hosting costs helps you project cash flow and can even help you devise a better pricing strategy.

However, it can be difficult to accurately estimate hosting costs before releasing your application. You can’t really know how users will use your app until it’s live, and their usage patterns may impact your hosting needs. Prior to launch, you can get a decent idea of what your hosting costs will be by load testing your application.

What is a load test?

A load test is a simulation of user activity on your application. Load testing software mimics user behavior by sending requests to your application and measuring the response time. Essentially, the software interacts with your web app in the same way as regular users. Instead of finding hundreds of people to help you test your app, the software can simulate hundreds of users with a single computer.

These tests determine how your application will perform when serving a large number of users. By simulating so many users at once, load tests help you find performance bottlenecks in your web app.

Load tests can also help you estimate the amount of hosting resources you will need to serve your users. That means we can use load tests to find out how much it will cost to host your app.

A simple approach to load testing

There are several load testing tools available, ranging from free open source projects to paid third-party services. Apache JMeter is a free, robust load testing tool with a comprehensive user manual. If you’re unsure of what to use or have a limited load testing budget, JMeter is a solid choice.

We can use JMeter to write load tests and find out how many users our application can support on a single server. As we will describe below, this information is part of a simple formula for estimating hosting costs. Once we know how many users a single server can support, we can estimate the number of servers we will need to host our application for our anticipated number of users.

To accomplish our goal of estimating hosting costs, we need to perform the following steps:

  1. Set up a test environment for our web application. Load tests are performed against the web application, so you should host your web app on infrastructure that resembles your anticipated production infrastructure.

    For example, if you host your production web app on AWS Elastic Beanstalk, you should host your test environment on AWS Elastic Beanstalk. Using a test environment that is similar to production will ensure load tests uncover realistic performance limitations without impacting your production environment.

  2. Determine what expected usage for the application will look like. Map out the typical user journey for your web app. Outline which pages users visit, what they do on each page, and how much time they spend on each page.

    If your app is already public, user analytics can give you the best picture of how your users behave. If you have not yet released your app, you can make an educated guess based on your knowledge of your app’s user experience.

  3. Write a load test to simulate expected usage patterns. Using the user journey you just outlined, write a load test script that mimics your expected user behavior. The script should simulate tens or hundreds of concurrent users.

    A truly representative load test will simulate different actions for different groups of users. However, we can keep things simple and assume all users will take the same actions. This assumption simplifies the load test script while demonstrating what impact users can have on server resources.

  4. Run the load test and record the results. Run your load test script against your test environment. Increase the number of simulated concurrent users until the application’s response time becomes unacceptable. Also monitor the CPU and memory usage of the server hosting your web app. Knowing which resource constraints limit your application performance will help you plan a cost-effective mix of resources.

  5. Analyze the results to determine what resources we will need. Use the test results to determine how many concurrent users a single web app server can support. Then, use that number to estimate how many servers you will need to host your application.

  6. Estimate the cost of the needed resources. The resources from the previous step will tell us how much memory and processing power we need to handle the expected load. We can then estimate the cost of these resources based on the anticipated number of users you expect your app to serve.

How to create a load test script

JMeter’s user manual describes how to build a simple load test plan. A simple plan built with the JMeter GUI is likely all you need for this exercise. For more complex needs, JMeter has scripting support in Java and other languages.

We’re a bit partial to Ruby at Twin Sun, so I prefer scripting with ruby-jmeter. The following ruby-jmeter example script simulates visiting a home page, increasing the number of threads (simulated users) over several test runs to discover where performance drops off.

require 'ruby-jmeter'

base_url = 'https://test.example.com'
thread_counts = [1, 5, 10, 20, 50, 100]

thread_counts.each do |thread_count| 
   test do
      threads count: thread_count, loops: 100 do
         visit name: 'Home Page', url: "#{base_url}/" do
            assert contains: '<title>Website Name Goes Here</title>', scope: 'main'
            assert contains: 'Text rendered from database query results goes here.', scope: 'body' 
         end 
         # additional pages and actions can be performed here. 
      end
   end.run(log: "#{thread_count}-threads.jtl")
end

The script dumps test results to separate log files for each thread count. Reviewing the logs will show you the average response time, throughput, and other metrics for each thread count. The following log entry shows a throughput of 80.4 requests per second, with an average response time of 112 milliseconds.

2022-09-28 21:37:37,840 INFO o.a.j.r.Summariser: summary =   4882 in 00:01:01 =   80.4/s Avg:   112 Min:    66 Max:  1792 Err:     0 (0.00%)

JMeter logs do not show you the CPU and memory usage of the server hosting your web app. You can track CPU and memory usage through other tools, such as the Linux top command, AWS CloudWatch, or a third-party performance monitoring service like New Relic.

How to interpret the results

The results of your load test will tell you how many concurrent users your web app can support. Total user count is generally less important at this stage. We want to know how many people will be using the app at the same time. Simultaneous activity is what will expose performance constraints in our infrastructure that lead to unacceptable performance.

It is up to you to define what “unacceptable performance” means for your web app. A general rule of thumb is to keep the average response time below 500 milliseconds. At a maximum, a slow page should load in less than one second. If users are waiting longer than that for your server to respond, they may lose patience and abandon your app.

Once you have your ideal performance threshold in mind, review your load test results to find where your threshold is exceeded. For example, you may find that your server can support 50 concurrent users before the average response time exceeds 500 milliseconds. If you need to support 1,000 concurrent users, you will need to allocate 20 times the resources allocated to your test server.

At this point, you should identify which resources are limiting your application’s performance under heavy load. If your server handles at most 50 concurrent users due to a memory constraint, you may find that allocating more memory to your web app server can increase that number to 100 before you hit a CPU constraint. Allocating more memory to a single server is likely far less costly than running two servers with lower memory allocations.

Estimating hosting costs based on the results

Understanding your resource utilization and performance constraints under heavy load can help you select the most cost-effective resources for hosting your web app. Getting this right is especially important in traditional data centers, where you much purchase expensive hardware outright, and under-provisioning can be as problematic as over-spending. Conversely, cloud hosting environments offer more flexibility in scaling resources as needed.

To simplify our considerations for estimating costs, let’s assume we are using a cloud hosting environment and can scale resources with user demand.

Even with its inherent flexibility, you can make more economical decisions with cloud-based hosting if you understand your resource constraints. Amazon Web Services (AWS), as an example, offers memory-optimized and compute-optimized server instances. A memory-intensive web app would benefit from the memory-optimized instances, whereas a CPU-constrained web app would benefit from the compute-optimized instances.

The only other piece of information you need to estimate your hosting costs is the estimated number of concurrent users you expect to serve. This number will fluctuate over time, and your web app’s traffic patterns will even change intraday. Prior to a public launch, you can model expected traffic patterns to determine how many concurrent users to plan for. The following line graph is an example model of the expected average number of concurrent users per hour for a business productivity application with 10,000 monthly active users.

Line graph of the average concurrent users per hour over a week, with a line representing each day of the week. The general trend shows an increase in usage during regular work hours.
Example graph that models the average concurrent users per hour

Considering the visualized usage patterns, we can plan on scaling resources up during the workday and scaling down at night and on weekends. We can also use our load testing results to tell us how much to scale up or down during those time windows. For instance, on Friday we experience an average of 120 concurrent users throughout the 9:00 AM hour. If we know that a single web app server can support 50 concurrent users, we will scale to 3 servers for that hour to handle the expected load.

Let’s assume we’re using AWS EC2 instances to host the web app in a load balanced environment. There are other factors beyond the EC2 instance costs to consider, but let’s only look at EC2 costs for a moment. Assuming our test web app server was a t3.xlarge EC2 instance, and the instance costs $0.1664 per hour, we can use scheduled scaling events throughout the week to estimate our EC2 costs. Using the above example traffic patterns, we can sum up the number of server instances needed each hour of the week and multiply that number by the hourly cost. In this example, EC2 instances would cost approximately $35.44 per week. That’s not bad for a successful web app!

AWS-specific considerations

That weekly spend for EC2 instances is a good starting point, but there are other costs to consider. Before getting to that, though, it’s worth noting that there are a lot of AWS-specific opportunities to reduce costs. The previous EC2 cost estimate was based on the current (2022) pricing for t3.xlarge On Demand instance pricing. “On Demand” means you pay for the time your EC2 instance is running, regardless of its workload. AWS offers cheaper pricing for EC2 instances through Spot Instances and Savings Plans, as we have covered in a previous article.

Beyond EC2 instance costs, there are other AWS costs to consider. A web app will likely use other AWS services, such as Elastic Load Balancing, S3 for storing static assets, an RDS database. Remember to include all of these services have costs in your hosting budget. You can use the AWS Pricing Calculator to estimate your overall costs.

General considerations

Regardless of your hosting provider, your hosting budget should include some margin for error. Especially if you are attempting to estimate costs before launching your app, you may encounter unexpected surprises. Perhaps your app is more popular than you expected, or users end up spending more time on your app than you anticipated. Those are good problems to have, but they can also lead to costs that exceed your expected hosting budget.

To account for these unknowns, add a 20% buffer to your hosting budget for the first year. You can adjust your buffer as you gain more data from real-world users.

Beyond web app server costs, user behavior may increase demand for other resources within your hosting infrastructure. Fees for data egress (the amount of data your app sends to users) can add up quickly, as can the cost of storing uploaded files and images. If your app performs a lot of background processing that was not captured in your load testing, you may end up needing to allocate more resources for those jobs than originally anticipated.

Important caveats to this approach

Load tests don’t always tell you everything. Like any type of testing, load testing is only as good as your test plan. User behavior is sometimes difficult to predict, so your load test may not accurately reflect how your users will use your application. However, a good load test will give you a decent picture of how your application will perform under heavy load.

Further, the approach to load testing I’ve described is a simple one, intended to be good enough for budgeting purposes for a new web app. Load testing from a single computer will not fully test the limits of your infrastructure. For example, your home Internet connection can not simulate the network load of thousands of users on their own separate Internet connections. It is good enough, though, for testing the CPU and memory limits of a single server.

Additionally, horizontal scaling will not guarantee a linear increase in performance. That is, ten servers behind a load balancer may not necessarily handle ten times the load of a single server. Other services in your infrastructure may become a bottleneck. A database server may not be able to handle the increased load of a horizontally scaled web app. You may also find the limits of a shared cache under heavy load. Load testing from a single computer will not reveal these more complex issues.

Again, this article’s approach to load testing strikes a balance between simplicity and accuracy. If your goal is to test the limits of your infrastructure, you will need to run load tests from multiple locations with a far more complex test plan. However, this simple approach is a decent basis for estimating your web app’s hosting costs.