Improving performance for handling millions of requests

Thiện Trần
6 min readJul 13, 2021

--

Time to optimize

In the previous topic, I have shown how I manage to handle a bunch of updating requests. However, that was a result of my table with size of 100.000 records. And yes, of course, when the data of the table growing up, the speed of my service will slow down too. 2 charts below will show you how data size affects the performance (data size from 1 million to 10 million), you can see detailed report here:

pgbench test result
Service performance test result

As the results show, for pgbench testing, the performance drops about 2 times, from 47.000 tps to 20.000 tps. Therefore, the speed of service is slow down dramatically based on the second chart. Though the changing of rpm values is not so much, the average served time grows nearly 250 times on 10 million rows case when compared to 1 million rows. That’s huge 😱, and I think the result would be even worse if the data continue to extend.

So, in this article, I will explain how I improve my service while the database is growing up!

1. Business Details

First thing first, let me explain the business again. I have one service that allows students can practice online. With each exam, a student can take an attempt with it and could submit a list of answer choices, each time students answer a question, clients will send all answer choices to the server and we must save it to our database. The target here is to serve as fast as possible.

2. How To Measure

Like the previous topic, I will use the same tool, K6 for stress testing with a database of 10 million records. I will not go into details here. If you want to know about how I use K6, then you can go back to my first article to see it.

Besides, this is my local computer hardware to run:

  • CPU: Intel(R) Core(TM) i5–10400 CPU @ 2.90GHz, 6 cores, 12 threads
  • RAM: 24GB
  • Disk: GX2 SSD 256GB

I will continue to use my Rust code with Actix framework like my first topic too.

3. Inspection

As we saw on the result charts above, the writing speed of database is inversely proportional to the number of records. So, the main problem here is how to accelerate the writing speed. The faster updating, the more reducing latency of the service, and more served requests at a time.

Additionally, the business doesn’t require aggregate data, just only reading and updating the record, therefore, the reasonable solution here is to use cache to read and write. The data will be flushed to database on the background task. This approach is called write-back.

Consequently, for reading data, my service will read from cache too to assure consistency. Because cache is faster to write/read than normal database so we can expect my service could serve more and faster than before.

4. Implementation

a. Normal way

Yes, simple way always goes first 😅. On this try, the data of a request will be written immediately to the cache, and we will have a job in the background to receive data that has been changed and update it into the database.

Processing flow of a request

The image above shows the flow that I implemented. After updating data into the cache, the cache key is base on the id of records. the data that has been updated will also be sent to the data channel. In the background, we will have some workers which are listening to the data channel and update the changed data to the database. Note that all the database records are loaded inside the cache already, for more particular, I use Redis as my cache server.

Let see the result (rate is fixed at 35.000 per second):

2 million rpm

Well, it’s okay, right? But this speed is not what I’m looking for, and I notice that while the tuning of Redis is 400.000 writing queries per second but if I send those queries one by one through TCP client, I cannot meet that peek of performance.

Therefore, this led to the 2nd implementation.

b. Not simple way 😏

Like the normal way, but this time I let the background workers take care of writing data to cache. The way workers receive and write data is each 1 second or every 10.000 records of data has been changed, the worker will flush those records into cache server and database table.

Processing flow of a request

This is the result of not-simple way implementation:

Yeah, now my service can even handle 3 million requests per minute with that speed at avg 7ms.

However, you should remember that I accept my data is not real-time updated because of the rule 1 second and 10.000 records changed. For many businesses, this is not acceptable so please careful when choosing this approach for your task 👌.

c. Furthermore

In the normal way section, we have a simple solution to handle requests and it doesn’t satisfy our desire. But do you know that we can improve that performance with a tiny effort? 😈

Just change the cache server and we will have a huge performance boosting😆. Yes, I’m not kidding, only change cache server, no need to change one line of code!

The server that I change from Redis to is KeyDB. It provides the same API as Redis and even more. So you don’t need to change your code to fit, convenient heh? 😝

Believe me or not, this is the result:

2 million rpm

4 times faster, so smooth!!!

5. Second Thought

Above, I always talk about the updating case, so how about the insert? If there are millions of creating requests like the updating, could my service handle them?

In updating cases, we need the id of record so we can set it to the cache server, however, when creating we don’t know the id yet until it was saved into the database table. So when there are millions of requests for creating, we will hit directly to database every time and the bottleneck problem is still there like it hasn’t been solved ever 😢. By the way, if we move the id generating business from the database to our service, the problem would be solved immediately, and the problem now is how to generate unique ids. Maybe we can use UUID, maybe we can use another service to handle this task,…. But that will be another article, right? 😄

Last but not least, when storing all data on Redis and using background jobs to push data into the database, you will need to enable auto backup data mode for Redis. This will assure that if cache server is crashed, we won’t lose too many data.

In the future, I will research more techniques and maybe I would write them down if I have time.

If you have any comments or advice, please leave them. I will appreciate it so much!

Thank you for spending your time with my blog!

Have a nice day!

--

--