Ready to dive into the lake?

lakeFS is currently only
available on desktop.

For an optimal experience, provide your email below and one of our lifeguards will send you a link to start swimming in the lake!

lakeFS Community
Tzahi Yaacobovicz
September 29, 2020

In this article I will show how Go concurrency enabled us to cut through a daunting DB performance barrier. This blog post continues our journey to big data performance. The first post on this issue discussed in-process caching in Go

The Pain

lakeFS is a versioned directory over objects stores like AWS S3 and GCS (resembling Github for object sores). It uses Postgres for storing the description of billions of objects that may reside in a large data lake. This gives big-data tools like Spark and Presto transactionality and immediate consistency in a transparent manner. 

In some use cases hundreds of thousands of distinct select statements hit the DB every second , each getting meta-data on a single object from  a single row.

Postgres has some fixed small cost associated with each statement execution. E.g. Postgres puts an ACCESS SHARE  lock on the table to block DDL statements from changing the table structure during the read. Multiplying this small cost by hundreds of thousands of requests per second can bring a strong Postgres server to its knees, requiring an expensive DB upgrade. 

The solution

The problem stems from executing too many statements, each doing too little work. The obvious solution is combining the work of many statements into a single statement. It is as simple as replacing 10 “SELECT * FROM table WHERE key=keyN” statements with a single “SELECT * FROM table WHERE key IN (key1,key2,…,key10)”. How didn’t we see this before?

Well, there are some hurdles to overcome before reaching performance heaven:

  1. Each getobject request comes separately. It is selfish and wants to return ASAP. We need to create a traffic cop that will put them in line without an interface change.
  2.  Response time guarantees(SLA) must be kept.
  3. After squeezing many requests into a single SQL statement, we need to return a distinct result to each waiting request.

We implemented a solution in lakeFS, and the results were a tenfold increase in read requests per second. For this blog post I wrote a simplified version of it . I used this program to benchmark the batched VS discrete reading. You can clone or fork it from our GitHub Repository.

Before explaining how Go goroutines and channels enabled an elegant solution, I would like to show the benchmark and its results. 

Benchmark setup

Hardware

  • Dell XPS 13 3790  two-in-one laptop
  • 32 GB memory
  • Intel i7-1065G7 4 core processor (8 hyperthreaded)
  • 1 TB NVMe disk.

Software and 

  • Ubuntu 20.04
  • Postgres 11.7
  • GO 1.15
  • Pgxpool golang Postgres driver by the benevolent Jack Christensen. Pgxpool support for native array parameters improved batching performance by 20%

Database

The test DB contains a single table “random_read_test” with 350 million rows composed of the following columns:

  • Pk – indexed text field containing a sequence number padded to 20 bytes.
  • Payload – text field containing the sequence number padded to 35 byte.

Table raw size is 20 GB, and the index adds another 13 GB. You may create the table using just two sql statements from this sql file

The test driver

I wrote a  test driver  that generates 3 million random keys, and sends them via a channel to three hundred reading goroutine instances(tunable by constant) that perform the reads. It is able to test different batch sizes, as well as non-batched reading.Important factors of the run can be controlled by a few constants that appear in the test driver. On each run, the driver emits a statistics row that is appended to the “statistics.csv” file. You may fiddle with them to fit your requirements.

Benchmark results 

Batch sizeRun duration in secondsAverage response timeWorst case response timeReads per second
Non batched505.2 ms770 ms60,000
1877.9 msna34,500
2434 ms42 ms70,000
4212.2 ms31 ms143,000
89.50.7 ms39 ms316,000
165.50.45 ms21 ms550,000
245.00.4 ms22 ms600,000
324.80.38 ms18 ms625,000
Conclusions:
  1. Batching gives a tenfold throughput improvement
  2. Even batch size of 2 is better than non batched
  3. Response time is order of magnitude better than non-batched
  4. Improvement from bigger batch flattens at batch size of 16
  5. I measured CPU consumption using the graphical system monitor. The graph for the batched run was flat at around 85%, while the non-batched version run at 80%. So almost all CPU resources are dedicated to reading, regardless of batching.

Using Go Concurrency for micro-batching

The software solution is described in the architecture diagram below. The rectangles represent a request for a key, the lines represent directional channels,  while the rounded rectangles represent goroutines that are started on server initialization and don’t terminate.

Using Go concurrency for micro-batching

It works as follows:

  • When a read request arrives, ReadEntry function is called with the requested key. In the code below you see that a replay channel is created and sent with the key to the orchestrator.  Then ReadEntry waits on the channel it created.
func ReadEntry(pk string) (*rowType, error) {
  replyChan := make(chan readResponse, 1) // create reply channel
  request := readRequest{pk: pk, replyChan: replyChan}
  readRequestChan <- request		  // send request
  select {				  // wait for reply or timeout
  case response := <-replyChan:
     return response.testRow, response.err
  case <-time.After(ReadTimeout):
     return nil, ErrReadEntryTimeout
  }
}
  • The read orchestrator accepts the key + channel and adds it to the current batch. It will send the batch for execution to the batch readers when one of the following conditions is met:
    1. The batch reached a configured maximum keys (e.g. 16 keys)
    2. A configured amount of time has passed since the first key was added (currently it is 0.5 ms).
  • The batch reader goroutine accepts the keys batch from the channel. It puts the received keys in the where condition of the select statement, and executes it. It will send the result of each key to the requester over the channel the requester provided. If the key was not found, it will send a “not found” message.

Does this solution fit your problem?

Good software solutions fit specific situations, and are no good for other challenges. Micro-batching may fit your needs if:

  1. It is a single statement transaction that is executed at a high rate and creates a bottleneck on the DB.
  2. The time between arrival of the first request in the batch and the last request is short and tolerable for your SLA.
  3. Transactional consistency is not required.
  4. DB bottleneck is CPU and not I/O.

Micro-batching works well for write operations as well, as long as they are not part of a multi step transaction.

This approach will not help where the challenge includes:

  1. The load is composed of many different transaction types.
  2. A non-trivial transaction that needs  transactional consistency.
  3. The bottleneck is on the I/O side, and you can not provide enough memory or IOPS to move the bottleneck to the CPU.

What next

If you got this far, then I hope you will take the time to clone the GitHub repository, and play with the micro-batching. If anything is not clear in the explanation, please drop us a line and we will clarify.
In a future post, we will discuss more ways in which we made Postgres pull the throughput and response time required in a big data environment. 

Notes

  1. This code is for demonstration purposes. Error handling is at “panic” level to keep the purpose clear
  2. The idea is inspired by the way Postgres handle commit processing. The configuration parameter “commit_delay” instructs Postgres to batch commits that came within the parameterized time period. It is disabled by default.

Git for Data – lakeFS

  • Get Started
    Get Started
  • The annual State of Data Engineering Report is now available. Find out what’s new in 2023 -

    +