Benchmarks

Methodology

We prepared a set of benchmarks designed to be as reproducible and deterministic as possible.
The dataset used during benchmarks is deterministic and consists of 40GiB (219 million) structured JSON logs.

Example log:

{
  "@timestamp": 897484581,
  "clientip": "162.146.3.0",
  "request": "GET /images/logo_cfo.gif HTTP/1.1",
  "status": 200,
  "size": 1504
}

All benchmarks were run against this dataset. The only thing that was changing - computational resources and cluster size.

Local Deploy

Tests were run on an AWS host c6a.4xlarge, with the following configuration:

CPU	RAM	Disk
AMD EPYC 7R13 Processor 3.6GHz, 16vCPU	32GiB	GP3

For more details on component setup and how to run the suite, see the link.

Local cluster configuration:

Container	Replicas	CPU Limit	RAM Limit
seq‑db (`--mode single`)	1	4	8GiB
elasticsearch	1	4	8GiB
file.d	1	–	12GiB

Results (write‑path)

In the synthetic tests we obtained the following results:

Container	Avg. Logs/sec	Avg. Throughput	Avg. CPU Usage	Avg. RAM Usage
seq‑db	370,000	48MiB/s	3.3vCPU	1.8GiB
elasticsearch	110,000	14MiB/s	1.9vCPU	2.4GiB

Thus, with comparable resource usage, seq‑db demonstrated on average 3.4× higher throughput than Elasticsearch.

Results (read‑path)

Both stores were pre-loaded with the same dataset. Read-path tests were run without any write load.

Elasticsearch settings:

Request cache disabled (request_cache=false)
Total hits counting disabled (track_total_hits=false)

Tests were executed using Grafana k6, with query parameters available in the benchmarks/k6 folder.

Scenario: fetch all logs using offsets

ES enforces default limit page_size * offset ≤ 10,000.

Parameters: 20 looping virtual users for 10s, fetching a random page [1–50].

DB	Avg	P50	P95
seq‑db	5.56ms	5.05ms	9.56ms
elasticsearch	6.06ms	5.11ms	11.8ms

Scenario `status: in(500,400,403)`

Parameters: 20 VUs for 10s.

DB	Avg	P50	P95
seq‑db	364.68ms	356.96ms	472.26ms
elasticsearch	21.68ms	18.91ms	29.84ms

Scenario `request: GET /english/images/top_stories.gif HTTP/1.0`

Parameters: 20 looping VUs for 10s.

DB	Avg	P50	P95
seq‑db	269.98ms	213.43ms	704.19ms
elasticsearch	46.65ms	43.27ms	80.53ms

Scenario: aggregation counting logs by status

SQL analogue: SELECT status, COUNT(*) GROUP BY status.
Parameters: 10 parallel queries, 2 VUs.

DB	Avg	P50	P95
seq‑db	16.81s	16.88s	16.10s
elasticsearch	6.46s	6.44s	6.57s

Scenario: minimum log size for each status

SQL analogue: SELECT status, MIN(size) GROUP BY status.
Parameters: 5 iterations with 1 thread.

DB	Avg	P50	P95
seq‑db	33.34s	33.41s	33.93s
elasticsearch	16.88s	16.82s	17.5s

Scenario: range queries — fetch 5,000 documents

Parameters: 20 threads, 10s, random page [1–50], 100 documents per page.

DB	Avg	P50	P95
seq‑db	406.09ms	385.13ms	509.05ms
elasticsearch	22.75ms	18.06ms	64.61ms

K8S Deploy

Cluster computation resources description:

Container	CPU	RAM	Disk
seq‑db (`--mode store`)	Xeon Gold 6240R @ 2.40GHz	DDR4 3200MHz	RAID10, 4×SSD
seq‑db (`--mode ingestor`)	Xeon Gold 6240R @ 2.40GHz	DDR4 3200MHz	–
elasticsearch (master/data)	Xeon Gold 6240R @ 2.40GHz	DDR4 3200MHz	RAID10, 4×SSD
file.d	Xeon Gold 6240R @ 2.40GHz	DDR4 3200MHz	–

We selected a baseline set of fields to index. Elasticsearch was set up with index k8s-logs-index to index only those fields.

Configuration `1x1`

Index settings (same applied to seq‑db including durability guarantees):

curl -X PUT "http://localhost:9200/k8s-logs-index/" -H 'Content-Type: application/json' -d'
{
  "settings": {
    "index": {
      "number_of_shards": "6",
      "refresh_interval": "1s",
      "number_of_replicas": "0",
      "codec": "best_compression",
      "merge.scheduler.max_thread_count": "2",
      "translog": { "durability": "request" }
    }
  },
  "mappings": {
    "dynamic": "false",
    "properties": {
      "request": { "type": "text" },
      "size": { "type": "keyword" },
      "status": { "type": "keyword" },
      "clientip": { "type": "keyword" }
    }
  }
}'

Results

Container	CPU Limit	RAM Limit	Avg. CPU	Avg. RAM
seq‑db (`--mode store`)	10	16GiB	8.81	3.2GB
seq‑db (`--mode proxy`)	6	8GiB	4.92	4.9 GiB
elasticsearch (data)	16	32GiB	15.18	13GB

Container	Avg. Throughput	Logs/sec
seq‑db	181MiB/s	1,403,514
elasticsearch	61MiB/s	442,924

Here, seq‑db achieved ~2.9x higher throughput with fewer resources usage.

Configuration `6x6`

Six seq‑db instances with --mode proxy and six with --mode store. Elasticsearch indexing settings stayed the same except number_of_replicas=1:

curl -X PUT "http://localhost:9200/k8s-logs-index/" -H 'Content-Type: application/json' -d'
{
  "settings": {
    "index": {
      "number_of_shards": "6",
      "refresh_interval": "1s",
      "number_of_replicas": "1",
      "codec": "best_compression",
      "merge.scheduler.max_thread_count": "2",
      "translog": { "durability": "request" }
    }
  },
  "mappings": {
    "dynamic": "false",
    "properties": {
      "request": { "type": "text" },
      "size": { "type": "keyword" },
      "status": { "type": "keyword" },
      "clientip": { "type": "keyword" }
    }
  }
}'

Results

Container	CPU Limit	RAM Limit	Replicas	Avg. CPU (per instance)	Avg. RAM (per instance)
seq‑db (`--mode proxy`)	3	8GiB	6	1.87	2.2GiB
seq‑db (`--mode store`)	10	16GiB	6	7.40	2.5GiB
elasticsearch (data)	13	32GiB	6	7.34	8.8GiB

Container	Avg. Throughput	Avg. Logs/sec
seq‑db	436MiB/s	3,383,724
elasticsearch	62MiB/s	482,596

Methodology​

Local Deploy​

Results (write‑path)​

Results (read‑path)​

Scenario: fetch all logs using offsets​

Scenario status: in(500,400,403)​

Scenario request: GET /english/images/top_stories.gif HTTP/1.0​

Scenario: aggregation counting logs by status​

Scenario: minimum log size for each status​

Scenario: range queries — fetch 5,000 documents​

K8S Deploy​

Configuration 1x1​

Results​

Configuration 6x6​

Results​

Methodology

Local Deploy

Results (write‑path)

Results (read‑path)

Scenario: fetch all logs using offsets

Scenario `status: in(500,400,403)`

Scenario `request: GET /english/images/top_stories.gif HTTP/1.0`

Scenario: aggregation counting logs by status

Scenario: minimum log size for each status

Scenario: range queries — fetch 5,000 documents

K8S Deploy

Configuration `1x1`

Results

Configuration `6x6`

Results