ElasticSearch with a SMILE

Dmitry Olshansky
6 min readOct 28, 2020

--

A short introduction to what my background is and why I know quite a few things about ElasticSearch. I’m lead SRE engineer at Tinkoff.ru for almost two years now and as a lead software engineer prior to that. We are (like many others) building many interesting products using ElasticSearch and OpenDistro specifically. In fact, I’ve recently contributed our patches to the upstream of the latter.

I’ve always wanted to write a good ElasticSearch blog post and for a long time but the circumstances and the material were never right — it’s either too complicated, some already well-covered stuff, or plain too specific to a particular environment.

Thankfully, an opportunity came along to report about a small protocol optimization that, on one hand, is (almost) fully documented in the official sources, and on the other hand — requires one to jump through a number of hoops to get it. So without further introductions, let’s start with the subject matter.

The romance of ElasticSearch and Content-Type(s)

I’ve been following the development of ElasticSearch since the days of 1.1, which was the first ElasticSearch I used in production, even though they skipped a lot of numbers, in that version jump from 2.x→5.x, it’s still a long road and it’s important to highlight a bit of history that is directly relevant to the trick I’m about to describe.

While from the beginning ElasticSearch was positioned as easily (elastic!) scalable RESTful distributed search engine with JSON being used for exchange, the team experimented a lot with different APIs (e.g. Thrift API was supported for some time) and formats by following HTTP and REST principles of content type negotiation.

So with this in mind, it comes as no surprise today that the following command:

curl -H 'Accept: application/yaml' localhost:9200

produces the well-known text but neatly arranged in a YAML format:

But wait, there is more! ElasticSearch supports not only YAML and JSON, but their binary cousins CBOR and SMILE. Particularly the last sentence here is interesting — “The bulk and multi-search APIs support NDJSON, JSON, and SMILE; other types will result in an error response”. Literally this means that the only binary format supported everywhere equally with JSON is SMILE. That’s the reason behind today’s blog post, even though I have discovered the mechanism before fact-checking to see if it was properly documented.

Also of possible interest is the fact that ElasticSearch metadata and communication across the transport layer is done in SMILE, while keeping the documents themselves in whatever format they were sent in. The idea I assume is that the subsequent search will ask for the same format in response, thus saving on transcoding. What this means is that in order to benefit from a different format, we’d have to index data in that format, but we’ll retain the ability to get (search) responses in any content type we’d ask for via transcoding.

Working with a SMILE

Let’s start with transcoding JSON to SMILE, to produce binary documents for indexing. Thankfully, SMILE was designed by the same folks (FasterXML group) who gave us the famous Jackson library making it a breeze to produce and re-encode SMILE from any other supported format. I’ll use Kotlin throughout the blog post with the main goal to keep code as simple as possible but efficient enough to be representative of the production code. Lastly, I believe that snippets in the blog posts should be easily portable to any language so I minimize the amount of language-specific constructs.

The script above is the essence of JSON → SMILE conversion (full featured script + POM). Finally, we can upload the resulting SMILE documents with curl like this:

# note that this command posts smile content but passes Accepts header for JSON
# as response format to avoid dumping SMILE to the console
curl -XPOST -H 'Content-type: application/smile' localhost:9200/test-index/_doc --data-binary @doc.smile -H 'Accept: application/json'

It’s tempting to go straight to benchmarking — obviously using a binary format instead of text would give us non-zero performance benefits. However, there is one last issue to sort out — how to compose Bulk API requests in SMILE format. And documentation at the time of writing is lacking, listing \n as the only separator which doesn't work (I tried it so you don't have to). I’ve kept thinking about this missing bit of information for a few days and finally decided to dive into ElasticSearch source code for the answers. Surely enough, the code doesn't reference \n constant anywhere and instead points to a format-specific streaming separator value, which happens to be 0xFF for SMILE. With that knowledge in hand, we are ready to prepare a SMILE-ready bulk insertion script.

The same script (full version of the script) is using a thread pool for scheduling a fixed number of parallel inserter threads reading from a queue + a single main thread that splits files by separator value and dispatches batches of documents to the work queue. This proves to be enough to easily saturate CPUs of a small cluster with a single laptop and more than enough for our benchmark setup, as we’ll see in the next section.

Benchmark

Benchmarking is both an art and science, so I’ll iterate over key aspects of how I obtained the graph below using that script as the main tool.

First — I tend to use shell script runners on top of existing command-line tools, as it keeps the tool reusable and allows me to play with parameters. The script listed below produces CSV files with raw metrics from elapsed wall-clock time outputted by the Kotlin driver script.

Secondly, to minimize the effect of noise I plotted the best of 5 runs for each parameter combination. The dataset — 300Mib of http-access logs was taken from ES Rally macro-benchmark framework by elastic.co guys. If you do data science you are probably looking for std error and line plotted according to least squares of error, which would be nice to have but there is only this much I can quickly do in my spreadsheet app. To further simplify reproduction I use a public cloud that I happen to know best — Azure. Nothing should prevent you from using any other cloud to obtain similar numbers, in fact I run preliminary tests on my laptop getting the same ratio of performance for JSON vs SMILE.

ElasticSearch server was run using stock configuration from tarball except for setting heap size (8 GiB), network configuration and fixing sysctl vm_max_map_count=500000 and memlock=unlimited ulimits to pass production server bootstrap checks.

Client instance — standard D2s v3 (2 vcpus, 8 GiB). Server instance — standard D4s v3 (4 vcpus, 16 GiB) with 1 Tb of Premium SSD disk (5k IOPs).

Average CPU utilization of the client during the benchmark — 20%. Average CPU utilization of the server was 75–85% with more utilization correlating with more optimal values of bulk size and usage of SMILE format. So finally with the routine and disclaimers safely out of the way, the picture:

Discussion

The interesting part is not the 5+% of throughput improvement at all bulk sizes. It’s the shape — SMILE gains are larger on both too small and too big size ranges. That is significant and let me briefly explain why. Grossly overshooting when picking the size of bulk requests, including up to an order of magnitude beyond the right value, is done routinely in many ElasticSearch setups I audited. Picking the right bulk size is an interesting problem and a good topic for its own blog entry, my verdict is that the only way to completely solve it is via dynamic optimization using self-tuning architecture. The fact that the SMILE version is much less sensitive to hitting the sweet spot of bulk size helps a lot — what is good size today will change tomorrow when you are not around to fix it.

Some closing thoughts on where the 5% better CPU utilization is coming from. The bulk of it is likely GC work and pause times being reduced — less text to process and ~20% smaller payloads may translate to less frequent STW pauses.

Anyhow this is only guesswork for now, as such things are never what they seem and in-depth analysis is outside of the scope of this post. Also not covering the potential savings on the network bandwidth, these are obviously nice to have but range in 10–20% of the size typically (better for larger documents).

And that’s it, the one (brand new!) weird trick to speed up your bulk indexing by 5+% with modest effort on the side of the ingest application.

--

--

No responses yet