A 7MB native-image Java app that runs in 30ms and uses only 4MB of RAM!

Post date: May 28, 2018 3:6:49 PM

GraalVM, a new, polyglot compiler for Java and several other languages, is capable of generating native images which allow JVM-based languages to run, in theory at least, with a lot less bloat, both in terms of total executable size and memory, than when compared with a full blown JVM.

This is perfect for short-running executables such as CLIs and serverless implementations, but also great for micro-services which restart often as they can be started up in milliseconds.

As I was writing a CLI for RawHTTP, I thought this would be a perfect time to try out GraalVM and see if the improvements it promises are actually delivered.

In this post, I explain how I created a native image for the RawHTTP CLI using GraalVM. It fits in a single executable just over 7MB large, and can:

I also compare the performance of the native image with the plain Java application running on the JVM and with a commonly-used native tool: curl as a HTTP client, Apache httpd as a HTTP server.

Unfortunately, I did not manage to include Java 9/10 native images (I wrote about those before) in this comparison due to several issues I've had trying to convert my application to use Java 9 modules. It would be interesting to perform that comparison in the future.

RawHTTP CLI client on the JVM

Before we start, let me introduce RawHTTP: it's a zero-dependency, reflection-less Java library that implements the core HTTP/1.1 specification, RFC-7230. I wrote more about it in a previous blog post, but in summary, RawHTTP makes it easy to write both HTTP servers and clients by providing only the basic HTTP protocol primitives, leaving other things like server routing and client cookies to other libraries. It goes against the current trend in the Java world of using full-stack frameworks that make use of annotations and reflection for everything: it's just vanilla Java code. Because of that, it's extremely light and fast: the core jar weighs in at 102KB.

The RawHTTP CLI is another library that uses core RawHTTP to implement a CLI tool that can run as a HTTP server (serving a local directory) or as a HTTP client.

Before we get to the native image and GraalVM, let's see how the CLI normally works on the JVM.

If you want to follow along, download the rawhttp.jar file from JCenter. Running the jar requires that you have Java SE 8+ installed.

To run a HTTP server on port 8082 serving the public/ directory, type this:

java -jar rawhttp.jar serve public/ --port 8082

To test that the server is working, from another shell, use curl to get the contents of a file within the public directory:

$ curl -v localhost:8082/my-file

*   Trying ::1...

* TCP_NODELAY set

* Connected to localhost (::1) port 8082 (#0)

> GET /my-file HTTP/1.1

> Host: localhost:8082

> User-Agent: curl/7.54.0

> Accept: */*

>

< HTTP/1.1 200 OK

< Content-Type: text/plain

< Cache-Control: no-cache

< Pragma: no-cache

< Content-Length: 24

< Date: Mon, 28 May 2018 07:29:20 GMT

< Server: RawHTTP

<

Hello, this is my file.

* Connection #0 to host localhost left intact

We can also use the RawHTTP CLI, of course, to do the same thing:

$ java -jar rawhttp.jar send -t "GET localhost:8082/my-file"

HTTP/1.1 200 OK

Content-Type: text/plain

Cache-Control: no-cache

Pragma: no-cache

Content-Length: 24

Date: Mon, 28 May 2018 07:31:27 GMT

Server: RawHTTP

Hello, this is my file.

If you run both of the commands above, you'll notice how curl runs a lot faster than java -jar.

We can actually measure exactly how long the commands take with the time built-in. I am using zsh with its very powerful time built-in, which can also show the memory consumption of a process, among other things, as explained here.

Here's the results when I run time on the above commands:

curl

curl -v localhost:8082/my-file  

0.01s  user 0.01s system 64% cpu 0.038 total

avg shared (code):         0 KB

avg unshared (data/stack): 0 KB

total (sum):               0 KB

max memory:                4932 MB

page faults from disk:     3

other page faults:         1475

java -jar

java -jar rawhttp.jar send -t   

0.34s  user 0.06s system 126% cpu 0.318 total

avg shared (code):         0 KB

avg unshared (data/stack): 0 KB

total (sum):               0 KB

max memory:                37568 MB

page faults from disk:     210

other page faults:         9549

In summary, curl runs in 0.038s, java -jar in 0.318s. curl uses around 4.9MB of RAM, java -jar uses 37MB.

That shows why the JVM is not generally great for CLIs.

RawHTTP CLI client in a GraalVM native-image

Not surprisingly, a Java process takes much more memory to run and needs a longer time to start up than a native tool like curl.

To make things more competitive, however, we can create a native image for the Java program using GraalVM's native-image command.

For a fat jar like the one we've been using, the command is very simple:

native-image -jar rawhttp.jar

This may take a minute or so... it looks like the GraalVM compiler is a little bit slow at the moment!

However, once it's done, you can try running the executable immediately and see what the result is:

$ ./rawhttp send -t "GET localhost:8082/my-file"

HTTP/1.1 200 OK

Content-Type: text/plain

Cache-Control: no-cache

Pragma: no-cache

Content-Length: 24

Date: Mon, 28 May 2018 07:57:38 GMT

Server: RawHTTP

Hello, this is my file.

We can see it works, and feels very responsive! But just how fast is it? Let's time it:

./rawhttp send -t "GET localhost:8082/my-file"  

0.01s  user 0.03s system 34% cpu 0.125 total

avg shared (code):         0 KB

avg unshared (data/stack): 0 KB

total (sum):               0 KB

max memory:                4852 MB

page faults from disk:     597

other page faults:         838

The native process takes only 4.8MB and runs in 0.125s. Not bad!!

But I noticed that there were a lot of page faults from disk. It turns out that running the command again, something changes (filesystem cache??) and we get no page faults from disk, resulting in the process running quite a lot faster:

./rawhttp send -t "GET localhost:8082/my-file"  

0.01s  user 0.01s system 56% cpu 0.028 total

avg shared (code):         0 KB

avg unshared (data/stack): 0 KB

total (sum):               0 KB

max memory:                4836 MB

page faults from disk:     0

other page faults:         1431

Now, the process is actually about as fast and as memory-intensive as curl (running curl multiple times does not change its statistics significantly).

From the previous section:

In summary, curl runs in 0.038s, java -jar in 0.318s. curl uses around 4.9MB of RAM, java -jar uses 37MB.

The native Java application process ran in 0.028s and used 4.8MB.

This is quite amazing for a Java application and could revolutionize the applicability of JVM-based applications.

An important caveat worth mentioning is that the GraalVM native-image compiler still has significant limitations. For example, Java reflection is only partially supported and takes some effort to get working (which was not relevant for RawHTTP CLI as it does not use any reflection, but just about any Java framework, Netty for example, will have trouble running native because of that). Also, java security packages are not currently supported, so https connections will not work with the native image (which is a bummer for RawHTTP CLI, of course). But both of these limitations seem to be temporary, and when they are removed, GraalVM will seriously benefit the JVM world.

Native HTTP Server Performance

In the previous sections, we looked at how long it takes for the HTTP client to start up, send a HTTP request and print the response on the terminal, and how much memory the process needed.

But how about the HTTP server?!

We obviously don't want to know how long the HTTP server runs for (it should run forever!), but it's useful to know how long it takes for it to start up and accept a connection, as well as how much memory it uses.

To find out the time-to-first-connection, I wrote a small bash script that starts the server, then keeps trying to send it a HTTP request, measuring how long it takes for the first request to succeed.

It works on Mac OSX with coreutils installed... on Linux, just replace gdate with date and it should work:

#!/bin/bash

# command that tests that the server is accepting connections

CMD="curl localhost:8082 >& /dev/null"

START_TIME=$(gdate +%s%3N)

# start the server

../rawhttp serve . -p 8082 &

SERVER_PID=$!

STEP=0.001 # sleep between tries, in seconds

TRIES=500

eval ${CMD}

while [[ $? -ne 0 ]]; do

((TRIES--))

echo -ne "Tries left: $TRIES"\\r

if [[ TRIES -eq 0 ]]; then

echo "Server not started within timeout"

exit 1

fi

sleep ${STEP}

eval ${CMD}

done

END_TIME=$(gdate +%s%3N)

TIME=$(($END_TIME - $START_TIME))

echo "Server connected in $TIME ms"

#./rawhttp send -t "GET localhost:8082/hello"

kill ${SERVER_PID}

Running this script a few times, this is the output I get:

$ ./time-rawhttp.sh

Serving directory /Users/renato/programming/projects/rawhttp/rawhttp-cli-tests/. on port 8082

Server connected in 112 ms

$ ./time-rawhttp.sh

Serving directory /Users/renato/programming/projects/rawhttp/rawhttp-cli-tests/. on port 8082

Server connected in 53 ms

$ ./time-rawhttp.sh

Serving directory /Users/renato/programming/projects/rawhttp/rawhttp-cli-tests/. on port 8082

Server connected in 40 ms

Even though there's some variance, the server starts up rather quickly.

We can also measure the memory the server uses when bombarded with HTTP requests by running it with the time command and then using the ab tool to benchmark the server.

Start the server:

$ time ./rawhttp serve public/ --port 8082

From another shell, run ab:

$ ab -n 100 -c 2 localhost:8082/my-file

Here's the report I got from ab:

Server Software:        RawHTTP

Server Hostname:        localhost

Server Port:            8082

Document Path:          /my-file

Document Length:        24 bytes

Concurrency Level:      2

Time taken for tests:   0.058 seconds

Complete requests:      100

Failed requests:        0

Total transferred:      18600 bytes

HTML transferred:       2400 bytes

Requests per second:    1719.19 [#/sec] (mean)

Time per request:       1.163 [ms] (mean)

Time per request:       0.582 [ms] (mean, across all concurrent requests)

Transfer rate:          312.27 [Kbytes/sec] received

Connection Times (ms)

              min  mean[+/-sd] median   max

Connect:        0    0   0.1      0       0

Processing:     1    1   0.2      1       2

Waiting:        1    1   0.2      1       2

Total:          1    1   0.2      1       2

Percentage of the requests served within a certain time (ms)

  50%      1

  66%      1

  75%      1

  80%      1

  90%      1

  95%      2

  98%      2

  99%      2

 100%      2 (longest request)

Now, stop the server by entering Ctrl+C in its shell. Here's the report I got from time:

./rawhttp serve public/ --port 8082  

0.08s  user 0.14s system 0% cpu 4:27.58 total

avg shared (code):         0 KB

avg unshared (data/stack): 0 KB

total (sum):               0 KB

max memory:                16116 MB

page faults from disk:     935

other page faults:         3373

Note: When running ab multiple times, or just running 1000 requests instead of 100, the native server process crashed for me. That did not happen with the normal JVM process. I couldn't figure out a way to find out why the process was crashing as no crash dump seems to be generated when that happened, but I'd love to know how I can do that - I will update this blog post if I ever figure this out. Created an issue with the GraalVM project.

UPDATE (09th June 2018): it turns out the server was not closing a FileInputStream properly, causing the server to crash (it probably only works in the JVM because the finalize method would close the handle). As pointed out in the GitHub issue, just using FileInputStream is not a great idea anyway... after I updated the code to close the InputStream as soon as possible, and started using Files.newInputStream() instead, the server stopped crashing and can handle a very high load without problems.

Baseline: Apache HTTP Server Performance

To have a baseline for server performance, just like we used curl to baseline the HTTP client, let's use an Apache HTTP Server so that we have an idea of how the native HTTP server is performing.

Mac OS already comes with Apache installed, so all I had to do was to configure a Virtual Host and expose the public/ directory before getting the server to serve static files.

To measure the server startup, I've used the same bash script as before, but changing the line that starts the server:

# start the server

apachectl start &

I also changed how the server is killed:

#kill ${SERVER_PID}

apachectl stop

Results:

$ sudo ./time-rawhttp.sh

Password:

Server connected in 771 ms

$ sudo ./time-rawhttp.sh

Server connected in 417 ms

$ sudo ./time-rawhttp.sh

Server connected in 402 ms

ab report:

Server Software:        Apache/2.4.29

Server Hostname:        rawhttp

Server Port:            8081

Document Path:          /my-file.txt

Document Length:        24 bytes

Concurrency Level:      2

Time taken for tests:   0.081 seconds

Complete requests:      100

Failed requests:        0

Total transferred:      26900 bytes

HTML transferred:       2400 bytes

Requests per second:    1241.16 [#/sec] (mean)

Time per request:       1.611 [ms] (mean)

Time per request:       0.806 [ms] (mean, across all concurrent requests)

Transfer rate:          326.05 [Kbytes/sec] received

Connection Times (ms)

              min  mean[+/-sd] median   max

Connect:        0    0   0.0      0       0

Processing:     1    1   0.5      1       4

Waiting:        0    1   0.5      1       3

Total:          1    2   0.5      1       4

WARNING: The median and mean for the total time are not within a normal deviation

        These results are probably not that reliable.

Percentage of the requests served within a certain time (ms)

  50%      1

  66%      2

  75%      2

  80%      2

  90%      2

  95%      2

  98%      4

  99%      4

 100%      4 (longest request)

JVM HTTP Server Performance

Let's compare that with the normal, JVM-based process.

Change the line that starts the server in the bash script:

# start the server

java -jar rawhttp.jar serve public/ --port 8082 &

Run the scripts a few times:

$ rawhttp-cli-tests/time-rawhttp.sh

Serving directory /Users/renato/programming/projects/rawhttp/public on port 8082

Server connected in 651 ms

$ rawhttp-cli-tests/time-rawhttp.sh

Serving directory /Users/renato/programming/projects/rawhttp/public on port 8082

Server connected in 546 ms

$ rawhttp-cli-tests/time-rawhttp.sh

Serving directory /Users/renato/programming/projects/rawhttp/public on port 8082

Server connected in 507 ms

Now, start the server with the time command, run ab from a separate shell, then kill the server process as we did in the previous section.

time output:

java -jar rawhttp.jar serve public/ --port 8082  

1.90s  user 0.45s system 4% cpu 48.256 total

avg shared (code):         0 KB

avg unshared (data/stack): 0 KB

total (sum):               0 KB

max memory:                68068 MB

page faults from disk:     213

other page faults:         17220

ab report:

Server Software:        RawHTTP

Server Hostname:        localhost

Server Port:            8082

Document Path:          /my-file

Document Length:        24 bytes

Concurrency Level:      2

Time taken for tests:   0.126 seconds

Complete requests:      100

Failed requests:        0

Total transferred:      18600 bytes

HTML transferred:       2400 bytes

Requests per second:    794.57 [#/sec] (mean)

Time per request:       2.517 [ms] (mean)

Time per request:       1.259 [ms] (mean, across all concurrent requests)

Transfer rate:          144.33 [Kbytes/sec] received

Connection Times (ms)

              min  mean[+/-sd] median   max

Connect:        0    0   0.2      0       2

Processing:     1    2   1.2      2       8

Waiting:        1    2   1.2      2       8

Total:          1    2   1.2      2       9

Percentage of the requests served within a certain time (ms)

  50%      2

  66%      2

  75%      3

  80%      3

  90%      4

  95%      5

  98%      7

  99%      9

 100%      9 (longest request)

Conclusion

GraalVM's native-image produces binaries that look and feel like native applications written in native languages like C.

A summary of the numbers collected in the previous sections demonstrates that:

HTTP Client GET Request Performance:

HTTP Server Performance (100 requests, concurrency level = 2):

Really amazing performance from the GraalVM native-image.

The only problem with GraalVM is that it still does not support all JVM features and seems to be a much less stable runtime at this point in time - May 2018 (though the crashes I had could have been caused by something I did wrong myself, I will keep this post updated).

However, once the compiler and runtime mature, there's no doubt that GraalVM is going to make it possible for JVM-languages to compete with native languages like Go, Haskell and maybe even Rust and C in some domains where, until now, the JVM memory- and startup-time- overhead, or even the necessity to install or ship a full JVM runtime, were considered unacceptable.

An interesting question to ask is how other native projects based on traditional JVM languages, such as Scala Native and Kotlin Native may be impacted, or perhaps even benefit, from GraalVM. Time will tell.

Final Note: I am having some problems publishing RawHTTP on JCenter and Maven Central... if you would like to run the benchmarks above, please build RawHTTP from source until that issue is resolved:

git clone git@github.com:renatoathaydes/rawhttp.git

cd rawhttp

git checkout back-to-java8

./gradlew fatJar # the jar is under rawhttp-cli/build/libs/rawhttp.jar