Announcing RawHTTP - a JVM library for handling raw HTTP

Post date: Dec 10, 2017 9:24:9 PM

HTTP is the language of the internet. It is also the most commonly used communication protocol for REST APIs, which have become the backbone of most distributed applications.

That's probably not news to anyone. But something that a lot of people may not realize is that HTTP (up to version 1.1, more on that later) is a textual protocol you can quite easily write by hand!

For example, here is a GET request directed at the jsontest.com REST API that will return the headers we send out as a JSON object:

GET / HTTP/1.1

Host: headers.jsontest.com

Accept: application/json

Running this request results in the following response (shown exactly as it is returned by the server in bytes-form):

HTTP/1.1 200 OK

Access-Control-Allow-Origin: *

Content-Type: application/json; charset=ISO-8859-1

X-Cloud-Trace-Context: e7be085086214dffdeb6350ad37672ed

Date: Sun, 10 Dec 2017 18:33:53 GMT

Server: Google Frontend

Content-Length: 155

{

   "X-Cloud-Trace-Context": "e7be085086214dffdeb6350ad37672ed/9779089525915638501",

   "Host": "headers.jsontest.com",

   "Accept": "application/json"

}

As you can see, both the request and response are easy to read and write... one could argue you could write them easily by hand. And that's exactly why I decided to create RawHTTP.

RawHTTP is a library written in pure Java, with no dependencies, which lets you create requests and responses from their textual representation, letting you then turn them into bytes which can be sent to either a server in the case of a request, or a HTTP client in the case of a response!

As a side note: do you know the difference between a HTTP request and a HTTP response?

Well, only the first line (called start-line by the HTTP RFC). The rest of the HTTP message (either request or response) is exactly the same. Sure, requests and responses normally include different headers, but the message format is exactly the same:

start-line\r\n

header-name: header-value\r\n

...

\r\n

<optional body>

There is some complication in the framing of the optional body, but essentially that's all there is to a HTTP message.

Pretty neat.

So, back to RawHTTP! Given that HTTP messages are so easy to write, you might ask yourself why even bother creating a library to parse it.

That's a fair question. And the answer is that even though it is in fact very easy to write, it is also very easy to make mistakes that make your HTTP message completely invalid, which may cause a connection to hang (if you, for example, get the Content-Length header value wrong!) or to just be rejected right away by the recipient (if you forget to use \r\n as line-separators and carelessly use just \n).

If you can make sure you never make these mistakes (and many others, of course), then you can definitely just do the following, in Kotlin, for example:

All examples in this blog post are written in Kotlin, but should look similar in Java or any other JVM language.

val request = "GET / HTTP/1.1\r\n" +

        "Host: ip.jsontest.com\r\n" +

        "Accept: application/json\r\n" +

        "\r\n"

val response = Socket(InetAddress.getByName("ip.jsontest.com"), 80).use {

    it.getOutputStream().write(request.toByteArray())

    // TODO read the response

}

This works perfectly, but reading the response might be a little tricky because you need to be able to read the headers first to be able to determine how to read the body (if any), and as mentioned earlier, reading the body can be tricky if the sender decides to send it out in chunks for efficiency.

There are more reasons why you probably don't want to do it all by yourself, but let it suffice to say that deciding when you should attempt to read a HTTP message's body, which headers, status code, methods, impact on that decision, is not trivial to decide.

RawHTTP helps with all of these problems.

Here's how you would send a request AND read the response correctly, using RawHTTP:

val http = RawHttp()

val request = http.parseRequest("""

    GET / HTTP/1.1

    Host: headers.jsontest.com

    Accept: application/json

""".trimIndent())

val response = Socket(InetAddress.getByName("date.jsontest.com"), 80).use {

    request.writeTo(it.getOutputStream())

    http.parseResponse(it.getInputStream()).eagerly()

}

Notice that RawHTTP keeps your HTTP message pretty raw, but it does fix a few things for you to make sure it is a little harder to make invalid requests and responses.

For one thing, you can use simple new-lines, no need to insist on using \r\n (though it works as well).

RawHTTP also inserts a final new-line for you automatically if needed.

You might have noticed the eagerly() method being called before we return the response? That's because, by default, HTTP message's bodies are not read unless you call asStream() and read the InputStream yourself, or you just call eagerly(), which does that for you. So, unless you only want to download the body of a message conditionally after processing the message's headers or you need custom behaviour, always call eagerly() after parsing a request/response to avoid surprises.

Other things RawHTTP does is make sure that you have the minimum set of headers necessary for your HTTP message to be generally valid, and set a HTTP version in the start-line if none is specified.

In the case of requests, you must make sure to include the Host header (at least in HTTP/1.1).

For the sake of convenience, you can simply specify your request as follows:

GET http://example.com/path/to/resource

RawHTTP will insert a Host header with value example.com and set the HTTP version to HTTP/1.1, so the end result should be this:

GET /path/to/resource HTTP/1.1

Host: example.com

If you do not want RawHTTP to fix the Host header for you (or fix new-lines), you can actually configure it to not do that! Just give an instance of com.athaydes.rawhttp.core.RawHttpOptions to the RawHttp constructor configured the way you want and you're done.

If your request needs to include a body, you can simply write the body after an empty line (don't forget to add the Content-Length header so the recipient of the message knows how to read the body - keep reading to see how RawHTTP can set a body from a String or File so you don't need to include it in the raw message):

val response = RawHttp().parseResponse("""

    HTTP/1.1 200 OK

    Server: RawHTTP

    Content-Length: 12

    Hello World!

""".trimIndent())

Notice that just like you can stream a request to a server, you can stream a response to a client, just use its writeTo(OutputStream) method! That means that using RawHTTP, you can quite simply create working HTTP clients and servers!

But please beware that RawHTTP is still VERY low level. It concerns itself only with the format of the HTTP messages as defined by the HTTP/1.1 RFC (RFC-7230). It currently does not validate headers' names and values, for example, or support any kind of common HTTP extension features, such as cookies, caching etc. So it may not be a replacement to HttpClient or Jetty :)

RawHTTP also allows some HTTP/1.0 behaviour, such as not setting the Host header if the message has HTTP/1.0 version, and downloading the body until the connection is closed in case of a missing Content-Length and Transfer-Encoding headers, but it generally adheres to HTTP/1.1. In the future, I plan to add support for setting the message's version to HTTP/2.0 and then process it as such, but unfortunately HTTP/2.0 messages are binary rather than textual, so they will have to be converted from the HTTP/1.1 format rather than written raw.

If you just need a very lightweight way to send requests/responses, though, specially if you just want to test your client or server, RawHTTP may be ideal for your use-case.

One last thing about sending HTTP messages with bodies: RawHTTP messages have a replaceBody method that lets you, you guessed it, replace a HTTP message's body with the contents of a String (via StringBody) or a file (via FileBody).

When you use replaceBody, RawHTTP changes the Content-Type and Content-Length headers automatically.

All RawHTTP objects are completely immutable, so methods like replaceBody do not modify the instance they are called on, but return a new, modified version of the object instead.

For example, to use a file as the body of a HTTP response:

val response = RawHttp().parseResponse("""

    HTTP/1.1 200 OK

    Server: RawHTTP

""".trimIndent()

).replaceBody(FileBody(File("example.json"), "application/json", true))

println(response.eagerly())

Which, in the case of my example.json file, prints the following:

HTTP/1.1 200 OK

Server: RawHTTP

Content-Type: application/json

Content-Length: 47

{

  "hello": "Hi",

  "location": "Stockholm"

}

To finish off, notice that RawHTTP ships with a very basic HTTP client that makes it just a little more convenient to send requests and receive responses. Using the included client, called TcpRawHttpClient, your code would look like this:

val request = RawHttp().parseRequest("""

    GET / HTTP/1.1

    Host: headers.jsontest.com

    Accept: application/json

""".trimIndent())

val response = TcpRawHttpClient().send(request).eagerly()

I am planning to write a very simple, asynchronous server implementation as well so that the library ships with both a functional client and server, but I haven't had the time to finish that yet... I also want to make sure the server implementation does a little bit more validation on the messages it accepts to avoid security issues, but in the spirit of the library's goals (namely, to provide the bare minimum HTTP implementation) it will still not take care of things like caching, CORS headers or anything on those lines... I hope this will be useful anyway as a test library or perhaps as the backbone of full-featured HTTP clients and servers in the future.

To use RawHTTP, just include it as a dependency of your project, it's on Maven Central and JCenter, see the RawHTTP Bintray page for details.

All source code shown in this blog post is available on GitHub.

Comments on Reddit.