Serving large files with Tornado safely without blocking

We need to take care of two things while serving large files using Tornado:

  1. It should not eat up the RAM
  2. It should not block the server

To do that, we'll have to read and send the files in chunks. What that means is we'll read a few megabytes and send them, then read the next few megabytes and send them and we'll keep doing that until we've read and sent the whole file.

Before moving on, I think it should go without saying that Tornado isn't recommended to serve large data. A specialized server, like Nginx, for this purpose should always be preferred when possible.

Serving the files safely so it doesn't eat up the RAM

We'll read the files in chunks, then write the chunk to the response, and flush it to the network socket.

Reading in chunks and flushing the data to network will ensure that we don't run out of RAM.

Here's a code example:

from tornado import web, iostream

class DownloadHandler(web.RequestHandler):
    async def get(self, filename):
        # chunk size to read
        chunk_size = 1024 * 1024 * 1 # 1 MiB

        with open(filename, 'rb') as f:
            while True:
                chunk = f.read(chunk_size)
                if not chunk:
                    break
                try:
                    self.write(chunk) # write the chunk to response
                    await self.flush() # send the chunk to client
                except iostream.StreamClosedError:
                    # this means the client has closed the connection
                    # so break the loop
                    break
                finally:
                    # deleting the chunk is very important because 
                    # if many clients are downloading files at the 
                    # same time, the chunks in memory will keep 
                    # increasing and will eat up the RAM
                    del chunk

Preventing our server from blocking

When we await self.flush(), Tornado writes the current data to the network socket. Theoretically, that means at the await self.flush() statement, our coroutine should pause because writing to socket takes some time. This little pause should allow the ioloop to run other handlers asynchronously. This means that our server shouldn't block.

But that is not always the case. self.flush() can be very fast if your client's network is also fast. Thus, the delay is so small that our coroutine will keep running without pausing and so, it will block the server.

A fool-proof way to make our code non-blocking is by putting it to sleep for a nanosecond just after flush(). That delay should be enough for the ioloop to run other handlers. In fact, it doesn't have to be a nanosecond. It can be an even smaller value. But for this example, I'll go with a nanosecond's pause.

UPDATE: I asked about this issue on Tornado's mailing list and Ben Darnell (Tornado's maintainer) gave out some pretty good tips. You can find the thread here. Do read it, he also posted a code example about "metered usage" which you can use to serve your clients fairly.

Example:

from tornado import web, iostream, gen

class DownloadHandler(web.RequestHandler):
    async def get(self, filename):

        ...
        try:
            self.write(chunk)
            await self.flush()
        except iostream.StreamClosedError:
            break
        finally:
            del chunk
            # pause the coroutine so other handlers can run
            await gen.sleep(0.000000001) # 1 nanosecond

This approach is pretty effective because the speed of a connection is still far slower than the speed of Tornado sending data to the socket. So, pausing for a nanosecond, and also serving other clients at the same time, doesn't really matter much.

Now that we've made the DownloadHandler asynchronous, we can serve multiple clients in a non-blocking way. Even if a few different users want to download different files, at the same time, our server won't block.

Benchmarks

Tornado isn't meant for serving large data. You should, when you have the option, use Nginx to serve large files. The benchmarks clearly show that.

----------------------------+------------+---------------+---------------
Server                      | 1 request  | 10 concurrent | 100 concurrent 
                            |            | requests      | requests
----------------------------+------------+---------------+---------------
Nginx (w/ sendfile)         | 0.130 sec  | 0.978 sec     | 15.790 sec
Nginx (w/o sendfile)        | 0.155 sec  | 1.472 sec     | 22.424 sec
Tornado 5.0 (w/o sendfile)  | 0.419 sec  | 3.782 sec     | 44.289 sec

It's quite apparent that Tornado can't keep up with Nginx.

What's sendfile?

sendfile is a function available on Linux (and Unix) which allows copying file to a socket at kernel level. This is far more faster than what we're doing - reading the file and writing to socket. While Python supports this using os.sendfile, Tornado doesn't. But there's an issue on Github about adding this support to Tornado.

Some notes on performance

Nginx is so fast because it's optimized for serving files. While I don't know the inner workings of Nginx but I can safely say that part of the reason for its speed is the fact that it's written in C.

Although there's still room for optimizations in Tornado, by using sendfile to write a file to socket which will make Tornado a little faster, but it still won't be as fast as Nginx. So, serving large files with Tornado should only be reserved for special cases where using Nginx is not possible.