So, the cool thing about Tornado the Python web framework is that it's based on a single thread IO loop. Aka Eventloop. This means that you can handle high concurrency with optimal performance. However, it means that can't do things that take a long time because then you're blocking all other users.
The solution to the blocking problem is to then switch to asynchronous callbacks which means a task can churn away in the background whilst your web server can crack on with other requests. That's great but it's actually not that easy. Writing callback code in Tornado is much more pleasant than say, Node, where you actually have to "fork" off in different functions with different scope. For example, here's what it might look like:
class MyHandler(tornado.web.RequestHandler):
@asynchronous
@gen.engine
def get(self):
http_client = AsyncHTTPClient()
response = yield gen.Task(http_client.fetch, "http://example.com")
stuff = do_something_with_response(response)
self.render("template.html", **stuff)
It's pretty neat but it's still work. And sometimes you just don't know if something is going to be slow or not. If it's not going to be slow (e.g. fetching a simple value from a fast local database) you don't want to do it async anyway.
So on Around The World I have a whole Admin dashboard where I edit questions, upload photos and run various statistical reports which can be all very slow. Since the only user who is using the Admin is me, I don't care if it's not very fast. So, I don't have to worry about wrapping things like thumbnail pre-generation in asynchronous callback code. But I don't either want to block the rest of the app where every single request has to be fast. Here's how I solve that.
First, I start 4 different processors across 4 different ports:
127.0.0.1:10001 127.0.0.1:10002 127.0.0.1:10003 127.0.0.1:10004
Then, I decide that 127.0.0.1:10004
will be dedicated to slow blocking ops only used by the Admin dashboard.
In Nginx it's easy. Here's the config I used (simplified for clarity)
upstream aroundtheworld_backends { server 127.0.0.1:10001; server 127.0.0.1:10002; server 127.0.0.1:10003; } upstream aroundtheworld_admin_backends { server 127.0.0.1:10004; } server { server_name aroundtheworldgame.com; root /var/lib/tornado/aroundtheworld; ... try_files /maintenance.html @proxy; location /admin { proxy_pass http://aroundtheworld_admin_backends; } location @proxy { proxy_pass http://aroundtheworld_backends; } ... }
With this in play it means that I can write slow blocking code without worry about blocking other users than myself.
This might sound lazy and that I should use an asynchronous library for all my DB, net and file system access but mind you that's not without its own risks and trouble. Most of the DB access is built to be very very simple queries that are always really light and almost always done over and database index that is fully in RAM. Doing it like this I can code away without much complexity and yet never have to worry about making the site slow.
UPDATE
I changed the Nginx config to use the try_files
directive instead. Looks nicer.
Comments
Alternatively you could host your admin app behind a WSGI server (e.g. gunicorn). You'd lose all of Tornado async capabilities (e.g. tornado.httpclient, tornado.auth modules) and you'd have to refactor your app to isolate admin code into a separate module, but I think it's worth it. WSGI will automatically fork for new requests and you won't have to worry about locking your app at all.
Nevertheless I think that using Tornado for fully sync apps (even when hosting behind WSGI) is a bit pointless. The entire point of Tornado is to be asynchronous. That's why I tend to use it only when I know the app's gonna be fully async and alongside a "regular" app written in, say, Flask.
"...and you won't have to worry about locking your app at all."
But I don't worry about locking at all. That was the point.
Yeah, building the admin in something else but connect to the same DB would make sense but it's a tonne of other problems. Now I can use the same base class for all views which means I can share a lot of functionality that I also need in the admin dashboard. E.g. the authentication.
Yeah, if the whole site was to sit on top of a slow database I wouldn't choose Tornado. But if that was the case I think I'd just use an async library and continue to use Tornado. :)
Thinking about doing exactly that.. how to you communiucate Flask with Tornado?
What do you mean, "communicate Flask with Tornado"?
I think this is a very clever approach, i.e. use nginx to route time-consuming get requests to a separate server (or pool of servers), and not slow the down the IO-loop(s). This approach would also work well for routing time-consuming web-socket connections. Very nice!