Tornado Gives You Wings
Under the hood, the pages you see at The Muse are served by the Tornado web framework. At talks and meetings when we mention our use of it, we're usually met with quizzical looks. It's not surprising. Tornado, after all, isn't the canonical web framework for companies in a similar setting as us—that award would go to Ruby on Rails. It isn't even the canonical Python web framework—that would by Django. Both frameworks allow you to easily model most applications, and there's far more knowledge, documentation, and third-party libraries behind them to boot.
So why do we use Tornado instead? Part of it admittedly is subjective. We believe that Tornado applications better follow the Zen of Python. When a bug happens, it's easier to delve into the Tornado code itself to see what's going on. And it feels like there's more thought that has gone into Tornado. Part of this is likely just that Tornado is younger and has been able to learn from some of the mistakes earlier frameworks made. But the key is that there's a minimalistic quality to Tornado. It doesn't try to encompass everything you want to do, but enough of it—it tries to follow the 90-10 rule.
These arguments usually fall on deaf ears. They are, after all, amorphous qualities, and not every developer's experience is the same. Plenty of developers, for example, have been burned in the past by minimalistic web frameworks that didn't meet increasingly sophisticated product needs.
So let's focus on one capability in particular that has been of great utility to us, asynchronous I/O. Rails and Django—among many other web frameworks—support only synchronous I/O. You can kinda-sorta address this problem by throwing the application behind a proxy like Gunicorn that fans requests out to multiple processes. This will still hit bottlenecks as the number of long-running I/O operations goes up, and it might not even be an option in low-memory environments like Heroku dynos. You could also monkey patch synchronous I/O operations to actually perform asynchronous I/O using something like gevent. Then again, it's hard enough getting asynchronous I/O right even when the intent is there, so this strategy is just begging for strange bugs and edge cases in production.
Tornado allows you to execute both synchronous and asynchronous I/O. It doesn't monkey patch anything, but rather provides asynchronous I/O if you explicitly use it. It's an escape hatch should the need arise to prevent performance degradation against long-running I/O operations.
At The Muse, most of our operations continue to be synchronous. Requests to postgres and redis are low-latency enough that asynchronous operations aren't warranted, and this allows us to leverage great libraries like SQLAlchemy that weren't written to be asynchronous.
But as our application became increasingly complex, we saw more and more need for asynchronous I/O. It started innocently enough with integration to third-party APIs. For example, for some time we used Swiftype to handle our search engine. Because API calls to Swiftype are frequent, and sometimes long-running, asynchronous I/O makes perfect sense.
As we've added more features, the payoff has really shown. For example, most of our endpoints now have the potential to execute long-running I/O operations due to Prerender. While such a feature would have given us great pause in most other web frameworks, it's easy to do in Tornado without negatively affecting performance.
On top of all that, Tornado has built-in support for multiprocessing load-balancing. Rather than having to throw your application behind Gunicorn, you can use Tornado all the way through. This results is significantly lower overhead and infrastructural complexity.
This isn't just theory, either. Two years ago—when it was far simpler—our application was written in Django. Switching to Tornado caused the greatest performance boost we've seen in our history—overall response time was dropped by two-thirds, and sever response time was halved. Our traffic has increased significantly since then, yet we've been able to minimize the number of Heroku dynos we've had to add in large part because of the Tornado-based optimizations we continue to be able to throw in the application.
What about third-party libraries? Libraries made specifically for Tornado are admittedly much less common than those for Django. But because Tornado is less monolithic, libraries that tightly couple to the framework are less necessary in the first place. Instead, we pick the best-in-breed libraries that are available for Python in general. Instead of Django's ORM, for example, we use SQLAlchemy.
How about documentation? Django is renowned for its thorough documentation, and Tornado's is admittedly much more Spartan. Again, documentation is less necessary, both because the framework is significantly simpler, and because it's easier to step through Tornado's code and figure out what's going on.
While other sites struggle with incompatibilities between infrastructure and synchronous I/O, we've chugged merrily along. Overall, Tornado has been a significant win.
Photo of tornado courtesy of Shutterstock.
Yusuf is a chef, avid ukulele player, and hip-hop artist. Unfortunately he does all of those poorly, so he sticks to his day job writing software. Prior to The Muse, Yusuf has worked as a developer at companies both big (Microsoft, IBM) and small (dotCloud, Transloc), and as an Associate Product Manager at Google. Find him on github, hacker news, or say hi on Twitter.More from this Author