The web is abuzz over 'real-time' -- the concept of websites and web applications providing instantaneous response to online interaction and access to streams of constantly-updating data and information.
Today's focus on real-time services is a reflection of the evolution of interactivity on the internet but applications that are heavy on interaction present unique challenges for developers when it comes to performance, scalability and availability. While simple applications that primarily pull content from a database and display it to users are can be made highly-efficient using techniques such as caching, interactive applications that are designed to be used in real-time can be much more difficult to maintain and scale.
Twitter, arguably the purest example of a popular 'real-time' internet service, is the perfect example of that. It has been plagued by performance and downtime issues for some time now and it's not hard to see why: at any given moment, there are thousands upon thousands of Twitter users posting and pulling content, with constant polling to the web servers for updates from Twitter clients, the website and through the API. This means lots of database reads and writes and a mountain of HTTP traffic. It's a developer's worst nightmare: a steady flow of resource-intensive database writes coupled with an almost never-ending flurry of database reads.
When it comes to dealing with performance, scalability and availability for real-time web applications developers now need to think about the following key issues (amongst others no doubt, but these are the ones I am focusing on) when designing applications:
- HTTP polling and providing a real-time experience for users will increase the number of requests to to the server unless a connection can remain open. Traditional web servers don't provide a solution for this and opening socket connections are not really a viable solution as firewalls will typically block this from within corporate networks.
- Database read and write performance and avoiding the locks that will ensue as a result of the high volume of writes to tables. Typical RDBMS databases are simply not ideal storage solutions when high volumes of read and write requests are required.
- CPU or IO intensive operations that need to be queued and processed separately to ensure the web servers remain responsive to "normal" requests.
- Autoscaling to support unexpected loads in a cost effective manner.
I have decided to address each of the issues in separate blog post, so in this blog post, I will aim to talk about concurrency, HTTP polling and providing a real-time experience for web visitors over an HTTP connection.
HTTP Push - Polling, Streaming and Sockets
The issues we have with why web servers struggle with real-time lie primarily with the antiquated HTTP protocol which is unfortunately a legacy we're going to be stuck with for some time. Google and others are fortunately looking at solutions. Google has recently published a proposal for a new protocol called
SPDY, which seems to address some of the biggest faults of HTTP's suitability to today's web applications. With the current HTTP protocol, connections are not persistent (meaning every interaction with the server requires a new request, new headers, new response headers, authentication etc.) and most of the communication is typically uncompressed.
SPDY sets about addressing these two key issues, with the persistent connection being the one relevant to HTTP polling.
Currently, when a user visits a web page which provides a real-time experience, what is typically happening is that the web browser is in fact polling the web server every second or so to say "is there an update?". These requests are pretty lightweight, but immediately present a massive problem when thousands of visitors use the website at the same time. Assuming you poll the server every 2 seconds, and you have 1,000 visitors on your site at any one time, the server(s) would receive approximately 500 requests per second asking "is there any update?"
Whilst 500 is a digestible amount of requests, increasing the number of visitors to 50,000 during a peak period meaning 25,000 requests per second would quickly bring down any small web farm. The issue again lies with the fact that HTTP does not allow data to be pushed back to the browser, so the browser has no option but to keep polling and overloading the server with unnecessary requests.
A number of approaches have been taken to solve this problem, with Google's recent suggestion being the most sensible way of fixing this without "hacking" the HTTP protocol, however
SPDY protocol is a long way away and is not something we can rely on. The common ways to work around the HTTP issues are as follows:
- Long Polling allows the browser to open a connection to a web server and keep the connection open for an extended period of time waiting for data to be sent to the browser. As as data is sent, a new Long Polling request is opened to the server waiting for the next event to be sent from the server.
Tornado is a web server built specifically to provide this type of long polling functionality and was built by FriendFeed which is now released as an open source server. For those of you using Nginx, you can configure Nginx as a Comet server using this beta plugin. The plugin allows your standard web application to pull and push content, and let the plugin do all the hard work distributing data to the clients. - Streaming allows the browser to open the connection to the web server and keep it open for as long as the user is on the website. This solution has numerous problems around browser support and the inability to detect the state of the connection. Whilst this is an option using the iFrame or XMLHttpRequest method, we do not recommend this approach.
- Socket Connections are achieved through the use of a plugin such as the common Adobe Flash. Flash has complete support for raw socket connections providing a facility for your application to open a bi-directional asynchronous connection to a server, however this is not done over HTTP. As a result of this being a raw socket connection, users behind strict corporate firewalls will often not be able to connect using these socket connections which means a socket connection is probably only viable for consumers using the application at home. Server solutions, commonly used by Flash based game developers include ElectroServer and SmartFoxServer (which is based on Red5 which is the open source alternative to Adobe Flash Media Server)
Concurrency
Once you've figured out your solution to support push from the server to the browser, your next challenge may very well be how to allow your website visitors to experience a truly real-time experience and interact with other visitors. A great example of this is Google Spreadsheets which allows you to work with your Google Doc in real-time, updating the document and receiving updates in real-time (e.g. if you update a formula, you see other cells update once Google pushes the changes back to you), and also to chat to other users editing your document at the same time. Providing an application which allows this type of message queuing and dispatching between users can be extremely difficult and is typically addressed with languages that are more adept at handling concurrency.
Erlang, a language developed by Ericsson way back when to help them with virtually unlimited scale and conurrency, is designed from the ground up to deal with concurrency. There are no shared variables which ensures there are no locking issues which is the typical hell that developers need to deal with when trying to write applications that handle concurrency gracefully. Erlang avoids locking issues by supporting the idea of messages so that each function or method simply passes messages to other functions or methods. Using this message passing and queuing system as an integral part of the language, applications can easily scale by dding more servers capable of receiving and dispatching messages, and issues around concurrency will never materialise. However, saying all this, Erlang is not ideal as a web server and typically Erlang based solutions use a proxy server to handle HTTP requests and push HTTP requests back to the browser. See
Alexey's post on how he is trying to build a system which can cope with a million long poll requests using Erlang and Nginx. You'll also need to learn a language and syntax so have fun ;)
Facebook use Erlang to drive their chat along with a Comet solution, if it works for them I'm sure it will work for you.
Scala is another functional language which is also more suitable for concurrent programming, and has been employed by Twitter to help them scale their concurrency issues.
Summary
Whilst there are numerous ways to address the inadequacies of the HTTP protocol and the inevitable and complex concurrency issues with most programming languages, building true real-time solutions is not easy and it's not likely to be easy for some time to come. New methods to address all of these issues are constantly being discovered and suggested, and even Google who are clearly fed up with the restrictions imposed upon them by the protocol are trying to find an alternative solution. If anyone is driven to make it work they are fortunately with their entire business relying on an increasingly usable web experience.
Further reading