HTTP and Stream Processing

As we discussed in the previous post, we are thinking about a new design and implementation for the streams library in Open Dylan.

While the examples in this post are in Dylan and are using code from our HTTP server, these issues exist in HTTP frameworks in other languages. The code should be clear enough that little to no Dylan knowledge is required to understand the points being made here.

What does this have to do with HTTP? There are several pain points in our HTTP stack as it is currently written:

  • Requests are read in their entirety into memory, so a large request (such as a file upload) takes a significant amount of memory.
  • Responses often buffer their entire output in memory as well.
  • Because of the use of the existing streams library, we don't handle non-blocking sockets and require a thread per socket.
  • We don't have a good model for handling long-lasting connections such as might be used with Server Sent Events or WebSockets without tying up a thread for the duration of the socket being open.

We don't know yet what the new streams API will look like, but we can look at our ...

read more »

There are comments.

Beginning to Rethink Streams

Dylan's current streams library has served us moderately well over the years. However, it has some issues which can be addressed by a new design, expanding the range of problems for which it is suited.

How things are now

According to the current streams library's documentation, the design goals were:

  • A generic, easy-to-use interface for streaming over sequences and files. The same high-level interface for consuming or producing is available irrespective of the type of stream, or the types of the elements being streamed over.
  • Efficiency, especially for the common case of file I/O.
  • Access to an underlying buffer management protocol.

One of the things it was explicitly not designed to handle was, again, according to the design goals in the documentation:

  • A comprehensive range of I/O facilities for using memory-mapped files, network connections, and so on.

Unfortunately, the primary interface to our current network library is based on these very streams for which network connections were not a design goal. While this works in practice, it imposes some important limitations on our networking code. The biggest of these is that sockets can not be non-blocking as it is expected that reads and writes will complete ...

read more »

There are comments.