HTTP and Stream Processing

As we discussed in the previous post, we are thinking about a new design and implementation for the streams library in Open Dylan.

While the examples in this post are in Dylan and are using code from our HTTP server, these issues exist in HTTP frameworks in other languages. The code should be clear enough that little to no Dylan knowledge is required to understand the points being made here.

What does this have to do with HTTP? There are several pain points in our HTTP stack as it is currently written:

Requests are read in their entirety into memory, so a large request (such as a file upload) takes a significant amount of memory.
Responses often buffer their entire output in memory as well.
Because of the use of the existing streams library, we don't handle non-blocking sockets and require a thread per socket.
We don't have a good model for handling long-lasting connections such as might be used with Server Sent Events or WebSockets without tying up a thread for the duration of the socket being open.

We don't know yet what the new streams API will look like, but we can look at our ...

There are comments.

Integrating with LLDB

I spend a lot of time debugging Dylan code. Up until now, this has been a somewhat painful process when not using the IDE on Windows. (And I don't really use the IDE on Windows as it doesn't fit well into my workflow.) I finally reached the point where I decided that I wanted to improve our debugger integration.

Much of what is described below may be applicable to people working on debugging support for other languages.

Current State of Dylan Debugging

We have a debugger on Windows integrated with our IDE. This facility is not available on the other platforms that we support. There are many reasons for this:

The debug info that Open Dylan generates is only done on Windows.
The debugger code for interacting with the OS is only implemented for Windows.
The IDE itself is only available on Windows.

This means that debugging on Linux, FreeBSD and Mac OS X has traditionally been more challenging. We often resort to "printf debugging" and have some basic debug printing functions that can be invoked from the compiler so long as they don't crash. Debugging is really only effective with the C back-end as the HARP ...

There are comments.

Integrating with LLVM

Many compilers are now being outfitted with LLVM backends. There are a variety of ways to do this, and we'll take a look at some of them here.

Calling LLVM APIs Directly

One common and easy way to integrate with LLVM is to directly link against the LLVM libraries and invoke the LLVM API directly.

C++ API

LLVM's native APIs are C++. Examples of languages that talk to LLVM via the C++ APIs are Julia and CLASP.

This is also the approach taken by Clang, however, Clang lives within the same code repository as LLVM and shares many developers. It is part of the LLVM project rather than a separate compiler that is using LLVM as a backend.

While this seems like an attractive option, there are some issues with it:

Your code must either be written in C++ or be able to invoke C++ code via an FFI (foreign function interface).
You are tied to a particular version of the LLVM API at compile time.
It is harder to re-use someone's existing installation of LLVM as you are tied to a particular version.

C API

LLVM also provides a C wrapper around the C++ APIs. This ...

There are comments.

Last Month's Hack-a-thon

Last month, we had our first hack-a-thon in many years. Attendance was spread out over the weekend and some great contributions got completed.

Carl Gay documented our regular expressions library.
Peter Housel made more solid progress on our LLVM compiler back-end.
Dustin Voss improved some aspects of our support for limited collections.
Ingo Promovicz improved support for locking I/O streams.
Bruce Mitchener worked on merging run-time libraries to help provide more consistent support for various features across the various compiler back-ends.
Francesco Ceccon worked on the gobject-introspection bindings and the generator for creating C-FFI bindings for the GTK+ family of libraries.
Hannes Mehnert addressed some issues in the networking library surrounding localhost connections.
Andreas Bogk returned from the dead and observed the goings on after a multi-year absence.
Tom Emerson began investigating providing support for Unicode in Dylan.
Robert Roland enhanced the sqlite bindings.

Thanks to everyone who participated and we're looking forward to our next hack-a-thon in a couple of months.

All of these changes and much much more will be present in our upcoming 2013.2 release!

There are comments.

Open Dylan's Compiler Back-ends

Open Dylan has 3 compiler back-ends currently:

HARP
C
LLVM

Harp

The HARP back-end generates native code for x86 processors and is used on Windows, FreeBSD and Linux. While for many years, this was our best and most mature compiler back-end, it is falling by the wayside and will be replaced by the LLVM back-end.

The issues with the HARP back-end are that it is 32-bit only and is more difficult to add support for things like new instructions, better scheduling, atomic operations and SSE support.

C

The C back-end generates C which is then compiled by either clang or gcc. We use the C back-end on Mac OS X, as well as Linux and FreeBSD on x86_64 processors.

The C back-end is relatively easy to work with and extend and is the easiest option for bringing up Open Dylan on a new platform. We are currently doing this with Linux on the ARM processor.

The C back-end used to generate code that was slower than HARP although in recent months, the C back-end has been optimized and many parts of it produce faster results than HARP.

The C back-end allows debugging the generated C (rather than at the Dylan ...

There are comments.

Reducing Dispatch Overhead

Method dispatch in Dylan is the process by which the compiler and run-time collaborate to choose the right implementation of a function to call. This can get rather complex as a number of factors are involved in choosing the right method to call and whether that can be done at compile-time or deferred to run-time.

Unfortunately, full generic method dispatch at run-time in Open Dylan is currently not as fast as we would like. This means that when performance issues strike, they may well be due to the overhead of method dispatch.

Discussing and improving the overall performance of method dispatch isn't the subject of this post. That's going to require a fair bit of planning and work before that is resolved.

Instead, we're going to look at what to do when you're experiencing performance problems at particular call sites due to the overhead of method dispatch.

Sometimes, this is readily visible in the profiler:

/static/images/method_dispatch_in_profiler.png

Here, we can see that the percent-encode method in the uri library is going through dispatch to call member?. (I've translated from the names of the functions in C to their names in Dylan. The name mangling isn't that ...

There are comments.

Saying Good-bye: HARP

This begins a new series of blog posts that will continue over the next few months as we say "Good-bye!" to parts of Open Dylan.

Freedom comes when you learn to let go
Creation comes when you learn to say no

  -- Madonna, The Power of Good-Bye

We're beginning a process by which we'll start slimming down the compiler and the libraries, letting go of some major chunks of code, with the goal of improving the hackability of the compiler and enabling us to make new leaps in functionality.

What is HARP?

HARP is the Harlequin Abstract RISC Processor and was designed and developed at Harlequin in the late 1980s. It was used in Harlequin's LispWorks and later translated to Dylan for use in Harlequin's DylanWorks (which is now Open Dylan).

Clive Tong, an engineer at Harlequin in 1989, briefly described it as:

The compiler targeted an instruction set known as HARP (Harlequin Abstract RISC Processor), and then HARP instructions were translated into machine instructions using a template matching scheme. HARP had an infinite set of registers, and the register colouring happened as part of this templating processing.

Some additional details about the early design of HARP ...

There are comments.

The ALGOL Roots of Dylan

Dylan is often seen as a descendent of the Lisp family of languages. It was designed (and implemented) by people from the Common Lisp world and borrowed lots from Scheme, EuLisp and other Lisp dialects. On the other hand, it was given a syntax from the ALGOL tradition rather than using s-expressions.

It is interesting to compare some of what is present within the Dylan language design with ALGOL though and see if perhaps some of the ALGOL influence wasn't just in the syntax. It is difficult today to know the extent to which ALGOL 68 was a direct influence versus having been filtered through Common Lisp and other languages. After all, Dylan was designed some 25 years or so after ALGOL 68. (At the time of this writing, Dylan itself is over 20 years old.)

It is clear though that many languages have picked up ideas from ALGOL over the years, including languages that were direct influencers of Dylan such as Scheme.

This post isn't intended to say "this concept came from ALGOL" but just to look at some of the interesting similarities.

Orthogonal Design

ALGOL 68 as defined in the Revised Report on the Algorithmic Language ...

There are comments.

Thoughts on a New Dylan Ecosystem

Dylan is in an unusual position for a programming language.

It has a pretty mature implementation, a specification, and a lot of high quality documentation. It has a long history and many people remember it from the early days of its history when it was being developed at Apple and other institutions.

On the other hand, the current community is fairly small as is the available set of libraries. This is typically a disadvantage, however it can be used to the community's advantage in a couple of ways.

Looking Forward

One way is that we can look forward and take advantage of what we know and want now without having to worry about backward compatibility.

A parallel might be seen with the early days of Node.js. Early on, Node.js was also working with a programming language with a specification and a mature implementation. However, Node.js was using JavaScript in a new way: to create server-side software employing asynchronous and non-blocking techniques. A big advantage that Node.js had was that there wasn't a large collection of libraries for doing server-side programming. The developers could (largely) start from a clean slate, producing libraries that fit in ...

There are comments.

Type System Overview

The type system and how it is used is a commonly misunderstood aspect of the Dylan language. Although it lacks some forms of expressiveness in the current incarnation, it also has some features that aren't found in many languages, such as singleton types. It is also very important in helping the compiler to generate faster yet still safe code.

One interesting feature in Dylan is that it is optionally typed. While this is more common today and sometimes has fancy names applied like 'gradually typed', the overall point is the same: Your code can start out untyped and looking like code does in Ruby or Python. However, when you want or need additional performance or correctness guarantees, you can supply type annotations that the compiler can use. The compiler can also infer some types from the values used or other type annotations.

In this post, we'll explain some of the basic concepts of the Dylan type system and show how it is used by the compiler.

Type and Value Relationships

There are 2 important relationships between values and types in Dylan.

They are instance? and subtype?. Other relationships, such as known-disjoint? are used within the compiler to assist ...

There are comments.