On the Road to Nowhere

The following three books are probably on the desks of every practicing Erlang specialist:

– “Erlang Programming,” by Francesco Cesarini and Simon Thompson, O’Reilly, 2009

– “Erlang and OTP in Action.” by Martin Logan, Eric Marritt and Richard Carlsson, Manning, 2011

– “Building Web Applications with Erlang,” by Zachary Kessin, O’Reilly, 2012

“Erlang and OTP in Action”

For all the goodness in these books (I like most the second one: Erlang and OTP in Action), they all fail to address Information Security in a meaningful way, and they do so by adopting the now “standard” position of positioning Erlang as a “performance” and “concurrency” orientated language, taking this as a sufficient explanation.

The Erlang and OTP in Action book gives the reader a very good grasp of all the aspects entailed by an Erlang-based project. Knowledgeable people say that it is impossible to fully understand Erlang without OTP. and this is because OTP effectively builds up Erlang into a complete platform for service design, build and run operations.

Security is addressed (in the customary way) only in Part I of the book, in chapter 8 “Introducing distributed Erlang/OTP,” keeping with the assumption that Security somehow is related to “distributed” operations only. We get in this chapter very useful information about nodes and clustering, for example how to start a node and node naming options. A key point is the role of the epmd (Erlang port mapper daemon) which is explained in section 8.2.3. For Security purposes it is very important to understand the behaviour of this sub-component and the assumptions under which it operates. The authors write:

“Note that Erlang’s default distribution model is based on the assumption that all the machines in the cluster are running on a trusted network. If that isn’t the case, or if some of the machines need to talk to the outside world, then communication over the unsafe network should be done using direct TCP (or UDP or SCTP) with a suitable protocol for your application, as you did with the RPC server in chapter 3. Alternatively, you can tunnel the traffic via SSL, SSH, of IPsec, or even configure the distribution layer to use SSL or another carrier protocol (see the Erlang/OTP SSL library and ERTS user guides for details.” (Page 197)

As we see, the advice here goes no further than what we have read in other books: in essence the authors suggest a “rolling-your-own” approach and focus exclusively on transport layer security (i.e. inter-node data encryption). This us useful but lacking in depth and from a Security perspective it leaves much to be desired.

It is interesting to see that the authors have an idea of the need to surround Erlang environments with additional infrastructural protection, but sadly they do not elaborate on the architecture of this:

“In a typical production environment, you have a number of machines on a trusted network and one or more Erlang nodes that communicate with the outside world via an Erlang web server like Yaws, MochiWeb or the standard library inets httpd. You may also be running other protocols on certain ports. Apart from that, nothing can access your network from the outside. Still it would be foolish to have no security at all, if only to avoid human error. Erlang’s distribution users a system of magic cookies for authorisation; and apart from firewalls, the most common reason for failing to connect nodes is an incorrectly set cookie.” (Page 198)

Aside of using the term “authorisation” wrongly (I guess the authors meant identification of the calling node) this paragraph just shows how dependent Erlang OTP is on the environment it runs in. In some cases this may be “protected” by other Erlang based components, for example the Yaws web server, and in others (perhaps the majority of cases!) all Security arrangements will be based on other infrastructural components and other parts of the Solution Architecture. The Solution and Security Architects need to be prepared to protect Erlang-OTP by means of authentication an authorisation mechanisms, as well as various types of firewalls, not to speak of the large gap that still would need to be addressed in terms of transaction level security (at application level).

As fine as Erlang may be for large concurrent workloads, this fact may give the Solution Architect pause and lead him or her to think about other options given the fact that a complete solution may just need a meaningful complement of non-Erlang-based technologies around it.

“Erlang Programming”

The book “Erlang Programming” by Cesarini and Thompson does not change my idea that the Erlang community is “set on its ways” in what respects to Security. The book is a fine show of software engineering knowledge, but sadly a very unilateral one.

As with every other book I read about Erlang, Security is relegated to a chapter dedicated to “Distributed Programming in Erlang” (Chapter 11, page 145). Now what is understood as “distributed” here is completely within the original design of Erlang way before “distributed” meant global, international communications and data movements over untrusted networks. If the current network technologies –more or less attuned to this situation– are under persistent, multifaceted and powerful attacks by public and private actors, and if we know (as we do) what can be done with data transfers even over “encrypted channels,” whatever is said about Erlang Security will seem not only insufficient but outright dangerous.

Let’s remember though that Erlang designers are in fact proposing that the security of Erlang-OTP should be provided by means of other (non-Erlang) technologies and components and we have the whole picture: the proposal role or niche sought by Erlang is to operate in the back end, as a high performance and high concurrency device while being completely dependent on the Solution and Infrastructure for anything related to data protection and access control. (I still need to investigate the access control capabilities of Erlang-based applications like Yaws and Couch DB to  fully understand what can be obtained from the Erlang platform.)

The following fragments show the over-arching focus of Cesarini and Thomson’s book:

“Take an installation of Ejabberd, an Erlang open source Jabber-based instant messaging (IM) server. Its implementation is distributed across a cluster of two or more Erlang nodes. These nodes, residing on the same or separate machines, help each other by sharing the message and event loads. Should one of the nodes terminate because of a software or hardware error, or simply because of lack of memory, the other nodes take over the traffic, hiding the fault from the end user. In the worst case, end users might believe they experienced a network glitch when the socket reconnects to the new node, but all they would notice are other users signing out and in.” (Page 245)

“The Erlang Web Framework, an open source application for Erlang-based web applications, uses distribution for scalability and reliability. A typical cluster consists of frontend and backend nodes., The frontend nodes contain the web servers (running in the Erlang node), a cache layers, and a layer handling XML parsing for inbound requests. It also contains the functionality for handling the dynamic generation of XHTML. Two or more backend nodes contain the databases and all of the glue and logic needed to generate the dynamic content. The real load will be on the frontend, as it handles the socket connections and most of the parsing. To scale the system, all you need to do is add more hardware and frontend nodes, increasing the backend support only when necessary. Should any of the nodes fail the load balances will automatically redirect the traffic to the nodes that are still alive.” (Page 246)

The focus is evidently on concurrent workloads and nothing else, a fact that runs against our most basic experience of Web applications and infrastructure. As for the “communication and security” we get some instructions on pages 250-253 repeating the customary instructions on how to use “cookies.” But also the warning:

“Even the most security-unconscious readers will have realized that basing your security on secret cookies alone is not very reliable. As telecom clusters tend to run behind firewalls, enhancing security in its distribution model has never been an issue. In the early days, cookies were in fact sent across the network unencrypted!

“Considering the low level of security in distributed Erlang, how can you build a secure distributed system in Erlang? There are two answers to this question:

“- If you are building a distributed system for scalability and robustness, it’s likely that you are working in a closed and secure network environment. In this case, the Erlang distribution model directly supports what you require in a transparent and effective way.

“-If you want to build a geographically distributed system, it is best to communicate between nodes using existing secure mechanisms, such as SSL over TCP-IP. The Erlang distribution has library support for many protocols including secure protocols such as SSL. We’ll cover the fundamentals of how to communicate using TCP/IP in Erlang in Chapter 15.

“You can enhance security by writing your own net_kernel process, giving the process whatever behaviour and level of security you might require.” (Page 254)

All of this not only reveals a worrying lack of understanding of Application Security but actually an equation of Security to “transport layer” security, as we have seen done in the Erlang-OTP mentioned earlier in this post. “Roll-your-own”, “use SSL” seem to be all the top Erlang specialists have to tell us.

I am somewhat encouraged though seeing that Cesarini and Thomson at least recognise that there is a problem. As for the “two answers” indicated above, I regret to say that this is misguided because nobody in the world is building distributed systems only for “scalability and robustness” … unless this is a way to confirm that some Erlang implementation actually are being built without any Security requirements.

“Building Web Applications with Erlang”

The book “Building Web Applications with Erlang,” by Zachary Kessin is even more disappointing, if only because the subject of web applications should not and cannot (under any conceivable premises) be addressed without a strong Security emphasis. It is a pity that Kessin chose to follow the “standard” approach and assume that everything we are seeking in web application development has to do with performance and resiliency. I am not naive about the limitation of the currently dominant application service platforms but if the Erlang wants to become “the next big thing” it will have to address the Security gaps this platform exhibits.

Kessin’s book is centred on the Yaws web server and is very well informed about the capabilities of this component. The ability to instantiate REST as well as conventional HTTP client server applications is frankly very attractive from the point of view of Solution design. REST also makes a lot of sense for concurrent workloads.

On the other hand, subjects related to authentication, authorisation, user data storage and application security are scantily treated. For example, the term “authentication” is used only three times in the entire book: once (on page 36) and only because of an obvious reference to HTTP cookies; a second time in the context of discussing external authentication (page 87), i.e. authentication by third parties; and a third time (on page 162) when Kessin mentions authentication in passing while remarking that “a more robust example should of course use some authentication to determine the user.”

Indeed. If we look at the code on pages 102 to 104 we see an example of “user status” handling but *no* authentication.

As for the term “authorisation,” we can find the subject used only on page 87 when referring to a third party “authorisation” page, i.e. not an application or transaction level authorisation mechanism.

If we consider how the author addresses user authentication we read things that would be funny if they did not correspond to what the Internet was probably around the year 1990:

“Sometimes you may wish to to restrict access to resources, for example to users who have entered a password or can otherwise be authenticated [..]. In many cases you may wish to do something like check a username and password or session token against a Mnesia database or other data store. Ideally you would validate the username and password against some source of data,
such as a Mnesia table. In  Example 3-6, I use the function  validate_username_password/1 that extracts the username and password from the request and checks them against the Mnesia table. This function will return either  {true, Uuid} if the user authenticates correctly, or  {false, Reason}. In this case,  Reason can be  no_user in the case where there is no user by that name, or bad_password. Clearly sharing the reason why the login was rejected would be a bad idea. The out/2function takes the result of  validate_username_password/1 and returns either {status, 401}if the user did not authenticate or a HTML page. It also logs the login attempt.” (Page 38)

So we learn that “sometimes you may wish to restrict access to resources,” when nowadays we *always* do so in any serious application environment, and we learn that “in many cases you may wish to do something like check a username and password” – and I can picture the smiles of any application developer who really understands the unforgiving tasks of user authentication. Note: the term Mnesia in the text above refers to the Erlang default/embedded database which in itself would need a Security qualification as to its ability to serve as a user data store!)

By the way, Kessin doesn’t say anything else about user authentication in the book. In summary, while I would keep this book at hand while using Erlang for some projects, I definitely would look somewhere else (and by this I mean: other non-Erlang based technologies) to provide authentication, authorisation, application security, transaction and session control and user management. I am sure though that if my clients required strong authentication and authorisation mechanisms at application level I would have trouble justifying the use of Erlang for web solutions.

Alternatively we could still see Erlang-OTP as an infrastructural component, in a secondary role or a “functional niche.” If this is what the Erlang community is aiming at then probably the goal has been already achieved.