On the Limits of the “Possible”

A look at Erlang-OTP quickly reveals the limits it has in terms of Security principles and concerns. A brilliantly conceived platform is diminished by the exclusive focus on performance and availability goals.

In the same way, important applications like the innovative CouchDB and the standards-based RabbitMQ are boxed-in by the flaws in the platform.

As applications, they cannot provide more than Erlang offers in terms of Security, or, more clearly, they have to resort to other technologies and common Internet technologies to stand up what is missing. These complements range from infrastructure capabilities to completely different programming languages which provide what they can, in many cases tested in different contexts.

This does not mean that an acceptable level of security may not be achieved by these means, and should serve only to underscore the limitations of the platform itself.

The Solution Architect or the Security Manager in a project may feel inclined to go along with these limitations and just blend in the CouchDB database into the technology stack, filling in the gaps and otherwise “couching” CouchDB in other layers of code and infrastructure, but this would leave the doors open to serious consequences. Because these are normally overlooked, I will reiterate here what this entails:

If any IT solution forges ahead without an established data ownership and data classification model the consequences are that both the Security Architecture and the Application Design will merge and become dependent on each other. Even more, there will be Security mechanisms spread across the application, the middleware (database and messaging) and the authentication and authorisation services. In other words, the lack of a security model directly and indirectly determines the application. So here is the problem with Erlang and derivatives: these do not offer and do not enforce any form of meaningful Security, so all the work is on the side of the Solution Architect and the Security Manager (and obviously the delivery lead and the programmers). What security model is recommended? How would a role model affect the database operation or performance? What should be the difference between the roles for developers and administrators?

The temptation to follow suit and implement some “agile” security remediation around Erlang-based software (or any other) by means of some  PHP pages and a “slap” of SSL is a real threat to any IT Project and has to be addressed early.

CouchDB

The book “CouchDB and PHP Web Development Beginner’s Guide,” by Tim Juravich (Packt Publishing, 2012) was the first book I read about this Erlang application during my research. I had already a good idea of the functionalities of the product in the context of high-volume web page caching but wanted to know more about more complex forms of implementation. I must say that this book is indeed a very good source to understand how to handle CouchDB despite of the complete focus on PHP for the interactions with the database.

Let’s turn to Security. If you are looking for it, jump directly to Chapter 3 and –after some instructions regarding the use of curl and the web interface, you get to the first information you need to retain for your project (see page 46).  We learn for example that:

“Having CouchDB unsecure isn’t bad when you are programming locally, but it can be catastrophic is you accidentally have an unsecure database on a publicly accessible server.”

Aside of the fact that it is difficult to understand why having an unsecure database is not bad when you program locally (unless this is the case of somebody playing at home with this software) I would object to the idea that you “accidentally” can have an unsecure database on a public server. These are no accidents; but we have to put up with the usual light tone people adopt when speaking about Security. Because we know, as Juravich himself writes that: “When you don’t have any administrators on your CouchDB instance” there *is* only one top level user called “Admin Party” who can process “any request, for anything.”

In other words, more seriously, this means that the default state of the database is completely unprotected. The web interface– called Futon– kindly advises us at the bottom right corner of the screen: “Everyone is admin. Fix this.” Correct, probably this should be your first task.

Turning around the examples given in the book, the reality is that the underlying mechanism does not expect or need any specification of the access level, or –better said—the CouchDB application does not know the concept of segregation of access and does not need it for any operation. You *may* define administrators and (as we will see in a minute) users, but that does not change the underlying operation of the software. These administrators and users are applied not at the level of the transaction and the read/write routines, but (similar as in operating systems) at the level of “ownership” of the objects and sessions.

Commands sent to CouchDB using the REST API are HTTP requests and carry user authentication and access scoping as parameters in the request string. Needless to say that the JASON responses from CouchDB are in plain text. Passwords hashes are stored in the database, but you will need something around the CouchDB API calls to secure the data. You depend on the command line environment (telnet or shell session) and the operating system security.

Note that even if you change the “roles” of the users (for example to “readers”)  while keeping only selected users as “admins,” there is still no separation of duties between the “admins.”  While this may be considered “sensible” from the point of view of a developer, it is very revealing as to the complete lack of security of the underlying software.

When looking at chapter 6 (Modeling Users) I expected to find some details with fine-grained control over user access, but instead I saw only detailed instructions to use PHP in order to “organise the user views.” This is essentially a web application development exercise (a small application to access CouchDB which can be used to define “user documents.” These documents are very important because they define the user roles and types. And empty value for “role” means that the user has no “special privileges.”

A login process now, after the definition  of a user document, will be constrained by the name, password, type and role stored in the database. On pages 123 to 137 Juravich describes the PHP code necessary to create a login page. On pages 145 and following, the author details how users log in and are provided with a session cookie to authenticate the user in the HTTP flow.

Other parts in the text revolve around the Security capabilities that can be created around the arrangement of User Documents and the level of security that may be provided by means of PHP programming. This includes the development of “content” (user profiles that are intended for “sharing” in social networks for example). (See chapter 7 “User Profiles and Modeling Posts”). The key point here is that only the authenticated user should be able to read and modify his/her profile. This is achieved by means of ever more PHP code with an example for  “admin” and common users.

At this point of the review I had the firm feeling that for any Erlang project I would have to consider (perhaps not exclusively) a meaningful level of PHP development or else the introduction of a PHP (or equivalent) web framework to wrap CouchDB and make it do more than just sit around and “serve content” ultra-efficiently.

[Added on 12/3/2014] I consulted two other books to complement this approach to CouchDB security:

a) “CouchDB: The Definitive Guide,” by J.Chris Anderson, Jan Lehnardt and Noah Slater, O’Reilly, 2010.

b) “Beginning CouchDB,” by Joe Lennon, Apress, 2009

Sadly these two books do not add anything relevant in terms of Security as compared with the material reviewed above.

RabbitMQ

From a completely different angle comes RabbitMQ. The difference lies in the fact that this application represents a serious attempt at implementing the industry standards for messaging (employing the AMQP protocol; see: http://www.amqp.org/). To learn about this technology I read the book “RabbitMQ in Action,” by Alvaro Videla and Jason J.W. Williams (Manning Publications, 2012).  Just because of its compact and clear explanation of messaging (including message formats, topics, publishing, queues, channels, exchanges, etc.) this book is a good study material. If you are familiar with IBM MQSeries or TIBCO then you will understand RabbitMQ’s messaging functionality in a matter of minutes.

Very soon we notice again the Erlang limitations seeping up to the application and its interfaces. And, as is customary in other Erlang literature, we soon know that non-Erlang technologies will have to take charge of Security while Erlang does its “resiliency” and “clustering” act.

The RabbitMQ book is somewhat different in that it gives a lot of space to the use of Secure Sockets Layer (SSL) to protects data transmissions (see for example page 213 and following covering SSL Setup); something which is also useful even if you don’t do anything else to enhance Security.

Videla and Williams do a good job explaining the concept of “virtual host” which is a “virtual message broker” peculiar to RabbitMQ. This is also important because in this Erlang application, permissions are associated to virtual hosts. The book indicates:

“When you create a user in Rabbit, it’s is usually assigned to at least one vhost and will only be able to access queues, exchanges and bindings on those assigned vhosts..” (Page 24)

If you don’t need multiple vhosts, the text explains, you can use the default one associated to the guest username with password guest. A key point is that the separation between vhosts is “absolute.”, which means that the user associated to one vhost cannot access queues on another one (which is not assigned to him/her). While this is something to start with in terms of Security, I find clear indications that the mechanism described is just a higher level replication of operating system capabilities, akin to file or folder read/write permissions. Interestingly, vhosts can created only using a specific utility called rabbitmqctl. The same command with the list_vhosts option will show all the vhosts on a particular Server (which in fact is an Erlang node where RabbitMQ is running; or a remote node if you use the –n option and the remote node name: rabbitmqctl –n rabbit@nodename).

The reader should focus on the relationship between vhosts and Security, as this is the most important link in the chain of functions and components to consider when securing RabbitMQ solutions.

Immediately the text moves on to performance and resiliency-related matters and Security is not addressed again until Chapter 3 (Running and Administering Rabbit). This chapter, especially section 3.2 is key to understand the underlying security model:

If you are familiar with access control lists on various operating systems, understanding RabbitMQ’s permission system will come readily to you. Like most permission systems, it starts with users who are then granted rights. The nice thing about the RabbitMQ permission system is that a single user can be granted permissions across multiple vhosts. this can greatly simplify the management of access control for an application that needs to talk across multiple security domains (using virtual hosts for separation).

“Within RabbitMQ users are the basic unit of access control. They can be granted different levels of access to one or more vhosts and use a standard username/password pair to authenticate the user. Adding, deleting , and listing them is simple and is accomplished using rabbitmqctl.”

(Note: Pages 43-44 describe the basics of user management.  Password changes are achieved with the same tool and the change_password option.)

In other words, the security model is extremely simple and provides (as detailed in section 3.2.2 ) read, write and configure permissions over queues in a vhost. The configure permission allows the user to create and delete queues and exchanges. The authors say on page 45 :

“The access control entry consists of four parts:

“-the user being granted access

“-the vhost on which the permissions apply

“-the combination of read/write/configure permissions to grant

“-the permission scope—whether the permissions apply only to client-named queues/exchanges, server-named queues/exchanges, or both. Client-named means your app set the name of the exchange/queue; server-named means your app didn’t supply a name and let the server assign a random one for you.”

“It is important to remember that access control entries can’t span vhosts.

And on page 47 they add:

“Permissions in RabbitMQ are simple to create and very flexible. The flexibility allows you to create complex permission structures for your vhosts, which can be a benefit when you need it, but can be difficult to interpret if they become too complex. Where possible, try to use vhost separation as your primary method of securing one app from another, and keep the number of access control entries per vhost to a minimum. This will help you to avoid unexpected permission behaviour that can be difficult to debug.”

All of which is good to know, especially because the Security advise we get from the book just ends there (if you omit fairly obvious tips about web management, SSL configuration and server logging).

Because Erlang delegates Security capabilities to its environment and the infrastructure, Erlang applications have no choice but “invent” their own “security models” and have to use other tools to achieve moderate and fairly conventional protection: for example using basic tools like login pages, PHP database calls, SSL transport encryption and very, very basic ACLs.

After learning these points I knew that my research hat to go in a different direction. And that is what I did next, by studying the now remote efforts done to create a Safer Erlang.