Just to clarify, this article is totally based on my understanding and is not an official document about CloudFoundry in any way, feel free to let me know if my understanding is wrong.
Let's first try to answer a question, since we learned previously that Paas is a super cool solution, one might wonder..
Why aren't lot of companies providing Paas solutions?
Building Platform as a Service (PaaS) is fairly complicated since it involves various complicated processes of building, deploying, or maintaining of various activities like orchestration of all the services internally, then abstracting all of that work, and finally, having to market, sell it, and maintain it. Due to the involvement of heavy investment, very few companies have considered building their own Paas solution. Vmware has interestingly made Cloud Foundry service open source.
What has CloudFoundry been orchestrated in?
Interestingly it is orchestrated entirely in Ruby! No Erlang, no JVM's, all Ruby under the hood.
For a nice technical overview, checkout this webinar
Who orchestrates all the components in CloudFoundry?
To orchestrate all of these moving components, the "brain" of the platform is a Rails 3 application (Cloud Controller) whose role is to store the information about all users, provisioned apps, services, and maintain the state of each component. When you run your CLI (command line client) on a local machine, you are, in fact, talking to the Cloud Controller. Interestingly, the Rails app itself is designed to run on top of the Thin web-server, and is using Ruby 1.9 fibers and async DB drivers - in other words, async Rails 3!
Rails application works hand in hand with is the Health Manager, which is a standalone daemon, which imports all of the CloudController ActiveRecord models, and actively compares to what is in the database against all the chatter between the remaining daemons. When a discrepancy is detected, it notifies the Cloud Controller - simple and an effective way to keep all the distributed state information up to date.
How is Orchestration of the CloudFoundry platform done?
The remainder of the CloudFoundry platform follows a consistent pattern: each service is a Ruby daemon which queries the CloudController when it first boots, subscribes to and publishes to a shared message bus, and also exposes several JSON endpoints for providing health and status information. Not surprisingly, all of the daemons are also powered by Ruby EventMachine under the hood, and hence use Thin and simple Rack endpoints.
The router is responsible for parsing incoming requests and redirecting the traffic to one of the provisioned applications (droplets). To do so, it maintains an internal map of registered URL's and provisioned applications responsible for each. When you provision or decommission a new app server instance, the router table is updated, and the traffic is redirected accordingly. For small deployments, one router will suffice, and in larger deployments, traffic can be load-balanced between multiple routers.
The DEA (Droplet Execution Agent) is the supervisor process responsible for provisioning new applications: it receives the query from the CloudController, sets up the appropriate platform, exports the environment variables, and launches the app server.
Finally, the services component is responsible for provisioning and managing access to resources such as MySQL, Redis, RabbitMQ, and others. Once again, very similar architecture: a gateway Ruby daemon listens to incoming requests and invokes the required start/stop and add/remove user commands. Adding a new or a custom service is as simple as implementing a custom Provisioner class.
What glues all these moving pieces together?
Each of the Ruby daemons above follows a similar pattern: on load, query the Cloud Controller, and also expose local HTTP endpoints to provide health and status information about its own status. But how do these services communicate between each other? Well, through another Ruby-powered service, of course! NATS publish-subscribe message system is a lightweight topic router (powered by EventMachine) which connects all the pieces! When each daemon first boots, it connects to the NATS message bus, subscribes to topics it cares about (ex: provision and heartbeat signals), and also begins to publish its own heartbeats and notifications.
This architecture allows CloudFoundry to easily add and remove new routers, DEA agents, service controllers and so on. Nothing stops you from running all of the above on a single machine, or across a large cluster of servers within your own datacenter.Distributed Systems with Ruby? Yes!
Building a distributed system with as many moving components as CloudFoundry is no small feat, and it is really interesting to see that the team behind it chose Ruby as the platform of choice. If you look under the hood, you will find Rails, Sinatra, Rack, and a lot of EventMachine code. If you ever wondered if Ruby is a viable platform to build a non-trivial distributed system, then this is great case study and a vote of confidence by VMware. Definitely worth a read through the source!
Another interesting read would be How Cloud Foundry works when a new Application is Deployed
Next time will try to explain CloudFoundry dynamics with a use case and go more into technical depth of each block.