The first phase of Diego’s development has focused on offloading the staging workflow – the task of converting uploaded app bits to compiled droplets ready for running in Cloud Foundry – from the existing DEAs to Diego. From the outset one of Diego’s mandates has been to make it relatively easy to support multiple platforms (e.g. linux + heroku buildpacks/linux + docker/windows) on one Cloud Foundry installation.
This blog post outlines what has emerged out of this first phase of development, and describes Diego’s architecture with an emphasis on how multiplatform support is envisioned to work.The Pieces
To wrap your head around Diego you need to wrap your head around the components that make Cloud Foundry tick. Here’s a bullet-point list broken out into existing runtime components and new components introduced by Diego.Runtime Components
- Cloud Controller receives user inputs and sends NATS staging messages to Diego. The diego code is under diego_staging_task.rb and differs from the DEA codepath. in particular, we’ve moved the responsibility for computing staging environment variables completely into CC.
- NATS is the message bus used by the existing Cloud Foundry runtime
- Loggregator streams application logs back to the user.
- DEAs run staged droplets in warden containers
- ETCD is the high-availability key-value store used to coordinate data across Diego’s components.
- Stager listens for staging messages over NATS and constructs and places a staging-specific RunOnce (more on these below) into ETCD.
- Executor picks up RunOnces from ETCD and executes them in a garden container (garden is a Go rewrite of warden).
- Linux-Smelter transforms a user’s application bits into a compiled droplet. It does this by running Heroku-style buildpacks against the app bits.
- FileServer provides blobs for downloading (the smelters live here) and proxies droplet uploads to the CC (this allows us to have a simpler upload interface downstream of the CC).
In addition to these components (which, basically, map onto separate processes running on seaparate VMs) there are a few additional puzzle pieces/concepts that must be understood when discussing Diego:
- RunOnces: The executor is built to be a generic platform-agnostic component that runs arbitrary commands within garden containers. The RunOnce is the bag of data that tells the executor what commands to run. When the executor receives a RunOnce it:
- Creates a garden container and applies the memory and disk limits conveyed by the RunOnce
- Runs the actions described in the RunOnce. There are four actions:
- Download: downloads a blob from a URL and places it in the container
- Upload: grabs a file from the container and uploads it to a URL
- Run: runs a provided script in the context of provided environment variables
- FetchResult: fetches a file from the container and sets its content as the result of the RunOnce
- Marks the RunOnce as succeeded/failed and saves it to ETCD
- RuntimeSchema is a central repository of models (including the RunOnces) and a persistence layer that abstracts away ETCD. Diego components use the runtime-schema to communicate with each other.
- Inigo is an integration test suite that excercises Diego’s intercomponent behavior.
Here’s a detailed outline of how information flows during the Diego staging process:
- The user pushes an app to Cloud Controller via the CF cli: cf push my_app
- CC sends a staging NATS message. This message includes:
- The App Guid
- A target stack (used to support multi-platform CF deployments)
- An ordered list of buildpacks (names + download URLs) to run when compiling the application
- The location of the user’s app bits (a url to a blobstore)
- The environment variables to be applied during the staging process (e.g. VCAP_APPLICATION, VCAP_SERVICES)
- An available Diego stager receives the staging NATS message and constructs a RunOnce with the following actions:
- A Download action to download the user’s app bits
- Download actions for each of the requested buildpacks
- A Download action to download the correct smelter (this is selected by stack)
- A Run command to run the smelter (along with the environment variables received from CC)
- An Upload command to upload the droplet
- A FetchResults action to fetch staging metadata from the smelter
- The Diego stager then puts the RunOnce in ETCD
- A compatible Diego executor (one that matches the desired stack) picks up the RunOnce, spawns a container, and executes its actions. When it does this the executor also streams logs generated by commands run in the container back to the user via Loggregator.
- On success, the executor marks the RunOnce as complete and puts it back in ETCD
- A Diego stager then pulls out the completed RunOnce and notifies CC that staging is complete.
It’s important to understand that the Diego stager’s role is quite small: it simply interprets the incoming NATS message, produces a valid staging RunOnce, and then conveys the result of executing said RunOnce back to the CC. The pool of Diego executors is doing all the heavy lifting (namely: actually downloading the app bits and producing a droplet).Multi-platform Entry Points
Diego is built to make supporting multiple-platforms relatively straightforward. The parameter that determines which platform an app is targeted for is the stack:
- The user selects a target platform by specifying a stack when pushing an app.
- The Diego stager selects a smelter based on the stack and creates a RunOnce with the associated stack.
- The Diego executors are configured with a stack parameter. An executor will only pick up a RunOnce if the stack denoted by the Runonce matches the executor’s stack.
Given this, to support a new platform (e.g. Windows) one needs the following pieces:
- A smelter designed for the target platform. For linux, Diego has a linux-smelter that runs through Heroku-style buildpacks. For windows, for example, one could construct a smelter that simply validates and repackages a user’s app bits in preparation for running against a .NET stack (in such a case the notion of buildpacks is unnecessary and can be ignord). The smelter’s api is simple: it should be an executable that accepts a certain set of command line arguments and produces a droplet.tgz file.
- A platform-specific plugin for the executor. This is compiled into the executor when it is built and solves certian platform-specific issues (e.g. converting environment variables from an array-of-arrays input format to a platform-specific format).
- A platform-specific Garden backend plugin. Garden performs all containerization via a backend plugin with a well-defined interface. Garden ships with a linux-backend that constitutes a reference implementation. To target other platforms one simply needs to write an API-compatible backend plugin.
In terms of deployment we envision that most components (the stager, fileserver, etcd, cloud controller, nats) will be deployed to Linux VMs. Only the Diego Executor and Garden need to live on the non-linux target platform. Since these components are written in Go, recompiling and targetting a supported non-linux platform should be relatively straightforward.
Today’s guest post is from Iwasaki Yudai, research engineer at NTT Laboratory Software Innovation Center and Du Jun and Zhang Lei from the Cloud Team, Software Engineering Laboratory, Zhejiang University, China (ZJU-SST).
Cloud Foundry is the leading open source PaaS offering with a fast growing ecosystem and strong enterprise demand. One of its advantages which attracts many developers is its consistent model for deploying and running applications across multiple clouds such as AWS, VSphere and OpenStack. The approach provides cloud adopters with more choices and flexibilities. Today, NTT Software Innovation Center and School of Software Technology, Zhejiang University, China (ZJU-SST) are glad to announce that we have successfully developed the BOSH CloudStack CPI, which automates the deployment and management process of Cloud Foundry V2 upon CloudStack.
CloudStack is an open source IaaS which is used by a number of service providers to offer public cloud services, and by many companies to provide an on-premises (private) cloud offering, or as part of a hybrid cloud solution. Many enterprises like NTT and BT want to bring Cloud Foundry to CloudStack in an efficient and flexible way in production environment. Therefore, developing BOSH CloudStack CPI for deploying and operating Cloud Foundry upon CloudStack becomes a logical choice.Technical Details
Since NTT and ZJU-SST developed BOSH CloudStack CPI independently at the beginning, there are many differences in the implementation. Hence, the first step is to merge code repositories of NTT and ZJU-SST into a new repository. We chose to create a new repository in GitHub cloudfoundry-community in order to encourage more developers to join us.
There are some crucial aspects in the process of refactoring CPI(check out the wiki if you are interested in digging more differences between NTT and ZJU-SST implementations):
- Stemcell Builder
- Create Stemcell
- Basic Zone VS Advanced Zone
- Fog Support
ZJU-SST used standard Ubuntu10.04 ISO file to build stemcells for both MicroBOSH and BOSH. NTT used Ubuntu10.04 of a backported kernel version due to some compatibility problems in their environment. Unfortunately, aufs, which is essential for warden in Cloud Foundry V2, is missing in the backported kernel. So, we decided to try standard Ubuntu12.04 as the base OS of stemcells for both MicroBOSH and BOSH after brainstorming together. We found that, with a minor patch of cf-release, Ubuntu12.04 is compatible with BOSH and Cloud Foundry. The patch only modifies the deployment process of Cloud Foundry, so it does not impact Cloud Foundry itself. In fact, this issue has been solved since cf-release v160 by updating nginx to 1.4.5.Create Stemcell
When referring to API call create_stemcell in CPI, ZJU-SST used an extra web server as a entrepot to store the stemcell temporarily while uploading, which follows the OpenStack style, but NTT didn’t use an extra web server and took the volume route same as AWS pattern.
- Similar to the OpenStack CPI
- Requires no inception server
- Requires a web server to save qcow2 files and expose them at HTTP for CloudStack
- [Client] –> (SSH upload) –> [WebServer] –> (HTTP) –> [CloudStack]
- The API call can’t receive image files directly, it downloads image files from given URLs. The web server is necessary because CloudStack can not directly receive image data from inception server.
- Same process as that of the AWS CPI
- Requires an inception server to create a bootstrap Micro BOSH stemcell
- Copies (using dd) a raw stemcell image to a volume attached to the inception server
- Creates a template from a snapshot of the volume
Both implementations have cons and pros.
[A] user have to setup a web server somewhere, however users can bootstrap MicroBOSH from the outside of CloudStack.
[B] an inception server is always required, however users don’t have to setup a web server.
After a heated discussion in the open source community, we adopted approach B as the default solution for uploading stemcells to bosh director due to approach A is not user-friendly. Meanwhile, we created a new branch for experiment with approach A.Basic Zone VS Advanced Zone
Before the collaboration between NTT and ZJU-SST, ZJU-SST worked in CloudStack advanced zone while NTT developed in CloudStack basic zone. ZJU-SST preferred advanced zone because according to our test, basic zone was unable to support Cloud Foundry without applying some “tricks” during the network configuration process. On the other hand, NTT did deploy Cloud Foundry in basic zone by using some “tricks”, for instance, deploying a separated BOSH DNS node and so on. However, we all agreed that it’s inconvenient to work in basic zone especially when we need to redeploy some components such as router. Finally, we reached an agreement to support both network types and add an option to switch between them.Fog Support
Both NTT and ZJU-SST invoked fog gem to send API requests to CloudStack engine. However, APIs of the official fog gem are not rich enough to supporting BOSH Cloudstack CPI. ZJU-SST built a local fog gem which added the missing APIs while NTT made a fog patch with missing APIs in CPI project to work around for the moment. We already have sent a PR to fog and are waiting for it to be merged.
When refactoring work finished, we started a month-long heavy test. Once a bug was found, the bug finder would open an issue and describe detailed informations about it. Then all of the developers will receive the message about this bug via mail list. Any commit to the code repository would be submitted in the form of pull request and the repository owners would review the set of changes, discuss potential modifications, and even push follow-up commits if necessary. Only if the PR passed CI and BAT can it be merged. Simply put, we followed the workflow of other Cloud Foundry repositories in GitHub such as cf-release and bosh. In this way, we controlled the history of the new repository and prevented potentially dangerous code from being added in.Current Status
- Finished development and test work.
- Support both basic zone and advanced zone for CloudStack.
- Tested on CloudStack 4.0.0 and 4.2.0 with KVM hypervisor.
- Successfully deployed Cloudfoundry V2 and had applications running.
- Support both Ubuntu10.04 and Ubuntu12.04 stemcells.
- Support Ubuntu14.04 stemcell is in the TODO list.
- Open sourced and maintained in GitHub.
- Create Inception Server
- Bootstrap Micro BOSH
- Deploy Cloud Foundry
- Setup Full BOSH (Optional)
You need a VM on the CloudStack domain where you install a BOSH instance using this CPI. This VM is so-called “inception” server. You will install BOSH CLI and BOSH Deployer gems on this server and run all operations with them.Why do I need an inception server?
The CloudStack CPI creates stemcells, which are VM templates, by copying pre-composed disk images to data volumes which automatically attached by BOSH Deployer. This procedure is same as that of the AWS CPI and requires that the VM where BOSH Deployer works is running on the same domain where you want to deploy your BOSH instance.Create Security Groups or Firewall Rules
You also need to create one or more security groups or firewall rules for VMs created by your BOSH instance. We recommend that you create a security group or firewall rule which opens all the TCP and UDP ports for testing. When in production environment, we strongly suggest setting more tight security group or firewall rules.Boot an Ubuntu Server
We recommend Ubuntu 12.04 64bit or later for your inception server. For those who use Ubuntu 12.10 or later we recommend to select Ubuntu 10.04 or later for the OS type while creating instance from ISO file or registering VM templates.More Steps
You may find more detailed guide step by step concerning on deploying Cloud Foundry on CloudStack here. In fact, the remaining steps are very straightforward and similar with other deployment methods such as AWS, Vsphere and OpenStack.Why NTT and ZJU-SST?
This work is powered by NTT and ZJU-SST, we have been working together since last November. Thanks to NTT team member Iwasaki Yudai and ZJU-SST team members Du Jun and Zhang Lei, the main contributors of this project, who had devoted much of their energies to fixing issues and rising to challenges on the CPI work. In addition, we would like to appreciate the selfless help received from Pivotal Cloud team and Nic Williams.
ZJU-SST is the biggest software engineering team of Zhejiang University as well as the leading Cloud Computing research institute in China. ZJU-SST started R&D work on Cloud Foundry and CloudStack about 3 years ago, and more recently launched a comprehensive PaaS platform based on Cloud Foundry V1 serving City Management Department of Hangzhou China. ZJU-SST released BOSH CloudStack CPI for Cloud Foundry V1 last May, and introduced the CPI work at PlatformCF 2013 in Santa Clara, California last September.
NTT, the world’s leading telecom company, has been active in fostering the Cloud Foundry developer and user community in Japan. NTT has been contributing to Cloud Foundry for the last two years and sharing its projects such as Memcached Service, Nise BOSH, a lightweight BOSH emulator and BOSH Autoscaler, with the community. NTT Communications, a subsidiary of NTT Group, has been running a public commercial PaaS Cloudn with Cloud Foundry since March 2013, and a video about their efforts on Cloud Foundry in building a commercial service with it is available at the Cloud Foundry Summit 2013 site.
The decision to work together was motivated in part because ZJU-SST intended to upgrade their previously released CPI to support Cloud Foundry V2 and NTT wanted to improve their independently developed BOSH CloudStack CPI project so that can be compatible with CloudStack advanced zone.
Xiaohu Yang, Executive Vice Dean of School of Software Technology, Zhejiang University, thought highly of this international collaboration. “It will be a win-win cooperation. Open source projects such as Cloud Foundry can serve as an international platform for education and researching”.Facts and Lessons Learned
This is a successful international collaboration which benefits both NTT and ZJU-SST a lot. With the help from NTT, ZJU-SST is able to release an upgraded CPI which supports Cloud Foundry V2 and CloudStack 4.0/4.2. NTT appreciates ZJU-SST for their huge effort in building a CPI runnable on various CloudStack environments. The most precious assets we get from this cooperation maybe the experience in how to perform international cooperation effectively and how to reach out to the community if help is needed.Join Us
Questions about the BOSH CloudStack CPI can be directed to BOSH Google Groups. Of course, please open an issue or send a pull request if you want to improve this project.
The following is a guest post by Kelly Lanspa from the CloudForge product team.
CloudForge from CollabNet is a collaborative software development and application lifecycle management platform. It includes source code management, Git/Subversion hosting and bug tracking on one platform, with backup services, additional storage and secure role-based user access to manage distributed teams. CloudForge is integrated with Pivotal’s hosted Cloud Foundry service, enabling users to easily build-test-deploy and scale apps.
From the marketplace console (or the cf command line utility), select CloudForge then choose one of the packages available. When you select a package by clicking on “buy this plan”, CloudForge will require you to select a unique name for your instance which will be used to create a new organization name in CloudForge. Each “space” in Cloud Foundry will map to a unique CloudForge organization, and users can create different CloudForge accounts for different Cloud Foundry Applications or Spaces, but this isn’t strictly required. You can always deploy your code from a shared CloudForge repository to different Spaces. As CloudForge is a shared service, it does not need to bind to a specific Cloud Foundry application.
Create a new instance called “PivotalDemo” in your default Space (typically “development”) but do not bind it to a specific application. Note: the account name in CloudForge needs to be unique.
Add the CloudForge service in your Space. When you have successfully created a new CloudForge instance it will be listed in your Space Services. Click on “Manage” to finish configuring your CloudForge account.
The instance name you chose for CloudForge service will be used for your new CloudForge account. By default, any user clicking on “Manage” within Cloud Foundry will be automatically logged into CloudForge via OAuth as the Cloud Foundry Space user. However, you will need to choose a username and a password to access your SVN and Git repositories. These SCM services do not support oAuth and are accessed directly from your SCM client, IDE or your command line. Changing your first and last name are optional.
Regardless of how many developers you have it is always useful to maintain version control, even if you are the only developer. Using a Cloud-based SCM provider like CloudForge means that you always have a full copy of your source code backed up and ready for retrieval from any computer. It also makes it very easy to add collaborators on your projects. In order to add collaborators simply click on Admin->Manage Users.
You can invite as many collaborators as you like (depending upon your plan) to join your CloudForge account regardless of whether they are users on Cloud Foundry (the admin CloudForge account is created when the CloudForge service is selected from the run.pivotal.io marketplace, but subsequent team members do not have to be Cloud Foundry users). Each user will create a new username on CloudForge so you can track their commits, issue updates, and comments separately as you develop your application.
Now that you have invited others to collaborate on your projects it’s time to create your first project. Click on Projects->New Project.
Create a new CloudForge Project to manage your development. After you name your project you can add a SVN or Git repository (or both). If you want to include Bug/Issue Tracking, Agile Planning, Wiki, or Discussions you can also choose to add TeamForge to your project. All of these services will be provisioned within your project and the direct links for each provided on your project landing page.
Once you have coded your application and merged all of your branches into a stable trunk you can easily deploy the app to Cloud Foundry using the CF command utility or directly from the Spring Tool Suite. You are able to deploy to any CloudFoundry target, Space, or Org; your CloudForge project is not bound to a specific instance of Cloud Foundry. Feel free to create as many Projects in CloudForge as you need. Your account can be used to manage multiple applications and projects across several Spaces within Cloud Foundry. If you using the Spring Tool Suite IDE, we recommend installing the CollabNet Desktop as well. This allows you to manage your CloudForge account, including all repositories and all tasks, from within the same environment that manages your code.