NoSQL, Heroku, and You
“NoSQL” is a label which encompasses a wave of innovation now happening in the database space. The NoSQL movement has sparked a whirlwind of discussion, debate, and excitement in the technical community. Why is NoSQL generating so much buzz? What does it mean for you, the application developer? And what place does NoSQL have for apps running on the Heroku platform?
SQL (the language) and SQL RDBMS implementations (MySQL, PostgreSQL, Oracle, etc) have been the one-size-fits-all solution for data persistence and retrieval for decades. The rise of the web and the LAMP stack cemented the role of the relational database. But in 2010 we see a variety of application needs which are not satisfied by MySQL and friends. New problems demand new tools. High availability, horizontal scaling, replication, schemaless design, and map/reduce capability are some of the areas being explored by a new crop of datastores, all of which are loosely categorized as NoSQL.

To understand why NoSQL is important to you as an app developer, let’s consider the use cases for some of these features:
- Frequently-written, rarely read statistical data (for example, a web hit counter) should use an in-memory key/value store like Redis, or an update-in-place document store like MongoDB.
- Big Data (like weather stats or business analytics) will work best in a freeform, distributed db system like Hadoop.
- Binary assets (such as MP3s and PDFs) find a good home in a datastore that can serve directly to the user’s browser, like Amazon S3.
- Transient data (like web sessions, locks, or short-term stats) should be kept in a transient datastore like Memcache. (Traditionally we haven’t grouped memcached into the database family, but NoSQL has broadened our thinking on this subject.)
- If you need to be able to replicate your data set to multiple locations (such as syncing a music database between a web app and a mobile device), you’ll want the replication features of CouchDB.
- High availability apps, where minimizing downtime is critical, will find great utility in the automatically clustered, redundant setup of datastores like Casandra and Riak.
Despite all the use cases described above, there will always be a place for the highly normalized, transactional, ad-hoc-query capabilities of SQL databases. We’re adding new tools to our toolbox, not removing old ones.
Polyglot Persistence – or, How Do You Pick a NoSQL Datastore?Part of the NoSQL message is: pick the right tool for the job. The use cases outlined above should help guide your choice of datastore, as will many resources around the web like this diagram, these slides, or this video. And, like any technology, you should pick something that feels right for you and your team.
But most apps encompass multiple use cases. How do you pick one database to handle all the types of data your app needs to deal with? NoSQL answers: you don’t have to pick just one. This concept is called polyglot persistence (more details).

For example, we can imagine a web app which uses four different datastores:
- MySQL for low-volume, high-value data like user profiles and billing information
- MongoDB for high-volume, low-value data like hit counts and logs
- Amazon S3 for user-uploaded assets like photos and documents
- Memcached for temporary counters and rendered HTML
Polyglot persistence also makes it easy to dip your toes into NoSQL. Don’t migrate your existing production data – instead, use one of these new datastores as a supplementary tool. (Example: put non-critical session data or stats into Redis or Tokyo Tyrant.) And if you’re starting on a new app, you should give serious consideration to NoSQL options for your primary datastore, in addition to the venerable SQL choices.
NoSQL and the CloudThe SQL databases we’re using today were designed over a decade ago. They were written with the constraints of 1990s hardware in mind: storage is cheap, memory and cpu are expensive. Today’s machines have different parameters. Memory and CPU are cheap, and can easily be scaled up on demand without capital expenditure using a service like Amazon EC2. But EC2, like all cloud technology, is based on virtualization. Virtualization’s weakness is I/O performance. So, the constraints of disk and memory have swapped: disk is the weak link in the chain, memory and cpu (spread out horizontally) are the strong links. It’s not surprising, then, that the datastores built a decade ago aren’t a good fit for the new parameters of cloud computing.
NoSQL databases tend to use memory over disk as the first-class write location: Redis and Memcached are in-memory only, and even systems like Casandra use memtables for writes with asynchronous flushing to disk, preventing inconsistent I/O performance from creating write speed bottlenecks. And since NoSQL datastores typically emphasize horizontal scalability via partitioning, this puts them in an excellent position to take advantage of the elastic provisioning capability of cloud. NoSQL and cloud are a natural fit.
Database-as-a-Service is the FutureInfrastructure-as-a-service like Amazon EC2 and Rackspace Cloud are what most of us think of as “cloud.” One of the effects of these large public clouds is that apps now have extremely low latency between themselves and other apps or service providers – 1ms or less compared to 50ms+ on the open internet. This latency difference opens up vast new possibilities for what a 3rd party service provider can offer.

Database-as-as-service is one of the coming decade’s most promising business models. Already services like MongoHQ (MongoDB), Cloudant (CouchDB), and Amazon RDS (MySQL) are offering fully hosted and managed databases to apps running in EC2. Like IaaS, DaaS promises remove up-front capex costs, and bring efficiency of scale and specialization in the admin and operation of databases. Although these services are still very young, the potential benefit of being able to outsource all the headaches of running and scaling your app’s database are enormous.
DaaS also goes hand-in-glove with polyglot persistence. Thanks to database services, you won’t need to learn how to sysadmin/DBA for every datastore you use – you can instead outsource that job to a service provider specializing in each database. One of the reasons databases have historically had a tribal affiliation (someone is a “MySQL guy” or a “Postgres gal” or an “Oracle guy,” but rarely two or all three) is because of the time investment in learning how to admin whatever database you use. DaaS removes that barrier and opens up even greater possibility for polyglot persistence in production use.
Heroku’s Commitment to Database InnovationHeroku already supports two of the most popular NoSQL databases, MongoDB and CouchDB, as add-ons: MongoHQ and Cloudant. We also support the transient key-value datastore, Memcache, via Northscale’s service.
Looking forward to the future: we have more NoSQL add-ons coming down the pipeline, such as Redis To Go. And we’ll be continuing to work with technology leaders in the NoSQL community to help them bring their database services to market. Our goal is to provide access to the cornucopia of breakthrough new database technologies from the NoSQL world, available from the Heroku add-ons catalog at the click of a button. We hope to make Heroku a great place to play with these new technologies, while still curating a list of options that are fully baked and ready for use in real production applications.
Of course, we can’t forget that Heroku currently runs the largest and most mature SQL-database-as-a-service in the world: our PostgreSQL service, packaged with every Heroku app. We’re continuing to expand and improve this service (including support for great new features in Postgres 9), because we know SQL and the apps that depend on it are here to stay. Reinforcing our commitment to polyglot persistence, we’ll be turning our Postgres service into a separately packaged add-on – still installed by default into each app, but possible to opt out, or combine with other datastore add-ons. We also hope to see other providers in the SQL-as-a-service space besides Heroku’s Postgres service and Amazon RDS.
It’s an exciting time for data, and our team here at Heroku is thrilled to take part in the continuing growth of the NoSQL movement.
Teambox on Heroku
More and more developers are using Heroku as a SaaS deployment platform. By creating their applications on top of Heroku, they can leverage our architecture and security model to provide SaaS to their customers easily. Today we want to highlight a new favorite, Teambox.
Teambox is an opensource twitter-like collaboration tool for companies organization and teams. Teams around the world use it to collaborate and keep in touch, track tasks and much more.
The teambox team has made it easy to install on Heroku as well. This screencast walks you through the instructions from start to finish in just 5 minutes. Give it a try yourself, and try out their collaboration tool.
Default to Bamboo
Deployment stacks have been a huge success. For many developers, heroku create —stack bamboo has become the default whenever creating new apps. With the latest version of Rails 2 and Rails 3 both requiring the Bamboo stack, we’re excited to make Bamboo the new default.
Effective immediately, all newly created apps will default to the bamboo stack with REE 1.8.7. You can still use the old aspen stack if you’d like by simply specifying `heroku create —stack aspen`. Existing apps stay on the stack they are on unless you explicitly migrate them.
A key feature of bamboo is to eliminate pre-installed gems. This provides app developers with considerably more flexibility in managing their apps. You can easily use any version of any gem by simply including it in your .gems file or Bundler Gemfile. You’ll need to remember to include all gems you are using. If you’re using Gemfile, this is automatically done for you. If you’re using .gems, please make sure to include all the gems you use, INCLUDING rails!
Rails 3 Beta 4 on Heroku
Heroku now supports Rails 3 beta 4 with Ruby 1.8.7. Make sure to push up to bamboo, and you should be all set!
As Rails 3 matures and gets closer to production a number of pieces continue to change. The beta 4 update introduced two significant changes to be aware of:
- Require Ruby 1.8.7 > p249 or Ruby 1.9.2.
- Require Bundler 0.9.26.
Heroku has updated to Ruby REE 1.8.7-2010.02 which incorporates the necessary patches for Rails 3. We will add support for 1.9.2 when the community releases the official release. In the meantime, developers interested in using Rails 3 on Heroku must use Ruby 1.8.7.
We have also updated to the latest stable release of Bundler: 0.9.26. We will continue to track the evolution of bundler through their 1.0 release. Due to thousands of applications that have already adopted bundler on Heroku, each update must be carefully tested to ensure that we don’t disrupt ongoing application operation.
Apigee Add-on for Twitter Public Beta
If you develop apps for Twitter, this is the add-on for you. The Apigee for Twitter Add-on allows developers to easily access Twitter REST api’s. Through a direct relationship with Twitter, Apigee can offer users of the Add-on vastly increased rate limits automatically. The goal is to ensure that no valid application hits rate limits at all.
If you’re developing applications using the Twitter REST api, check out the add-on today. Using it is often as simple as changing your app to use the apigee provided config var endpoint instead of “api.twitter.com”. Full docs are available here, and as always please let us know how it works for you.
Rails 2.3.6+ Dependency Issues
This past Sunday, Rails 2.3.6 was released, and quickly followed by 2.3.7 and 2.3.8. One of the major changes in these new versions is to require a newer version of Rack, specifically 1.1.0, that is incompatible with Rails 2.3.5 and older. Due to the fairly complex ways in which Rubygems resolves dependencies, this can prevent your app from starting – in your local environment as well as when deployed on Heroku. If you’ve been affected by this issue, you would see this error message:
Missing the Rails gem. Please `gem install -v= x.x.x`,
update your RAILS_GEM_VERSION setting in config/environment.rb
for the Rails version you dohave installed, or
comment out RAILS_GEM_VERSION to use the latest
version installed.
We have written up a new docs page page with information detailing the issue, how to reproduce on your local machine, and how to resolve it.
Ignition!
We can’t be happier to announce that we recently closed a $10 million Series B round of investment led by Ignition Partners. We’re planning to use the money to further expand our platform, turbo-charge partner programs for add-on providers and consultancies, and accelerate our go-to-market programs.
The growth and excitement that we’ve seen at Heroku, particularly in 2010, has been incredibly energizing for all of us. We talk a lot about numbers – the 60,000-plus apps running on our platform gets quoted a lot recently – but even more motivating are the creative forces that the platform is unleashing.
Developers and companies are building and running some amazing apps with our platform (check out the United Nations app ProtectedPlanet.net for one of my recent favorites). Ruby on Rails consultants are growing their businesses and creating happier customers. Technology vendors are building some very cool extensions to our platform as part of the Add-on system.
In other words, creating an open, efficient, and reliable platform that upends the status quo is not just about technology: it’s about resources and support for developers, making it easy for partners to use the platform for their own customers, and enabling technology partners to build businesses by extending the platform itself.
It’s also about having a great team. We can’t be prouder of the team we’re building at Heroku (if you might be interested in joining, please check out jobs.heroku.com), and the team just got stronger: John Connors of Ignition has joind our board. John is a super-smart, super-seasoned executive with a wealth of experience (including stints as CIO and CFO at Microsoft) that we’ve already begun to draw on as we plan what’s next for Heroku.
We’d like to take this moment to thank all of you for your support. We’ll take full advantage of the additional resources and expertise joining us today to serve you in the future.
Official news release here.
MongoHQ Add-on Public Beta
Let’s cut straight to the chase: MongoHQ is launching their add-on to all Heroku users as a public beta.
The detailsOver the last six months we have seen persistent demand for MongoDB support on Heroku, so we are incredibly excited that MongoHQ is releasing their highly anticipated add-on into public beta today. The add-on interfaces seamlessly with their successful hosted service, and allows developers to use MongoDB as a first-class-citizen data store in any of their Heroku apps. Using it is just as easy as you’ve come to expect from Heroku: simply add the add-on, and you’re good to go!
The first available plan is free and includes one database up to 16MB. Soon, you will have your pick of the full range of MongoHQ plans.
Getting StartedAdd the add-on to your app:
$ heroku addons:add mongohq:free
This will create a fresh new MongoHQ database for you and set the proper environment variable in your app. Follow the detailed instructions in our MongoHQ docs page for specifics on using MongoDB in your app.
Transferring data in/outSince MongoDB isn’t a SQL database, taps won’t work. Luckily Pedro has created a Heroku client plugin that offers you push/pull functionality to a MongoDB. If you have MongoDB running locally on your development machine, with this plugin you can easily push and pull data just like with taps.
If you have any questions using the MongoHQ add-on, or any Heroku add-on, our support staff is available.
Node.js Feedback
The response to yesterday’s Node.js announcement continues to be absolutely amazing.
First and foremost, we’re thrilled to see the community share our excitement about Node.js and its potential on the Heroku platform.
We do, however, also want to be mindful that we’re still in the experimental phase with this technology here. For this reason, we’re going to proceed carefully and invite testers in small batches.
So, if you don’t hear from us right away, despair not. It’ll likely take us a few weeks to get through the current list, and if you’re reading this for the first time, please don’t hesitate to register your interest at nodejs@heroku.com.
Experimental Node.js Support
Today we’re offering experimental support for node.js to a limited set of users. We know there is a lot of demand, and will work with as many users as we can. See below for details.
A natural complement to RubyYesterday we posted about how we think about the platform and make roadmap decisions. We are always looking for the next set of use cases to support, and lately we’ve been thinking about realtime apps and event-driven architectures.
Today, most Ruby apps are synchronous. By default, all I/O blocks. If you’re uploading a file, polling a service, or waiting on data, your app will be blocked. While it’s possible to fix this by meticulously eliminating all blocking I/O from your code and dependencies, and using a library such as EventMachine, it’s tedious and as Adam points out: “Libraries like eventmachine will never be truly intuitive to use, because event-driven I/O is enough of a fundamental shift that it requires deep language integration. JavaScript, it turns out, is a fundamentally event-driven language because of its origins in the browser”
Node.js is evented I/O for JavaScript, built on top of the blazingly fast V8 engine. It makes handling event-driven I/O incredibly simple, and aligns perfectly with our maniacal focus on simplicity and developer productivity. The Ruby community has quickly adopted node, and with great reason. Complimenting existing apps with node.js for components that require real-time event handling or massive concurrency is both easy and elegant – in part thanks to the availability of frameworks such as express.
Simple to useNode fits well into our existing architecture. It’s just another runtime selectable as part of our stacks:

Supporting node opens many questions on the platform: how are we going to charge, what version will we support, how can I integrate it with my current apps? We’re releasing this in experimental state now to collaborate with the Node.js and Ruby communities on this. Over the next months we will work together to provide answers to all these questions and more.
How to ParticipateIf you want to participate and use node on an experimental basis, drop an email to nodejs@heroku.com with your email address, what you want to use Node for, and we’ll work to include you in the program.
Update & Roadmap
It’s been a great first quarter for us, and it’s time for a brief update on where we are and where we’re headed.
GrowthHeroku’s growth has continued to be huge. 1,500 new apps were deployed to Heroku last week alone, and that number increases every week. Next week we will cross the 60,000 application mark.
As you can imagine, traffic is growing even more quickly, serving billions of requests per month. In fact, traffic has grown by 4x over the last four months:

Many are finding great value in the platform and paying for features and scale. Our customer count and revenue have similar growth curves.
RoadmapWhere is Heroku’s platform going next? How can you plan for our next releases or influence our direction? When we launch new features, what’s the best way to think about how they fit into our overall strategy?
Here’s how we think about our roadmap and decide on big new areas to work on: it’s all about use cases.
We started with the simplest use case: making it drop-dead easy for developers to deploy applications, and have grown into more complex ones (like multi-tier complex composite apps). We continue to try and expand the number of use cases that we provide a complete solution for.
Here’s a brief look at our historical roadmap from the perspective of expanding use cases:
Provisionless Deployment. Instant deployment with the now famous “git push heroku master” is at the heart of Heroku, enabling the basic use case: a frictionless application platform that just works.

Caching and Instant Scaling. We introduced high-speed HTTP caching built into every app, and added dynos which can be scaled up and down instantly, enabling large scale and variable scale apps.
Asynchronous Patterns. We added background processing with Delayed::Job, followed shortly by workers which can be scaled up and down just like dynos, enabling many use cases around modern asynchronous architecture.
Platform Extensibility. We launched the Add-ons system, a way to extend Heroku apps with core functionality like full-text search or memcache, and to consume external services like New Relic, Zerigo, or Sendgrid. Use cases here are literally endless. Add-ons allow the growing ecosystem of startups and established companies building cloud services to add new features to our platform – many more than we could do on our own.
Flexible Runtime. We recently introduced deployment stacks, which enable choice between multiple runtime environments.
What’s Next?Over the next days, weeks, and months, we will release new features that continue to expand the number of use cases supported by Heroku, whether for startups or large enterprises.
You can be sure that each time we build a feature we will be maniacally focused on simplicity and developer productivity, and will always try to maintain the cohesiveness and quality of the platform.
From our core focus on developer productivity and frictionless deployment, we’ll be expanding the footprint to include areas like realtime and event-driven apps, more complex multi-tier applications, and a broader platform for deploying advanced applications. Stay tuned, and let us know where you’d like to see us go.
Supporting Large Data: Part 1
As apps have matured on Heroku, data sets have gotten much larger. Taps is designed to help development by providing a fast and easy way to transfer databases between local environments and Heroku. Today we launched taps 0.3 with a reworked architecture and a new set of features focused on large data sets:
- Push/Pull Specific Tables
You can now choose which tables to push and pull. Specify a regex and taps will only push or pull the tables that match. To only pull specific tables, specify a comma delimited list. For example, to pull the logs and tags tables, run this command:
heroku db:pull --tables logs,tags
- Resume Transfers
Interruptions can happen when moving large datasets. Now when you interrupt (Ctrl-C) or an error occurs, taps will dump a session file that can be used to resume the transfer:
heroku db:pull --resume-filename session_file.dat
- More Robust Schema Translation
DB operations are now 100% powered by Sequel, making taps capable of handling more varied data and schemas. - Improved Transfer Speed
Taps now uses primary keys to move large data sets with constant transfer speed regardless of table size. Taps will fall back to ‘paging’ the data (OFFSET + LIMIT) if no primary key is available.
Run “sudo gem update heroku taps” to get the new version.
SSL Hostname Add-on Public Beta
Ever since we launched the current IP-based solution at $100/month in response to customer demand, we have been pursuing a cheaper and more elegant solution for SSL with custom certificates on Heroku.
Today, we’re happy to announce the public beta of a new SSL add-on that accomplishes this goal. It’s called ssl:hostname, and is priced at $20/month. This new add-on will allow you enable SSL traffic to your application on any subdomain, such as www.mydomain.com or secure.mydomain.com, using your own SSL certificate. Note that this is a paid beta, and you will be charged for using the add-on through the beta period.
Full docs are available here. You can install it via the heroku gem, or directly from the Add-ons Catalog.

