A Rails Cloud Implementation Using MongoDB and Heroku
When you have mastered how to use the Heroku platform to deploy and manage Rails web applications, you can choose which database or data store to use on the backend. Using a simple Rails app, Note Taker with Search (see the previous article in this series, "Deploying a Rails Application to Heroku"), I will demonstrate how to use the non-relational MongoDB datastore. Why MongoDB? Next to the PostgreSQL relational database (and occasionally the PostGIS geospatial/geolocation extensions), MongoDB is the data storage and management tool that I most often use (with the possible exception of the Sesame RDF data store).
The nosql-databases.org web site provides links to most non-relational data store systems (leaving out only RDF data stores). Check it out if you want to explore other options besides MongoDB. (My next article in this series covers another good NoSQL alternative: CouchDB.) In a nutshell, NoSQL data stores trade off immediate data consistency for greater data storage capacity and resiliency in the face of network partitioning (the "CAP Theorem"). The following two PDFs make a good case for why this tradeoff is sometimes a good design decision:
- Amazon Web Services' "Dynamo: Amazon's Highly Available Key-value Store"
- Google's "Bigtable: A Distributed Storage System for Structured Data"
Save these papers for reference and, after working through the rest of this article, read through them for a better understanding of NoSQL data stores.
The minimal Heroku deployment option (5 megabytes of data storage and one compute unit) is free. For larger deployments, deploying on Heroku is definitely more expensive than managing your own Amazon EC2 instances. For many projects, however, the higher deployment costs can be more than offset by the reduction in development and administration costs. I do a lot of "bare metal" EC2 deployments, and I can attest to the cost and time savings of Heroku because "bare metal" EC2 deployments usually require me to:
- Manage installation of Ruby, Rails, PostgreSQL, memcached, etc. (easy to do)
- Manage persistent data stores by attaching EC2 volumes before starting any services that require persistent disk storage (not so easy, and time consuming to set up)
- Manage the automatic binding of EC2 instances to Elastic IP addresses
- Bring more servers online quickly when I get traffic spikes (difficult to do)
- Back up EBS volumes and database dumps to secure S3 storage (easy to do, but does take a little time to set up)
The use case I cover in this article (using MongoDB data stores) requires an external server, but you can find some third-party services that provide managed MongoDB services. I prefer to run MongoDB on my own managed EC2 instance. Note that if you decide to mix Rails deployments on Heroku with your own services running on EC2, you will lose some of the hassle-free admin and deployment advantages of using Heroku.