October 25, 2014
Hot Topics:
RSS RSS feed Download our iPhone app

Ruby for the REST of Us: Using Ruby and REST to Integrate with Amazon S3

  • August 2, 2006
  • By Dominic Da Silva
  • Send Email »
  • More Articles »

Have you heard of Ruby? How about REST? Are you interested in a cheap, unlimited Web-based data storage service that uses the storage and network infrastructure of Amazon.com, the largest online retailer in the world? If so, this article is for you. This article covers the Ruby, REST, and the Amazon S3 REST library for Ruby, and how these technologies were combined to build rSh3ll, an open source command shell for using the Amazon S3 service.

Ruby: The Scripting Language that Is Taking the Computing World by Storm

It is hard to imagine anyone in the programming world these days who has not heard of Ruby. The ever-increasing popularity of the Ruby on Rails Web framework is helping to make Ruby the language of choice for rapid application development and testing. Ruby is an interpreted scripting language that provides quick and easy object-oriented programming and contains some neat features such as closures, blocks, and mixins. Ruby is also highly portable, running on Unix/Linux, Windows, and MacOS. For those wanting a more thorough introduction to Ruby, you can read W. Jason Gilmore's article on Ruby.

REST: A Lightweight Way to do Web Services

In recent years, Web services have become the preferred means of exposing information and services to others over the Web. Many well-known Internet companies such as Google, Ebay, Paypal, and Amazon provide Web service APIs to their services and data. REST (Representation State Transfer) is a lightweight, simple Web-based interface that uses XML and HTTP. REST defines a simple and small set of operations, based on the HTTP protocol, for operating on resources. These CRUD (Create, Read, Update, Delete) operations are based on the well-known HTTP operations POST, GET, PUT and DELETE. By using a REST approach, you can have easy programmatic access to a Web service using the HTTP protocol.

Amazon S3: Unlimited Data Storage at a Great Price

Amazon S3 (Simple Storage Service) was released by Amazon Web Services earlier this year, with the aim of providing users with unlimited data storage and bandwidth capabilities. S3 uses the highly scalable, reliable, fast, and inexpensive data storage infrastructure that Amazon uses to run its own global network of Web sites. The service costs $0.15 per gigabyte of storage used per month, plus $0.20 per gigabyte of data transferred. S3 users can create up to 100 buckets in which they can store an unlimited number of objects (also referred to as tokens). Each object can range in size from as smalls as 1 byte to as large as 5 gigabytes. The Amazon S3 feature set is intentionally minimal, consisting of the following:

  • Write, read, and delete objects containing from 1 byte to 5 gigabytes of data each. The number of objects you can store is unlimited.
  • Each object is stored and retrieved via a unique, developer-assigned key.
  • Authentication mechanisms are provided to ensure that data is kept secure from unauthorized access. Objects can be made private or public, and rights can be granted to specific users.
  • Uses standards-based REST and SOAP interfaces designed to work with any Internet-development toolkit.
  • Built to be flexible so that protocol or functional layers can be added easily. Default download protocol is HTTP. A BitTorrent protocol interface is provided to lower costs for high-scale distribution. Additional interfaces will be added in the future.

The Amazon S3 Web Service APIs

To use the Amazon S3 service, you first must create a free Amazon Web Service Account and then sign up for the Amazon S3 service. You can read this Amazon Web Services blog post outlining the whole process in detail. Once you have done this, you will have an Access Key and Secret Access Key that gives you access to the S3 service. Amazon S3 provides both REST and SOAP libraries for languages such as Ruby, Java, C#, Perl, and PHP. These libraries provide simple functionality and basic features needed to use the S3 Web service. The library you are interested in is the Amazon S3 library for REST in Ruby, which is packaged as a single Ruby module: S3.rb.

rSh3ll: The Amazon S3 Command Shell for Ruby

rSh3ll is a command shell for using the Amazon S3 service. By using rSh3ll, you can manage your Amazon S3 buckets and objects. rSh3ll is open source, available under the MIT License and hosted on RubyForge. The latest version of rSh3ll is version 2.1. The distribution is contained in one zip file, rSh3ll-2.1.zip, which contains the program's source code (rSh3ll.rb and S3.rb) and batch files needed to run the program. The first thing you need to do to run rSh3ll is specify your AWS Access Key and Secret Access Key in the rSh3ll.rb file, as shown below:

# set your AWS access key and secret access key
AWS_ACCESS_KEY_ID     = '<INSERT YOUR AWS ACCESS KEY ID HERE>'
AWS_SECRET_ACCESS_KEY = '<INSERT YOUR AWS SECRET ACCESS KEY HERE>'

Once you have done this, you then can fire up rSh3ll.rb either via the rSh3ll.bat file or by running the command 'ruby rSh3ll.rb'. The command set for rSh3ll is:

bucket [bucketname]
count [prefix]
createbucket
delete <id>
deleteall [prefix] (TODO: support > 1000 item deletes)
deletebucket
exit
get <id>
getacl ['bucket'|'item'] <id>
getfile <id> <file>
gettorrent <id>
head ['bucket'|'item'] <id>
host [hostname]
list [prefix] [max]
listbuckets
pass [password]
put <id> <data>
putfile <id> <file>
putfilewacl <id> <file> ['private'|'public-read'|
                         'public-read-write'|'authenticated-read']
quit
setacl ['bucket'|'item'] <id> ['private'|'public-read'|
                               'public-read-write'|'authenticated-read']
user [username]

An example usage of rSh3ll (performing a bucket listing, setting the current bucket to 'rSh3ll' and getting a token listing for bucket 'rSh3ll') would be the following:

C:DownloadrSh3ll>ruby rSh3ll.rb

Welcome to rSh3ll (Amazon S3 command shell for Ruby) (c)
   2006 SilvaSoft, Inc.
Type 'help' for command list.

rSh3ll> listbuckets
--- bucket list ---
["dominicdasilva", "jSh3ll", "rSh3ll", "sharpSh3ll", "silvasoftinc"]
rSh3ll> bucket rSh3ll
--- bucket set to 'rSh3ll' ---
rSh3ll> list
--- token list for bucket 'rSh3ll' ---
["rSh3ll-1.0.zip", "rSh3ll-1.1.zip", "rSh3ll-1.2.zip",
 "rSh3ll-2.0.zip", "rSh3ll
-2.1.zip", "rSh3ll-preview1.zip"]
rSh3ll> quit
Goodbye...

Leveraging the Amazon S3 REST Library for Ruby to Create rSh3ll

rSh3ll uses the Amazon S3 REST interface for Ruby, contained in the S3.rb module and has no external dependencies. Although S3.rb requires the HMAC-SHA1 Ruby library, rSh3ll eliminates this requirement by using the built-in OpenSSL Ruby library for message authentication and encoding. The S3.rb module provides a main class AWSAuthConnection that contains methods for opening a connection to S3 and performing operations against it. rSh3ll uses AWSAuthConnection to do its work, first opening a connection to the S3 service:

# create AWS connection
conn = S3::AWSAuthConnection.new(user_, password_, USE_SSL)

USE_SSL is a boolean variable that can be used to enable SSL encryption on the connection, set to false by default:

# setting this to false is faster, but your data is not encrypted
# as it is sent.
USE_SSL = false

rSh3ll then loops reading in the next command line entered by the user and prints the result of the command to the console. After parsing the command line for correct command syntax, rSh3ll.rb invokes the corresponding method on the AWSAuthConnection and prints the results to the console. A code extract from rSh3ll.rb for the implementation of the 'bucket', 'list', and 'listbuckets' commands are shown below:

 1. # loop getting commands
 2. line = ""
 3. begin
 4.    print "rSh3ll> "
 5.    # read and split line
 6.    line = STDIN.readline.gsub(/n/, "")
 7.    tokens = line.split
 8.    # get command
 9.    command = tokens[0]
10.    # execute command
11.    case command
12.       # bucket
13.       when 'bucket'
14.          if tokens.size != 2
15.             puts "error: bucket [bucketname]"
16.          else
17.             bucket_ = tokens[1]
18.             puts "--- bucket set to '" + bucket_ + "' ---"
19.          end
20.          ...
21.          # list
22.          when 'list'
23.             if bucket_ == ''
24.                puts "error: bucket is not set"
25.             else
26.                puts "--- token list for bucket '" + bucket_
                      + "' ---"
27.                p conn.list_bucket(bucket_).entries.map 
                      { |entry| entry.key }
28.             end
29.          ...
30.          # listbuckets
31.          when 'listbuckets'
32.             puts "--- bucket list ---"
33.             p conn.list_all_my_buckets.entries.map
                   { |bucket| bucket.name }
34.          ...
35.       # no match
36.       else
37.          puts "no match"
38.       end
39. end until command == 'quit' or command == 'exit'

When the user lists their S3 buckets by typing the 'listbuckets' command, rSh3ll invokes the list_all_my_buckets method on the AWSAuthConnection (lines 30-33 of the code above). This command returns a ListAllMyBucketsResponse instance. ListAllMyBucketsResponse contains an attribute reader for its private @entries attribute, which contains a list of buckets. rSh3ll then prints out these entries. The same logic takes place for a 'list' command (lines 21-28 of the code above), except that the list_bucket method is invoked, and a ListBucketResponse instance is returned instead (which also contains an attribute reader for its private @entries attribute, containing a list of tokens). The 'list' command, however, relies on the bucket being set previously by the 'bucket' command (lines 12-19 of the code above). An example command scenario using these three commands was shown earlier and should make more sense now that internal workings have been explained.

Conclusion

I hope this article has sparked your interest in using Ruby and REST within your applications to provide simple and painless integration with Web services. Amazon S3 offers a cheap, unlimited storage service for end users and provides REST and SOAP APIs for numerous programming languages. rSh3ll, an Amazon S3 command shell for Ruby, is a simple command-line tool for accessing Amazon S3.

References

About the Author

Dominic Da Silva (http://www.dominicdasilva.com/) is the President of SilvaSoft, Inc., a software consulting company specializing Java, Ruby, and .NET based-Web and Web services development. He has worked with Java since 2000 and is a Linux user from the 1.0 days. He is also Sun Certified for the Java 2 platform. Born on the beautiful Caribbean island of Trinidad and Tobago, he now makes his home in sunny Orlando, Florida.






Comment and Contribute

 


(Maximum characters: 1200). You have characters left.

 

 


Sitemap | Contact Us

Rocket Fuel