January 23, 2021
Hot Topics:

Mining Amazon.com Catalog Data with Ruby

  • By Jason Gilmore
  • Send Email »
  • More Articles »

Amazon.com introduced one of the world's first online affiliate programs in 1996, a mere two years after the company's founding. The enormous popularity of the Amazon Associates Program is widely considered to have played a significant role in the company's early growth. In 2002 the company launched a catalog API intended for use in conjunction with the Associates Program called the Amazon E-Commerce Service (later retitled the Product Advertising API).

Amazon's Product Advertising API provides developers with an interface for creating interesting new services that mine Amazon's enormous product catalog. Using this API, access to typical product information such as the, price and manufacturer is just the tip of the iceberg; it's also possible to retrieve information about the sales volume (via the sales rank), product reviewers, product descriptions, related products, and much, much more.

The popularity of this API has prompted the development of libraries that facilitate application integration using all of the most popular programming languages, among them PHP, Ruby, Perl and C#. Ruby offers a particularly powerful library known as Ruby/AWS. This library (or gem in Ruby parlance) provides an easy way to begin programmatically perusing and mining the Amazon catalog in every conceivable manner, a characteristic I recently came to fully appreciate while integrating Ruby/AWS into a new project.

In this tutorial I'll introduce you to Ruby/AWS, showing you how to use this great library to bend the Product Advertising API to your will.

Installing and Configuring Ruby/AWS

As I mentioned, Ruby/AWS is packaged as a Ruby gem, meaning you can install it via the RubyGems package manager. To install it, just open a terminal and execute the following command:

%>gem install ruby-aws

When installed, you'll need to sign up for an Amazon Web Services account in order to obtain an API key. Creating an account is free and takes only a moment. Within your account profile you'll be able to retrieve your "Access Key ID" and "Secret Access Key", which serve as your account's username and password, respectively. Ruby/AWS will look for this information within a configuration file named .amazonrc in your home directory, so create this file and copy the following contents into it:

locale = 'us' cache = false key_id = 'PASTE_YOUR_ACCESS_KEY_HERE' secret_key_id = 'PASTE_YOUR_SECRET_KEY_HERE'

If you hail from outside of the United States, you can change the locale setting, causing Ruby/AWS to consult the associated country-specific Amazon catalog. For instance, if you live in the UK, use the locale setting uk, which will cause Ruby/AWS to consult the Amazon.co.uk catalog.

Performing a Product Lookup

The Product Advertising API exposes a number of methods useful for searching the catalog in a variety of ways. For instance, you can look up a specific product according to its ASIN Amazon Standard Identification Number), search a particular category of products (Books, Music or Grocery for instance) by product title, release date, or manufacturer, and even search for a product's related items in order to further entice a prospective customer into purchasing more.

Further, to conserve bandwidth and improve performance, several lookup responses (known as response groups) can be returned with varying degrees of specificity. For instance, the Small response group returns key attributes such as the product title, ASIN, and URL. The Medium response group includes everything found in Small as well as product image URLs and the latest sales rank. Still other response groups can return information specific to bestselling items, solely images, and product reviews. (See the API documentation for a complete summary of available response groups.)

Let's work through a few examples involving Ruby/AWS, beginning with a simple item lookup based on its ASIN (incidentally, you can find a product's ASIN on its Amazon product page). Much of the following script is standard Ruby syntax, so you should pay particular attention to the ItemLookup, ResponseGroup and Request calls. Following the example I'll talk more about the role of these calls.

#!/usr/bin/ruby -w require 'rubygems' require 'amazon/aws/search' include Amazon::AWS include Amazon::AWS::Search il = ItemLookup.new( 'ASIN', { 'ItemId' => '1430231149', 'MerchantId' => 'Amazon' } ) rg = ResponseGroup.new( 'Medium' ) req = Request.new resp = req.search( il, rg ) item = resp.item_lookup_response.items.item attribs = item.item_attributes title = attribs.title asin = item.asin sales_rank = item.sales_rank publication_date = attribs.publication_date puts "#{title} was released on #{publication_date}"

Executing this script produces the following output:

Beginning PHP and MySQL: From Novice to Professional, Fourth Edition was released on 2010-09-30

The ItemLookup constructor determines what precisely we are looking for, in this case a specific ASIN. The ItemId parameter defines that ASIN, and the MerchantId parameter specifies that we're interested only in products sold by Amazon.com, rather than the array of affiliate merchants selling (or reselling) products through the site. The ResponseGroup method defines the type of response group used for the lookup results. Finally, the Request.search method executes the search.

Following a successful search you'll be able to access the product attributes using the typical dot notation used when accessing Ruby objects. As you can see from the example, part of the challenge is figuring out which attributes are stored within item_attributes and which are directly accessible.

Page 1 of 2

This article was originally published on October 6, 2010

Enterprise Development Update

Don't miss an article. Subscribe to our newsletter below.

Thanks for your registration, follow us on our social networks to keep up-to-date