Architecture & DesignCreating a Reproducible and Portable Development Environment

Creating a Reproducible and Portable Development Environment content and product recommendations are editorially independent. We may make money when you click on links to our partners. Learn More.

By Michael Sverdlik at GigaSpaces

The Pain Point

Many developers often need to create easily reproducible development environments—for anything from testing to troubleshooting, and even continued development across teams. To this end, many technologies have arisen to answer this need from Vagrant and VirtualBox, and even Docker in certain contexts. However, with the onset of the cloud where many companies choose to do dev and test and QA work on resources on demand, this type of virtual development environment comes with its downsides too, namely issues such as nested virtualization.

In the context of our R&D and Ops work on Cloudify, an open source cloud orchestration tool written Python with a TOSCA-based YAML DSL, we often need to create reproducible and portable development environments on the cloud, and had to find a way to overcome these issues. We wanted to find the most seamless process to do this—that too would be easily replicable per environment. So what better way than to start with the most popular cloud: AWS? This article dives into one such scenario of porting Vagrant .box files for AWS, demonstrating how to overcome issues like nested virtualization, and the need for bare metal machines that are also costly and time-consuming to provision. You’ll have a step-by-step tutorial for how to easily create v2v (virtual to virtual) machines and create a VMDK disk image that can then be uploaded to any AWS environment.

The Demo

For the quick trial of Cloudify, we provide a Vagrantfile and Vagrant box with Cloudify’s Manager pre-installed on a VirtualBox image. By utilizing Vagrant and VirtualBox, we are able to provide our customers with a reproducible demo environment to evaluate Cloudify locally, on their personal computers.

We could have just provided OVF & VMDK files and have had users import these into VirtualBox; however, the point was to make the evaluation as simple as possible, and Vagrant strips away potential issues one might encounter when dealing directly with VirtualBox. So, instead of providing a detailed explanation on how to correctly set up a VirtualBox VM, we can summarize our quick start guide in two bullets:

  1. Download the Vagrantfile.
  2. Run ‘vagrant up’.

Utilizing Packer to Create Vagrant Boxes

Creating a Vagrant box is a very straightforward matter. You can create one by using Vagrant itself or one of the many utilities available for performing this task.


Packer will be launched from our wrapper script and its output will be parsed until an AMI ID is found.

Step 2: Launch Worker Instance

Launching the worker instance is a straightforward task with Python and Boto:

import oss
import boto.ec2
from boto.ec2 import blockdevicemapping as bdm

# Open connection
access_key = os.environ.get('AWS_ACCESS_KEY_ID')
secret_key = os.environ.get('AWS_ACCESS_KEY')
conn = boto.ec2.connect_to_region(settings['region'],

# Run Packer and get source AMI ID
baked_ami_id = run_packer()
baked_ami = conn.get_image(baked_ami_id)

# Get snapshot id
baked_snap =

# Create mapping for factory machine
mapping = bdm.BlockDeviceMapping()
mapping['/dev/sda1'] = bdm.BlockDeviceType(size=10,
mapping['/dev/sdf'] =

# Create temp key pair
kp_name = random_generator()
kp = conn.create_key_pair(kp_name)

# Create temp security group
sg_name = random_generator()
sg = conn.create_security_group(sg_name,
   'vagrant nightly')

# Run worker instance
reserv = conn.run_instances(image_id=settings['factory_ami'],

factory_instance = reserv.instances[0]

Steps 3-6: Creating the VMDK Image

We’ll use Fabric to execute commands over SSH on the worker instance.

First, we’ll set up the environment (private key, timeouts, and connection attempts) and wait for the worker instance to enter a ‘running’ state:

env.key_filename = os.path.join(gettempdir(),
env.timeout = 10
env.connection_attempts = 12

while factory_instance.state != 'running':

Next, we’ll use Fabric’s ‘execute’ to launch remote commands on the worker instance:

execute(do_work, host='{}@{}'.format(settings['username'],

‘do_work’ is the heart of Steps 3 to 7. It’s essentially a shell script that’s being executed with Fabric’s ‘run()’ and ‘sudo()’:

# Install needed utilities
sudo('apt-get update')
sudo('apt-get install -y virtualbox kpartx
   extlinux qemu-utils python-pip')
sudo('pip install awscli')

# Create mount point and mount source image
sudo('mkdir -p /mnt/image')
sudo('mount /dev/xvdf1 /mnt/image')

# Create file image, mount it and create FS
run('dd if=/dev/zero of=image.raw bs=1M count=5120')
sudo('losetup --find --show image.raw')
sudo('parted -s -a optimal /dev/loop0 mklabel msdos'
   ' -- mkpart primary ext4 1 -1')
sudo('parted -s /dev/loop0 set 1 boot on')
sudo('kpartx -av /dev/loop0')
sudo('mkfs.ext4 /dev/mapper/loop0p1')
sudo('mkdir -p /mnt/raw')
sudo('mount /dev/mapper/loop0p1 /mnt/raw')

# Copy over data from source image to new volume
sudo('cp -a /mnt/image/* /mnt/raw')

# Install bootloader (extlinux)
sudo('extlinux --install /mnt/raw/boot')
sudo('dd if=/usr/lib/syslinux/mbr.bin
      conv=notrunc bs=440 count=1 '
sudo('echo -e "DEFAULT cloudifyn'
   'LABEL cloudifyn'
   'LINUX /vmlinuzn'
   'APPEND root=/dev/disk/by-uuid/'
   'sudo blkid -s UUID -o value
      /dev/mapper/loop0p1' ron'
   'INITRD  /initrd.img" | sudo -s tee

# Unmount
sudo('umount /mnt/raw')
sudo('kpartx -d /dev/loop0')
sudo('losetup --detach /dev/loop0')

Finally, we want to convert the raw image to VMDK format. This is possible by using the qemu-img utility:

# Convert to VMDK
run('qemu-img convert -f raw -O vmdk
   image.raw image.vmdk')

Step 7: Creating an OVF Descriptor and Bundling It into a .box File

Perhaps you noticed that in the beginning of ‘do_work()’ we installed ‘virtualbox’. This is done so we could easily create an OVF descriptor instead of manually building the XML.

We’re going to create a VM (without starting it), attach our VMDK file to it, set a few machine settings (such as CPU and memory size), and export it:

run('mkdir output')
# Create VM
run('VBoxManage createvm --name cloudify
   --ostype Ubuntu_64 --register')
# Create storage controller
run('VBoxManage storagectl cloudify '
   '--name SATA '
   '--add sata '
   '--sataportcount 1 '
   '--hostiocache on '
   '--bootable on')
# Attach volume to storage controller
run('VBoxManage storageattach cloudify '
   '--storagectl SATA '
   '--port 0 '
   '--type hdd '
   '--medium image.vmdk')
# Modify VM parameters
run('VBoxManage modifyvm cloudify '
   '--memory 2048 '
   '--cpus 2 '
   '--vram 12 '
   '--ioapic on '
   '--rtcuseutc on '
   '--pae off '
   '--boot1 disk '
   '--boot2 none '
   '--boot3 none '
   '--boot4 none ')
# Export VM
run('VBoxManage export cloudify
   --output output/box.ovf')

Now that we have OVF & VMDK files, all that’s left is to create the initial Vagrantfile and Vagrant metadata file that will be packaged in the archive and tar everything:

run('echo " do |config|" > output/Vagrantfile')
run('echo " config.vm.base_mac = 'VBoxManage showvminfo cloudify '
   '--machinereadable | grep macaddress1 | cut -d"=" -f2'"'
   ' >> output/Vagrantfile')
run('echo -e "endnn" >> output/Vagrantfile')
run('echo 'include_vagrantfile = File.expand_path'
   '("../include/_Vagrantfile", __FILE__)' >> output/Vagrantfile')
run('echo "load include_vagrantfile if File.exist?'
   '(include_vagrantfile)" >> output/Vagrantfile')
run('echo '{ "provider": "virtualbox" }' > output/metadata.json')
run('tar -cvf -C output/ .')

Step 8: Uploading to S3

Upload to S3 is done by using AWSCLI tools:

run('aws s3 cp '
   ' s3://{}/{}.box'.format(settings['aws_s3_bucket'],

Because we used an IAM role to initiate the worker instance, we do not provide any security credentials here. The IAM role provides these for us.

Step 9: Cleanup

Throughout the process we created different AWS resources. To clean them up after we’re done (whether it was successful or not), we append each resource into the ‘RESOURCES’ array. When we finish, we run the ‘cleanup()’ function that tries to destroy each and every resource.

def main():
   # ...
   conn = boto.ec2.connect_to_region(settings['region'],
   # ...

def cleanup():
   for item in RESOURCES:
      if type(item) == boto.ec2.image.Image:
      elif type(item) == boto.ec2.instance.Instance:
         while item.state != 'terminated':
      elif type(item) == boto.ec2.connection.EC2Connection:
      elif (type(item) == boto.ec2.securitygroup.SecurityGroup or
            type(item) == boto.ec2.keypair.KeyPair):
         print('{} not cleared'.format(item))

Note: This is a very naive approach; it depends on the order of the items in the array to be successfully completed.

Final Result

We’ll let the script speak for itself. You can find the full script in our Github repo and give it a test drive.

About the Author

Michael Sverdlik is a Senior Human Swiss Army Knife at GigaSpaces working on Cloudify. When he is not geeking around on his laptop or PS4, he likes to travel and see the world. More of a cat person. Hi, mom!

Get the Free Newsletter!

Subscribe to Developer Insider for top news, trends & analysis

Latest Posts

Related Stories