Architecture & DesignMigrating MongoDB to DynamoDB, Part 2

Migrating MongoDB to DynamoDB, Part 2

The AWS Database Migration Service (DMS) added support for two NoSQL databases in 2017: MongoDB as the source database and AWS DynamoDB as the target database. In a two-article tutorial, we are migrating a MongoDB database to DynamoDB on DMS. In the first article, “Migrating MongoDB to DynamoDB, Part 1,” we created a MongoDB replica set and DynamoDB table. In this continuation article, we shall discuss creating and running a DMS migration to migrate data.

This article has the following sections:

Creating a DMS Migration

Next, we shall create a DMS migration to migrate the MongoDB database to DynamoDB. Log in as the IAM user (dvohra or other) created for DMS and select DMS in the AWS Management Console. Click Create migration in the DMS Dashboard, as shown in Figure 1.

Create migration
Figure 1: Create migration

Click Next in the DMS Welcome page, as shown in Figure 2.

Welcome>Next
Figure 2: Welcome>Next

The Create replication instance dialog gets displayed, as shown in Figure 3, in which we configure a replication instance to initiate the connections between the source and target databases, transfer the data, and cache any changes on the source database during the initial load.

Create replication instance
Figure 3: Create replication instance

Specify the replication instance name in the Name field, select the Instance class, select a VPC, and select the option for whether a Multi-AZ replication instance is to be created. Description, which is usually optional in configurable settings, is a required field. The default settings are provided for all of these fields except the VPC. The replication instance settings used are shown in Figure 4.

Replication instance Settings
Figure 4: Replication instance Settings

Select the option to make the replication instance Publicly accessible and click Advanced to configure advance parameters, as shown in Figure 5.

Setting replication instance as publicly accessible
Figure 5: Setting replication instance as publicly accessible

In the Advanced section, default settings are provided for all the fields (see Figure 6).

Advanced Settings
Figure 6: Advanced Settings

The default settings, except the KMS master key, which must be set to the encryption key (dms) created before logging in as the IAM user (dvohra), are suitable for any replication instance, as shown in Figure 7. Click Next.

Advanced Settings>Next
Figure 7: Advanced Settings>Next

The replication instance begins to get created, as indicated by the message shown in Figure 8. Specify the database endpoints next while the replication instance is being created. However, the database endpoints cannot be tested until the replication instance has been created.

Replication Instance being created
Figure 8: Replication Instance being created

For the Source engine, select the mongodb database, as shown in Figure 9.

Selecting Source Engine as mongodb
Figure 9: Selecting Source Engine as mongodb

For the Target engine, select the dynamodb database, as shown in Figure 10.

Selecting Target Engine as dynamodb
Figure 10: Selecting Target Engine as dynamodb

The Endpoint Identifier may be kept as the default for both the source and target databases, but the other connection parameters need to be specified. For the Source database connection details, specify the Server name as the Private IP (Figure 21 in first article, “Migrating MongoDB to DynamoDB, Part 1“) of the CoreOS EC2 instance on which the MongoDB replica set is started using Docker and specify Port as 27017 (see Figure 11). Select “none” for SSL mode and Authentication mode. Specify Database name as test and select Authentication mechanism as default.

Source Database Connection Details
Figure 11: Source Database Connection Details

For the source database engine, mongodb, select Metadata mode as document and select the option _id as a separate column, as shown in Figure 12. The Run test buttons are used to test the source and target database connections and not enabled until the replication instance has been created.

Other Settings for the Source Engine
Figure 12: Other Settings for the Source Engine

Copy the Role ARN for the dms-vpc-role from the IAM Console, as shown in Figure 13. Role ARN is to be used for defining the target database connection for the DMS migration.

Copying Role ARN
Figure 13: Copying Role ARN

Copy and paste the Role ARN in the Service Access Role ARN field, as shown in Figure 14.

Service Access Role ARN
Figure 14: Service Access Role ARN

When the replication instance has been created, a message indicating the same gets displayed, as shown in Figure 15.

Replication instance created
Figure 15: Replication instance created

For the target database, click Run test to test the connection. If a connection gets established, the message “Connection tested successfully” should get displayed (see Figure 16).

Target Database Connection tested successfully
Figure 16: Target Database Connection tested successfully

Similarly, click Run test for the source database, and the message “Connection tested successfully” should get displayed if a connection gets established, as shown in Figure 17.

Source Database Connection tested successfully
Figure 17: Source Database Connection tested successfully

Click Next in Database endpoints, as shown in Figure 18.

Database Endpoints>Next
Figure 18: Database Endpoints>Next

Next, configure a migration task in the Create task page. A task consists of several settings, including task name, task description, source endpoint, target endpoint, replication instance, migration type, task settings, table mappings, and advanced settings. The default settings for Task name and the non-modifiable settings for the Source endpoint, Target endpoint, Replication instance, and Migration type are shown in Figure 19.

Create task Settings
Figure 19: Create task Settings

Add a suitable description and select a Migration type from the drop-down list shown in Figure 20. The different options for Migration type are Migrate existing data, Migrate existing data and replicate ongoing changes, and Replicate data changes only. To migrate existing data from MongoDB to DynamoDB with the provision to replicate ongoing changes, select Migrate existing data and replicate ongoing changes. A migration task once created has the provision to be modified subsequently except for the Migration type setting, which is non-modifiable after a migration task has been created. Therefore, choose the Migration type by assuming it is a permanent setting.

Choosing Migration type
Figure 20: Choosing Migration type

Select Task Settings for Target table preparation mode, Stop task after full load completes, Include LOB columns in replication, and Enable logging (see Figure 21).

Task Settings
Figure 21: Task Settings

Click Advanced Settings, as shown in Figure 22, to configure advanced settings, which include Control table settings and Tuning settings. The default advanced settings may be kept.

Advanced Settings
Figure 22: Advanced Settings

In Table mappings, configure selection rules, as shown in Figure 23. At least one selection rule with an Include action is required. Select a Schema name (test) in the DMS source MongoDB. Schema name is the same as a MongoDB database name, which is test. Specify Table name is like as ‘%’, which selects all tables. A table is also called a collection in MongoDB. Select Action as Include, which includes the objects selected by a selection rule. Exclude Actions are processed after Include Actions.

Table Mappings
Figure 23: Table Mappings

Source filters to limit the number and type of records transferred from source to target may also be configured. Click Add selection rule, as shown in Figure 24.

Add selection rule
Figure 24: Add selection rule

Transformation rules may be added to make uppercase/lowercase, and add/remove prefix/suffix transformations. If Logging has been enabled, DMS creates a role to log to CloudWatch. Creating a task implicitly grants the permissions required to access and log to CloudWatch. Click Create task, as shown in Figure 25.

Create task
Figure 25: Create task

A migration task starts to get created (see Figure 26). Initially, the Status is “Creating”. The Status should get updated automatically and the option to click the refresh button to refresh the status periodically is also provided.

Migration task starts to get created
Figure 26: Migration task starts to get created

When a task gets created, the Status becomes Ready, as shown in Figure 27.

Task Status Ready
Figure 27: Task Status Ready

An IAM role for CloudWatch access and logging gets created automatically, as shown in Figure 28.

IAM Role for CloudWatch Logs
Figure 28: IAM Role for CloudWatch Logs

Running the Migration

To run the migration task, click Start/Resume, as shown in Figure 29.

Start/Resume task
Figure 29: Start/Resume task

The Task status becomes Starting, as shown in Figure 30.

Task Starting
Figure 30: Task Starting

When the task has completed running, the Tables loaded column lists the number of tables loaded as 1, the Status becomes Stopped, The Complete % should indicate 100, as shown in Figure 31. As the Type column indicates, the migration type is Full Load & Ongoing Replication.

Full Load Completed
Figure 31: Full Load Completed

In addition to the Tables loaded column, the Tables loading, Tables queued, and Tables errored also get listed, as shown in Figure 32.

Tables loading, Tables queued, and Tables errored are all 0
Figure 32: Tables loading, Tables queued, and Tables errored are all 0

In DynamoDB, the wlslog table lists seven items, as shown in Figure 33. Two other tables, awsdms_apply_exceptions and awsdms_full_load_exceptions, also get created automatically. The awsdms_apply_exceptions table provides exception details, including the error name and description, the statement that was running when the error occurred, the name of the task, the table owner, the table name, and the time of the exception. The awsdms_full_load_exceptions table provides information about the exceptions generated after a full load.

The wlslog table lists seven items
Figure 33: The wlslog table lists seven items

Click an _id to display the document (_doc attribute value), as shown in Figure 34.

Document for an item stored in DynamoDB
Figure 34: Document for an item stored in DynamoDB

The DynamoDB Filter may be used to filter the search. As an example, search for a specific _id by specifying _id as field, selecting String as the field type, selecting ‘=’ as the filter operator, and specifying the _id the search for, as shown in Figure 35. Click Start Search.

Applying a Filter
Figure 35: Applying a Filter

The DynamoDB table row data for the _id specified gets listed (see Figure 36).

Filtered Data for specific _id
Figure 36: Filtered Data for specific _id

After a migration task has completed migrating a database, the task status becomes Stopped, but the migration Endpoints are still Active, as shown in Figure 37.

Endpoints Active even after migration completed and task stopped
Figure 37: Endpoints Active even after migration completed and task stopped

Resuming a Migration

A migration task that has stopped may be restarted or resumed. The following are some of the reasons for resuming or restarting a task:

  • A new document has been added to an existing collection (also called table) in the DMS source MongoDB database
  • A new collection (table) has been added in the MongoDB database
  • The migration needs to be re-run with different selection rules, which could be necessary, as an example, if a table prefix needs to be added using a transformation rule.

As an example, add three more documents to the wlslog collection in the Mongo CLI.

doc8 = {"timestamp":"Apr 8, 2014 7:06:23 PM PDT",
   "category":"Notice","type":"WebLogicServer",
   "servername":"AdminServer","code":"BEA-000360",
   "msg":"Server in RUNNING mode"}
doc9 = {"timestamp":"Apr 8, 2014 7:06:24 PM PDT",
   "category":"Notice","type":"WebLogicServer",
   "servername":"AdminServer","code":"BEA-000365",
   "msg":"Server Stopping"}
doc10 = {"timestamp":"Apr 8, 2014 7:06:25 PM PDT",
   "category":"Notice","type":"WebLogicServer",
   "servername":"AdminServer","code":"BEA-000361",
   "msg":"Server Resumed"}
db.wlslog.insert([doc8,doc9,doc10])

As the output in Figure 38 indicates, the three documents get added.

Adding three more documents
Figure 38: Adding three more documents

Click Start/Resume to resume a stopped task, as shown in Figure 39.

Start/Resume for stopped task
Figure 39: Start/Resume for stopped task

In the Start task dialog, two options are provided: Start or Restart. The Start option starts the task and loads new tables or collections added to the DMS source. The Start option also loads any table that was only partially loaded in a previous run. The Start option does not load data (new or old) into a table that has already been loaded completely into the target database. The Restart option restarts a task and deletes existing data in the target database and restarts the full load. In effect, the Restart option loads new data added to existing tables in addition to loading any new tables that are added in the DMS source. Because we added new data to an existing table, we need to select the Restart option and click Start task, as shown in Figure 40.

Restarting task
Figure 40: Restarting task

The task restarts, deletes existing table/s in the target database, and loads all data from the source database to target the database. When the data load completes, the task status becomes Load complete (see Figure 41).

Load complete
Figure 41: Load complete

Click the refresh button in the DynamoDB, as shown in Figure 42.

Refreshing Data in wlslog table
Figure 42: Refreshing Data in wlslog table

The number of items listed is 10, as shown in Figure 43, instead of the seven before restarting the task. The three new items we added make the items total 10.

Listing 10 items after adding new items and refreshing data
Figure 43: Listing 10 items after adding new items and refreshing data

The new data migrated are distinguished from the data loaded in the 1st run by a different prefix; the data loaded in the 1st run have the prefix 59401 in the _id and the data loaded in the 2nd run have the prefix 59402. The Filter option may be used to list just the new data, as shown in Figure 44.

Filtering Data to list only three new items
Figure 44: Filtering Data to list only three new items

The task status again becomes Stopped after the new data has been migrated, as shown in Figure 45.

Status becomes Stopped after Full Load completed
Figure 45: Status becomes Stopped after Full Load completed

Deleting a Migration

To delete a migration, select the migration and click Delete (see Figure 46).

Delete
Figure 46: Delete

In the Delete task dialog, click Delete, as shown in Figure 47.

Delete task verification
Figure 47: Delete task verification

The task status becomes Deleting, as shown in Figure 48, before the task gets deleted.

Task Deleting
Figure 48: Task Deleting

Deleting a task does not delete the DMS endpoints being used; this means a new task may be created.

Conclusion

In two articles, we discussed migrating a MongoDB database to DynamoDB tables by using AWS Database Migration Service.

Get the Free Newsletter!
Subscribe to Developer Insider for top news, trends & analysis
This email address is invalid.
Get the Free Newsletter!
Subscribe to Developer Insider for top news, trends & analysis
This email address is invalid.

Latest Posts

Related Stories