Database Replication

Database Replication: AWS Database Migration Service

AWS DMS is an easy to use, cost-effective database replication tool

Thomas Spicer
Openbridge
Published in
6 min readJan 30, 2020

--

In August 2016, Amazon Web Services released Database Migration Service (DMS). DMS is database replication software focused on making it easier to migrate data from a source database to a target destination like a data warehouse or data lake (within AWS).

What is AWS Database Migration Service?

The service supports migrations of data from multiple sources via a few different replication methods. Oracle to Oracle, Oracle, or Microsoft SQL Server to Amazon Aurora and Redshift. The product also supports MySQL data replication as well as Postgres, and others.

In addition to replication data from a database, AWS DMS allows you to continuously replicate your data with high availability and consolidate databases to cloud warehouses like Amazon RDS, Amazon Redshift, or object storage Amazon S3. The S3 destination becomes a perform landing zone for a data lake.

For in-depth details on the supported source and target database engines, AWS makes these details available on their site.

Types of Database Replication: One time, On-going

Typically, you will either be doing a one-time migration or continuous data replication. In the case of a unique one-time replication, you may undertake this process to do a seed replication to a new system for testing or production.

In the case of continuous replication, you may do this process on a schedule, such a nightly job, or undertake near real-time replication. Ongoing near real-time replication does have specific configuration requirements for read/write access to the source system.

If you only have read access to the source database, this will require an alternate replication pattern. Why? Most real-time processes will require write operations to the source for replicating updates. This will prevent most services from having the required write access to keep track of changes. Without write access to the source database, you need an alternate replication pattern that can track changes in the source system. An alternative is to switch to less frequent replication tasks.

In many cases near real-time replication may be nice to have, but a scheduled migration task is more than adequate. Scheduled DMS tasks can also be very cost-effective given you only need to pay for instances while running.

Another benefit of the AWS product is the opportunity to automate processing according to your specific requirements beyond the AWS user interface. For example, you can accomplish automation with AWS CLI, CloudFormation, or a third-party solution like Terraform.

DMS and Data Lake Landing Zone

One of the intriguing options and a less obvious use case of DMS is using S3 as a target destination. Using the DMS S3 target destination creates a cost-effective, and high-quality data lake landing zone for exported tables from a source system.

From your source system landing zone, you can create scalable, zero administration data pipelines to data lakes or cloud warehouses like Azure Data Lake, AWS Redshift, AWS Redshift Spectrum, AWS Athena, and Google BigQuery.

Pricing

When migrating databases to Amazon Aurora, Amazon Redshift, Amazon DynamoDB, or Amazon DocumentDB (with MongoDB compatibility), you can use DMS free for six months. However, while a free extended trial is helpful, in most cases, you have to budget for ongoing operations.

For less frequent (i.e., daily) batch replication operations, we will use a c4.large, which is $0.154 per hour. The SSD storage is $0.115 per GB/month. If we are running our daily process for 4 hours, this will be about $.62 a day or about $19 for the month.

Assuming we were consistently persisting about 500 GB of data within the process, this would be another $58 a month. The total service costs would be $76.

If we were running this 24/7, then the daily costs would $3.70 or $110 for the month. The data storage would roughly be the same at 500 GB. The total cost would be $168.

Depending on the type of replication, replication schemes, and replication processes, your costs will vary upward or downward. However, regardless of the end configuration, the AWS service for database replication is an affordable option that offers value and powerful capabilities.

Database replication software comparison

Up until a couple of years ago, tools like DMS were hard to come by or difficult to employ. The lack of cost-effective and quick setup solutions led several SaaS vendors like Fivetran, Stitch, Alooma, and Openbridge to roll out solutions.

Why did these companies build out solutions? Customers needed to support data replication from a source database like Postgres, MySQL, and others to a cloud warehouse like Redshift or BigQuery for data analytics. Moving data into a data lake or cloud warehouse opened new opportunities to use tools like Tableau, Looker, PowerBI, and others.

So why would you use a Saas tool like Fivetran integrations for data replication over DMS? Today, you likely would not unless you have heavily invested in Fivetran already. Given the emergence of the and the refinement of the product by AWS over the past 12–24 months, it is a go-to offering for replication. The only use case where we still leverage our Openbridge replication tools is for read-only data sources.

So you how much does a SaaS tool like Fivetran cost compared to AWS Database Migration Service? Fivetran costs will range from USD 36K to USD 120K per year. The Fivetran pricing model would likely be 20x more than a base AWS configuration. In fairness to Fivetran, you would never select them just for database replication services alone. The Fivetran cost would be prohibitive for a data replication use case. If you are thinking of Fivetran as a primary solution for replication, you should explore DMS as an alternative first.

What about other SaaS vendors? Fivetran alternatives like Stich also offer replication services. While Fivetran competitors like Stich are less expensive, DMS still affords greater flexibility and cost efficiencies, especially given their pricing model (number of replicated rows).

For replication use cases, this is less about SaaS comparisons like Stitch vs. Fivetran but more about how these services compare to the latest Amazon DMS offering. If you need to continuously replicate a read-only system, feel free to reach out to the Openbridge team for details on our service.

Getting Started

Getting started with the AWS Database Migration Service will require that you create an AWS account, set up a migration process, and associated replication instance(s). In more sophisticated use cases, you may need to employ the DMS schema conversion tool.

As with any system migration of data from one location to another, make sure you have a database migration plan in place. This is critical for testing and signoff of the processes. Without this plan, subtle shifts, gaps, or bugs can corrupt the downstream processing.

Openbridge provides a fully-managed Amazon’s Data Migration Service (DMS) to customers. The typical for our customers is to use the service to deliver data to an AWS S3 landing zone, we then ingest the data into a curated data lake, register everything in a data catalog, and create corresponding tables/views in Athena or Redshift Spectrum.

DDWant to explore using database replication? Need a platform and team of experts to kickstart your data and analytics efforts? We can help! Getting traction adopting new technologies, especially if it means your team is working in different and unfamiliar ways, can be a roadblock for success. This is especially true in a self-service only world. If you want to discuss a proof-of-concept, pilot, project, or any other effort, the Openbridge platform and team of data experts are ready to help.

Reach out to us at hello@openbridge.com. Prefer to talk to someone? Set up a call with our team of data experts.

Visit us at www.openbridge.com to learn how we are helping other companies with break down their data silos.

References:

--

--