How to Migrate to MySQL RDS Without Downtime

Published in

WorkMarket Engineering

7 min readJan 2, 2019

As Amazon Cloud gains more popularity, many companies are looking to migrate their database infrastructure to Amazon RDS. On the other hand, one of the primary questions asked is “How to do it without downtime?” Given the demand, Amazon implemented Data Migration Services (DMS) and supports replication from an external MySQL server to RDS to make it easier for its prospective customers to move. Each of the solutions helps to establish replication flow from the non-RDS to RDS database and reduce downtime during data transfer down to a few minutes.

However, neither solution avoids downtime during the traffic switchover from the non-RDS to RDS database. While a few minutes of downtime may seem insignificant, it doesn’t quite work for many companies including WorkMarket. Therefore, we’ve implemented the following workaround to avoid any impact or downtime on our production database and application.

In my previous post on how we perform zero-downtime maintenances on MySQL RDS, I described why we chose RDS and how we implemented a workaround to avoid RDS outage limitations during RDS scheduled maintenances, re-configurations, and upgrades. Reading that article can help you get a better understanding of our decisions.

While RDS doesn’t support out of the box Master to Master (M/M) replication (neither RDS to RDS nor EC2 to RDS), it doesn’t prevent us from implementing it. Our approach was to configure M/M replication between our MySQL instance hosted on EC2 and the new RDS, then gracefully switch traffic from our application servers one at a time without any downtime for our customers.

One of the main advantages of this approach is that when you migrate to RDS and hit any issues, you can switch back to the original non-RDS database without any data loss as data is in sync on both masters. Also, you can keep your non-RDS infrastructure on standby for some time while monitoring how your Application behaves on RDS. Your non-RDS database remains entirely in sync with the RDS and ready for rollback at any time. Once you are comfortable with your application performance on RDS, you can decommission your obsolete non-RDS database. No Amazon solution offers that!

If this grabs your attention, I’ve outlined the implementation steps of the process below. Otherwise, you can skip to the end.

Assumptions and preparation requirements

To clarify, the Non-RDS server in this article refers to MySQL instance running either in your data center or on an Amazon EC2 instance
The approach described here only works for MySQL as it natively supports M/M.
For simplicity, we assume that we need to migrate two non-RDS servers: Master (M1) and its slave (S1) to M2 and S2 RDS instances (see diagram below). Once you understand the approach, it is irrelevant how large your environment is. The process is the same with some minor tweaks.
Your MySQL version should be 5.6.13 or later. M2 should run ideally the same or higher minor version than M1. However, it should not differ in the major version.
You’ve configured VPN and security groups to allow bidirectional secure communication between RDS and non-RDS instance on the database ports.
Convert non-system MyISAM tables to InnoDB before the move.
Enable binary logs on RDS and non-RDS instances.
Set binary logs retention to a sufficient time duration for your non-RDS database to be transferred to RDS and sync up via binary logs replication.

On RDS: 
call mysql.rds_set_configuration(‘binlog retention hours’, 72);
On non-RDS: 
you can configure the expire_logs_days parameter in my.cnf.

Provision an empty Multi-AZ RDS master (M2) as your destination production database.
Create replication user on M1 and M2

GRANT EXECUTE, REPLICATION SLAVE, REPLICATION CLIENT ON *.* TO ‘repl’@’%’ IDENTIFIED BY PASSWORD <secret>;

Extract all user-defined MySQL user accounts on M1 and create them on M2 as they are excluded from mysqldump. You can use pt-show-grants to help you with extraction.
Before you begin, please ensure that you set auto_increment_increment and auto_increment_offset to avoid M/M replication Primary key conflicts. In my case, I set as follows:

non-RDS: auto_increment_ increment = 4 and auto_increment_offset = 1
RDS:     auto_increment_ increment = 4 and auto_increment_offset = 2

Replication Filter

Set the replicate-wild-ignore-table=mysql.rds_% parameter on M1 to avoid replicating RDS meta tables from RDS to non-RDS once M/M is set up. The change requires M1 restart.In MySQL 5.7 you can avoid a restart by using:
CHANGE REPLICATION FILTER REPLICATE_WILD_DO_TABLE = (‘mysql.rds_%’);Alternatively, If you wish to avoid setting replication filters, you can extract RDS table definitions on RDS and create them on non-RDS with BLACKHOLE engine so it will ignore any changes to these tables from RDS.

mysqldump considerations

While AWS supports mysqldump and Percona XtraBackup to copy a database, we chose to use mysqldump as it allowed us to defragment our data during the process.

To avoid any impact on production traffic, we won’t take a mysqldump directly from primary DB Instance (M1) as the export process may take a very long time and could trigger performance issues on M1 (despite the suggestion in the Amazon documentation). In contrast, we used the approach described in exporting data from RDS to non-RDS and run mysqldump on the slave (S1). I’ll describe the steps later in the article.

We need to make sure that S1 is not taking any production traffic to avoid any performance issue. If in your case it does, you’ll need to create a separate dedicated slave to perform the export.

S1 instance is hosted on s1-host, we’re placing mysqldump in /mnt/mybackup directory which could be either local or a mounted drive as it’s the fastest way to export our data. Please ensure that the drive has sufficient disk space for your compressed backup.

Do thorough testing

Don’t try to rush with the production move. Make sure you’ve spent enough time testing in non-prod and addressing any bugs specific to your environment. For example, we came across the issue with timezone support during setting replication from RDS to non-RDS master, and after some research, we had to run the following command on our non-RDS master:

shell> mysql_tzinfo_to_sql /usr/share/zoneinfo | mysql -u root

Zero-Downtime MySQL Non-RDS to RDS Migration

Implementation

Do not proceed with the steps unless you’ve addressed all the requirements above. The good news is that once you addressed them you already halfway there!

1. Perform data export on s1-host from S1

a) Connect to the S1 and stop replication. Also, ensure that there are no any other activities or changes are taking place on the server, and it is completely frozen as it has to be in the consistent state during export.mysql> stop slave;b) Obtain M1’s current binlog file and position coordinates from S1mysql> show slave status\G
       Master_Log_File: mysql-bin.000113
       Exec_Master_Log_Pos: 9135402c) Execute mysqldump on s1-host. Please note that we don’t specify --master-data2 because we run the export on the slave, and the option captures the coordinates of S1 and not M1. We, however, need only M1's coordinates that we took in the previous step. Also, we use nohup to run the process in the background for a long time.ssh s1-host
$ nohup mysqldump -u root --single-transaction --routines --triggers --databases mydb | gzip -c -1 > /mnt/mybackup/mydb.sql.gz &d) After mysqldump completion, it is a good idea to verify that our coordinates are still the same as before export and then restart replication.mysql> show slave status\G
       Master_Log_File: mysql-bin.000113
       Exec_Master_Log_Pos: 9135402
mysql> stop slave;

2. Perform data import to M2.

If your non-RDS servers are already on AWS, make sure that they are located within the same availability zone (AZ) to minimize the cost of data transfer. If you are transferring from your own data center, you may consider copying your compressed backup to a temporary EC2 instance in the same AZ with the destination RDS and run it from there. In our case, we import directly from s1-host EC2 instance that located in the same AZ with M2.

ssh s1-host
$ nohup gunzip < /mnt/mybackup/mydb.sql.gz |mysql -h m2.ccxyxyxyxyxyx24.us-east-1.rds.amazonaws.com 2>&1 &

3. Setup M/M replication

After import completion, we can setup M/M replication. In this example, we assume that m1-host IP address is 10.199.9.91.

a) Configure M2 as a slave of M1mysql> CALL mysql.rds_set_external_master (‘10.199.9.91’, 3306 , ‘repl’, ‘pass’, ‘mysql-bin.000113’, 9135402, 0);mysql> CALL mysql.rds_start_replication;
mysql> show slave status\G;b) Connect to M2, and obtain its current binlog file and position coordinatesmysql> show master status\G
***************************
     File: mysql-bin.000042
 Position: 652c) Connect to M1 and configure it to be a slave of the M2.mysql> CHANGE MASTER TO 
        MASTER_HOST='m2.xyxyxyxyxy24.us-east-1.rds.amazonaws.com', 
        MASTER_USER='repl', 
        MASTER_PASSWORD='pass', 
        MASTER_LOG_FILE='mysql-bin.000042', 
        MASTER_LOG_POS=652;-- See more discussion on FILTER in the preparation section above mysql> CHANGE REPLICATION FILTER REPLICATE_WILD_IGNORE_TABLE = ('mysql.rds\_%');mysql> start slave;
mysql> show slave status\G;

4. Create an S2 as a read replica of M2

Now that we set up M/M replication, we need to provision all other servers including cross-region slaves (if you have them). This makes the RDS environment identical to the non-RDS environment before the traffic switchover. In our case, we need to provision only S2 slave. Use the standard process as described in the documentation.

5. Perform Graceful Switchover

Since we replicated our entire non-RDS infrastructure to RDS and set up M/M replication, we are ready to proceed with the graceful switchover without downtime.

a) Begin the switchover by moving your read-only traffic first by pointing one of the APP servers to the slave(S2). Since slave is read-only, it offers easy switchback to the source DB without data loss in case of an issue. After careful testing, you can switch all remaining read-only APP servers to the slave.

b) Following switching read traffic to S2, you can begin read-write traffic switch to M2 in the same manner. The beauty of the M/M approach is since the data is replicated in both ways you can easily switch back to the source DB without losing any data in case of an issue.

Conclusion

The answer to the question “Is it possible to move from non-RDS to RDS without downtime?” is a resounding: “Yes!” With a little effort and creativity, we were able to offer our customers a seamless transition to the RDS platform on the AWS Cloud while working around the limitations that it posses.

Our goal is to keep our customers happy, so we are happy to come up with the ways to do just that!
I hope you’ll find the article helpful. Please feel free to reach out to me if you have any questions!

How to Migrate to MySQL RDS Without Downtime

Assumptions and preparation requirements

Implementation

Conclusion

Written by Dmitriy Royzenberg