20131008 getting past a mysql replication error - plembo/onemoretech GitHub Wiki

title: Getting past a mysql replication error link: https://onemoretech.wordpress.com/2013/10/08/getting-past-a-mysql-replication-error/ author: phil2nc description: post_id: 6497 created: 2013/10/08 09:47:52 created_gmt: 2013/10/08 13:47:52 comment_status: closed post_name: getting-past-a-mysql-replication-error status: publish post_type: post

Getting past a mysql replication error

So what happens if "show slave status\G" reveals that your slave is no longer receiving updates? Well, after checking the usual suspects (making sure the master database is running without errors, that there's connectivity between the two machines and that the replication user account being used by the slave still works), take a careful look at the error messages that the slave status output shows, and the mysql error log (on RHEL systems this is /var/log/mysql/mysqld.log, unless you've redefined it in /etc/my.cnf). Assuming the error wasn't some massive update you don't want to reproduce, but instead something like (as it was in my case) the deletion of an old database that never got replicated to the slave because you didn't do a proper initialization, then you can try skipping that last event and see if it gets things going again. Here's the procedure to follow on the slave as the root db user:

mysql> stop slave;
mysql> set global SQL_SLAVE_SKIP_COUNTER = 1;
mysql> start slave;

Do another "show slave status\G" and see if the problem was cleared. Then look at the mysqld log and confirm that replication has re-started.

131008  9:37:10 [Note] 'SQL_SLAVE_SKIP_COUNTER=1' executed at
 relay_log_file='./mysqld-relay-bin.000002',
 relay_log_pos='17134984', master_log_name='mysql-bin.000001',
 master_log_pos='17134839' and new position at
 relay_log_file='./mysqld-relay-bin.000002',
 relay_log_pos='17135081', master_log_name='mysql-bin.000001',
 master_log_pos='17134936'
 131008  9:37:10 [Note] Slave I/O thread: connected to master
 'repl@dbhost1.example.com:3306',replication started in log
 'mysql-bin.000001' at position 75509126

As an "acid test" (no pun intended), try creating and then deleting a "test" database on the master, using "show databases" on the slave", to confirm the slave is following its leader.