Loading...
 
[Zobrazit/Skrýt nabídky vlevo]
[Zobrazit/Skrýt nabídky vpravo]

Replication consistency check and repair

How to repair mysql replication, mostly in ROW logging based environment, where is not so easy to repair by hand (see Mysql - how to debug problem with ROW based logging )

There is basic information about percona toolkit, which will be use. On the REALITY DB, pt-table-checksum is allready prepared: Replication consistency check


At first, clear checksum table on master by invoking (on RCZ):

MariaDB [(none)]> use percona;
MariaDB [percona]> truncate table checksum;

Second, we have to sync slave to master, even if there is some errors. We can skip errors by hand
stop slave; SET GLOBAL SQL_SLAVE_SKIP_COUNTER = 1; start slave;
or by my.cnf setting like
slave-skip-errors = 1032
. This will skip errors 1032, which will be logged:
Slave SQL: Could not execute Update_rows_v1 event on table REALITY.xpristupu; Can't find record in 'xpristupu', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log mariadb-reality-binlog.000109


When we have slave in sync (Seconds_Behind_Master: 0) we have to decide which way we use. We can checksummed whole database, or just surely inconsistent. This information can be obtained from slave log, search for example for error 1032:

[root@mdbbackup.reality.internal /]# cat /var/log/mysqld.log | grep 1032 | awk '{print $14}' | sort -n | uniq 
REALITY.fotky;
REALITY.pocetVypisDen;
REALITY.xpristupu;
UZIV.sessions;


Now, we run checksum against acquired tables:

/usr/bin/pt-table-checksum --tables REALITY.fotky,REALITY.pocetVypisDen,REALITY.xpristupu,UZIV.sessions --replicate=percona.checksum --create-replicate-table --recursion-method dsn=h=127.0.0.1,D=percona,t=dsns  --nocheck-replication-filters h=localhost,u=root


We get some info into checksum table. Now, we have to sync this inconsistency to slave. We use pt-table-sync tool. Because there is some data manipulation, you have to know, what are you doing and using "dry-run" and "print" as much as possible.

We will print "sync" queries by this command executed on master

pt-table-sync --print --replicate percona.checksum 127.0.0.1


If it looks good, we can execute this sync by this command executed on master

pt-table-sync --execute --replicate percona.checksum 127.0.0.1


It will find slaves and make slaves synchronous by some updates to binlog.

Possible Problems:


If any problem comes into play, like:

Change algorithm

Failed to prepare TableSyncChunk plugin: Cannot chunk table `REALITY`.`xpristupu` using the character column z_cislo_zakazky, most likely because all values start with the same character. This table must be synced separately by specifying a list of --algorithms without the Chunk algorithm at /usr/bin/pt-table-sync line 4074.  while doing REALITY.xpristupu on 172.55.0.13


We have to sync with specified algorithm (before we can truncate checksum table and run checksum for specific table only):

pt-table-sync --execute --algorithms=Nibble --replicate percona.checksum 127.0.0.1


No unique index

[root@mariadb.reality.internal scripts]# pt-table-sync --print --replicate percona.checksum 127.0.0.1
Can't make changes on the master because no unique index exists at /usr/bin/pt-table-sync line 10697.  while doing REALITY.pristupy on 172.55.0.13


We have to use diferrent syntax of pt-table-sync resulting in diferrent behaviour. When we use more then 1 DSN in command, it will force sync from first DSN to others, regardless master or slave status. You have to be very very careful and read manual page carefuly!! man pt-table-sync

pt-table-sync --dry-run  --explain-host h=localhost,D=REALITY,t=dotazy h=172.55.0.13
pt-table-sync --print  h=localhost,D=REALITY,t=pristupyVypis h=172.55.0.13
pt-table-sync --execute  h=localhost,D=REALITY,t=pristupyVypis h=172.55.0.13



Created by darek. Last Modification: Pondělí 21 of listopad, 2016 11:58:38 CET by darek.