How do you re-add a down ER node when it was down for longer time due to --for example-- hard disk crash -- Posted by inturi on Wednesday, October 8 2008
Step1: After hard disk crashes at Node1, delete Node1 from Node2 .
command : cdr delete server -c Node2 Node1

Note: This step is to avoid building ER sendq at Node2 while Node1 is down for extended period of time.

Step2 : After restoring Node1 from backup
Delete Node1 ER using the command 'cdr delete server -c Node1 Node1'

Step3: Redefine ER at Node1
Command: cdr define server -c Node1 -S Node2 Node1

Step4: Now add Node1 participant to all replicate definitions using the command
cdr change repl -c Node1 "replicate name" --add "Node1 partitpant" "select statement"
If you are using templates then you could run 'cdr realize template "template name" "database@node1"

Correct the problem if you see any errors here.

Step5: Wait for few seconds then run 'cdr error' command at Node1, and check for remote server errors. Please correct the problem if you see any errors here.

Step6: Start all the replicate definitions for Node1 using the command
cdr start repl -c Node1 "replicate name" Node1

Correct the problem if you see any errors here.

Step7: Wait for few seconds then run 'cdr error' command at Node1, and check for remote server errors. Please correct the problem if u see any errors here.

Step8 : Now resynchronize Node1 data with Node2 data using 'cdr check' command with repair option(--repair).
Specify Node2 as sync node(-m option)
'cdr check' can be run on the replicate set or on individual replicates.
If you have replicate set created for all the replicates then you can run 'cdr check' on the replicate set.
Otherwise run 'cdr check' on individual replicates. If you are running 'cdr check' on individual replicates then make sure you run 'cdr check' first on parent tables then on child tables when you have primary key/foreign key relationship tables.

'cdr check' on replicate set takes care of ordering parent and child tables, and it also runs 'cdr check' on multiple tables in parallel where ever possible.

Note: 'cdr check' command by default deletes all extra rows found at Node1. The other two options for extra rows found at Node1 are 'keep' and 'merge'. Please look at ER manual for documentation on these options.

Step9 : Once data is in sync, start applications at Node1.


Note: If node1 is down for few hours and you have enough queue space, then you can just bring up node1, and it will automatically sync up with other ER nodes with the data queued in sendq of peer ER nodes

Also note that 'cdr check' command is available from 10.00xC4 onwards. you can use 'cdr define repair' or -S option in 'cdr start repl' command if you have 10.00xC1, xC2 or xC3.

[Edit] Home
If this information is helpful to you then please click here and post one simple tip that you know. Share your knowledge!