NetApp vFiler DR with Data ONTAP Simulator Part 9: Planned Failover
This article is part of a series.
With the configuration tasks of the previous parts you now have an environment where netapp01 can be considered as master and netapp02 as slave. Clients use the share of the vFiler running on the master (netapp01) and the data is replicated to the slave (netapp02). Because of this replication if the master fails you can also start the vFiler on the slave without data loss.
If the slave (netapp02) is not available (e.g. shutdown for maintenance) the shares on the vFiler are not affected. Of course the data replication to the slave no longer works and is broken. Changes are only written to the master. As soon as the slave (netapp02) is available again replication kicks in and all changes are also written to the slave.
But what happens if the master (netapp01) is shutdown or fails? To provide the shares to the clients the vFiler has to be transfered to the slave (netapp02). It is differentiated between a planned failover (controlled transfer of the vFiler to the slave) and a disaster failover in case of an failure. This part describes the steps of a planed failover.
- stop vFiler on master (netapp01) => “stopped”
netapp01> vfiler stop vfiler01 vfiler01 stopped netapp01> Thu Apr 28 20:20:52 CEST [netapp01:vf.stopped:warning]: vfiler: 'vfiler01'; stopped netapp01> vfiler status vfiler0 running vfiler01 stopped
netapp02> vfiler status vfiler0 running vfiler01 stopped, DR backup
- start vFiler DR on slave => “running”
netapp02> vfiler dr activate vfiler01@netapp01 Waiting for "vol_vfiler01" to become stable. Thu Apr 28 20:33:58 CEST [netapp02:snapmirror.sync.fail:notice]: Synchronous SnapMirror from netapp01_vfiler01_con:vol_vfiler01 to netapp02:vol_vfiler01 failed. CIFS local server is running. Thu Apr 28 20:34:04 CEST [vfiler01@netapp02:cifs.startup.local.succeeded:info]: CIFS: CIFS local server is running. Thu Apr 28 20:34:04 CEST [netapp02:httpd.config.mime.missing:warning]: /etc/httpd.mimetypes.sample file is missing. Thu Apr 28 20:34:04 CEST [vfiler01@netapp02:httpd.config.mime.missing:warning]: /etc/httpd.mimetypes file is missing. Thu Apr 28 20:34:04 CEST [vfiler01@netapp02:httpd.config.mime.missing:warning]: /etc/httpd.mimetypes.sample file is missing. Thu Apr 28 20:34:05 CEST [netapp02:wafl.scan.ownblocks.done:info]: Completed block ownership calculation on volume vol_vfiler01. The scanner took 0 ms. Vfiler vfiler01 activated. e0a: flags=0xe48867mtu 1500 inet 192.168.2.67 netmask 0xffffff00 broadcast 192.168.2.255 ether 00:0c:29:61:01:2b (auto-1000t-fd-up) flowcontrol full netapp02> Thu Apr 28 20:34:05 CEST [netapp02:cmds.vfiler.dr.activated:info]: Disaster recovery backup vFiler unit: 'vfiler01' of the vFiler unit at remote storage system: 'netapp01' was activated. Thu Apr 28 20:34:29 CEST [vfiler01@netapp02:nbt.nbns.registrationComplete:info]: NBT: All CIFS name registrations have completed for the local server. netapp02> vfiler status vfiler0 running vfiler01 running
- check state of SnapMirror => “Source” on master und “Broken-off” on slave
netapp01> snapmirror status Snapmirror is on. Source Destination State Lag Status netapp01:vol_vfiler01 netapp02:vol_vfiler01 Source 00:04:22 Idle
netapp02> snapmirror status Snapmirror is on. Source Destination State Lag Status netapp01_vfiler01_con:vol_vfiler01 netapp02:vol_vfiler01 Broken-off 00:05:12 Idle
- resync from Slave to Master (-s for synchronous replication)
netapp01> vfiler dr resync -s vfiler01@netapp02 One can optionally provide an alternate ip path for sync snapmirroring Alternate IP address/Hostname for remote filer netapp02 []: Alternate IP address/Hostname for local filer netapp01 []: netapp02's Administrative login: root netapp02's Administrative password: CIFS local server on vFiler vfiler is shutting down... CIFS local server on vfiler vfiler has shut down... Thu Apr 28 20:40:57 CEST [vfiler01@netapp01:telnet_0:notice]: IP address 192.168.2.68 is removed from interface "e0a" Configuring SnapMirror to mirror vfiler vfiler01's storage units from remote filer netapp02. Starting snapmirror initialize commands. It could take a very long time when the source or destination filers are involved in many simultaneous transfers. The console will not be available until all initialize commands are started successfully. Please use the "snapmirror status" command on the source filer to monitor the progress. Thu Apr 28 20:41:00 CEST [netapp01:snapmirror.dst.resync.info:notice]: SnapMirror resync of vol_vfiler01 to netapp02:vol_vfiler01 is using netapp02(4082368507)_vol_vfiler01.4 as the base snapshot. Thu Apr 28 20:41:00 CEST [netapp01:vFiler.storageUnit.off:warning]: vFiler vfiler01: storage unit /vol/vol_vfiler01 now offline. Thu Apr 28 20:41:01 CEST [netapp01:wafl.snaprestore.revert:info]: Reverting volume vol_vfiler01 to a previous snapshot. Thu Apr 28 20:41:02 CEST [netapp01:vFiler.storageUnit.On:notice]: vFiler vfiler01: storage unit /vol/vol_vfiler01 now online. Revert to resync base snapshot was successful. Thu Apr 28 20:41:02 CEST [netapp01:replication.dst.resync.success:notice]: SnapMirror resync of vol_vfiler01 to netapp02:vol_vfiler01 was successful. SnapMirror transfer initiated for vfiler storage units.
- check SnapMirror from netapp02 (Source) to netapp01 (Destination) => additional entries with state “Snapmirrored” on master and “Source” on slave
netapp01> snapmirror status Snapmirror is on. Source Destination State Lag Status netapp02_vfiler01_con:vol_vfiler01 netapp01:vol_vfiler04 Snapmirrored 00:00:00 In-sync netapp01:vol_vfiler01 netapp02:vol_vfiler01 Source 00:10:45 Idle
netapp02> snapmirror status Snapmirror is on. Source Destination State Lag Status netapp01_vfiler01_con:vol_vfiler01 netapp02:vol_vfiler01 Broken-off 00:11:31 Idle netapp02:vol_vfiler01 netapp01:vol_vfiler01 Source 00:00:00 In-sync
With this 4 actions master and slave changed rolls. The vFiler now runs on netapp02 and data is replicated from netapp02 to netapp01. You can now shutdown netapp01 and do your maintenance.
If the “old” master (netapp01) is back again you can move the vFiler back to it.
- wait until status of SnapMirror from netapp02 (Source) to netapp01 (Destination) “In-sync”
netapp01> snapmirror status Snapmirror is on. Source Destination State Lag Status netapp02_vfiler01_con:vol_vfiler01 netapp01:vol_vfiler01 Snapmirrored 00:00:00 In-sync netapp01:vol_vfiler01 netapp02:vol_vfiler01 Source 00:14:45 Idle
netapp02> snapmirror status Snapmirror is on. Source Destination State Lag Status netapp01_vfiler01_con:vol_vfiler01 netapp02:vol_vfiler01 Broken-off 00:15:31 Idle netapp02:vol_vfiler01 netapp01:vol_vfiler01 Source 00:00:00 In-sync
- stop vFiler on slave => “stopped”
netapp02> vfiler stop vfiler01 vfiler01 stopped Thu Apr 28 20:47:10 CEST [netapp02:vf.stopped:warning]: vfiler: 'vfiler01'; stopped netapp02> vfiler status vfiler0 running vfiler01 stopped
- start vFiler on master => “running”
netapp01> vfiler dr activate vfiler01@netapp02 Waiting for "vol_vfiler01" to become stable. Thu Apr 28 20:48:23 CEST [netapp01:snapmirror.sync.fail:notice]: Synchronous SnapMirror from netapp02_vfiler01_con:vol_vfiler01 to netapp01:vol_vfiler01 failed. Thu Apr 28 20:48:30 CEST [netapp01:wafl.scan.ownblocks.done:info]: Completed block ownership calculation on volume vol_vfiler01. The scanner took 0 ms. CIFS local server is running. Thu Apr 28 20:48:31 CEST [vfiler01@netapp01:cifs.startup.local.succeeded:info]: CIFS: CIFS local server is running. Thu Apr 28 20:48:31 CEST [netapp01:httpd.config.mime.missing:warning]: /etc/httpd.mimetypes.sample file is missing. Thu Apr 28 20:48:31 CEST [vfiler01@netapp01:httpd.config.mime.missing:warning]: /etc/httpd.mimetypes file is missing. Thu Apr 28 20:48:31 CEST [vfiler01@netapp01:httpd.config.mime.missing:warning]: /etc/httpd.mimetypes.sample file is missing. Vfiler vfiler01 activated. e0a: flags=0xe48867mtu 1500 inet 192.168.2.66 netmask 0xffffff00 broadcast 192.168.2.255 inet 192.168.2.69 netmask 0xffffff00 broadcast 192.168.2.255 ether 00:0c:29:ee:ee:f2 (auto-1000t-fd-up) flowcontrol full netapp01> Thu Apr 28 20:48:32 CEST [netapp01:cmds.vfiler.dr.activated:info]: Disaster recovery backup vFiler unit: 'vfiler01' of the vFiler unit at remote storage system: 'netapp02' was activated. Thu Apr 28 20:48:55 CEST [vfiler01@netapp01:nbt.nbns.registrationComplete:info]: NBT: All CIFS name registrations have completed for the local server. netapp01> vfiler status vfiler0 running vfiler01 running
- check state of SnapMirror from netapp02 (Source) to netapp01 (Destination) => “Source” on netapp02 and “Broken-off” on netapp01
netapp01> snapmirror status Snapmirror is on. Source Destination State Lag Status netapp02_vfiler01_con:vol_vfiler01 netapp01:vol_vfiler01 Broken-off 00:03:38 Idle netapp01:vol_vfiler01 netapp02:vol_vfiler01 Source 00:17:04 Idle
netapp02> snapmirror status Snapmirror is on. Source Destination State Lag Status netapp01_vfiler01_con:vol_vfiler01 netapp02:vol_vfiler01 Broken-off 00:17:58 Idle netapp02:vol_vfiler01 netapp01:vol_vfiler01 Source 00:04:32 Idle
- resync of master to slave => status of SnapMirror from netapp01 (Source) to netapp02 (Destination) “In-sync” (can take some time)
netapp02> vfiler dr resync -s vfiler01@netapp01 One can optionally provide an alternate ip path for sync snapmirroring Alternate IP address/Hostname for remote filer netapp01 []: Alternate IP address/Hostname for local filer netapp02 []: netapp01's Administrative login: root netapp01's Administrative password: CIFS local server on vFiler vfiler01 is shutting down... waiting for CIFS shut down (^C aborts)... CIFS local server on vfiler vfiler01 has shut down... Thu Apr 28 20:53:02 CEST [vfiler01@netapp02:telnet_0:notice]: IP address 192.168.2.68 is removed from interface "e0a" Configuring SnapMirror to mirror vfiler vfiler01's storage units from remote filer netapp01. Starting snapmirror initialize commands. It could take a very long time when the source or destination filers are involved in many simultaneous transfers. The console will not be available until all initialize commands are started successfully. Please use the "snapmirror status" command on the source filer to monitor the progress. Thu Apr 28 20:53:06 CEST [netapp02:snapmirror.dst.resync.info:notice]: SnapMirror resync of vol_vfiler01 to netapp01:vol_vfiler01 is using netapp01(4082368508)_vol_vfiler01.5 as the base snapshot. Thu Apr 28 20:53:06 CEST [netapp02:vFiler.storageUnit.off:warning]: vFiler vfiler01: storage unit /vol/vol_vfiler01 now offline. Thu Apr 28 20:53:08 CEST [netapp02:wafl.snaprestore.revert:info]: Reverting volume vol_vfiler01 to a previous snapshot. Thu Apr 28 20:53:09 CEST [netapp02:vFiler.storageUnit.On:notice]: vFiler vfiler01: storage unit /vol/vol_vfiler01 now online. Revert to resync base snapshot was successful. Thu Apr 28 20:53:10 CEST [netapp02:replication.dst.resync.success:notice]: SnapMirror resync of vol_vfiler01 to netapp01:vol_vfiler01 was successful. SnapMirror transfer initiated for vfiler storage units. netapp02> snapmirror status Snapmirror is on. Source Destination State Lag Status netapp01_vfiler01_con:vol_vfiler01 netapp02:vol_vfiler01 Snapmirrored 00:00:00 In-sync netapp02:vol_vfiler01 netapp01:vol_vfiler01 Source 00:08:02 Idle
netapp01> snapmirror status Snapmirror is on. Source Destination State Lag Status netapp02_vfiler01_con:vol_vfiler01 netapp01:vol_vfiler01 Broken-off 00:08:37 Idle netapp01:vol_vfiler01 netapp02:vol_vfiler01 Source 00:00:00 In-sync
- delete SnapMirror relations of slave to master
netapp02> snapmirror release vol_vfiler01 netapp01:vol_vfiler01 snapmirror release: vol_vfiler01 netapp01:vol_vfiler01: No release-able destination found that matches those parameters. Use 'snapmirror destinations' to see a list of release-able destinations.
netapp01> snapmirror release vol_vfiler01 netapp01:vol_vfiler01 snapmirror release: vol_vfiler01 netapp01:vol_vfiler01: No release-able destination found that matches those parameters. Use 'snapmirror destinations' to see a list of release-able destinations.
As before the failover the vFiler runs on the master (netapp01) again and the data is replicated from the slave (netapp02) to the master (netapp01).
netapp01> vfiler status vfiler0 running vfiler01 running netapp01> snapmirror status Snapmirror is on. Source Destination State Lag Status netapp01:vol_vfiler01 netapp02:vol_vfiler01 Source 00:00:00 In-sync
netapp02> vfiler status vfiler0 running vfiler01 stopped, DR backup netapp02> snapmirror status Snapmirror is on. Source Destination State Lag Status netapp01_vfiler01_con:vol_vfiler01 netapp02:vol_vfiler01 Snapmirrored 00:00:00 In-sync
All articles of the series
Part 1: Download of the files needed
Part 2: Configuration of the first simulator
Part 3: Configuration of the second simulator
Part 4: Create an aggregate and volume
Part 5: DNS Configuration
Part 6: Create vFiler and configure vFiler DR
Part 7: Synchronous vFiler DR
Part 8: Create shares on vFiler
Part 9: Planned Failover
Part 10: Disaster Failover