Tuesday, May 12, 2009

Disaster Recovery Planning for Active Directory

Preventing Active Directory failures should be a key component of any disaster recovery plan. There are steps every Windows shop can take to reduce the chances of an AD disaster. The best way to minimize downtime is to have a proactive plan in place.

Need to restore a single domain controller? Want to prevent the accidental bulk deletion of objects? Microsoft MVP Gary Olsen offers his advice on how to plan for the worst and what to do to get your Active Directory up and running again.

Disaster Recovery Planning for Active Directory

Part 1: How creating an Active Directory replication lag site minimizes disasters

It is a good idea to have a disaster recovery plan for major catastrophes, but there are a number of actions you can take to prevent disaster -- or at least minimize the chances of an Active Directory disaster such as the accidental bulk deletion of objects.

One of those actions is to create a replication lag site. Very simply, the lag site is an Active Directory site that is intentionally a few days to a week behind the rest of the domain. Of course, there are some gotchas when doing this, which we'll discuss shortly, but the lag site basically preserves a live backup of the Active Directory.

You create a lag site by putting a domain controller from the hub site into its own site (we'll call it the disaster recovery site) with a site link to the hub site. Configure the hub-disaster recovery site link for a replication frequency of 96 hours. That means that the disaster recovery site domain controller's copy of the Active Directory will be 96 hours behind the rest of the forest.

Now, remember that administrator who -- mistakenly, of course -- recently deleted an organizational unit (OU) with 10,000 users? Your only alternative is to do an authoritative restore (and hope your backup media is valid). That means you have to perform the following authoritative restore process:

  1. Unplug the domain controller that has the authoritative copy of the Active Directory from the network.
  2. Get the appropriate system state backup tape that you made before the deletion.
  3. Make sure the tape is valid and that it is no older than the TombstoneLifetime (60 days by default).
  4. Boot the restore domain controller into Directory Service Restore Mode (DSRM).
  5. Do a system state restore to this domain controller. Note that you have to do this twice to get the groups and users restored properly. This is not trivial.
  6. Plug the domain controller into the network.
  7. Replication will force the Active Directory objects from the restored domain controller to the other domain controllers in the network.

Note: Refer to Microsoft's KB 241594: How to perform an authoritative restore to a domain controller in Windows 2000 and KB 280079: Authoritative restore of groups can result in inconsistent membership information across domain controllers for more details on authoritative restore.

With the lag site, however, you now have a domain controller that has a copy of the Active Directory before the deletion took place (assuming you noticed it within four days of the occurrence). Let's say you discovered that an administrator mistakenly deleted 10,000 accounts yesterday. You can go to the domain controller in the lag site, which still has a copy of the Active Directory before the deletion and perform an authoritative restore using that domain controller's copy of the Active Directory, and push it out. Again, this depends on when the lag site replicates and when the deletion took place. If replication takes place on Monday and Friday, and the deletion happens Thursday night, then you have a small window of opportunity.

Get control of the gotchas

It is important that you take steps to prevent authentication from the lag site domain controllers since it has security data (accounts, passwords, locked accounts, group membership, etc.) that is a week old. You can accomplish this by defining a site policy for the lag site and defining the "DCLocator DNS Records Not Registered by the DCs" setting. The Mnemonics field is described in the Explain tab. You need to include all of the Mnemonics except CNAME record (needed for replication). The Explain tab is a bit confusing, but it's a space-delimited list as shown in Figure 1. The Mnemonics themselves are listed in the left column on the Explain tab.







The minimum configuration to implement a Active Directory lag site is to have a single site with at least one domain controller from each domain in the site. The preferred configuration is to have two domain controllers from each domain in the site. Set their replication frequency for 168 hours (seven days) and stagger the schedule so they replicate every 3.5 days. Thus, you have two old copies to choose from, mitigating the problem just noted.

You can also use a Virtual Server as the lag site domain controllers to save hardware costs.

If you have a multiple (parent/child) domain structure, then you have a lot of unseen problems. When you attempt a restore on one domain, it will fail to restore cross-domain group memberships. Hewlett-Packard Co. was the first to discover this problem, and the company developed a tool called Active Directory Link Replication Manager (ADLRM) that stores these links in a SQL database and restores them quite nicely. The tool also can store and restore individual attributes. For instance, if you have an HR application that modifies certain user attributes, and you need to restore the attribute to the pre-modified value, ADLRM can do that without requiring a full-scale authoritative restore.

10 comments:

  1. Very informative blog... This blog share helpful strategy for disaster recovery site. Thanks for sharing.

    ReplyDelete
  2. Thankfulness to my dad who informed me relating to this blog, this website is really amazing.

    ReplyDelete
  3. A safe haven and brief accommodations are a have to in all catastrophe hit regions. Community households need someplace warmth and relaxed to sleep. https://www.rebuildrecover.org/highlighting-usefulness-federal-disaster-assistance-website/

    ReplyDelete
  4. Your website is terribly informative and your articles are wonderful.
    inpatient rehab georgia

    ReplyDelete
  5. Wonderful, just what a blog it is! This blog has provided the helpful data to us continue the good work.
    alcohol detox centers in nj

    ReplyDelete
  6. If you really desire to get such type of information, visit this blog quickly.
    neuropathy pain relief

    ReplyDelete
  7. Keep the ball rolling you have done the great job here.
    inpatient rehab georgia

    ReplyDelete
  8. I'm in no doubt coming back again to read these articles and blogs.
    Drug addiction Atlanta

    ReplyDelete