AD Forest Recovery Steps

  • 236 Views
  • Last Post 15 October 2015
nidhin_ck posted this 12 October 2015

Hi Experts,
I'm in the process of creating AD Forest recovery process for our infrastructure. Currently in root domain, all FSMO roles are placed in one DC and all DC's in entire forest are GC. We have one root and 4 child domains and all DC's are WIN2k8R2.
I have one question. At the time of DC recovery, we need to select one DC from from each domain. So it it advisable to restore the DC which has all the FSMO roles or do i need to select any other DC from forest domain?
After reading the MS forest recovery doc, i have created below steps. Did i miss any points on below steps or any correction. 
1. Update DSRM password for the DC's2. Decide the DC for recovery3. Configure Selected DC's boot in DSRM mode4. Disconnect the network cable from root domain dc / Shutdown all the DC's except the selected Root DC5. Reboot selected forest DC in DSRM mode6. On Root DC : Perform nonauthoritative of AD DS & Authoritative SYSVOL restorea. Login to DC using DSRM pwdb. get the version number of the backups which you have createdc. identify the backup you want to restored. restore AD in nonauthoritativly & SYSVOL in authoritativly 7. Reboot the DC in normal mode 8. Remove GC9. Check DNS service10. Create DWORD "HKLM\System\CurrentControlSet\Services\NTDS\Parameters\Repl Perform Initial Synchronizations" with value 011. Seize FSMO roles12. Metadata cleanup for other DC's in Root Domain13. Remove A record of deleted DC's from Forward lookup zone and from _msdcs zone14. Raise RID value by 100,000 15. Invalidate current RID pool16. Reset computer account pwd of DC's twice (Current administrator pwd)17. Reset krbtgt account pwd twice18. Configure time source19. Install OS on other DC's and do DCPROMO20. Enable GC on Root DC's21. Do a force replication from initial restore forest DC

Regards,
Nidhin CK

Order By: Standard | Newest | Votes
gkirkpatrick posted this 12 October 2015

I don’t understand step 1 – Update DSRM password for the DCs?

 

Which DC you restore is primarily based on which backups are the most suitable. Outside of that, I’ve always used the DC with the PDC

Emulator role when I had a choice, but it doesn’t really matter because you’re seizing the roles to the restored DC anyway.

 

How is your DNS set up? You mention “Check DNS service” but getting DNS working again can be a bit more involved depending on your

environment. When promoting subsequent DCs in an AD-integrated DNS environment, I’ve run into situations where the appropriate DNS records didn’t replicate properly. I would check DNS and check AD and SYSVOL replication after promoting each DC. You can write

a simple batch file using DCDIAG.

 

You should reset all the trust passwords after step 17, and then reconnect the network.

 

Do you have similar steps for the subordinate DCs?

 

-gil

 

 

 

 

show

GuyTe posted this 14 October 2015

 

Btw, InitSync is a bit tricky. When you set “Repl Perform Initial Synchronizations” to 0, what happens is that during the boot process, the InitSync is not executed.

The thing is, that this does not say anything about FSMO advertisement. In order for the FSMOs to start advertising, the DC

still has to complete the InitSync cycle.

 

In order to speed up the process, right after step 12 (metadata cleanup), make sure all the stale connection objects on the newly designated FSMO holder are deleted

(manually created connection objects need to be deleted) and run KCC (verify the results with repadmin /showrepl and repadmin /showconn)

Only after that will you be able to raise the RID pool.

 

Another thing to consider is that seizing FSMO roles using ntdsutil, while the old FSMO is not metadata cleaned up yet, will result in annoying timeouts. Reversing

the order (metadata cleanup first, FSMO seizure second) might save you couple minutes.

 

There are many other optimizations you can do in order to make the recover smoother. One of the major topics is DNS:

-       

Root Hints (AD + local)

-       

Forwarders



-       

Client DNS settings (resolver)

-       

Delegations

-       

Conditional forwarding

-       

Secondary zones



-       

Stub Zones

-       

NS records on the primary (file based and AD integrated) zones.

-       

Etc..

 

Another one that comes to mind is configuring the time service on the forest root PDCe as authoritative (AnnounceFlags) + setting the type to NTP (in most cases).

 

If you are still stuck with FRS, you might need to manually cleanup the SYSVOL replica set subscriptions.

 

There are many more, and the more complex the environment, the more knobs there are to tweak to allow smoother recovery.

 

Guy

 

show

slavickp posted this 14 October 2015

This is great information Guy. Who can ask MS to ipdate the document?
Cheers
Slav
MCM-DS

show

GuyTe posted this 14 October 2015

Well, disabling InitSync is still valid approach as it speeds up the OS boot process and allows earlier logon, so no point altering the docs with regards to the

InitSync.

 

Cheers,

Guy

 

show

nidhin_ck posted this 15 October 2015

Hi Guy,
One doubt on below point you said 
"In order to speed up the process, right after step 12 (metadata cleanup), make sure all the stale connection objects on the newly designated FSMO holder are deleted (manually created connection objects need to be deleted) and run KCC (verify the results with repadmin /showrepl and repadmin /showconn)Only after that will you be able to raise the RID pool."
Once we recover the first Root DC, we will have all the connections links in Sites & services . So do we need to delete all these links (including connections links of child Domain DC's ) as part of metadata cleanup then proceed with FSMO role seize & RID pool raise ? 
Another doubt. At which stage we can make "Repl Perform Initial Synchronizations" to 1. When we recover at least one DC in all the domains orWhen we complete first DC recovery in Root domain
Regards,
Nidhin CK

show

GuyTe posted this 15 October 2015

The idea is quite simple: as long as there is a connection object, KCC will try to leverage it (well, eventually it will mark it as failed but it will take time).

My rule of thumb:



-       

Automatically generated COs: delete any CO that is not pointing to a DC that is not online at the moment.

-       

Manually created: if the DC involved in the CO will be re-promoted or will not be brought back, delete it. Otherwise keep it

(this is the case of DCs restored from backup in other domains in the forest as part of the forest recovery)

 

You can re-enable InitSync after you have a working replication for all partitions in the forest. If this is a single domain forest, do it after you have at least

2 DCs online.

In multi-domain forest, do it after you have reconnected all the domains and have end-to-end replication working.

 

If you ask me, do it at the end. This is also the time to verify strict replication consistency is set to 1 (unless there is a chance for lingering objects).

 

Guy

 

show

nidhin_ck posted this 15 October 2015

Thanks a lot Guy EmojiEmoji
Regards,
Nidhin CK

show

Close