| Author | Messages | |
activedirsmaporg
Posts:0
 | | 08/22/2005 5:50 AM |
| Both Steve, Hunter's, and your original advice is sound ... I think it is
very likely if you call PSS, they'll tell you to do Steve's, yours, and
Hunter's advice in about that order.
My favorite disk sub-system diagnostics is jetstress, but dedicated disk
sub-system stressers are better, as they try odd patterns of bits that
they know buses, electrical systems, and disks get fouled up on. Also do
not ignore RAM checkers, that is almost as likely, perhaps even more
likely here.
Do you have ECC or parity memory? Any events in system or app event log
related to parity memory issues?
BTW, how big is your ntds.dit file? Is it over 1.5-2.5 GBs? That
increases the hypothesis of memory issues.
So you have multiple of these events? If you do, do they always happen
for the same page numbers ("pgno") and offsets? If different, does thier
frequency increase?
If you haven't restored it already, I'd be curious if you felt like
sharing, what the page looked like from:
esentutl /m ntds.dit /p81184 /v
... then we could see how bad the header was corrupted. Also this will
tell you if the page is an "Index page", and thus likely to be fixed by an
offline defrag. If you see "primary" or "long value" page, offline defrag
probably won't fix it.
Also get the previous page too (change 81184 to 81183 in the above
command). But again, only if you feel like sharing.
Cheers,
BrettSh
This posting is provided "AS IS" with no warranties, and confers no
rights.
On Sat, 20 Aug 2005, Coleman, Hunter wrote:
> I'd also look at running hardware diagnostics, particularly on the
> disk subsystem and controller. No point in restoring or repromoting if
> there is an unresolved hardware problem.
> > -----Original Message-----
> From: ActiveDir-owner@xxxxxxxxxxxxxxxxxx on behalf of Steve Linehan
> Sent: Fri 8/19/2005 8:18 PM
> To: ActiveDir@xxxxxxxxxxxxxxxxxx
> Cc:
> Subject: RE: [ActiveDir] Database Corruption
> > Well the first thing I always recommend is to try an offline
> defrag as it is possible that the corruption is in an index, i.e.
> metadata, that can be rebuilt. If the offline defrag fails then
> restoring from backup or repromoting will be your next step.
> > Thanks,
> -Steve
> _____
> > From: ActiveDir-owner@xxxxxxxxxxxxxxxxxx [mailto:ActiveDir-owner@xxxxxxxxxxxxxxxxxx] On Behalf Of Ayers, Diane
> Sent: Friday, August 19, 2005 6:43 PM
> To: ActiveDir@xxxxxxxxxxxxxxxxxx
> Subject: RE: [ActiveDir] Database Corruption
> > My preferred approach would be to demote the box to member
> server and re-promote to a domain controller to ensure a good fresh
> copy of the DIT. YMMV as the specific requirements at your location
> may prevent this. We have only run into this once early in our AD
> days and this was the approach we used with good success.
> > Diane
> _____
> > From: ActiveDir-owner@xxxxxxxxxxxxxxxxxx [mailto:ActiveDir-owner@xxxxxxxxxxxxxxxxxx] On Behalf Of Alex Fontana
> Sent: Friday, August 19, 2005 3:29 PM
> To: ActiveDir@xxxxxxxxxxxxxxxxxx
> Subject: [ActiveDir] Database Corruption
> > Started getting the error below a few weeks ago on one of our
> DCs. My first reaction is to run a non-auth restore from a day before
> this started happening and let replication take care of everything
> else. Any reason NOT to do this? I™m concerned that this may
> happen again and wasn™t able to find anything specific to the error
> below. Besides calling PSS any thing else I should look into before
> restoring? This box holds all FSMO roles, Win2k3, server for NIS.
> > TIA
> -alex
> > > Event Type: Error
> Event Source: NTDS ISAM
> Event Category: Database Page Cache
> Event ID: 475
> Date: 8/19/2005
> Time: 2:00:24 PM
> User: N/A
> Computer: DC
> Description:
> > NTDS (528) NTDSA: The database page read from the file
> "C:\WINNT\NTDS\ntds.dit" at offset 665067520 (0x0000000027a42000) for
> 8192 (0x00002000) bytes failed verification due to a page number
> mismatch. The expected page number was 81184 (0x00013d20) and the
> actual page number was 2349964126 (0x8c119b5e). The read operation
> will fail with error -1018 (0xfffffc06). If this condition persists
> then please restore the database from a previous backup. This problem
> is likely due to faulty hardware. Please contact your hardware vendor
> for further assistance diagnosing the problem.
> > > >
List info : http://www.activedir.org/List.aspx
List FAQ : http://www.activedir.org/ListFAQ.aspx
List archive: http://www.mail-archive.com/activedir%40mail.activedir.org/ | | | |
|
|