Data corruption: Detection, prevention, elimination & removal

Robert Ameeti Robert at Ameeti.net
Sat Mar 15 19:32:34 PDT 2008


I'm hoping that a recent data corruption issue has been put to bed at 
least temporarily but I'd like to see if the problem can be minimized 
or simplified in the future.

On an Enterprise shared database in a situation with about 5 files 
loaded as a set to each of 6 clients, I experienced data corruption 
to what appears to be every record that was accessed and modified. 
The symptom of the corruption was the notorious "One of more numeric 
data cells contain corrupted data."

What was interesting about this recent corruption was that I know 
that only those records that were touched were being corrupted. Thus, 
this wasn't cosmic rays. A deletion of the file from the server and 
reloading of the file to each computer did not stop the corruption 
the first time. A second deletion & reload of the file to the server 
and then reloading to the clients and voila, all was fine again!

This corruption, which I've not narrowed down as to the exact cause, 
gave me 2 new errors. 1) Import aborted before all of the data was 
loaded. Both of the files involved were shared and had ~15,000 
records. Another symptom was an error 'Netword error, data garbled."

What was new to me in this occurrence was that this corruption was 
being passed from the server to clients. Whether it started on the 
server or on the clients, I do not know. But what I do know is that 
when a client synchronized, they received the corrupted record! Ugh. 
I think I really want the server to retain corruption and not pass it 
back to clients.

What I'd also like is some sort of method to remove obvious corrupted 
records. I could do a SelectAll and thereby visually see those 
records that were corrupted (and that is how I knew that with 
specific records that were touched by the clients, each of those 
records were corrupt while others were fine.) I could not 
successfully move to the corrupt record so as to then Delete it. I 
could not RemoveUnselected because the file was being shared. I could 
see them but not remove them. The only answer that I could determine 
was to stop the work day at lunch, reload (twice) a known good file 
from one computer that had not accessed those records nor had it 
synchronized. This resulted in 4 hours of work lost for 6 people. All 
of their work would have to be created from memory (which of course 
will be faulty.)

I think I want for the server to not update a client with a corrupt 
record. If the client or server is able to determine that the record 
is corrupt because what should be numeric is not numeric (as I saw it 
distinctly was not numeric data in those numeric cells) can't those 
records be frozen at their source instead of being virus like and 
moving from computer to computer?

Can perhaps Panorama automatically delete corrupted records and then 
write whatever information that it does have to a text file for an 
administrator to then decide how to proceed. If it knows that the 
record is corrupt, there is little value to have it passed around to 
other computers.
-- 

<><><><><><><><><><><><><><><><><><>
Robert Ameeti

Integrity without knowledge is weak and useless, and knowledge without integrity is dangerous and dreadful.
            -- Samuel Johnson
<><><><><><><><><><><><><><><><><><>


More information about the Qna mailing list