Newbie question re synchonizing address lists
Ken Doucet
doucet at nas.net
Sat Feb 2 14:01:32 PST 2008
I have an address file (200k) consisting of
recid
socin
first
last
address
add2
city
code1(c1)code2(n1)code3(c1)code4(n1)code5(c1)code6(n1)
pcode(c6) field that is just a consolidation of code1-6.
This single file is an aggregation of 7 separate databases and has been sent
to a mailing house which has returned a file with 2 additional fields:
VALID [code indicating their grouping of data as either V(valid)
C(correction) U(undeliverable) and G(garbage)
CORRECT_TEXT [text description of types of changes they have made.
The returned file does not show original content but rather a revised
version per their system.
As I am loathe to make the assumption that their database revisions are
correct I initially took a quick look at the group they said was Valid and
found 2k records where they have ignored spelling errors in our files.
Needless to say I am motivated to analyze the suggested revisions before
swallowing them into the databases.
Coming from a DOS database background it would be easy for me to go down the
same road (rut) that I have been in but I suspect that Provue's feature set
may provide some significant leverage. From watching the screencasts I am
wondering if I can simulate a "synchronization" between the file I sent and
the one they returned and do the analysis from that perspective? Or, should
I create a consolidated file that would contain the original and revised
addresses and work within the one file by doing comparisons between fields
i.e. a soundex comparison between original city and the revised city field.
I appreciate any suggestions as to approach as the only exposure I have had
is going through all the screencasts an a brief foray into documentation.
Also, I might add that I will also be looking to find/mark all duplicate
records (i.e. find all records that have same lastname and birthdate). This
project will also be focusing on consolidating duplicate records, which I
feel will be best started once I have cleaned up the address problems
otherwise I won't be able to find duplicates based on address field if there
are tons of typos (which there are)
Thanks
Ken
More information about the Qna
mailing list