Greetings, I thought I'd interject a bit.
It would be good to build a tool to find duplicate control numbers. I did this by exporting all the biblios, using: marcprint (my python utility) | grep "=001" | sort | uniq -c | sort -r | less and looked for counts greater than 1.
I'll use your "marcprint" in my example, though I suspect exporting a MARC file would be useful enough, if the MARC file is then converted to something "human readable". Under Windows, I would likely used MarcEdit to "break" the .mrc file into a .mrk file. http://people.oregonstate.edu/~reeset/marcedit/html/ And then having uploaded my .mrk files into a linux environment, substitute "marcprint" with "cat mymarcfile.mrk". All this uploading got me thinking that perhaps something like: SELECT ExtractValue(marcxml,'//datafield[@tag="001"]/*') AS Control FROM biblioitems But I didn't take it further than this, since I don't have time. NOTATION: # is a comment. $ is a command line prompt # Get a list of unique "=001" fields, should be one per record, right? $ marcprint | grep "=001" | sort -u -r > ~/check1.txt # Get a list of all the "=001" fields. $ marcprint | grep "=001" | sort -r > ~/check2.txt # Compare the two. Any differences will be due to duplications. $ diff ~/check1.txt ~/check2.txt GPML, Mark Tompsett