Robin, thanks so much for helping me not waste my time trying to accomplish the impossible. I guess I'll use the spreadsheet as a read-only viewing tool to compare the various MARC records for a given title, but do the actual editing in MarcEdit's mnemonic format (.mrk) On Mon, Feb 28, 2011 at 12:37 PM, Robin Sheat <robin@catalyst.net.nz> wrote:
hansbkk@gmail.com schreef op ma 28-02-2011 om 12:15 [+0700]:
I'm trying to export batches of records from MARC to a delimited text format, do a bunch of editing and then convert back to MARC.
The CSV format is not capable of holding the information contained within MARC. MARC has two levels of repeatable headers (field, subfield) where order and grouping matters whereas CSV has one level of (generally) non-repeatable headers with fixed order and no concept of grouping.
MarcEdit works great, except when going from MARC to CSV, the repeating fields (5XX and up) are being concatenated with semicolon separators into single fields. I'd like to keep these as separate repeating fields,
Splitting with tokens, like semicolons, is pretty much the only way to begin, but unless you're extremely careful you're still going to lose information. The only way to be careful is by encoding a whole lot of extra information into your CSV files, at which point they stop being CSV, for all intents and purposes.
so I'm working on kludging up some workarounds, but in the meantime can anyone suggest an alternative tool, ideally one that will do a clean round-trip, i.e. reconstruct the MARC from the CSV identically if the data isn't altered?
That is pretty much an impossible task.
Chris I'm cc'ing you specifically because I recall your mentioning a script in a past thread that went the other way (CSV to MARC), but unfortunately couldn't find the message in the archives.
I have a script I call csvtomarc.pl. It's designed for taking the output from something that can only export in a CSV-like form[0], passing it through a host of rules and transformations, linking up items, and outputting MARC. It's what I use to do migrations, generally.
The most up-to-date version of this at the moment can be found here: http://git.catalyst.net.nz/gw?p=koha.git;a=tree;f=import/csv;h=a92d26d020e04... but I warn you, it's not designed as an end-user tool, it's pretty complex.
[0] No product I've encountered yet can export as proper CSV it seems, they all break the spec and produce unparseable results that require hand massaging to fix. I'm especially looking at you, Liberty.