[Koha] Marc-->CSV - alternative to MarcEdit?
hansbkk at gmail.com
hansbkk at gmail.com
Mon Feb 28 19:11:49 NZDT 2011
Robin, thanks so much for helping me not waste my time trying to
accomplish the impossible.
I guess I'll use the spreadsheet as a read-only viewing tool to
compare the various MARC records for a given title, but do the actual
editing in MarcEdit's mnemonic format (.mrk)
On Mon, Feb 28, 2011 at 12:37 PM, Robin Sheat <robin at catalyst.net.nz> wrote:
> hansbkk at gmail.com schreef op ma 28-02-2011 om 12:15 [+0700]:
>> I'm trying to export batches of records from MARC to a delimited text
>> format, do a bunch of editing and then convert back to MARC.
>
> The CSV format is not capable of holding the information contained
> within MARC. MARC has two levels of repeatable headers (field, subfield)
> where order and grouping matters whereas CSV has one level of
> (generally) non-repeatable headers with fixed order and no concept of
> grouping.
>
>> MarcEdit works great, except when going from MARC to CSV, the
>> repeating fields (5XX and up) are being concatenated with semicolon
>> separators into single fields. I'd like to keep these as separate
>> repeating fields,
>
> Splitting with tokens, like semicolons, is pretty much the only way to
> begin, but unless you're extremely careful you're still going to lose
> information. The only way to be careful is by encoding a whole lot of
> extra information into your CSV files, at which point they stop being
> CSV, for all intents and purposes.
>
>> so I'm working on kludging up some workarounds, but
>> in the meantime can anyone suggest an alternative tool, ideally one
>> that will do a clean round-trip, i.e. reconstruct the MARC from the
>> CSV identically if the data isn't altered?
>
> That is pretty much an impossible task.
>
>> Chris I'm cc'ing you specifically because I recall your mentioning a
>> script in a past thread that went the other way (CSV to MARC), but
>> unfortunately couldn't find the message in the archives.
>
> I have a script I call csvtomarc.pl. It's designed for taking the output
> from something that can only export in a CSV-like form[0], passing it
> through a host of rules and transformations, linking up items, and
> outputting MARC. It's what I use to do migrations, generally.
>
> The most up-to-date version of this at the moment can be found here:
> http://git.catalyst.net.nz/gw?p=koha.git;a=tree;f=import/csv;h=a92d26d020e04b6a21de328f307c53a41965ff80;hb=refs/heads/stdc_import
> but I warn you, it's not designed as an end-user tool, it's pretty
> complex.
>
> [0] No product I've encountered yet can export as proper CSV it seems,
> they all break the spec and produce unparseable results that require
> hand massaging to fix. I'm especially looking at you, Liberty.
More information about the Koha
mailing list