Consider adding:<br><br><blockquote style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;" class="gmail_quote">use strict;</blockquote><blockquote style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;" class="gmail_quote">
use warnings;<br></blockquote><div> <br></div>And then fixing anything warnings complains about. One might be changing the order here:<br><br><blockquote style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;" class="gmail_quote">
if($ARGV[0] eq '-h' || $ARGV[0] eq '--help' || scalar(@ARGV) < 2 )</blockquote><div><br>becoming:<br><br><blockquote style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;" class="gmail_quote">
if ((scalar(@ARGV) < 2) || $ARGV[0] eq '-h' || $ARGV[0] eq '--help') </blockquote><div><br>so that you check if the array is populated before trying to access the [0]th element. <br><br>Also be sure to die on things that represent fatal errors, like:<br>
<br><blockquote style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;" class="gmail_quote">
open(FF4,">./$output_file");</blockquote><div><br>should become:<br><br><blockquote style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;" class="gmail_quote">
open(FF4,">./$output_file") or die "Cannot write to $output_file: $!"; </blockquote></div></div></div><br>Because if you can't produce the output file, there isn't any reason to go on! This helps catch user permissions errors at runtime.<br>
<br>--Joe<br><br><div class="gmail_quote">On Mon, Dec 22, 2008 at 12:02 PM, Jeffrey LePage <span dir="ltr"><<a href="mailto:jeffrey_lepage@yahoo.com">jeffrey_lepage@yahoo.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
I've been trying to convert a Sagebrush Athena generated MARC export to a format acceptable to Koha.<br>
<br>
I believe I have a working script that does the conversion.<br>
<br>
I did the final conversion this morning, and the import with bulkmarcimport.pl. I haven't had the chance to look at the catalog in depth, but at first glance everything seems OK. I will now unleash the librarians with instructions to examine the new catalog and make sure everything is good. If I find problems I'll post the new information.<br>
<br>
One recent change: Apparently Koha like UTF-8, so I explicitly encode any strings in the MARC records as UTF08.<br>
<br>
For now, here's my script. Please note, the script optionally sets the current and permanent branch for the books (Koha 952$a and 952$b). If you don't want the branch set then don't use this option. If you use this option, then the branch_code should correspond to branches.branchcode in the Koha DB. Is setting the branch a bad idea?<br>
<br>
Use with care, no warranty is expressed or implied, blah blah blah.<br>
<br>
For those of you familiar with Koha/Perl, please comment. Especially let me know if I'm doing something horribly horribly wrong.<br>
<br>
<br>
<br>
<br>
<br>
<br>
PERL CODE FOLLOWS:<br>
********************************<br>
use MARC::Batch;<br>
use Encode;<br>
<br>
<br>
my $input_file;<br>
my $output_file;<br>
my $location;<br>
if($ARGV[0] eq '-h' || $ARGV[0] eq '--help' || scalar(@ARGV) < 2 )<br>
{<br>
print "This converts a MARC file generated by Sagebrush Athena to a file appropriate for Koha\n\n";<br>
print "Usage: perl marcconvert.pl originalmarcfile convertedmarcfile\n";<br>
print "\tor \n";<br>
print "Usage: perl marcconvert.pl originalmarcfile convertedmarcfile branch_code \n\n";<br>
exit;<br>
}<br>
else<br>
{<br>
$input_file = $ARGV[0];<br>
if( -f $input_file )<br>
{<br>
# the file exists and it is a file (not a directory or something else)<br>
}<br>
else<br>
{<br>
print "The input file '$input_file' does not exist\n";<br>
exit;<br>
}<br>
$output_file = $ARGV[1];<br>
}<br>
<br>
if($ARGV[2]){ $location = encode("utf8", $ARGV[2]); }<br>
<br>
my $batch = MARC::Batch->new('USMARC',$input_file);<br>
<br>
open(FF4,">./$output_file");<br>
while ( my $record = $batch->next())<br>
{<br>
my @fields = $record->fields();<br>
my $newrecord = MARC::Record->new();<br>
#$newrecord->leader($record->leader()); # i let MARC::Record generate a leader. Is this wrong?<br>
foreach my $field (@fields)<br>
{<br>
my $tag = $field->tag();<br>
my $newfield;<br>
if($tag < 10)<br>
{<br>
# it has data but no indicators or subfields<br>
my $data = encode("utf8", $field->data()); # Koha like UTF-8, so we have to convert<br>
$newfield = MARC::Field->new($tag,$data);<br>
}<br>
elsif($tag eq '852')<br>
{<br>
# do data conversion to 952<br>
# Sagebrush Athena puts some stuff in tag 852 but Koha likes it in tag 952<br>
my @subfields = ();<br>
my $athena_k = ''; # Koha 952 $o is composed of 852 $k $h $i $m<br>
my $athena_h = ''; #Koha 952 $o is composed of 852 $k $h $i $m<br>
my $athena_i = ''; #Koha 952 $o is composed of 852 $k $h $i $m<br>
my $athena_m = ''; #Koha 952 $o is composed of 852 $k $h $i $m<br>
foreach my $sub ($field->subfields())<br>
{<br>
my $data = encode("utf8", $sub->[1]);<br>
if($sub->[0] eq 't')<br>
{<br>
push(@subfields,'t',$data);<br>
}<br>
elsif($sub->[0] eq '6')<br>
{<br>
push(@subfields,'y',$data);<br>
}<br>
elsif($sub->[0] eq 'b')<br>
{<br>
#push(@subfields,'b','FHM'); # I set the current and permanent branch manually - see below<br>
}<br>
elsif($sub->[0] eq 'a')<br>
{<br>
#push(@subfields,'a','FHM'); # I set the current and permanent branch manually - see below<br>
}<br>
elsif($sub->[0] eq 'z')<br>
{<br>
push(@subfields,'z',$data);<br>
}<br>
elsif($sub->[0] eq 'k')<br>
{<br>
$athena_k = $data;<br>
}<br>
elsif($sub->[0] eq 'h')<br>
{<br>
$athena_h = $data;<br>
}<br>
elsif($sub->[0] eq 'i')<br>
{<br>
$athena_i = $data;<br>
}<br>
elsif($sub->[0] eq 'm')<br>
{<br>
$athena_m = $data;<br>
}<br>
elsif($sub->[0] eq '9')<br>
{<br>
push(@subfields,'g',$data);<br>
}<br>
elsif($sub->[0] eq '5')<br>
{<br>
push(@subfields,'e',$data);<br>
}<br>
elsif($sub->[0] eq '8')<br>
{<br>
push(@subfields,'d',$data);<br>
}<br>
elsif($sub->[0] eq 'p')<br>
{<br>
push(@subfields,'p',$data);<br>
}<br>
else<br>
{<br>
push(@subfields,$sub->[0],$data);<br>
}<br>
}<br>
#Koha 952 $o is composed of 852 $k $h $i $m<br>
my $koha_o= "$athena_k $athena_h $athena_i $athena_m"; #Koha 952 $o is composed of 852 $k $h $i $m<br>
$koha_o =~ s/^\s+//;<br>
$koha_o =~ s/\s+$//;<br>
$koha_o =~ s/\s{2,}/ /g;<br>
$koha_o = encode("utf8", $koha_o);<br>
push(@subfields,'o',$koha_o);<br>
<br>
if($location)<br>
{<br>
push(@subfields,'a',$location); # I set the current and permanent branch manually - is this a bad idea<br>
push(@subfields,'b',$location); # I set the current and permanent branch manually - is this a bad idea<br>
}<br>
$newfield = MARC::Field->new('952', $field->indicator(1), $field->indicator(2), @subfields );<br>
}<br>
else<br>
{<br>
# no data, but has<br>
# 1) indicators (defined, but not necessarily set)<br>
# 2) subfields<br>
#<br>
# This is for all the tags >= 10 and not tag 852<br>
my @subfields = ();<br>
foreach my $sub ($field->subfields())<br>
{<br>
my $data = encode("utf8", $sub->[1]);<br>
push(@subfields,$sub->[0],$data);<br>
}<br>
$newfield = MARC::Field->new($tag, $field->indicator(1), $field->indicator(2), @subfields );<br>
}<br>
$newrecord->append_fields($newfield);<br>
}<br>
print FF4 $newrecord->as_usmarc();<br>
<br>
}<br>
close(FF4);<br>
<br>
<br>
<br>
_______________________________________________<br>
Koha mailing list<br>
<a href="mailto:Koha@lists.katipo.co.nz">Koha@lists.katipo.co.nz</a><br>
<a href="http://lists.katipo.co.nz/mailman/listinfo/koha" target="_blank">http://lists.katipo.co.nz/mailman/listinfo/koha</a><br>
</blockquote></div><br>