Hello, translating/writing Koha 3.0 french documentation, I configured Koha as OAI repository with system preferences OAI-PMH. It seems to work quite fine, however I wonder how the sets defined in sys pref OAI-PMH:Set are used : - I didn't see the correspondence between Set and any field of biblio, or document types. - I didn't find in oai.pl a query where Sets are involved, - and the xml response to oai verb oai.pl?verb=ListIdentifiers&metadataPrefix=oai_dc&set=name_of_set is the same for every set value. Does anyone know about that ? Thanks, Laurence Lefaucheur ---------------------- Laurence Lefaucheur Biblibre http://www.biblibre.com
Hi, On Mon, Dec 22, 2008 at 5:26 AM, Laurence Lefaucheur <laurence.lefaucheur@biblibre.com> wrote:
translating/writing Koha 3.0 french documentation, I configured Koha as OAI repository with system preferences OAI-PMH. It seems to work quite fine, however I wonder how the sets defined in sys pref OAI-PMH:Set are used : - I didn't see the correspondence between Set and any field of biblio, or document types. - I didn't find in oai.pl a query where Sets are involved, - and the xml response to oai verb oai.pl?verb=ListIdentifiers&metadataPrefix=oai_dc&set=name_of_set is the same for every set value.
It appears that the ability to define different sets has not been implemented. Furthermore, the OAI-PMH:Subset syspref is marked as experimental and is not used anywhere. Perhaps Paul can advise whether there's any code that can be contributed to finish this part of the OAI-PMH implementation. Regards, Galen -- Galen Charlton VP, Research & Development, LibLime galen.charlton@liblime.com p: 1-888-564-2457 x709 skype: gmcharlt
I've been trying to convert a Sagebrush Athena generated MARC export to a format acceptable to Koha. I believe I have a working script that does the conversion. I did the final conversion this morning, and the import with bulkmarcimport.pl. I haven't had the chance to look at the catalog in depth, but at first glance everything seems OK. I will now unleash the librarians with instructions to examine the new catalog and make sure everything is good. If I find problems I'll post the new information. One recent change: Apparently Koha like UTF-8, so I explicitly encode any strings in the MARC records as UTF08. For now, here's my script. Please note, the script optionally sets the current and permanent branch for the books (Koha 952$a and 952$b). If you don't want the branch set then don't use this option. If you use this option, then the branch_code should correspond to branches.branchcode in the Koha DB. Is setting the branch a bad idea? Use with care, no warranty is expressed or implied, blah blah blah. For those of you familiar with Koha/Perl, please comment. Especially let me know if I'm doing something horribly horribly wrong. PERL CODE FOLLOWS: ******************************** use MARC::Batch; use Encode; my $input_file; my $output_file; my $location; if($ARGV[0] eq '-h' || $ARGV[0] eq '--help' || scalar(@ARGV) < 2 ) { print "This converts a MARC file generated by Sagebrush Athena to a file appropriate for Koha\n\n"; print "Usage: perl marcconvert.pl originalmarcfile convertedmarcfile\n"; print "\tor \n"; print "Usage: perl marcconvert.pl originalmarcfile convertedmarcfile branch_code \n\n"; exit; } else { $input_file = $ARGV[0]; if( -f $input_file ) { # the file exists and it is a file (not a directory or something else) } else { print "The input file '$input_file' does not exist\n"; exit; } $output_file = $ARGV[1]; } if($ARGV[2]){ $location = encode("utf8", $ARGV[2]); } my $batch = MARC::Batch->new('USMARC',$input_file); open(FF4,">./$output_file"); while ( my $record = $batch->next()) { my @fields = $record->fields(); my $newrecord = MARC::Record->new(); #$newrecord->leader($record->leader()); # i let MARC::Record generate a leader. Is this wrong? foreach my $field (@fields) { my $tag = $field->tag(); my $newfield; if($tag < 10) { # it has data but no indicators or subfields my $data = encode("utf8", $field->data()); # Koha like UTF-8, so we have to convert $newfield = MARC::Field->new($tag,$data); } elsif($tag eq '852') { # do data conversion to 952 # Sagebrush Athena puts some stuff in tag 852 but Koha likes it in tag 952 my @subfields = (); my $athena_k = ''; # Koha 952 $o is composed of 852 $k $h $i $m my $athena_h = ''; #Koha 952 $o is composed of 852 $k $h $i $m my $athena_i = ''; #Koha 952 $o is composed of 852 $k $h $i $m my $athena_m = ''; #Koha 952 $o is composed of 852 $k $h $i $m foreach my $sub ($field->subfields()) { my $data = encode("utf8", $sub->[1]); if($sub->[0] eq 't') { push(@subfields,'t',$data); } elsif($sub->[0] eq '6') { push(@subfields,'y',$data); } elsif($sub->[0] eq 'b') { #push(@subfields,'b','FHM'); # I set the current and permanent branch manually - see below } elsif($sub->[0] eq 'a') { #push(@subfields,'a','FHM'); # I set the current and permanent branch manually - see below } elsif($sub->[0] eq 'z') { push(@subfields,'z',$data); } elsif($sub->[0] eq 'k') { $athena_k = $data; } elsif($sub->[0] eq 'h') { $athena_h = $data; } elsif($sub->[0] eq 'i') { $athena_i = $data; } elsif($sub->[0] eq 'm') { $athena_m = $data; } elsif($sub->[0] eq '9') { push(@subfields,'g',$data); } elsif($sub->[0] eq '5') { push(@subfields,'e',$data); } elsif($sub->[0] eq '8') { push(@subfields,'d',$data); } elsif($sub->[0] eq 'p') { push(@subfields,'p',$data); } else { push(@subfields,$sub->[0],$data); } } #Koha 952 $o is composed of 852 $k $h $i $m my $koha_o= "$athena_k $athena_h $athena_i $athena_m"; #Koha 952 $o is composed of 852 $k $h $i $m $koha_o =~ s/^\s+//; $koha_o =~ s/\s+$//; $koha_o =~ s/\s{2,}/ /g; $koha_o = encode("utf8", $koha_o); push(@subfields,'o',$koha_o); if($location) { push(@subfields,'a',$location); # I set the current and permanent branch manually - is this a bad idea push(@subfields,'b',$location); # I set the current and permanent branch manually - is this a bad idea } $newfield = MARC::Field->new('952', $field->indicator(1), $field->indicator(2), @subfields ); } else { # no data, but has # 1) indicators (defined, but not necessarily set) # 2) subfields # # This is for all the tags >= 10 and not tag 852 my @subfields = (); foreach my $sub ($field->subfields()) { my $data = encode("utf8", $sub->[1]); push(@subfields,$sub->[0],$data); } $newfield = MARC::Field->new($tag, $field->indicator(1), $field->indicator(2), @subfields ); } $newrecord->append_fields($newfield); } print FF4 $newrecord->as_usmarc(); } close(FF4);
Hi Jeffrey, On Mon, Dec 22, 2008 at 12:02 PM, Jeffrey LePage <jeffrey_lepage@yahoo.com> wrote:
For now, here's my script. Please note, the script optionally sets the current and permanent branch for the books (Koha 952$a and 952$b). If you don't want the branch set then don't use this option. If you use this option, then the branch_code should correspond to branches.branchcode in the Koha DB. Is setting the branch a bad idea?
Setting the branch is a good idea - in fact, no Koha item record should leave them unset. Thanks for sending this script. Would you consider uploading it to contribs.koha.org? Regards, Galen -- Galen Charlton VP, Research & Development, LibLime galen.charlton@liblime.com p: 1-888-564-2457 x709 skype: gmcharlt
I made some changes to my script for converting a Sagebrush Athena MARC export to a format acceptable to Koha. I uploaded it to contribs.koha.org. It's now called athenamarc2koha.pl http://contribs.koha.org/extension_view.php?eid=15 It's just an itty-bitty script, but I'm sure some will find it useful. Thanks to Joe Atzberger and others for guidance and suggestions. -- Please avoid sending me Word or PowerPoint attachments. See http://www.gnu.org/philosophy/no-word-attachments.html --- On Mon, 12/22/08, Galen Charlton <galen.charlton@liblime.com> wrote:
From: Galen Charlton <galen.charlton@liblime.com> Subject: Re: [Koha] A script for converting MARC records from Sagebrush Athena To: "Jeffrey LePage" <jeffrey_lepage@yahoo.com> Cc: koha@lists.katipo.co.nz Date: Monday, December 22, 2008, 10:11 AM Hi Jeffrey,
On Mon, Dec 22, 2008 at 12:02 PM, Jeffrey LePage <jeffrey_lepage@yahoo.com> wrote:
For now, here's my script. Please note, the
script optionally sets the current and permanent
branch for the books (Koha 952$a and 952$b). If you don't want the branch set then don't use this option. If you use this option, then the branch_code should correspond to branches.branchcode in the Koha DB. Is setting the branch a bad idea?
Setting the branch is a good idea - in fact, no Koha item record should leave them unset.
Thanks for sending this script. Would you consider uploading it to contribs.koha.org?
Regards,
Galen -- Galen Charlton VP, Research & Development, LibLime galen.charlton@liblime.com p: 1-888-564-2457 x709 skype: gmcharlt
Hi Jeffrey, On Mon, Dec 22, 2008 at 6:54 PM, Jeffrey LePage <jeffrey_lepage@yahoo.com> wrote:
I made some changes to my script for converting a Sagebrush Athena MARC export to a format acceptable to Koha. I uploaded it to contribs.koha.org. It's now called athenamarc2koha.pl
http://contribs.koha.org/extension_view.php?eid=15
It's just an itty-bitty script, but I'm sure some will find it useful.
Thanks for your contribution! Regards, Galen -- Galen Charlton VP, Research & Development, LibLime galen.charlton@liblime.com p: 1-888-564-2457 x709 skype: gmcharlt
Consider adding: use strict; use warnings;
And then fixing anything warnings complains about. One might be changing the order here: if($ARGV[0] eq '-h' || $ARGV[0] eq '--help' || scalar(@ARGV) < 2 ) becoming: if ((scalar(@ARGV) < 2) || $ARGV[0] eq '-h' || $ARGV[0] eq '--help') so that you check if the array is populated before trying to access the [0]th element. Also be sure to die on things that represent fatal errors, like: open(FF4,">./$output_file"); should become: open(FF4,">./$output_file") or die "Cannot write to $output_file: $!"; Because if you can't produce the output file, there isn't any reason to go on! This helps catch user permissions errors at runtime. --Joe On Mon, Dec 22, 2008 at 12:02 PM, Jeffrey LePage <jeffrey_lepage@yahoo.com>wrote:
I've been trying to convert a Sagebrush Athena generated MARC export to a format acceptable to Koha.
I believe I have a working script that does the conversion.
I did the final conversion this morning, and the import with bulkmarcimport.pl. I haven't had the chance to look at the catalog in depth, but at first glance everything seems OK. I will now unleash the librarians with instructions to examine the new catalog and make sure everything is good. If I find problems I'll post the new information.
One recent change: Apparently Koha like UTF-8, so I explicitly encode any strings in the MARC records as UTF08.
For now, here's my script. Please note, the script optionally sets the current and permanent branch for the books (Koha 952$a and 952$b). If you don't want the branch set then don't use this option. If you use this option, then the branch_code should correspond to branches.branchcode in the Koha DB. Is setting the branch a bad idea?
Use with care, no warranty is expressed or implied, blah blah blah.
For those of you familiar with Koha/Perl, please comment. Especially let me know if I'm doing something horribly horribly wrong.
PERL CODE FOLLOWS: ******************************** use MARC::Batch; use Encode;
my $input_file; my $output_file; my $location; if($ARGV[0] eq '-h' || $ARGV[0] eq '--help' || scalar(@ARGV) < 2 ) { print "This converts a MARC file generated by Sagebrush Athena to a file appropriate for Koha\n\n"; print "Usage: perl marcconvert.pl originalmarcfile convertedmarcfile\n"; print "\tor \n"; print "Usage: perl marcconvert.pl originalmarcfile convertedmarcfile branch_code \n\n"; exit; } else { $input_file = $ARGV[0]; if( -f $input_file ) { # the file exists and it is a file (not a directory or something else) } else { print "The input file '$input_file' does not exist\n"; exit; } $output_file = $ARGV[1]; }
if($ARGV[2]){ $location = encode("utf8", $ARGV[2]); }
my $batch = MARC::Batch->new('USMARC',$input_file);
open(FF4,">./$output_file"); while ( my $record = $batch->next()) { my @fields = $record->fields(); my $newrecord = MARC::Record->new(); #$newrecord->leader($record->leader()); # i let MARC::Record generate a leader. Is this wrong? foreach my $field (@fields) { my $tag = $field->tag(); my $newfield; if($tag < 10) { # it has data but no indicators or subfields my $data = encode("utf8", $field->data()); # Koha like UTF-8, so we have to convert $newfield = MARC::Field->new($tag,$data); } elsif($tag eq '852') { # do data conversion to 952 # Sagebrush Athena puts some stuff in tag 852 but Koha likes it in tag 952 my @subfields = (); my $athena_k = ''; # Koha 952 $o is composed of 852 $k $h $i $m my $athena_h = ''; #Koha 952 $o is composed of 852 $k $h $i $m my $athena_i = ''; #Koha 952 $o is composed of 852 $k $h $i $m my $athena_m = ''; #Koha 952 $o is composed of 852 $k $h $i $m foreach my $sub ($field->subfields()) { my $data = encode("utf8", $sub->[1]); if($sub->[0] eq 't') { push(@subfields,'t',$data); } elsif($sub->[0] eq '6') { push(@subfields,'y',$data); } elsif($sub->[0] eq 'b') { #push(@subfields,'b','FHM'); # I set the current and permanent branch manually - see below } elsif($sub->[0] eq 'a') { #push(@subfields,'a','FHM'); # I set the current and permanent branch manually - see below } elsif($sub->[0] eq 'z') { push(@subfields,'z',$data); } elsif($sub->[0] eq 'k') { $athena_k = $data; } elsif($sub->[0] eq 'h') { $athena_h = $data; } elsif($sub->[0] eq 'i') { $athena_i = $data; } elsif($sub->[0] eq 'm') { $athena_m = $data; } elsif($sub->[0] eq '9') { push(@subfields,'g',$data); } elsif($sub->[0] eq '5') { push(@subfields,'e',$data); } elsif($sub->[0] eq '8') { push(@subfields,'d',$data); } elsif($sub->[0] eq 'p') { push(@subfields,'p',$data); } else { push(@subfields,$sub->[0],$data); } } #Koha 952 $o is composed of 852 $k $h $i $m my $koha_o= "$athena_k $athena_h $athena_i $athena_m"; #Koha 952 $o is composed of 852 $k $h $i $m $koha_o =~ s/^\s+//; $koha_o =~ s/\s+$//; $koha_o =~ s/\s{2,}/ /g; $koha_o = encode("utf8", $koha_o); push(@subfields,'o',$koha_o);
if($location) { push(@subfields,'a',$location); # I set the current and permanent branch manually - is this a bad idea push(@subfields,'b',$location); # I set the current and permanent branch manually - is this a bad idea } $newfield = MARC::Field->new('952', $field->indicator(1), $field->indicator(2), @subfields ); } else { # no data, but has # 1) indicators (defined, but not necessarily set) # 2) subfields # # This is for all the tags >= 10 and not tag 852 my @subfields = (); foreach my $sub ($field->subfields()) { my $data = encode("utf8", $sub->[1]); push(@subfields,$sub->[0],$data); } $newfield = MARC::Field->new($tag, $field->indicator(1), $field->indicator(2), @subfields ); } $newrecord->append_fields($newfield); } print FF4 $newrecord->as_usmarc();
} close(FF4);
_______________________________________________ Koha mailing list Koha@lists.katipo.co.nz http://lists.katipo.co.nz/mailman/listinfo/koha
participants (4)
-
Galen Charlton -
Jeffrey LePage -
Joe Atzberger -
Laurence Lefaucheur