[Koha] More export/import questions

Scott Owen sowen at edzone.net
Tue Aug 17 02:02:41 NZST 2010


Not exactly a "fix", but this will help you identify records that may have this issue (more than one 952 field contains the same barcode).
In my case it was only 20 records out of 4,000+, all in the very first barcode range.
I suspect that they may have originally been imported twice.

Cut/paste to a blank file and save as get952.pl
read the readme.
Not pretty code, but it should work. 


##################################StartScript################################
#!/usr/bin/perl
## #### README #####
##
## Copy this script to the same directory as the "koha.mrk" file.
## run as get952.pl > OutputFileName
## example: perl get952.pl > output.txt
## 
## This script reads a Koha MARC file produced from MARCedit (the "koha.mrk" file defined below)
## and prints out the 952 sub field "$p" data (barcode) one line at a time to a temporary file
## (tempfile.txt).
## The tempfile.txt file is then parsed (a backup file "tempfile.txt.bac" is created during the
## process).
## The output file will contain the barcode and the number of times that barcode was found in a 952 field.
##
## Example output:
## p70240
##  = 2
## p70241
##  = 2
## p70249
##  = 1
## p70299
##  = 1
##
## Both the "tempfile.txt" and "tempfile.txt.bac" are deleted at the end of the script.
## The original "koha.mrk" file is never changed. 
## The MARCedit file MUST BE the *.mrk file and NOT a *.mrc file.
## You can change the input file (coded as "koha.mrk" below).
## The "count" portion of this script was copied from perlmonks.org
## 
###################


######
$MARCfile = "koha.mrk";
###### 


my $tempfile = "tempfile.txt";

open(MARCfile) or die("Could not open file.");
open FILE, ">$tempfile" or die "unable to open $tempfilefile $!";


foreach $line (<MARCfile>) {
    chomp($line);              
 if($line =~ m/^=952/i)
 #{print "$line"};
 {$newline = "$line";
  my @values = split(/\$/, $newline);
    foreach my $val (@values) {
  if($val =~ m/^p/i){
   #print "$val\n";
   print FILE "$val\n";
     
   }
   }
 
 }

}
close FILE >$tempfile;

my $tempfile = 'tempfile.txt';
my %seen = ();
{
  local @ARGV = ($tempfile);
  local $^I = '.bac';
  while(<>){
     $seen{$_}++;
     next if $seen{$_} > 1;
     print;
  }
}
foreach my $keys  ( sort {$seen{$b} <=> $seen{$a}} keys %seen) {
  print "$keys = $seen{$keys}\n";
}

unlink($tempfile);
unlink("tempfile.txt.bac");

##################################EndScript#############################








>>> Joel Harbottle<joel.harbottle at hotmail.com.au> 8/12/2010 11:15 AM >>>
Hi Scott,

As a Cataloguer, I have just noticed another potential problem in your snip from MarcEdit. I have copied and pasted your MarcEdit snip below, and have also put the problem(s) in bold text.


*************************************
=942  \\$cBOOK
=952  \\$bPINE$dPINE$kF HAN$p74238$r{dollar}11.96$t3$v2010-05-13$u4247
=952  \\$bPINE$r0.00$w2008-08-17$u266$kF HAN$p70265$v2008-08-17
=952  \\$bPINE$r0.00$dPINE$w2010-05-13$u4247$kF HAN$p74238$v2010-05-13
=952  \\$bPINE$dPINE$kF HAN$p70262$r{dollar}13.99$t2$u263
=952  \\$bPINE$dPINE$kF HAN$p70262$r0.00$v2008-08-17$w2008-08-17$u263
=952  \\$bPINE$dPINE$kF HAN$p70265$r{dollar}10.96$t5$u266
=952  \\$bPINE$dPINE$kF HAN$p74255$r{dollar}10.96$t6$u4264
**************************************

So, in your example, you have the same barcode(s) utilised twice in the listing of Items. The only exception to this is the last line of Items, with Barcode '74255'.

If you are able to find a fix to this problem, would you be able to either send the fix to me via email, or send to the entire list?

Best Wishes
Joel



Joel Harbottle
Library Technician
Library Services (Tasmania)
Email: Joel.Harbottle at hotmail.com.au



Date: Thu, 12 Aug 2010 09:03:55 -0400
From: sowen at edzone.net
To: koha at lists.katipo.co.nz
Subject: [Koha] More export/import questions


Hi all,
Thank you all for the responses to my previous questions about exporting/importing biblio data.
I think I have a grasp of what need to be done.
(seems to be a big bigger job than I was hoping for...)

With that said.... I noticed another potential issue (maybe?).

*************************************
=942  \\$cBOOK
=952  \\$bPINE$dPINE$kF HAN$p74238$r{dollar}11.96$t3$v2010-05-13$u4247
=952  \\$bPINE$r0.00$w2008-08-17$u266$kF HAN$p70265$v2008-08-17
=952  \\$bPINE$r0.00$dPINE$w2010-05-13$u4247$kF HAN$p74238$v2010-05-13
=952  \\$bPINE$dPINE$kF HAN$p70262$r{dollar}13.99$t2$u263
=952  \\$bPINE$dPINE$kF HAN$p70262$r0.00$v2008-08-17$w2008-08-17$u263
=952  \\$bPINE$dPINE$kF HAN$p70265$r{dollar}10.96$t5$u266
=952  \\$bPINE$dPINE$kF HAN$p74255$r{dollar}10.96$t6$u4264
**************************************

In the above snip from MARCedit, the 3rd line and the 7th line both reference the same item ($p70265 -- barcode # 70265).
Is this normal?
Will the data in line 7 overwrite the data from line 3?
ie: if line 3 contains the $v2008-08-17 field data, and line 7 does not, will the result be an empty $v field?

Should I be concerned with this, or is this a normal occurrence?

Thank you again for any insight.

-Scott Owen
Alma Public Schools


 



_______________________________________________ Koha mailing list Koha at lists.katipo.co.nz http://lists.katipo.co.nz/mailman/listinfo/koha 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.katipo.co.nz/pipermail/koha/attachments/20100816/482f4026/attachment-0001.htm 


More information about the Koha mailing list