I am having difficulty setting up a record matching rule for OCLC# for new records being imported into the system. I am trying to match with tag 001 and subfield a. Evelyn W. Behar Metadata Librarian New York University Health Sciences Libraries 212.263.8615 evelyn.behar@med.nyu.edu </PRE> <html> <body> ------------------------------------------------------------<br /> This email message, including any attachments, is for the sole use of the intended recipient(s) and may contain information that is proprietary, confidential, and exempt from disclosure under applicable law. Any unauthorized review, use, disclosure, or distribution is prohibited. If you have received this email in error please notify the sender by return email and delete the original message. Please note, the recipient should check this email and any attachments for the presence of viruses. The organization accepts no liability for any damage caused by any virus transmitted by this email.<br /> ================================= </body> </html> <PRE>
Hi Evelyn, We use the following rule with success: Search Index: utility Source: 1100 Tag: 001 Subfields: a Offset: 0 Length: 0 Normalization rule: ISBN You can assign whatever number (over 1,000 I think) in the "Source" box. This is part of a multi match point rule that we have set up and we assign different source numbers to each match point with the idea that the match score would tell us which point(s) the match was made on. I hope this helps you. Nora ________________________ Nora Blake MassCat Manager Massachusetts Library System P.O. Box 241 South Deerfield, MA 01373-0241 413-665-9898 x123 866-MASSCAT (627-7228) Email: nblake@masslibsystem.org AIM: noraatmls _____ From: koha-bounces@lists.katipo.co.nz [mailto:koha-bounces@lists.katipo.co.nz] On Behalf Of Behar, Evelyn Sent: Tuesday, July 27, 2010 10:52 AM To: 'koha@lists.katipo.co.nz' Subject: [Koha] OCLC Record Matching Rule I am having difficulty setting up a record matching rule for OCLC# for new records being imported into the system. I am trying to match with tag 001 and subfield a. Evelyn W. Behar Metadata Librarian New York University Health Sciences Libraries 212.263.8615 evelyn.behar@med.nyu.edu ------------------------------------------------------------ This email message, including any attachments, is for the sole use of the intended recipient(s) and may contain information that is proprietary, confidential, and exempt from disclosure under applicable law. Any unauthorized review, use, disclosure, or distribution is prohibited. If you have received this email in error please notify the sender by return email and delete the original message. Please note, the recipient should check this email and any attachments for the presence of viruses. The organization accepts no liability for any damage caused by any virus transmitted by this email. =================================
I worked on tweaking Nora's rule to work for me on my system (running DB rev 3.01.00.144, just a few behind HEAD), and got the following as a working, stand-alone rule: Match threshold: 100 Matchpoints (just the one): Search index: Control-number Score: 101 Tag: 001 Subfields: a Offset: 0 Length: 0 Normalization rule: Control-number Required Match checks: none (remove the blank one) I'd love to see further documentation on how to choose the search indexes and normalization rules... or perhaps a dropdown of implemented options. -Ian 2010/7/27 Nora Blake <nblake@masslibsystem.org>
Hi Evelyn,
We use the following rule with success:
Search Index: utility
Source: 1100
Tag: 001
Subfields: a
Offset: 0
Length: 0
Normalization rule: ISBN
You can assign whatever number (over 1,000 I think) in the “Source” box. This is part of a multi match point rule that we have set up and we assign different source numbers to each match point with the idea that the match score would tell us which point(s) the match was made on.
I hope this helps you.
Nora
________________________
Nora Blake MassCat Manager Massachusetts Library System P.O. Box 241 South Deerfield, MA 01373-0241 413-665-9898 x123 866-MASSCAT (627-7228) Email: nblake@masslibsystem.org AIM: noraatmls ------------------------------
*From:* koha-bounces@lists.katipo.co.nz [mailto: koha-bounces@lists.katipo.co.nz] *On Behalf Of *Behar, Evelyn *Sent:* Tuesday, July 27, 2010 10:52 AM *To:* 'koha@lists.katipo.co.nz' *Subject:* [Koha] OCLC Record Matching Rule
I am having difficulty setting up a record matching rule for OCLC# for new records being imported into the system. I am trying to match with tag 001 and subfield a.
*Evelyn W. Behar*
Metadata Librarian
New York University Health Sciences Libraries
212.263.8615
evelyn.behar@med.nyu.edu
------------------------------------------------------------ This email message, including any attachments, is for the sole use of the intended recipient(s) and may contain information that is proprietary, confidential, and exempt from disclosure under applicable law. Any unauthorized review, use, disclosure, or distribution is prohibited. If you have received this email in error please notify the sender by return email and delete the original message. Please note, the recipient should check this email and any attachments for the presence of viruses. The organization accepts no liability for any damage caused by any virus transmitted by this email. =================================
_______________________________________________ Koha mailing list Koha@lists.katipo.co.nz http://lists.katipo.co.nz/mailman/listinfo/koha
-- Ian Walls Lead Development Specialist ByWater Solutions Phone # (888) 900-8944 http://bywatersolutions.com ian.walls@bywatersolutions.com Twitter: @sekjal
Me too! I've asked on list for someone to point me in the right direction, I have no clue how to write the manual pages for this feature though. Nicole 2010/8/6 Ian Walls <ian.walls@bywatersolutions.com>:
I'd love to see further documentation on how to choose the search indexes and normalization rules... or perhaps a dropdown of implemented options.
Folks, search indexes are in Zebra's records.abs file. For instance, for biblionumber (999$c), you should see the label Local-number. This is the search index. Thanks, Savitra Sirohi Nucsoft OSS Labs http://www.osslabs.biz On Sat, Aug 7, 2010 at 7:02 PM, Nicole Engard <nengard@gmail.com> wrote:
Me too! I've asked on list for someone to point me in the right direction, I have no clue how to write the manual pages for this feature though.
Nicole
2010/8/6 Ian Walls <ian.walls@bywatersolutions.com>:
I'd love to see further documentation on how to choose the search indexes and normalization rules... or perhaps a dropdown of implemented options.
Koha mailing list Koha@lists.katipo.co.nz http://lists.katipo.co.nz/mailman/listinfo/koha
Thanks for this! What about the normalization rule? Thanks again, Nicole On Sat, Aug 7, 2010 at 12:39 PM, savitra sirohi <savitra.sirohi@osslabs.biz> wrote:
Folks, search indexes are in Zebra's records.abs file. For instance, for biblionumber (999$c), you should see the label Local-number. This is the search index.
Hi Nicole, Savitra, From what I have discovered about the search indexing there are two configuration files that have an impact on this. The actual term that you search for will be defined in "ccl.properties" and the file "record.abs" tells the zebra indexing what data to extract from the MARC data. So conceptually record.abs tells zebra how to fill the search index and ccl.properties tells you what you can search for. Ideally both files should tell a similar story but it is not always so. If a field is in record.abs but not in ccl.properties it will be in the search index files but you will not be able to search on it, and conversely if it is in ccl.properties but not in record.abs you will be able to search on it but there will be no data indexed. I have used "Control-number" in record matching rules which maps to the 001 MARC tag successfully. Some searchable tags can be called a nuber of different things - for instance you can search for au, aut or Author all of which equate to author name. The following is from delivered ccl.properties: #Author-name 1003 A personal or corporate author, 100, 110, 111, 400 # or a conference or meeting 410, 411, 700, 710, # name. (No subject name 711, 800, 810, 811 # headings are included.) Author 1=1003 s=pw au Author aut 1=1003 So if you are setting up a Record Matching Rule and you want to see if the index you choose works then you can search in the simple search box for (eg): Control-number="123456" or Control-number:123456 or similar. If it does not work for data you know to be there then check that record.abs puts the data into the search index and ccl.properties enables you to search on it. I have not used a Normalization rule - sorry. I hope this is helpful. Ian On 09/08/2010 13:10, Nicole Engard wrote:
Thanks for this! What about the normalization rule?
Thanks again, Nicole
On Sat, Aug 7, 2010 at 12:39 PM, savitra sirohi <savitra.sirohi@osslabs.biz> wrote:
Folks, search indexes are in Zebra's records.abs file. For instance, for biblionumber (999$c), you should see the label Local-number. This is the search index.
_______________________________________________ Koha mailing list Koha@lists.katipo.co.nz http://lists.katipo.co.nz/mailman/listinfo/koha
-- Ian Bays Director of Projects PTFS Europe.com mobile: +44 (0) 7774995297 phone: +44 (0) 800 756 6803 skype: ian.bays email: ian.bays@ptfs-europe.com
Can you guys (and anyone else) take a look at the new stuff I documented: http://git.koha-community.org/gitweb/?p=kohadocs.git;a=commit;h=d299fff10f18... There are still some question marks in my documentation that I'd love some help filling in. I didn't go into all of the stuff Ian wrote yet because it was more advanced than I wanted to get in this starting documentation. Nicole
Hi Nicole, The only bit I can comment on is this bit: + <para>'Search index' can be found by looking at the zebra.abs + file on your system which tells the zebra indexing what data + to extract from the MARC data</para> It should not be "zebra.abs" or even "record.abs", but it should be ccl.properties. Also the text should then say "'Search index' can be found by looking at the ccl.properties file on your system which tells the zebra indexing what data to search for in the MARC data". The following link shows the content of ccl.properties (as it was when documented): http://koha.org/documentation/manual/3.0/searching/guide-to-searching/ccl-in... I hope that is OK. Ian On 09/08/2010 16:07, Nicole Engard wrote:
Can you guys (and anyone else) take a look at the new stuff I documented: http://git.koha-community.org/gitweb/?p=kohadocs.git;a=commit;h=d299fff10f18... There are still some question marks in my documentation that I'd love some help filling in. I didn't go into all of the stuff Ian wrote yet because it was more advanced than I wanted to get in this starting documentation.
Nicole
-- Ian Bays Director of Projects PTFS Europe.com mobile: +44 (0) 7774995297 phone: +44 (0) 800 756 6803 skype: ian.bays email: ian.bays@ptfs-europe.com
Thank you!! Also, I will have to create a new (updated) version of that page for the new manual. New commit: http://git.koha-community.org/gitweb/?p=kohadocs.git;a=commit;h=22216d5af2b0... Nicole On Mon, Aug 9, 2010 at 11:46 AM, Ian Bays <ian.bays@ptfs-europe.com> wrote:
Hi Nicole,
The only bit I can comment on is this bit:
+ <para>'Search index' can be found by looking at the zebra.abs + file on your system which tells the zebra indexing what data + to extract from the MARC data</para>
It should not be "zebra.abs" or even "record.abs", but it should be ccl.properties.
Also the text should then say "'Search index' can be found by looking at the ccl.properties file on your system which tells the zebra indexing what data to search for in the MARC data".
The following link shows the content of ccl.properties (as it was when documented):
http://koha.org/documentation/manual/3.0/searching/guide-to-searching/ccl-in...
I hope that is OK.
Ian
On 09/08/2010 16:07, Nicole Engard wrote:
Can you guys (and anyone else) take a look at the new stuff I documented: http://git.koha-community.org/gitweb/?p=kohadocs.git;a=commit;h=d299fff10f18... There are still some question marks in my documentation that I'd love some help filling in. I didn't go into all of the stuff Ian wrote yet because it was more advanced than I wanted to get in this starting documentation.
Nicole
-- Ian Bays Director of Projects PTFS Europe.com mobile: +44 (0) 7774995297 phone: +44 (0) 800 756 6803 skype: ian.bays email: ian.bays@ptfs-europe.com
Nicole, on normalization rules - I am looking at the code, and I can see there is a default normalization routine that removes characters like commas and semi-colons, but nothing else. So from what I can tell, it should not matter what you put in the normalization rules field, it will always yield the same result. Thanks, Savitra Sirohi Nucsoft OSS labs http://www.osslabs.biz On Mon, Aug 9, 2010 at 5:40 PM, Nicole Engard <nengard@gmail.com> wrote:
Thanks for this! What about the normalization rule?
Thanks again, Nicole
On Sat, Aug 7, 2010 at 12:39 PM, savitra sirohi <savitra.sirohi@osslabs.biz> wrote:
Folks, search indexes are in Zebra's records.abs file. For instance, for biblionumber (999$c), you should see the label Local-number. This is the search index.
Cool! So, maybe we don't need that as an option on the page since no matter what value you assign the same thing will happen. On Mon, Aug 9, 2010 at 2:24 PM, savitra sirohi <savitra.sirohi@osslabs.biz> wrote:
Nicole, on normalization rules - I am looking at the code, and I can see there is a default normalization routine that removes characters like commas and semi-colons, but nothing else.
So from what I can tell, it should not matter what you put in the normalization rules field, it will always yield the same result.
Thanks, Savitra Sirohi Nucsoft OSS labs http://www.osslabs.biz
On Mon, Aug 9, 2010 at 5:40 PM, Nicole Engard <nengard@gmail.com> wrote:
Thanks for this! What about the normalization rule?
Thanks again, Nicole
On Sat, Aug 7, 2010 at 12:39 PM, savitra sirohi <savitra.sirohi@osslabs.biz> wrote:
Folks, search indexes are in Zebra's records.abs file. For instance, for biblionumber (999$c), you should see the label Local-number. This is the search index.
New commit: http://git.koha-community.org/gitweb/?p=kohadocs.git;a=commit;h=0a137e13e6d8... I'm up for any other tips for the question marks. On Mon, Aug 9, 2010 at 2:31 PM, Nicole Engard <nengard@gmail.com> wrote:
Cool! So, maybe we don't need that as an option on the page since no matter what value you assign the same thing will happen.
On Mon, Aug 9, 2010 at 2:24 PM, savitra sirohi <savitra.sirohi@osslabs.biz> wrote:
Nicole, on normalization rules - I am looking at the code, and I can see there is a default normalization routine that removes characters like commas and semi-colons, but nothing else.
So from what I can tell, it should not matter what you put in the normalization rules field, it will always yield the same result.
Thanks, Savitra Sirohi Nucsoft OSS labs http://www.osslabs.biz
On Mon, Aug 9, 2010 at 5:40 PM, Nicole Engard <nengard@gmail.com> wrote:
Thanks for this! What about the normalization rule?
Thanks again, Nicole
On Sat, Aug 7, 2010 at 12:39 PM, savitra sirohi <savitra.sirohi@osslabs.biz> wrote:
Folks, search indexes are in Zebra's records.abs file. For instance, for biblionumber (999$c), you should see the label Local-number. This is the search index.
Silly question here - isn't 001 the LOC number? and 040 the OCLC number? I'm asking because I'm adding what I learned in this thread to the manual for others to benefit from later. Nicole 2010/7/27 Behar, Evelyn <Evelyn.Behar@med.nyu.edu>:
I am trying to match with tag 001 and subfield a.
Hi Nicole Field 001 contains the control number of the organization which is identified in the field 003. Typically field 001 should contain the control number of the local system. MARC 21 Format for Bibliographic Data, 001 - Control Number http://www.loc.gov/marc/bibliographic/bd001.html Contains the control number assigned by the organization creating, using, or distributing the record. For interchange purposes, documentation of the structure of the control number and input conventions should be provided to exchange partners by the organization initiating the interchange. The MARC code identifying whose system control number is present in field 001 is contained in field 003 (Control Number Identifier). An organization receiving a record may move the incoming control number from field 001 (and the control number identifier from field 003) to field 035 (System Control Number), 010 (Library of Congress Control Number), or 016 (National Bibliographic Agency Control Number), as appropriate, and place its own system control number in field 001 (and its control number identifier in field 003). ma, 2010-08-09 kello 08:09 -0400, Nicole Engard kirjoitti:
Silly question here - isn't 001 the LOC number? and 040 the OCLC number? I'm asking because I'm adding what I learned in this thread to the manual for others to benefit from later.
Nicole
2010/7/27 Behar, Evelyn <Evelyn.Behar@med.nyu.edu>:
I am trying to match with tag 001 and subfield a.
On Mon, Aug 9, 2010 at 12:24 PM, Ville Huhtala <vthuhtala@netscape.net> wrote:
Field 001 contains the control number of the organization which is identified in the field 003. Typically field 001 should contain the control number of the local system.
Thanks Ville, I was thinking of the OCLC number, not the control number which is why I was confused. I did update the documentation with the proper definition. Nicole
participants (7)
-
Behar, Evelyn -
Ian Bays -
Ian Walls -
Nicole Engard -
Nora Blake -
savitra sirohi -
Ville Huhtala