008 Fixed length data elements: problems with imported records in 3.12
Something is happening to our 008 fields, when we edit the 008 field of an imported record (converted from our old database - this doesn't seem to be a problem with Z39.50 downloads). I don't know if it's always been a problem, or if it's just come up since we moved to 3.12. Our cataloguer only recently reported it , but we don't often need to edit 008 fields. Here's an example of the imported 008: 130515s2006####stka###gr#####000#0#eng#d If I edit the record and open the 008 editing window Koha automatically changes it to: 130515s2006 And this is what it looks like after clearing the field and entering the information manually: 131030s2006 stka gr 000 0 eng d The problem seems to be with the # characters in the imported 008. Koha slots the information it has into the spaces, working from left to right. When it gets to the first # something happens. It skips over the # character(s) and continues to slot in information, redistributing the other characters in the remaining spaces from left to right. Once the data is redistributed, if the information in a given position doesn't match what is allowed in that position then Koha uses the default (usually blank which is what happened in the above example). If a character does match what's allowed, then Koha uses it. . . which can make for some bizarre entries because 'a' in one position means something very different in another. But it also seems to hold on to these incorrect characters, even when you think you have changed them using the editing window. While the field editor displays the corrected information, after you click OK and return to the MARC display, the characters are still in the wrong positions. In order to properly correct the faulty 008 field, I have to manually delete all the incorrect information from the main MARC display (rather than using the field editor) and then enter the correct information either in the MARC display or using the field editor. I think I need to file a bug report . . .? or is it two bugs? - 1) the # character and 2) the field editor not correctly updating the information??? Aside from the quirks described above, I have a couple questions: How important is the 008 field in the long term? I know it does influence a few display and search functions in Koha. Is it worth the effort to export the whole catalogue and try to fix this with batch edits? How do I avoid this problem with the next set of uploads? -- Elaine Bradtke Data Wrangler VWML English Folk Dance and Song Society | http://www.efdss.org Cecil Sharp House, 2 Regent's Park Road, London NW1 7AY Tel +44 (0) 20 7485 2206 (This number is for the English Folk Dance and Song Society in London, England. If you wish to phone me personally, send an e-mail first. I work off site) -------------------------------------------------------------------------- Registered Company No. 297142 Charity Registered in England and Wales No. 305999 --------------------------------------------------------------------------- "Writing about music is like dancing about architecture" --Elvis Costello (Musician magazine No. 60 (October 1983), p. 52)
Hi Elaine, On Wed, Oct 30, 2013 at 4:31 PM, Elaine Bradtke <eb@efdss.org> wrote:
Here's an example of the imported 008: 130515s2006####stka###gr#####000#0#eng#d
Just to confirm, for the migrated records, "#" in the 008 represents a literal hash character, not a space?
How important is the 008 field in the long term? I know it does influence a few display and search functions in Koha.
In my view, in general it's worth going to some effort to maintain the 008s, as the information encoded in them is either not to be found anywhere else in the record, or if it is, is in the form of free text that can cause indexing indigestion. For example, I'd rather grab the primary language from the 008/35-37 than have to parse language names. Regards, Galen -- Galen Charlton Manager of Implementation Equinox Software, Inc. / The Open Source Experts email: gmc@esilibrary.com direct: +1 770-709-5581 cell: +1 404-984-4366 skype: gmcharlt web: http://www.esilibrary.com/ Supporting Koha and Evergreen: http://koha-community.org & http://evergreen-ils.org
Hi Galen Correct - # is literal. Not a space. Elaine On Thu, Oct 31, 2013 at 12:05 AM, Galen Charlton <gmc@esilibrary.com> wrote:
Hi Elaine,
On Wed, Oct 30, 2013 at 4:31 PM, Elaine Bradtke <eb@efdss.org> wrote:
Here's an example of the imported 008: 130515s2006####stka###gr#####000#0#eng#d
Just to confirm, for the migrated records, "#" in the 008 represents a literal hash character, not a space?
How important is the 008 field in the long term? I know it does influence a few display and search functions in Koha.
In my view, in general it's worth going to some effort to maintain the 008s, as the information encoded in them is either not to be found anywhere else in the record, or if it is, is in the form of free text that can cause indexing indigestion. For example, I'd rather grab the primary language from the 008/35-37 than have to parse language names.
Regards,
Galen -- Galen Charlton Manager of Implementation Equinox Software, Inc. / The Open Source Experts email: gmc@esilibrary.com direct: +1 770-709-5581 cell: +1 404-984-4366 skype: gmcharlt web: http://www.esilibrary.com/ Supporting Koha and Evergreen: http://koha-community.org & http://evergreen-ils.org
-- Elaine Bradtke Data Wrangler VWML English Folk Dance and Song Society | http://www.efdss.org Cecil Sharp House, 2 Regent's Park Road, London NW1 7AY Tel +44 (0) 20 7485 2206 (This number is for the English Folk Dance and Song Society in London, England. If you wish to phone me personally, send an e-mail first. I work off site) -------------------------------------------------------------------------- Registered Company No. 297142 Charity Registered in England and Wales No. 305999 --------------------------------------------------------------------------- "Writing about music is like dancing about architecture" --Elvis Costello (Musician magazine No. 60 (October 1983), p. 52)
On Wed, Oct 30, 2013 at 5:05 PM, Galen Charlton <gmc@esilibrary.com> wrote:
Hi Elaine,
On Wed, Oct 30, 2013 at 4:31 PM, Elaine Bradtke <eb@efdss.org> wrote:
Here's an example of the imported 008: 130515s2006####stka###gr#####000#0#eng#d
Just to confirm, for the migrated records, "#" in the 008 represents a literal hash character, not a space?
We are using a literal '#' (hash). Your question prompted me to check the MARC21 standard, specifically http://www.loc.gov/marc/specifications/specrecstruc.html where we find the following notacion on the documenation: *blank (SP).* ASCII character 20(hex) (represented graphically in MARC 21 documentation as [image: ASCII character 20 (hex)] or #), which is used in * indicators* and data elements containing coded values (and occurs in data content). Generally, blank stands for "undefined," but in some instances it has been assigned a meaning. ASCII name is space. So our MARC is has # where it should have ' '. We'll fix this and re-evaluate this problem. This interpretation is not obvious from the specification. -Doug-
How important is the 008 field in the long term? I know it does influence a few display and search functions in Koha.
In my view, in general it's worth going to some effort to maintain the 008s, as the information encoded in them is either not to be found anywhere else in the record, or if it is, is in the form of free text that can cause indexing indigestion. For example, I'd rather grab the primary language from the 008/35-37 than have to parse language names.
Regards,
Galen -- Galen Charlton Manager of Implementation Equinox Software, Inc. / The Open Source Experts email: gmc@esilibrary.com direct: +1 770-709-5581 cell: +1 404-984-4366 skype: gmcharlt web: http://www.esilibrary.com/ Supporting Koha and Evergreen: http://koha-community.org & http://evergreen-ils.org
Still, there's an underlying quirk - if Koha sees a character it doesn't like in the 008 editing window, it skips over it and collapses the rest of the data. While the non # characters were originally in the correct place - once we touch that field in the editing window they are moved to a different position, and either ignored because they're not valid or dumped in the wrong place where they are valid, but incorrect. Of course if it didn't do this, we probably would have never noticed the # issue. Elaine On Thu, Oct 31, 2013 at 2:00 AM, Doug Kingston <dpk@randomnotes.org> wrote:
On Wed, Oct 30, 2013 at 5:05 PM, Galen Charlton <gmc@esilibrary.com>wrote:
Hi Elaine,
On Wed, Oct 30, 2013 at 4:31 PM, Elaine Bradtke <eb@efdss.org> wrote:
Here's an example of the imported 008: 130515s2006####stka###gr#####000#0#eng#d
Just to confirm, for the migrated records, "#" in the 008 represents a literal hash character, not a space?
We are using a literal '#' (hash). Your question prompted me to check the MARC21 standard, specifically http://www.loc.gov/marc/specifications/specrecstruc.html where we find the following notacion on the documenation: *blank (SP).* ASCII character 20(hex) (represented graphically in MARC 21 documentation as [image: ASCII character 20 (hex)] or #), which is used in *indicators* and data elements containing coded values (and occurs in data content). Generally, blank stands for "undefined," but in some instances it has been assigned a meaning. ASCII name is space. So our MARC is has # where it should have ' '. We'll fix this and re-evaluate this problem. This interpretation is not obvious from the specification.
-Doug-
How important is the 008 field in the long term? I know it does influence a few display and search functions in Koha.
In my view, in general it's worth going to some effort to maintain the 008s, as the information encoded in them is either not to be found anywhere else in the record, or if it is, is in the form of free text that can cause indexing indigestion. For example, I'd rather grab the primary language from the 008/35-37 than have to parse language names.
Regards,
Galen -- Galen Charlton Manager of Implementation Equinox Software, Inc. / The Open Source Experts email: gmc@esilibrary.com direct: +1 770-709-5581 cell: +1 404-984-4366 skype: gmcharlt web: http://www.esilibrary.com/ Supporting Koha and Evergreen: http://koha-community.org & http://evergreen-ils.org
-- Elaine Bradtke Data Wrangler VWML English Folk Dance and Song Society | http://www.efdss.org Cecil Sharp House, 2 Regent's Park Road, London NW1 7AY Tel +44 (0) 20 7485 2206 (This number is for the English Folk Dance and Song Society in London, England. If you wish to phone me personally, send an e-mail first. I work off site) -------------------------------------------------------------------------- Registered Company No. 297142 Charity Registered in England and Wales No. 305999 --------------------------------------------------------------------------- "Writing about music is like dancing about architecture" --Elvis Costello (Musician magazine No. 60 (October 1983), p. 52)
Hi, On Wed, Oct 30, 2013 at 7:21 PM, Elaine Bradtke <eb@efdss.org> wrote:
Still, there's an underlying quirk - if Koha sees a character it doesn't like in the 008 editing window, it skips over it and collapses the rest of the data. While the non # characters were originally in the correct place - once we touch that field in the editing window they are moved to a different position, and either ignored because they're not valid or dumped in the wrong place where they are valid, but incorrect.
Of course if it didn't do this, we probably would have never noticed the # issue.
Indeed, this is worth writing up as a bug report; I've certainly run into MARC records in the past that used # where blanks were meant. Regards, Galen -- Galen Charlton Manager of Implementation Equinox Software, Inc. / The Open Source Experts email: gmc@esilibrary.com direct: +1 770-709-5581 cell: +1 404-984-4366 skype: gmcharlt web: http://www.esilibrary.com/ Supporting Koha and Evergreen: http://koha-community.org & http://evergreen-ils.org
participants (3)
-
Doug Kingston -
Elaine Bradtke -
Galen Charlton