to catalog or not to catalog, that is my question
To catalog or not to catalog, that is my question. :) My name is Eric, and I work in a digital scholarship center in the Libraries at the University of Notre Dame. When you figure out what digital scholarship is, then please tell me. I've been practicing librarianship for a while now. For a good time, I spun up a virtual machine and installed Koha on it. (The process was easy but it did/does require practice.) Then, by hand, I added a tiny number of items to the catalog complete with a few authority records. 'Works very well. I'm impressed. I have also collected about .25 million open access journal articles. I used OAI to do this work. At a minimum, each article comes with identifiers, authors, titles, dates, and abstracts. With additional processing, I can extract/compute extents, summaries, keywords, names of people, names of organizations, and names of places. Combined together, all of these attributes form the content bibliographic records. It would not be too difficult for me to automatically create MARC records for each (most) of my collection. I could then import the whole thing into Koha. Fun!? What would I gain from such a process? Well, I would have a nice search engine enabling me to find items in the collection. The items could be (re-)syndicated via OAI, and I might be able to implement a SRU interface to the index. If I catalog things, and then if I supplemented the cataloging with Library of Congress authorities, then I would end up creating relationships between items, and such forms the basis of interesting networks. There has got to be a way to map authorities to WikiData Qnumbers, and if both implemented and made available on the 'Net, then the whole would contribute to the Semantic Web. On the other hand, the process is not trivial. While I am perfectly able to create an index of the whole without using Koha (Zebra), Koha might do it better than me. Would the index actually get used? Probably not; I would probably be the only one using the catalog. All that said, my questions to y'all are, "What sorts of technology gotcha's might I encounter if I were to try to import .25 million articles into Koha? And how could I automate assigning authorities to each record?" -- Eric Lease Morgan Navari Family Center for Digital Scholarship University of Notre Dame https://cds.library.nd.edu/
On Jul 17, 2023, at 10:43 AM, Eric Lease Morgan <emorgan@nd.edu> wrote:
To catalog or not to catalog, that is my question. ... All that said, my questions to y'all are, "What sorts of technology gotcha's might I encounter if I were to try to import .25 million articles into Koha? And how could I automate assigning authorities to each record?"
Please tell me, what sorts of issues might I encounter if I try to import as many as .25 million MARC records, and what sorts of automated authorities process might you suggest? --Eric Morgan, University of Notre Dame
Hi Eric, there are a lot of libraries with this many and more records using Koha. It's usually easier and faster to use the command line script bulkmarcimport.pl for loading records during migration. The script can load bibliographic and authority records. For authorities this might have some helpful information: https://bywatersolutions.com/education/authorities-town-hall-recap Hope this helps, Katrin On 25.07.23 19:00, Eric Lease Morgan wrote:
On Jul 17, 2023, at 10:43 AM, Eric Lease Morgan <emorgan@nd.edu> wrote:
To catalog or not to catalog, that is my question. ... All that said, my questions to y'all are, "What sorts of technology gotcha's might I encounter if I were to try to import .25 million articles into Koha? And how could I automate assigning authorities to each record?"
Please tell me, what sorts of issues might I encounter if I try to import as many as .25 million MARC records, and what sorts of automated authorities process might you suggest? --Eric Morgan, University of Notre Dame
_______________________________________________
Koha mailing list http://koha-community.org Koha@lists.katipo.co.nz Unsubscribe: https://lists.katipo.co.nz/mailman/listinfo/koha
On Jul 26, 2023, at 1:06 PM, Katrin Fischer <katrin.fischer.83@web.de> wrote:
there are a lot of libraries with this many and more records using Koha. It's usually easier and faster to use the command line script bulkmarcimport.pl for loading records during migration. The script can load bibliographic and authority records.
For authorities this might have some helpful information: https://bywatersolutions.com/education/authorities-town-hall-recap
-- Katrin
For a good time, I have begun to catalog the .25 million journal articles I have cached via OAI. "Thank you, Katrin!" As of right now, I have only done .008 of them, but oh well. While the process is both iterative and requires practice, it is not onerous: 1. install and configure Koha 2. transform rudimentary bibliographic data into MARC 3. use Koha's bulkmarcimport.pl utility to import MARC 4. evaluate 5. go to Step #2 You can play with the 8,000 records at http://catalog.infomotions.com Here are two example searches: 1) title=marc, and 2) author=kilgour [1, 2] Currently, my cataloging practice is rudimentary, but I'll get up with some of my colleagues and learn how to create more robust records. There are two things that are different about this implementation. First, I'm cataloging at the journal article level. Second, I provide links to both the canonical version of the article as well as the cached version. When I'm done, I hope to exploit two cool Koha features: 1) output as OAI-PMH, and 2) query via SRU. Fun with Koha! [1] marc - https://bit.ly/3OBX3FC [2] kilgour - https://bit.ly/452wnTJ -- Eric Lease Morgan Navari Family Center for Digital Scholarship University of Notre Dame
participants (2)
-
Eric Lease Morgan -
Katrin Fischer