to catalog or not to catalog, that is my question

17 Jul 2023

      To catalog or not to catalog, that is my question.  :)

My name is Eric, and I work in a digital scholarship center in the Libraries at the University of Notre Dame. When you figure out what digital scholarship is, then please tell me. I've been practicing librarianship for a while now.

For a good time, I spun up a virtual machine and installed Koha on it. (The process was easy but it did/does require practice.) Then, by hand, I added a tiny number of items to the catalog complete with a few authority records. 'Works very well.  I'm impressed.

I have also collected about .25 million open access journal articles. I used OAI to do this work. At a minimum, each article comes with identifiers, authors, titles, dates, and abstracts. With additional processing, I can extract/compute extents, summaries, keywords, names of people, names of organizations, and names of places. Combined together, all of these attributes form the content bibliographic records. It would not be too difficult for me to automatically create MARC records for each (most) of my collection. I could then import the whole thing into Koha. Fun!?

What would I gain from such a process? Well, I would have a nice search engine enabling me to find items in the collection. The items could be (re-)syndicated via OAI, and I might be able to implement a SRU interface to the index. If I catalog things, and then if I supplemented the cataloging with Library of Congress authorities, then I would end up creating relationships between items, and such forms the basis of interesting networks. There has got to be a way to map authorities to WikiData Qnumbers, and if both implemented and made available on the 'Net, then the whole would contribute to the Semantic Web.

On the other hand, the process is not trivial. While I am perfectly able to create an index of the whole without using Koha (Zebra), Koha might do it better than me. Would the index actually get used? Probably not; I would probably be the only one using the catalog.

All that said, my questions to y'all are, "What sorts of technology gotcha's might I encounter if I were to try to import .25 million articles into Koha? And how could I automate assigning authorities to each record?"

--
Eric Lease Morgan
Navari Family Center for Digital Scholarship
University of Notre Dame

https://cds.library.nd.edu/

Eric Lease Morgan

Eric Lease Morgan

Katrin Fischer

Eric Lease Morgan

tags

participants (2)