Hello, and please forgive me if I haven't navigated to precisely the right forum for Koha Newbie questions: I am the IT Project Manager for a very large public school district currently running the Sagebrush (now Follett) Accent product, which in turn uses the Unicorn engine (2003 version) created by Sirsi. On of our largest challenges with the architecture of our existing product is that the architecture is dependent upon two databases: BRS (for full text search) and OPAC. Nor does it appear that the division of labor between BRS and ORACLE is clean -- we can see that there are patron user indexes and other objects used by BRS that make no sense if the tasks were segregated as simply as they have been portrayed. Due to the number of libraries we serve (680+) and the number of titles in our OPAC (7,000,000+), a batch job which should run nightly to synchronize BRS temp and permanent indices and Oracle records very often cannot complete after hours. This batch (ADUTEXT) may not run during business hours, as it interferes with ordinary central cataloging tasks, as well as certain tasks (inventory, adding brief marc records, importing new copies of arriving shipments) in the field. I am interested in evaluating Koha, but am concerned by the touted "Dual Database" nature of Koha. This sounds very much like a trip down the same dead-end street we've encountered with Accent. Does anyone have insight & experience to share with me about batch or periodic processes required to maintain Koha (especially "nightly" stuff), or a better understanding of the the Koha architecture, and why it would or would not pose similar problems for an operation our size? Thanks in advance! -Jon -- View this message in context: http://www.nabble.com/Koha-Newbie-Questions-tf4429659.html#a12636666 Sent from the Koha - Discuss mailing list archive at Nabble.com.
Hi Jon, Just a quick note to address your question about the dual database design question. In Koha version 2.2.x, we used a design with only one relational database at the back-end. The database design was structured to allow us to divide up a MARC record into its parts (fields, subfields, terms) to allow us to query on any/all of those parts using SQL. The problem with this design is that it doesn't scale well. Relational databases do a great job of capturing transactions and storing data, but aren't too good at allowing complex query operations on sets and subsets of that data. So for the next version of koha (3.0), and the one that's been making the news lately, we've integrated a textual database engine called Zebra (http://indexdata.dk/zebra) that allows us to index the records (MARC, Holdings, Authorities, etc.) and allow access to them via standard query languages and protocols (such as Z39.50, SRW/U, CQL, CCL, PQF). The data is still stored in a back-end relational database, the index is just one way to get at it ... it's primarily a 'read-only' database and doesn't contain authoritative data. Generally speaking, there would not be a need to 're-index' Zebra, or to have any periodic patch process to update the index, unless you drastically altered the search index configuration. The index is updated in real time along with transactions that are stored in the relational tables. In the case where you did need to re-index, I've seen Zebra crunch through 7 million records in about 45 minutes. This is really just a partial answer, because I've got to head out to a conference I'm attending in Champaign IL this week :-). Let us know if it answers some of your questions, and if you have additional ones. Cheers, -- Joshua Ferraro SUPPORT FOR OPEN-SOURCE SOFTWARE President, Technology migration, training, maintenance, support LibLime Featuring Koha Open-Source ILS jmf@liblime.com |Full Demos at http://liblime.com/koha |1(888)KohaILS ----- Original Message ----- From: "Jon Bek" <jonbek@yahoo.com> To: koha@lists.katipo.co.nz Sent: Wednesday, September 12, 2007 10:27:23 AM (GMT-0500) America/New_York Subject: [Koha] Koha Newbie Questions Hello, and please forgive me if I haven't navigated to precisely the right forum for Koha Newbie questions: I am the IT Project Manager for a very large public school district currently running the Sagebrush (now Follett) Accent product, which in turn uses the Unicorn engine (2003 version) created by Sirsi. On of our largest challenges with the architecture of our existing product is that the architecture is dependent upon two databases: BRS (for full text search) and OPAC. Nor does it appear that the division of labor between BRS and ORACLE is clean -- we can see that there are patron user indexes and other objects used by BRS that make no sense if the tasks were segregated as simply as they have been portrayed. Due to the number of libraries we serve (680+) and the number of titles in our OPAC (7,000,000+), a batch job which should run nightly to synchronize BRS temp and permanent indices and Oracle records very often cannot complete after hours. This batch (ADUTEXT) may not run during business hours, as it interferes with ordinary central cataloging tasks, as well as certain tasks (inventory, adding brief marc records, importing new copies of arriving shipments) in the field. I am interested in evaluating Koha, but am concerned by the touted "Dual Database" nature of Koha. This sounds very much like a trip down the same dead-end street we've encountered with Accent. Does anyone have insight & experience to share with me about batch or periodic processes required to maintain Koha (especially "nightly" stuff), or a better understanding of the the Koha architecture, and why it would or would not pose similar problems for an operation our size? Thanks in advance! -Jon View this message in context: Koha Newbie Questions Sent from the Koha - Discuss mailing list archive at Nabble.com.
Joshua, Thank you for the prompt and concise reply. This greatly alleviates my concerns. Based on your information, I will continue to consider Koha, with some of the next steps likely including spinning up a test instance, migrating a copy of my current production data, and then beating heck out of with Compuware or Mercury Loadrunner to evaluate capacity and performance. Thanks again! -Jon Joshua M. Ferraro wrote:
Hi Jon,
Just a quick note to address your question about the dual database design question. In Koha version 2.2.x, we used a design with only one relational database at the back-end. The database design was structured to allow us to divide up a MARC record into its parts (fields, subfields, terms) to allow us to query on any/all of those parts using SQL.
The problem with this design is that it doesn't scale well. Relational databases do a great job of capturing transactions and storing data, but aren't too good at allowing complex query operations on sets and subsets of that data. So for the next version of koha (3.0), and the one that's been making the news lately, we've integrated a textual database engine called Zebra (http://indexdata.dk/zebra) that allows us to index the records (MARC, Holdings, Authorities, etc.) and allow access to them via standard query languages and protocols (such as Z39.50, SRW/U, CQL, CCL, PQF). The data is still stored in a back-end relational database, the index is just one way to get at it ... it's primarily a 'read-only' database and doesn't contain authoritative data.
Generally speaking, there would not be a need to 're-index' Zebra, or to have any periodic patch process to update the index, unless you drastically altered the search index configuration. The index is updated in real time along with transactions that are stored in the relational tables. In the case where you did need to re-index, I've seen Zebra crunch through 7 million records in about 45 minutes.
This is really just a partial answer, because I've got to head out to a conference I'm attending in Champaign IL this week :-). Let us know if it answers some of your questions, and if you have additional ones.
Cheers,
-- Joshua Ferraro SUPPORT FOR OPEN-SOURCE SOFTWARE President, Technology migration, training, maintenance, support LibLime Featuring Koha Open-Source ILS jmf@liblime.com |Full Demos at http://liblime.com/koha |1(888)KohaILS
----- Original Message ----- From: "Jon Bek" <jonbek@yahoo.com> To: koha@lists.katipo.co.nz Sent: Wednesday, September 12, 2007 10:27:23 AM (GMT-0500) America/New_York Subject: [Koha] Koha Newbie Questions
Hello, and please forgive me if I haven't navigated to precisely the right forum for Koha Newbie questions: I am the IT Project Manager for a very large public school district currently running the Sagebrush (now Follett) Accent product, which in turn uses the Unicorn engine (2003 version) created by Sirsi. On of our largest challenges with the architecture of our existing product is that the architecture is dependent upon two databases: BRS (for full text search) and OPAC. Nor does it appear that the division of labor between BRS and ORACLE is clean -- we can see that there are patron user indexes and other objects used by BRS that make no sense if the tasks were segregated as simply as they have been portrayed. Due to the number of libraries we serve (680+) and the number of titles in our OPAC (7,000,000+), a batch job which should run nightly to synchronize BRS temp and permanent indices and Oracle records very often cannot complete after hours. This batch (ADUTEXT) may not run during business hours, as it interferes with ordinary central cataloging tasks, as well as certain tasks (inventory, adding brief marc records, importing new copies of arriving shipments) in the field. I am interested in evaluating Koha, but am concerned by the touted "Dual Database" nature of Koha. This sounds very much like a trip down the same dead-end street we've encountered with Accent. Does anyone have insight & experience to share with me about batch or periodic processes required to maintain Koha (especially "nightly" stuff), or a better understanding of the the Koha architecture, and why it would or would not pose similar problems for an operation our size? Thanks in advance! -Jon
View this message in context: Koha Newbie Questions Sent from the Koha - Discuss mailing list archive at Nabble.com.
_______________________________________________ Koha mailing list Koha@lists.katipo.co.nz http://lists.katipo.co.nz/mailman/listinfo/koha
-- View this message in context: http://www.nabble.com/Koha-Newbie-Questions-tf4429659.html#a12639976 Sent from the Koha - Discuss mailing list archive at Nabble.com.
Hello everyone, I a have been testing the VMWare Koha appliance, thank you Kyle!! Is there a fast way to get my isbn data into koha and then have it query a few Z39.50 servers to download the rest of the book data? We have about 2000 books. I can scan the isbn numbers into a text file and then import these into the mysql database. I also know how to manually query Z39.50 within Koha but that is a very cumbersome procedure. There must be an easier way... Any help would be greatly appreciated!
participants (3)
-
Jon Bek -
Joshua M. Ferraro -
Piet Slaghekke