[Koha] Problems with Zebra index

Oliver Goldschmidt o.goldschmidt at tuhh.de
Wed Jul 24 00:30:47 NZST 2013


Jared,

thank you very much for your reply!

In fact I forgot to run zebraidx as user koha-koha, and so you were
right: I had bad permissions on the index files in shadow. I tried to
fix that by changing the ownership and restarted zebra, but that had no
effect. So I guess you are right, that I messed up my index by trying
that (which is not too bad; I can still remove the index and try again
as koha-koha). I hope nothing else broke by that mistake, and the
database is still fine?!

To increase the disk space, thats exactly what I did: I changed the
value in zebra-biblios.cfg. But can you explain, what the directories
are used for? After finishing indexing, will I have data in both
directories or could I configure my 100 GB disk, so that both
directories can take 80 GB space? I will try that and see...

I still have a problem with rebuild_zebra.pl: it ignores the -s
parameter. If I understood that right, rebuild_zebra should use an
existing exported_records file, if I use the parameter -s and -d. But it
doesn't. Any time I'm starting rebuild_zebra, the script exports my
database (this takes pretty much time and I wanted to bypass it). Is
this a bug or am I missing anything?

Best
- Oliver

Am 23.07.2013 13:38, schrieb Jared Camins-Esakov:
> Oliver,
>
> I have good news and bad news. The good news is, the fix to your
> problem is probably easy. The bad news is that running the zebraidx
> command manually more than likely messed up your installation.
>
> It sounds like your first problem can be solved simply by increasing
> the space that Zebra will use (it is not uncommon to need in excessive
> of 100GB for indexes in a large installation). I'm not sure how you
> increased the space allotted, so I'm going to provide instructions for
> the correct way to do this that you can check your work against. If
> you open up the zebra-biblios.cfg and zebra-biblios-dom.cfg files that
> Koha installed (in /etc/koha/sites/koha/ ), you'll need to change two
> lines, the lines starting with register and shadow. At the end of the
> line it says 20G or 45G, depending whether you changed that. Change
> those numbers to, say, 80G.
>
> rebuild_zebra_sliced would not help you in this instance, because your
> problem is the amount of disk space required, not a bad record.
>
> Now for the bad news. If you ran zebraidx as any user other than
> koha-koha, your permissions are going to be all wrong. You can try
> changing the owner recursively on /var/lib/koha/koha to koha-koha.
> That might fix it (but I am not sure, since I haven't tried). The
> zebra_bib_index_mode is easy to fix, fortunately. Just change
> zebra_bib_index_mode to grs1, run rebuild_zebra.pl
> <http://rebuild_zebra.pl> -r -b -x and you should be fine. You can
> worry about switching to DOM indexing once you have indexing with
> GRS-1
> working: http://wiki.koha-community.org/wiki/Switching_to_dom_indexing
>
> Regards,
> Jared
>
>
>
> On Tue, Jul 23, 2013 at 4:29 AM, Oliver Goldschmidt
> <o.goldschmidt at tuhh.de <mailto:o.goldschmidt at tuhh.de>> wrote:
>
>     Hi Koha community,
>
>     I am new to Koha and have spent the last week with trying to feed the
>     Zebra index with our bibliographic records. This turned out to be
>     pretty
>     difficult.
>     I have successfully imported our records (about 600.000) to the Koha
>     database. Then I tried to use rebuild_zebra.pl
>     <http://rebuild_zebra.pl> to put the records into
>     the index. This failed due to disk space reasons: I have 100 GB disk
>     space reserved for the Zebra index (mounted on /var/lib/koha) and have
>     split this space in zebra config into 45 GB for the shadow
>     directory and
>     45 GB for the register directory. This was not sufficient, which I
>     think
>     is a little bit weired, because I think 600.000 records should not
>     take
>     so much space... So, my first question: is that normal? Does Zebra
>     need
>     so much disk space for the index? What are the directories
>     register and
>     shadow exactly for?
>
>     Next try was indexing with rebuild_zebra_sliced.sh. I used the default
>     value of 10000 for the chunks. First I got an error, I guess because a
>     configuration value was not set properly (the script did not find
>     index_mode; so I set it manually to "dom", which I guessed should
>     be the
>     correct value for indexing marcxml).
>     After fixing that manually, I succeeded to split my export file
>     into 59
>     10000-record-chunks. I tried to index the first two chunks and that
>     seemed to work without problems for the first chunk (but it finished
>     very fast, which made me wonder if Koha really did something - I just
>     realized, that the marcxml file was not valid - but why didn't I
>     get an
>     error?). For the second chunk, there were two messages
>     (unfortunaltely I
>     cannot recall them). This is the command I used to do that:
>
>     zebraidx -c /etc/koha/sites/koha/zebra-biblios.cfg -v
>     none,fatal,warn -g
>     marcxml -d biblios update
>     /tmp/rebuild/export/biblio/exported_records_1000001
>
>     But now, when I search in the Koha opac for an "e" for example, I
>     still
>     get no results. Though the index seems to be empty, but actually there
>     are files in /var/lib/koha/koha/biblio/shadow. Is there a way to look
>     into the Zebra index directly?
>     I have no idea where to look next.
>
>     Does anybody have any hint about that? Any help would be appreciated.
>
>     Best
>     -Oliver
>
>     --
>     Oliver Goldschmidt
>     TU Hamburg-Harburg / Universitätsbibliothek / Digitale Dienste
>     Denickestr. 22
>     21071 Hamburg - Harburg
>     Tel.    +49 (0)40 / 428 78 - 32 91
>     eMail   o.goldschmidt at tuhh.de <mailto:o.goldschmidt at tuhh.de>
>     --
>     GPG/PGP-Schlüssel:
>     http://www.tub.tu-harburg.de/keys/Oliver_Marahrens_pub.asc
>
>     _______________________________________________
>     Koha mailing list  http://koha-community.org
>     Koha at lists.katipo.co.nz <mailto:Koha at lists.katipo.co.nz>
>     http://lists.katipo.co.nz/mailman/listinfo/koha
>
>
>
>
> -- 
> Jared Camins-Esakov
> Bibliographer, C & P Bibliography Services, LLC
> (phone) +1 (917) 727-3445
> (e-mail) jcamins at cpbibliography.com <mailto:jcamins at cpbibliography.com>
> (web) http://www.cpbibliography.com/


-- 
Oliver Goldschmidt
TU Hamburg-Harburg / Universitätsbibliothek / Digitale Dienste
Denickestr. 22
21071 Hamburg - Harburg
Tel. 	+49 (0)40 / 428 78 - 32 91
eMail	o.goldschmidt at tuhh.de
--
GPG/PGP-Schlüssel: 
http://www.tub.tu-harburg.de/keys/Oliver_Marahrens_pub.asc



More information about the Koha mailing list