[Koha] Quesitons about make_spellcheck_suggest.pl

Joe Atzberger ohiocore at gmail.com
Thu Sep 4 08:52:19 NZST 2008


Since 2.4, Koha added Lingua::Ispell as a dependency, so if we're really
just looking to make misspellings into valid strings in language X, the
easiest implementation might be to use Ispell.

I like the idea of building target strings out of the MARC data, but that
might be better addressed by using a script to extend the Ispell dictionary.

--joe

On Wed, Sep 3, 2008 at 2:47 PM, Brendan Gallagher <gallabr at biblio.org>wrote:

> Thanks for the lead Jesse --
> I'll continue to mess around with these and see what I can develop.
>
> -Brendan
>
> On Sep 3, 2008, at 2:44 PM, Jesse Weaver wrote:
>
> On Wed, Sep 3, 2008 at 12:25 PM, Brendan Gallagher <gallabr at biblio.org>wrote:
>
>> Hi All -
>>
>> I've been snooping around in the make_spellcheck_suggest.pl perl
>> script and I've developed a few questions.
>>
>> Looks like the script was originally written for Koha 2.4 CVS and
>> hasn't quite been updated yet.
>> I've been working my way through updating the script (I'm on perl 5.10
>> and koha 3.1).
>>
>> Ok, my question is (or if someone has a better recommendation of a
>> different path I should be chasing or working towards developing)
>>
>> Could someone lead me to the new mysql database tables for marc_word
>> (or another equivalent) and correct me if I am wrong but I am equating
>> "marc_subfield_table" with the newer mysql table
>> marc_subfield_structure (plus i need to change subfieldvalue to either
>> tagfield or tagsubfield).  Below is the current part of the script
>> that I am referencing,
>
>
> marc_subfield_structure is present in both versions, and is different from
> _table. It holds framework information (description of MARC subfields, and
> what they mean, like the title or author), rather than the values of those
> fields for any given record.
>
>
>>
>> "
>> my $query_words = "SELECT DISTINCT word, COUNT(word) FROM marc_word";
>> my $query_marc_subfields = "SELECT DISTINCT subfieldvalue,
>> COUNT(subfieldvalue) FROM marc_subfield_table";
>> my $query_titles = "SELECT DISTINCT title, COUNT(title) FROM biblio
>> GROUP BY title";
>> my $query_authors = "SELECT DISTINCT author, COUNT(author) FROM biblio
>> GROUP BY author";
>> "
>>
>> I do have the fuzzy searching set for my opac --> but I am just
>> looking a little bit more (and some of my searches dealing with these
>> "spellcheck" suggestions are working - you can see that I have got it
>> to populate some information in mysql database).
>>
>> Here is a copy of the message I get when executing this script.  and I
>> want to get ride of the "excute failed" -parts.
>>
>> Step 1 of 5: Checking to make sure suggest tables exist
>> Use of uninitialized value $_ in pattern match (m//) at
>> make_spellcheck_suggest.pl line 99.
>> All tables present ... moving along
>> Step 2 of 5: Deleting old data
>> Step 3 of 5: Creating non-distinct table from various Koha tables
>> Finished building marc_word list
>> Adding marc_word entries with the following tagsubfields:020a, 100a,
>> 110a, 130a, 240a, 245a, 245b, 245c, 245p, 246a, 246b, 440a, 440p,
>> 505t, 511a, 534a, 600a, 610a, 611a, 630a, 650a, 651a, 700a, 710a,
>> 730a, 740a, 800a, 830a,
>> DBD::mysql::st execute failed: Unknown column 'tagsubfield' in 'where
>> clause' at make_spellcheck_suggest.pl line 173.
>> DBD::mysql::st fetchrow_array failed: fetch() without execute() at
>> make_spellcheck_suggest.pl line 179.
>> 0 more records added...
>> Finished building marc_subfield_table list
>> Adding marc_subfield_table entries with the following tags and
>> subfields:020, a, 100, a, 110, a, 130, a, 240, a, 245, a, 245, b, 245,
>> c, 245, p, 246, a, 246, b, 440, a, 440, p, 505, t, 511, a, 534, a,
>> 600, a, 610, a, 611, a, 630, a, 650, a, 651, a, 700, a, 710, a, 730,
>> a, 740, a, 800, a, 830, a,
>> DBD::mysql::st execute failed: Table 'koha.marc_subfield_table'
>> doesn't exist at make_spellcheck_suggest.pl line 173.
>> DBD::mysql::st fetchrow_array failed: fetch() without execute() at
>> make_spellcheck_suggest.pl line 179.
>> 0 more records added...
>> 57708 more records added...
>> 83598 more records added...
>> Step 4 of 5: Deleting old distinct entries
>> Step 5 of 5: Creating distinct spellcheck table out of non-distinct
>> table
>> Finished: total distinct items added to spellcheck: 81520
>>
>
> Koha 3.0's new database schema does make this a bit harder; the script
> would have to parse the record from the biblioitems.marcxml or
> biblioitems.marc column, then find the words within that to add to the
> relevant tables.
>
>
>>
>>
>> Thanks,
>>
>> +++++++++++++++++++++++++++++++++++++++++++
>> Brendan A. Gallagher
>> Software Services Coordinator
>> Bibliomation, INC.
>> Middlebury, CT 06516
>> http://www.biblio.org
>> (203)577-4070 x119
>> +++++++++++++++++++++++++++++++++++++++++++
>>
>>
> --
> Jesse Weaver
> Software Developer, LibLime
>
>
>
> _______________________________________________
> Koha mailing list
> Koha at lists.katipo.co.nz
> http://lists.katipo.co.nz/mailman/listinfo/koha
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.katipo.co.nz/pipermail/koha/attachments/20080903/34c55fb0/attachment.htm 


More information about the Koha mailing list