[Koha] Quesitons about make_spellcheck_suggest.pl
Joshua Ferraro
jmf at liblime.com
Thu Sep 4 22:35:42 NZST 2008
On Wed, Sep 3, 2008 at 4:52 PM, Joe Atzberger <ohiocore at gmail.com> wrote:
> Since 2.4, Koha added Lingua::Ispell as a dependency, so if we're really
> just looking to make misspellings into valid strings in language X, the
> easiest implementation might be to use Ispell.
>
> I like the idea of building target strings out of the MARC data, but that
> might be better addressed by using a script to extend the Ispell dictionary.
The original mechanism for this, which Brendan is looking at, took phrases
into consideration as well, not just single words. Would ispell be
able to handle
phrases?
Josh
> --joe
>
> On Wed, Sep 3, 2008 at 2:47 PM, Brendan Gallagher <gallabr at biblio.org>
> wrote:
>>
>> Thanks for the lead Jesse --
>> I'll continue to mess around with these and see what I can develop.
>> -Brendan
>> On Sep 3, 2008, at 2:44 PM, Jesse Weaver wrote:
>>
>> On Wed, Sep 3, 2008 at 12:25 PM, Brendan Gallagher <gallabr at biblio.org>
>> wrote:
>>>
>>> Hi All -
>>>
>>> I've been snooping around in the make_spellcheck_suggest.pl perl
>>> script and I've developed a few questions.
>>>
>>> Looks like the script was originally written for Koha 2.4 CVS and
>>> hasn't quite been updated yet.
>>> I've been working my way through updating the script (I'm on perl 5.10
>>> and koha 3.1).
>>>
>>> Ok, my question is (or if someone has a better recommendation of a
>>> different path I should be chasing or working towards developing)
>>>
>>> Could someone lead me to the new mysql database tables for marc_word
>>> (or another equivalent) and correct me if I am wrong but I am equating
>>> "marc_subfield_table" with the newer mysql table
>>> marc_subfield_structure (plus i need to change subfieldvalue to either
>>> tagfield or tagsubfield). Below is the current part of the script
>>> that I am referencing,
>>
>> marc_subfield_structure is present in both versions, and is different from
>> _table. It holds framework information (description of MARC subfields, and
>> what they mean, like the title or author), rather than the values of those
>> fields for any given record.
>>
>>>
>>> "
>>> my $query_words = "SELECT DISTINCT word, COUNT(word) FROM marc_word";
>>> my $query_marc_subfields = "SELECT DISTINCT subfieldvalue,
>>> COUNT(subfieldvalue) FROM marc_subfield_table";
>>> my $query_titles = "SELECT DISTINCT title, COUNT(title) FROM biblio
>>> GROUP BY title";
>>> my $query_authors = "SELECT DISTINCT author, COUNT(author) FROM biblio
>>> GROUP BY author";
>>> "
>>>
>>> I do have the fuzzy searching set for my opac --> but I am just
>>> looking a little bit more (and some of my searches dealing with these
>>> "spellcheck" suggestions are working - you can see that I have got it
>>> to populate some information in mysql database).
>>>
>>> Here is a copy of the message I get when executing this script. and I
>>> want to get ride of the "excute failed" -parts.
>>>
>>> Step 1 of 5: Checking to make sure suggest tables exist
>>> Use of uninitialized value $_ in pattern match (m//) at
>>> make_spellcheck_suggest.pl line 99.
>>> All tables present ... moving along
>>> Step 2 of 5: Deleting old data
>>> Step 3 of 5: Creating non-distinct table from various Koha tables
>>> Finished building marc_word list
>>> Adding marc_word entries with the following tagsubfields:020a, 100a,
>>> 110a, 130a, 240a, 245a, 245b, 245c, 245p, 246a, 246b, 440a, 440p,
>>> 505t, 511a, 534a, 600a, 610a, 611a, 630a, 650a, 651a, 700a, 710a,
>>> 730a, 740a, 800a, 830a,
>>> DBD::mysql::st execute failed: Unknown column 'tagsubfield' in 'where
>>> clause' at make_spellcheck_suggest.pl line 173.
>>> DBD::mysql::st fetchrow_array failed: fetch() without execute() at
>>> make_spellcheck_suggest.pl line 179.
>>> 0 more records added...
>>> Finished building marc_subfield_table list
>>> Adding marc_subfield_table entries with the following tags and
>>> subfields:020, a, 100, a, 110, a, 130, a, 240, a, 245, a, 245, b, 245,
>>> c, 245, p, 246, a, 246, b, 440, a, 440, p, 505, t, 511, a, 534, a,
>>> 600, a, 610, a, 611, a, 630, a, 650, a, 651, a, 700, a, 710, a, 730,
>>> a, 740, a, 800, a, 830, a,
>>> DBD::mysql::st execute failed: Table 'koha.marc_subfield_table'
>>> doesn't exist at make_spellcheck_suggest.pl line 173.
>>> DBD::mysql::st fetchrow_array failed: fetch() without execute() at
>>> make_spellcheck_suggest.pl line 179.
>>> 0 more records added...
>>> 57708 more records added...
>>> 83598 more records added...
>>> Step 4 of 5: Deleting old distinct entries
>>> Step 5 of 5: Creating distinct spellcheck table out of non-distinct
>>> table
>>> Finished: total distinct items added to spellcheck: 81520
>>
>> Koha 3.0's new database schema does make this a bit harder; the script
>> would have to parse the record from the biblioitems.marcxml or
>> biblioitems.marc column, then find the words within that to add to the
>> relevant tables.
>>
>>>
>>> Thanks,
>>>
>>> +++++++++++++++++++++++++++++++++++++++++++
>>> Brendan A. Gallagher
>>> Software Services Coordinator
>>> Bibliomation, INC.
>>> Middlebury, CT 06516
>>> http://www.biblio.org
>>> (203)577-4070 x119
>>> +++++++++++++++++++++++++++++++++++++++++++
>>>
>>
>> --
>> Jesse Weaver
>> Software Developer, LibLime
>>
>>
>> _______________________________________________
>> Koha mailing list
>> Koha at lists.katipo.co.nz
>> http://lists.katipo.co.nz/mailman/listinfo/koha
>>
>
>
> _______________________________________________
> Koha mailing list
> Koha at lists.katipo.co.nz
> http://lists.katipo.co.nz/mailman/listinfo/koha
>
>
--
Joshua Ferraro SUPPORT FOR OPEN-SOURCE SOFTWARE
CEO migration, training, maintenance, support
LibLime Featuring Koha Open-Source ILS
jmf at liblime.com |Full Demos at http://liblime.com/koha |1(888)KohaILS
More information about the Koha
mailing list