[Koha] Quesitons about make_spellcheck_suggest.pl
Brendan Gallagher
gallabr at biblio.org
Fri Sep 5 01:15:34 NZST 2008
Yup - After some quick research into Ispell and Aspell --> it doesn't
look like it will handle phrases. Phrases would have to be parsed,
then feed into the spell checker as single words (at least for Ispell
and Aspell).
-Brendan
On Sep 4, 2008, at 6:35 AM, Joshua Ferraro wrote:
> On Wed, Sep 3, 2008 at 4:52 PM, Joe Atzberger <ohiocore at gmail.com>
> wrote:
>> Since 2.4, Koha added Lingua::Ispell as a dependency, so if we're
>> really
>> just looking to make misspellings into valid strings in language X,
>> the
>> easiest implementation might be to use Ispell.
>>
>> I like the idea of building target strings out of the MARC data,
>> but that
>> might be better addressed by using a script to extend the Ispell
>> dictionary.
> The original mechanism for this, which Brendan is looking at, took
> phrases
> into consideration as well, not just single words. Would ispell be
> able to handle
> phrases?
>
> Josh
>
>> --joe
>>
>> On Wed, Sep 3, 2008 at 2:47 PM, Brendan Gallagher
>> <gallabr at biblio.org>
>> wrote:
>>>
>>> Thanks for the lead Jesse --
>>> I'll continue to mess around with these and see what I can develop.
>>> -Brendan
>>> On Sep 3, 2008, at 2:44 PM, Jesse Weaver wrote:
>>>
>>> On Wed, Sep 3, 2008 at 12:25 PM, Brendan Gallagher <gallabr at biblio.org
>>> >
>>> wrote:
>>>>
>>>> Hi All -
>>>>
>>>> I've been snooping around in the make_spellcheck_suggest.pl perl
>>>> script and I've developed a few questions.
>>>>
>>>> Looks like the script was originally written for Koha 2.4 CVS and
>>>> hasn't quite been updated yet.
>>>> I've been working my way through updating the script (I'm on perl
>>>> 5.10
>>>> and koha 3.1).
>>>>
>>>> Ok, my question is (or if someone has a better recommendation of a
>>>> different path I should be chasing or working towards developing)
>>>>
>>>> Could someone lead me to the new mysql database tables for
>>>> marc_word
>>>> (or another equivalent) and correct me if I am wrong but I am
>>>> equating
>>>> "marc_subfield_table" with the newer mysql table
>>>> marc_subfield_structure (plus i need to change subfieldvalue to
>>>> either
>>>> tagfield or tagsubfield). Below is the current part of the script
>>>> that I am referencing,
>>>
>>> marc_subfield_structure is present in both versions, and is
>>> different from
>>> _table. It holds framework information (description of MARC
>>> subfields, and
>>> what they mean, like the title or author), rather than the values
>>> of those
>>> fields for any given record.
>>>
>>>>
>>>> "
>>>> my $query_words = "SELECT DISTINCT word, COUNT(word) FROM
>>>> marc_word";
>>>> my $query_marc_subfields = "SELECT DISTINCT subfieldvalue,
>>>> COUNT(subfieldvalue) FROM marc_subfield_table";
>>>> my $query_titles = "SELECT DISTINCT title, COUNT(title) FROM biblio
>>>> GROUP BY title";
>>>> my $query_authors = "SELECT DISTINCT author, COUNT(author) FROM
>>>> biblio
>>>> GROUP BY author";
>>>> "
>>>>
>>>> I do have the fuzzy searching set for my opac --> but I am just
>>>> looking a little bit more (and some of my searches dealing with
>>>> these
>>>> "spellcheck" suggestions are working - you can see that I have
>>>> got it
>>>> to populate some information in mysql database).
>>>>
>>>> Here is a copy of the message I get when executing this script.
>>>> and I
>>>> want to get ride of the "excute failed" -parts.
>>>>
>>>> Step 1 of 5: Checking to make sure suggest tables exist
>>>> Use of uninitialized value $_ in pattern match (m//) at
>>>> make_spellcheck_suggest.pl line 99.
>>>> All tables present ... moving along
>>>> Step 2 of 5: Deleting old data
>>>> Step 3 of 5: Creating non-distinct table from various Koha tables
>>>> Finished building marc_word list
>>>> Adding marc_word entries with the following tagsubfields:020a,
>>>> 100a,
>>>> 110a, 130a, 240a, 245a, 245b, 245c, 245p, 246a, 246b, 440a, 440p,
>>>> 505t, 511a, 534a, 600a, 610a, 611a, 630a, 650a, 651a, 700a, 710a,
>>>> 730a, 740a, 800a, 830a,
>>>> DBD::mysql::st execute failed: Unknown column 'tagsubfield' in
>>>> 'where
>>>> clause' at make_spellcheck_suggest.pl line 173.
>>>> DBD::mysql::st fetchrow_array failed: fetch() without execute() at
>>>> make_spellcheck_suggest.pl line 179.
>>>> 0 more records added...
>>>> Finished building marc_subfield_table list
>>>> Adding marc_subfield_table entries with the following tags and
>>>> subfields:020, a, 100, a, 110, a, 130, a, 240, a, 245, a, 245, b,
>>>> 245,
>>>> c, 245, p, 246, a, 246, b, 440, a, 440, p, 505, t, 511, a, 534, a,
>>>> 600, a, 610, a, 611, a, 630, a, 650, a, 651, a, 700, a, 710, a,
>>>> 730,
>>>> a, 740, a, 800, a, 830, a,
>>>> DBD::mysql::st execute failed: Table 'koha.marc_subfield_table'
>>>> doesn't exist at make_spellcheck_suggest.pl line 173.
>>>> DBD::mysql::st fetchrow_array failed: fetch() without execute() at
>>>> make_spellcheck_suggest.pl line 179.
>>>> 0 more records added...
>>>> 57708 more records added...
>>>> 83598 more records added...
>>>> Step 4 of 5: Deleting old distinct entries
>>>> Step 5 of 5: Creating distinct spellcheck table out of non-distinct
>>>> table
>>>> Finished: total distinct items added to spellcheck: 81520
>>>
>>> Koha 3.0's new database schema does make this a bit harder; the
>>> script
>>> would have to parse the record from the biblioitems.marcxml or
>>> biblioitems.marc column, then find the words within that to add to
>>> the
>>> relevant tables.
>>>
>>>>
>>>> Thanks,
>>>>
>>>> +++++++++++++++++++++++++++++++++++++++++++
>>>> Brendan A. Gallagher
>>>> Software Services Coordinator
>>>> Bibliomation, INC.
>>>> Middlebury, CT 06516
>>>> http://www.biblio.org
>>>> (203)577-4070 x119
>>>> +++++++++++++++++++++++++++++++++++++++++++
>>>>
>>>
>>> --
>>> Jesse Weaver
>>> Software Developer, LibLime
>>>
>>>
>>> _______________________________________________
>>> Koha mailing list
>>> Koha at lists.katipo.co.nz
>>> http://lists.katipo.co.nz/mailman/listinfo/koha
>>>
>>
>>
>> _______________________________________________
>> Koha mailing list
>> Koha at lists.katipo.co.nz
>> http://lists.katipo.co.nz/mailman/listinfo/koha
>>
>>
>
>
>
> --
> Joshua Ferraro SUPPORT FOR OPEN-SOURCE SOFTWARE
> CEO migration, training, maintenance, support
> LibLime Featuring Koha Open-Source ILS
> jmf at liblime.com |Full Demos at http://liblime.com/koha |1(888)KohaILS
More information about the Koha
mailing list