Re: [Koha] Quesitons about make_spellcheck_suggest.pl

4 Sep 2008


      ...
Since 2.4, Koha added Lingua::Ispell as a dependency, so if we're really
just looking to make misspellings into valid strings in language X, the
easiest implementation might be to use Ispell.
I like the idea of building target strings out of the MARC data, but that
might be better addressed by using a script to extend the Ispell dictionary.
The original mechanism for this, which Brendan is looking at, took phrases
into consideration as well, not just single words. Would ispell be
able to handle
On Wed, Sep 3, 2008 at 4:52 PM, Joe Atzberger <ohiocore@gmail.com> wrote:
phrases?

Josh
...
--joe
On Wed, Sep 3, 2008 at 2:47 PM, Brendan Gallagher <gallabr@biblio.org>
wrote:
...
Thanks for the lead Jesse --
I'll continue to mess around with these and see what I can develop.
-Brendan
On Sep 3, 2008, at 2:44 PM, Jesse Weaver wrote:
On Wed, Sep 3, 2008 at 12:25 PM, Brendan Gallagher <gallabr@biblio.org>
wrote:
...
Hi All -
I've been snooping around in the make_spellcheck_suggest.pl perl
script and I've developed a few questions.
Looks like the script was originally written for Koha 2.4 CVS and
hasn't quite been updated yet.
I've been working my way through updating the script (I'm on perl 5.10
and koha 3.1).
Ok, my question is (or if someone has a better recommendation of a
different path I should be chasing or working towards developing)
Could someone lead me to the new mysql database tables for marc_word
(or another equivalent) and correct me if I am wrong but I am equating
"marc_subfield_table" with the newer mysql table
marc_subfield_structure (plus i need to change subfieldvalue to either
tagfield or tagsubfield).  Below is the current part of the script
that I am referencing,
marc_subfield_structure is present in both versions, and is different from
_table. It holds framework information (description of MARC subfields, and
what they mean, like the title or author), rather than the values of those
fields for any given record.
...
"
my $query_words = "SELECT DISTINCT word, COUNT(word) FROM marc_word";
my $query_marc_subfields = "SELECT DISTINCT subfieldvalue,
COUNT(subfieldvalue) FROM marc_subfield_table";
my $query_titles = "SELECT DISTINCT title, COUNT(title) FROM biblio
GROUP BY title";
my $query_authors = "SELECT DISTINCT author, COUNT(author) FROM biblio
GROUP BY author";
"
I do have the fuzzy searching set for my opac --> but I am just
looking a little bit more (and some of my searches dealing with these
"spellcheck" suggestions are working - you can see that I have got it
to populate some information in mysql database).
Here is a copy of the message I get when executing this script.  and I
want to get ride of the "excute failed" -parts.
Step 1 of 5: Checking to make sure suggest tables exist
Use of uninitialized value $_ in pattern match (m//) at
make_spellcheck_suggest.pl line 99.
All tables present ... moving along
Step 2 of 5: Deleting old data
Step 3 of 5: Creating non-distinct table from various Koha tables
Finished building marc_word list
Adding marc_word entries with the following tagsubfields:020a, 100a,
110a, 130a, 240a, 245a, 245b, 245c, 245p, 246a, 246b, 440a, 440p,
505t, 511a, 534a, 600a, 610a, 611a, 630a, 650a, 651a, 700a, 710a,
730a, 740a, 800a, 830a,
DBD::mysql::st execute failed: Unknown column 'tagsubfield' in 'where
clause' at make_spellcheck_suggest.pl line 173.
DBD::mysql::st fetchrow_array failed: fetch() without execute() at
make_spellcheck_suggest.pl line 179.
0 more records added...
Finished building marc_subfield_table list
Adding marc_subfield_table entries with the following tags and
subfields:020, a, 100, a, 110, a, 130, a, 240, a, 245, a, 245, b, 245,
c, 245, p, 246, a, 246, b, 440, a, 440, p, 505, t, 511, a, 534, a,
600, a, 610, a, 611, a, 630, a, 650, a, 651, a, 700, a, 710, a, 730,
a, 740, a, 800, a, 830, a,
DBD::mysql::st execute failed: Table 'koha.marc_subfield_table'
doesn't exist at make_spellcheck_suggest.pl line 173.
DBD::mysql::st fetchrow_array failed: fetch() without execute() at
make_spellcheck_suggest.pl line 179.
0 more records added...
57708 more records added...
83598 more records added...
Step 4 of 5: Deleting old distinct entries
Step 5 of 5: Creating distinct spellcheck table out of non-distinct
table
Finished: total distinct items added to spellcheck: 81520
Koha 3.0's new database schema does make this a bit harder; the script
would have to parse the record from the biblioitems.marcxml or
biblioitems.marc column, then find the words within that to add to the
relevant tables.
...
Thanks,
+++++++++++++++++++++++++++++++++++++++++++
Brendan A. Gallagher
Software Services Coordinator
Bibliomation, INC.
Middlebury, CT 06516
http://www.biblio.org
(203)577-4070 x119
+++++++++++++++++++++++++++++++++++++++++++
--
Jesse Weaver
Software Developer, LibLime
_______________________________________________
Koha mailing list
Koha@lists.katipo.co.nz
http://lists.katipo.co.nz/mailman/listinfo/koha
_______________________________________________
Koha mailing list
Koha@lists.katipo.co.nz
http://lists.katipo.co.nz/mailman/listinfo/koha
-- 
Joshua Ferraro SUPPORT FOR OPEN-SOURCE SOFTWARE
CEO migration, training, maintenance, support
LibLime Featuring Koha Open-Source ILS
jmf@liblime.com |Full Demos at http://liblime.com/koha |1(888)KohaILS