[Koha] Proposal to form Koha Technical Committee

Thomas Dukleth kohalist at agogme.com
Thu Dec 2 09:24:38 NZDT 2010


[Correction.]

Discussions of programming development issues are more appropriate for the
koha-devel mailing list.  I encourage people to subscribe to that list.

The programming development issue spilled over on to this list as part of
a discussion of development management which is more than merely a
programming development issue.

Correction to my earlier reply inline:

On Mon, November 22, 2010 06:56, Thomas Dukleth wrote:

[...]

> On Sat, November 20, 2010 04:49, LAURENT Henri-Damien wrote:
>> Le 19/11/2010 22:41, Thomas Dukleth a écrit :

[...]

> 4.  SIMPLESERVER.
>
>> Same for the Z3950 server which we wrote a wrapper using
>> Net::Z3950::SimpleServer. This would allow ppl to expose their
>> collection as a Z3950 server.
>
> Net::Z39.50::SimpleServer has been on my list as one option which would
> need improvement to have features sufficiently comparable to Zebra as a
> Z39.50 Server.  'Simple' may be taken to mean that supporting complexity
> is an exercise left to the programmer using the tool.  Implementations of
> SimpleServer generally only support use attributes because of the
> complexity of managing more.  Below, Henri-Damien attests to the
> difficulties of parsing PQF reliably.  SimpleServer does not support SRU.
> The documentation for SimpleServer is less complete than Zebra
> documentation.  We also have much greater working knowledge of Zebra.
>
> I will enquire with IndexData about what options there might be for adding
> SRU support to SimpleServer.

I try to be especially careful about being accurate.  I easily falsified
my claim that SimpleServer does not support SRU without any need to do
anything more than examine the documentation more completely.  My
knowledge that Simple Server had not had SRU support at an important point
in the past led me to be less careful than I almost always am before I
posted my message.  I did check before posting but I did not check well
enough while giving my attention to what SimpleServer provides for
managing PQF.

Even with SRU support the basic problem of SimpleServer being simple
remains.  SimpleServer requires the implementing programmer to do much
work in managing the complexity of PQF.  The extent to which SimpleServer
provides management of PQF merely changes one data model into another data
model without necessarily reducing the complexity.  Henri-Damien Laurent
had thought it prudent to avoid such complexity when he considered
undertaking the simpler task of merely writing PQF queries where Koha acts
as a Z39.50 client.  Writing PQF for a Z39.50 client is much easier than
parsing PQF for a Z39.50 server.  Something as simple as distinguishing a
list of words to be matched from a phrase to be matched for SimpleServer
requires the programmer to manage the complexity of PQF.

Zebra takes care of the complexity of parsing PQF for us automatically
where we only need to bother about using the configuration files
appropriately.  Zebra has some problems which would cost something to fix.

With an appropriate investment of resources, SimpleServer could be a
comparable alternative to Zebra as a Z39.50/SRU server for Koha when using
Solr/Lucene for local indexing.  As SimpleServer is written in Perl, we
may consider that SimpleServer is a better long term option for a
Z39.50/SRU server in Perl based Koha than C based Zebra when using
Solr/Lucene.

There are not any no cost options for replacing Zebra as a Z39.50/SRU
server for Koha with a sufficiently comparable feature set when using
Solr/Lucene for local indexing.  Zebra has a very large feature set as a
Z39.50/SRU server which we should not give up easily.  The features may be
overlooked by most people because Zebra is not nearly as well configured
by default in Koha as it could be.

Each of the options has its own advantages and disadvantages.  We need to
understand the details well and give appropriate weight to the various
factors.

I agree that it would be much easier to maintain one record indexing
system at a high level than two and that factor should be weighted
appropriately.

>
>
> 5.  PREFIX QUERY FORMAT.
>
>>
>>>
>>> As I have stated previously, C4::Search ought to be rewritten in future
>>> using prefix query format (PQF) as the native language for Z39.50.
>> Well, Rewriting the whole Search With PQF would not be handy in many
>> aspects.  I have thought about that many times. And using PQF still
>> appears to me not to be the solution.
>> a) It is really a pain to maintain and analyse.
>> b) Whenever you need some more feature in your search, you have to add
>> some more qualifiers, and therefore provide a robust parser.
>
> I agree that analysing PQF connectors is tricky in comparison to CCL
> connectors which Yaz converts to PQF.  In my own work independent of Koha,
> I overcame difficulties by having the user interface display the PQF query
> which my code would generate.
>
> I have code for writing PQF query sets which I had started in 2005 using
> PHP Yaz before Net::Z3950::ZOOM was available for Perl.  My code supports
> the complete and I mean complete Bib-1 attribute set.  I could port the
> code to Perl with a little effort, although, it would still need some work
> for more extensibility of the user controlled term sets.
>
> Queries built using PQF can be tested by building the same query in CCL
> and sending it to Yaz for conversion to PQF as a comparison.
>
> Writing a PQF parser to interpret incoming PQF queries for SimpleServer
> would be perhaps an order of magnitude more complex.
>
>>> Continuing to support Common Command Language (CCL) is trivial because
>>> Yaz
>>> translates CCL to PQF as it does now for Koha.

We may need CCL support to work most effectively with CCL based Pazpar2. 
Basic functionality in Pazpar2 converts CCL to whatever query language is
appropriate for each of configured target server.  Pazpar2 documentation
contains some incomplete descriptions of PQF options.

[...]


Thomas Dukleth
Agogme
109 E 9th Street, 3D
New York, NY  10003
USA
http://www.agogme.com
+1 212-674-3783




More information about the Koha mailing list