[Koha] some thoughts about cataloguing and acquisition (important)

Fri Jan 24 04:48:09 NZDT 2003

On Thu, 23 Jan 2003, Marco Gaiarin wrote:
> > database directly.  But in most practical systems some degree of
> > denormalization is necessary for effeciency.
> 
> There's ever time for dirty tricks, and i'm seriously convinced that
> with a well written schema in normal (normalized?!) form all things can
> be done with no big penalties (or better, with low speed penalties
> compared to overral stability benefit ;).

Then please educate me.  Here's a challenge:

Database: anything freely runnable on Linux

ERD: Two tables in a master-slave arrangemet.  The master is a "claim"  
table and the slave is a "transact" table.  The transact table often
contains hundreds and occasionally thousands of transactions per claim.  
Transactions include allocating funds and spending them across several
categories per claim.  Allocation and spending totals for each category
and the overall claim are regularly needed in reports.

Usage: few updates, many reports.  Reports may be for a few or thousands
of claims.

Quandry:  Either don't maintain totals in an effort to remain pure for 
normalization purposes or maintain the necessary totals in the claim 
table or a claimsumm table.

So, can you use a standard benchmarking module (
http://search.cpan.org/author/JHI/perl-5.8.0/lib/Benchmark.pm ) to show
that the time to produce a report in the normalized fashion is quicker
than producing it in the denormalized fashion?  Let's say that if you can
produce the reports based on normalized data in less than twice the time
they can be produced with denormalized data that would be considered "no
big penalties" and you win.  For fun, try it with a few test clients
running concurrently and watch what happens to the results.

I'm sure the librarians are bored to tears by all of this by now, but it
would be extremely disappointing to them I think if the "normalization to
the death" camp holds sway over Koha.  Normalization is a very good idea
and it doesn't cause any problems until you try to scale beyond a database
size of merely academic interest.  Normalization is often ignored because
of premature optimization and total ignorance and I spend more of my DBA
consulting time teaching people this most basic of database concepts than
I do encouraging people to break the rules.  But when there's a clear need
to break the rules, you'll pay an increasingly large penalty as things
scale.  And I suspect almost every library would like to get bigger
someday.

-- 
</chris>

"Never offend people with style when you can offend them with substance."
		- Sam Brown