[Koha] some thoughts about cataloguing and acquisition (important)

Christopher Hicks chicks at chicks.net
Tue Jan 21 07:21:00 NZDT 2003


On Mon, 20 Jan 2003, Marco Gaiarin wrote:
> Please, no. Referential integrity is the only warrenty of data ever
> correct and coherent. A correct schema, well written, are not slower as
> a full set of dirty tricks. Please, build koha around a well written
> data schema, ad the application will be easer tu build and to manage.

You're mixing up a number of things here.  What you're referring to as
"referential integrity" I'm guessing you mean normalized.  Either way,
neither is a warranty for correct data.  I'm a big believer in
normalization and a fully normalized "ideal" model for every system should
be saved as a reference for all DBA's and programmers dealing with the
database directly.  But in most practical systems some degree of
denormalization is necessary for effeciency.

I have a system which produces reports of insurance claims.  We maintain a
claim transaction table that has a lot of transactions in it, but also
another table summarizing various things about these transactions.  If we
were to scan the full transaction table for each report produced, the
reports would take four to five times as long to produce and the overall
application would be slower because of CPU and I/O utilization on the
server.  The obvious difficulty is that this information must be
maintained which entails programming and system overhead, but given our
usage patterns this trade off is well worth it.  Some people are
overzealous in their optimizations and you can end up with some nasty data
models because of it, but that doesn't mean that denormalization is all
bad.

If you're speaking of referential integrity in the sense of constraints
within the database, there again you're facing a performance penalty which
is usually not worth it.  I encourage software developers to write code on
a database which supports constraints, but the implementation is often on
a database where constraints aren't available (like MySQL) or where the
constraints slow things down more than they're worth (like Oracle) in
production.  Financial environments and other places where there's plenty
of money to throw at hardware for the database often run production
systems with referential integrity cut on, but that's the opposite of most
libraries I'm familiar with.

-- 
</chris>

"Never offend people with style when you can offend them with substance."
		- Sam Brown 




More information about the Koha mailing list