I never realized what a hassle it is to sort lists where numbers come after letters (as is the norm with MARC subfield marks). MySQL will only sort with numbers before the letters. I've managed to get the MARC records sorting in the right order now, and it can optionally display both tag names and subfield names for those of us for whom MARC is somewhat greek. When adding new subfields, there is a drop down box with subfield marks and names, or a simple text entry box where you can just type in the subfield mark. Now I really need to start thinking about how to tie this in to Koha. I don't think it should be too difficult to generate MARC records from the existing biblio, biblioitems, items table data. I suppose the easiest thing to do would be to insert some kind of API everywhere that modifies those tables so that they also modify the MARC records. This would be a first step towards adding MARC support. Second step would be inserting an API for every routine that _reads_ from those tables. At that point, all reads could gather data from the MARC records, and all writes could update both the old tables and the new MARC tables. MARC records would be linked at the biblioitems level. The items table data would, I believe, mostly be recorded in an 852 - Location tag in the Marc data. Links to the biblio table could be stored in a local 9XX tag. For example, if there is a biblio entry for "King Lear" with biblionumber 5214, and there are 4 biblioitems for that biblio (hardcover, paberback, cassette, large print), the MARC records for each of those four would have a tag like: 944 ## a 5214 When doing searches on MARC records, all four results with the same 944 a value could be lumped together under "King Lear" in the results output. I believe this would result in the same behaviour obtained with the three split tables? ------------------------ Technical Stuff Below!!! ------------------------ Rough outline of the Schema I used: I used the schema developed for the Pytheas project. It is fully capable of storing all of the information in any MARC record, and is not overly difficult to search. One feature I'd like to add is a keyword index so that soundex searches would work (ie soundex(elefant) = soundex(elephant)). The Resource_Table contains an entry for every MARC record. Resource_Table Resource_ID int auto_increment Record_ID int The Bib_Table contains an entry for every tag in a record. The Record_ID is used to link all tags in a MARC record together. Each entry in Bib_Table will have one of the Tag_#XX_ID fields filled out with a number that links to the appropriate #XX_Tag_Table. The Control_ID field links to a Control_Table that is used for the fixed length data elements like 008. Bib_Table Bib_ID int auto_increment Record_ID int Tag_0XX_ID int Tag_1XX_ID int Tag_2XX_ID int Tag_3XX_ID int Tag_4XX_ID int Tag_5XX_ID int Tag_6XX_ID int Tag_7XX_ID int Tag_8XX_ID int Tag_9XX_ID int Control_ID int The 2XX_Tag_Table is representative of 10 tables (0XX_Tag_Table through 9XX_Tag_Table). It contains one entry for each tag of a record. The Tag_ID value is linked back to the Bib_Table Tag_2XX_ID field. The Subfield_ID is used to link to the 2XX_Subfield_Table which contains all of the subfields for this tag. 2XX_Tag_Table Tag_Key int auto_increment Tag_ID int Indicator1 int Indicator2 int Tag char(3) Subfield_ID int The 2XX_Subfield_Table is again representative of 10 such tables. The Subfield_ID is used to link all of the subfields of a given tag together and back to the 2XX_Tag_Table entry. Small values can be stored directly in the Subfield_Value field. Larger values are stored in separate blob tables using the Storage_ID field as the link. 2XX_Subfield_Table Subfield_Key int auto_increment Subfield_ID int Subfield_Mark char(1) Subfield_Value char(255) Storage_ID int The Storage_Table and *Blob_Table is where the data elements themselves are stored. Depending on the size of the data, they can be stored in the Blob_Table, Med_Blob_Table, or the Long_Blob_Table, based on the value of Storage_Type. Storage_Table Storage_Key int auto_increment Storage_ID int Blob_ID int Med_Blob_ID int Long_Blob_ID int Storage_Type enum('B', 'MB', 'LB'); The Med_Blob_Table and Long_Blob_Table are identical to Blob_Table except that the Blob_Data field is mediumblob or longblob respectively. The Seq_No field can be used to break a huge record into multiple longblob's if necessary. Blob_Table Blob_Key int auto_increment Blob_ID int Blob_Data blob Seq_No int If I wanted to find the title and author of all records that had "bear" in any subject (6XX) tag, I would use: select B.Record_ID from Bib_Table B, 6XX_Tag_Table T, 6XX_Subfield_Table S where B.Tag_6XX_ID=T.Tag_ID and T.Subfield_ID=S.Subfield_ID and S.Subfield_Mark='a' and S.Subfield_Value like '%bear%'; for each Record_ID returned, run the following to get the author: select S.Subfield_Value from Bib_Table B, 1XX_Tag_Table T, 1XX_Subfield_Table S where B.Record_ID=$Record_ID and B.Tag_1XX_ID=T.Tag_ID and T.Subfield_ID=S.Subfield_ID and T.Tag='100'; and the following to get the title: select S.Subfield_Value from Bib_Table B, 2XX_Tag_Table T, 2XX_Subfield_Table S where B.Record_ID=$Record_ID and B.Tag_2XX_ID=T.Tag_ID and T.Subfield_ID=S.Subfield_ID and T.Tag='245';