[Koha] Character sets in MariaDB 10.5
Coehoorn, Joel
jcoehoorn at york.edu
Thu Aug 26 09:24:52 NZST 2021
The "mb4" stands for "multi-byte 4". It allows utf8 to handle characters up
to 4 bytes long. Basic utf8 for MariaDB only handles characters up to 3
bytes long. Three bytes is already enough for the entire Basic Multilingual
Plane, which includes everything needed for nearly all modern languages, as
well as a large number of symbols. There are also some performance and
compatibility benefits for this lower level character set. Whether this is
intentional, or what the discussion of the tradeoffs for Koha looked like,
I couldn't say (I wasn't there).
Joel Coehoorn
Director of Information Technology
York College of Nebraska
On Wed, Aug 25, 2021 at 3:53 PM Michael Kuhn <mik at adminkuhn.ch> wrote:
> Hi
>
> 1. In the last few years when installing Koha on Debian GNU/Linux 9 or
> 10 the character sets in MariaDB were as follows:
>
> MariaDB [(none)]> SHOW VARIABLES LIKE '%char%';
> +--------------------------+----------------------------+
> | Variable_name | Value |
> +--------------------------+----------------------------+
> | character_set_client | utf8mb4 |
> | character_set_connection | utf8mb4 |
> | character_set_database | utf8mb4 |
> | character_set_filesystem | binary |
> | character_set_results | utf8mb4 |
> | character_set_server | utf8mb4 |
> | character_set_system | utf8 |
> | character_sets_dir | /usr/share/mysql/charsets/ |
> +--------------------------+----------------------------+
>
> Today I installed Koha 21.05.03 on Debian GNU/Linux 11 with MariaDB
> 10.5.11 where the character sets are as follows:
>
> MariaDB [(none)]> SHOW VARIABLES LIKE '%char%';
> +--------------------------+----------------------------+
> | Variable_name | Value |
> +--------------------------+----------------------------+
> | character_set_client | utf8 |
> | character_set_connection | utf8 |
> | character_set_database | utf8mb4 |
> | character_set_filesystem | binary |
> | character_set_results | utf8 |
> | character_set_server | utf8mb4 |
> | character_set_system | utf8 |
> | character_sets_dir | /usr/share/mysql/charsets/ |
> +--------------------------+----------------------------+
>
> I'm not sure what is going on here. Does anyone know why the character
> sets for client, connection and results have changed from utf8mb4 to
> utf8? Is this correct with Koha or should these character sets be changed?
>
> 2. Today I came upon an installation of Koha 18.11.05 using MariaDB
> 10.0.32 which has the following character sets:
>
> MariaDB [(none)]> SHOW VARIABLES LIKE '%char%';
> +--------------------------+----------------------------+
> | Variable_name | Value |
> +--------------------------+----------------------------+
> | character_set_client | utf8 |
> | character_set_connection | utf8 |
> | character_set_database | latin1 |
> | character_set_filesystem | binary |
> | character_set_results | utf8 |
> | character_set_server | latin1 |
> | character_set_system | utf8 |
> | character_sets_dir | /usr/share/mysql/charsets/ |
> +--------------------------+----------------------------+
>
> This seems quite wrong to me - as far as I know "latin1" was never a
> supported character set in Koha... as far as I know the character sets
> should be set as shown in topic 1.
>
> However, is it still possible to update such a database with these
> character sets to Koha 21.05.03 without destroying the data completely?
>
> Best wishes: Michael
> --
> Geschäftsführer · Diplombibliothekar BBS, Informatiker eidg. Fachausweis
> Admin Kuhn GmbH · Pappelstrasse 20 · 4123 Allschwil · Schweiz
> T 0041 (0)61 261 55 61 · E mik at adminkuhn.ch · W www.adminkuhn.ch
> _______________________________________________
>
> Koha mailing list http://koha-community.org
> Koha at lists.katipo.co.nz
> Unsubscribe: https://lists.katipo.co.nz/mailman/listinfo/koha
>
More information about the Koha
mailing list