Re: [Koha] Help from people experienced in Arabic, Hebrew, and CJK languages (and Vietnamese)
I would also add that we need feedback from Vietnamese users. I admit that I'm not very familiar with Vietnamese, but I know it's a tonal language (https://en.wikipedia.org/wiki/Vietnamese_alphabet#Tone_marks) which relies on a lot of diacritics with a Latin-based alphabet, so removing those tone marks could perhaps be highly problematic for Vietnamese users. According to http://translate.koha-community.org/, the Vietnamese translation is 67% complete, so I suspect there are Vietnamese users of Koha out there. I just noticed Dung Hoang's email address in the listserv archives, so including him here for his input. I wonder if the Romanization of CJK languages would be a thing we'd need to consider as well... we don't support any CJK libraries, so I'm not sure how those libraries store their data... whether its as ideograms or in romanized text... I'm curious about Danish and other Scandinavian languages as well, as å may be equivalent to "aa" but not "a", I think? -- More importantly, I agree with Galen. Do patrons expect to enter diacritics when entering their usernames? As Galen pointed out on Bugzilla, the original unaccenting of usernames started with http://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=7411. Sophie: Do you recall how a userid with diacritics could cause login failures? David Cook Systems Librarian Prosentient Systems 72/330 Wattle St, Ultimo, NSW 2007
-----Original Message----- Date: Thu, 10 Dec 2015 10:35:19 -0500 From: Galen Charlton <gmc@esilibrary.com> To: koha <koha@lists.katipo.co.nz> Subject: Re: [Koha] Help from people experienced in Arabic, Hebrew, and CJK languages Message-ID: <CAPLnt67RQ1L1dERD5zO- NMc77yTDfPgEBYqE5tYU8A1LdhjxmA@mail.gmail.com> Content-Type: text/plain; charset=UTF-8
Hi,
On Thu, Dec 10, 2015 at 6:01 AM, Tajoli Zeno <z.tajoli@cineca.it> wrote:
The link: http://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=14759
Please try to join the discussion also if you don't know perl, but if you know well one of those: - Arabic alphabet and language - Hebrew alphabet and language - CJK writing systems and languages
In addition, it would be great to have input from Koha users using *any* language that contains diacritics, as there is a functionality question underlying the discussion.
At the moment, the module in question is used to remove diacritics when automatically generating a public catalog username -- e.g., if a patron whose surname is Müller is registered with a Koha database, would they expect, based on their use of other websites, that their username could include the diacritic (e.g., "müller")? Or would they expect that it would never include the diacritics (e.g., "muller")?
Regards,
Galen -- Galen Charlton Infrastructure and Added Services Manager Equinox Software, Inc. / The Open Source Experts email: gmc@esilibrary.com direct: +1 770-709-5581 cell: +1 404-984-4366 skype: gmcharlt web: http://www.esilibrary.com/ Supporting Koha and Evergreen: http://koha-community.org & http://evergreen-ils.org
Chinese, Simplified and Traditional, could stored with UTF-8. Presently, we do store Traditional Chinese in UTF-8 at Taiwan. However some very old system still using BIG-8 around Taiwan and HK, Singapore, etc. Most Integrated Library System at China use GB 2312, GB stand for GuoBiao which means national standard, store Simplified Chinese. 2015-12-11 8:32 GMT+08:00 David Cook <dcook@prosentient.com.au>:
I wonder if the Romanization of CJK languages would be a thing we'd need to consider as well... we don't support any CJK libraries, so I'm not sure how those libraries store their data... whether its as ideograms or in romanized text...
-- Wishing you all the best. . . . Anthony Mao 毛慶禎 +886 2 29052334 (voice) + 886 2 29017405 (FAX)
On 11 December 2015 at 01:32, David Cook <dcook@prosentient.com.au> wrote:
I'm curious about Danish and other Scandinavian languages as well, as å may be equivalent to "aa" but not "a", I think?
I dunno much about those other languages, but at least for Norwegian you are right. In older writings å might be written as aa. So in the interests of comprehensiveness, it would certainly be interesting to treat aa and å as synonymous. Treating a and å as synonymous would mostly be a disaster, I think. You want to be able to distinguish between e.g. "make" (spous) and "måke" (sea gull). Best regards, Magnus Enger Libriotech
Thank you to everyone for your replies! They make me think that we shouldn't be unaccenting any text, as it dramatically changes the meaning of words. However, we should have clarified in this case that the text we're unaccenting are the names of people. The original problem traces back to issues with diacritics in the userid: http://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=7411 Personally, I think we should investigate why diacritics were causing an issue in the userid, rather than unaccenting text. However, what do people have to say about the removal of diacritics in the names of people? In the case of Frédéric, the diacritics are for the sake of pronunciation, and don't change the meaning of the word. That said, I'm sure there are names in French and other languages where the name is the same as a common noun and that removing the diacritics could in fact have an impact on the meaning of the name. In fact, it might have an impact to the extent that they're distinctly different words. You wouldn't write your name as "Hotel" when it's actually "Lily" (disclaimer: I didn't have a good example so this is an exaggeration). On the other hand, maybe people are used to writing usernames without diacritics? I think that's the real question we want to ask people. David Cook Systems Librarian Prosentient Systems 72/330 Wattle St, Ultimo, NSW 2007
-----Original Message----- From: enger.magnus@gmail.com [mailto:enger.magnus@gmail.com] On Behalf Of Magnus Enger Sent: Friday, 18 December 2015 12:28 AM To: David Cook <dcook@prosentient.com.au> Cc: Koha list <koha@lists.katipo.co.nz> Subject: Re: [Koha] Help from people experienced in Arabic, Hebrew, and CJK languages (and Vietnamese)
On 11 December 2015 at 01:32, David Cook <dcook@prosentient.com.au> wrote:
I'm curious about Danish and other Scandinavian languages as well, as å may be equivalent to "aa" but not "a", I think?
I dunno much about those other languages, but at least for Norwegian you are right. In older writings å might be written as aa. So in the interests of comprehensiveness, it would certainly be interesting to treat aa and å as synonymous.
Treating a and å as synonymous would mostly be a disaster, I think. You want to be able to distinguish between e.g. "make" (spous) and "måke" (sea gull).
Best regards, Magnus Enger Libriotech
participants (3)
-
Anthony Mao -
David Cook -
Magnus Enger