Russian Language characters in public catalog
Hi all, This is perhaps more of a general web design problem than a Koha-specific one but I'm wondering if others have had trouble with accented Russian characters displaying properly and what their solution was. Here's a search of all our Russian language records <https://library.cca.edu/cgi-bin/koha/opac-search.pl?advsearch=1&idx=kw&op=and&idx=kw&op=and&idx=kw&limit=ln%2Crtrn%3Arus&sort_by=relevance&do=Search>; there's an accent that's a large, sweeping arch over multiple letters (sorry, after some research I still don't know its name) which renders as blank unicode boxes by fonts that do not support it, e.g. "M. Sarʹi︠a︡n" as opposed to how it should look <https://www.google.com/search?&q=M.+Sar%CA%B9i%EF%B8%A0a%EF%B8%A1n>. The Arial font is so far the only one I've found that renders the accent properly, though I assume there are others. Koha theme's NotoSans doesn't and neither do any of our institutional brand fonts. Is there anywhere in the public catalog display where the record language is shown? Preferably as a class or other CSS hook, if I could just do `#catalogue_detail_biblio.russian { font-family: arial }` that'd be the perfect solution. I've been considering how to handle this but nothing other than simply switching the whole catalog's font to Arial seems great. I can add some JavaScript to change the font on just select pages, but I'd have to tell when affected records are appearing and also apply it to search results. I could edit all the records to romanize them but would be laborious since I don't think there's a way to do a bulk edit that catches every instance of the accent. Plus, newly imported records will have the same problem and I'll have to keep revisiting them. Best, ERIC PHETTEPLACE Systems Librarian (he/him) ephetteplace@cca.edu | o 510.594.3660 5212 Broadway | Oakland, CA | 94618 :(){ :|: & };:
Am 25.01.20 um 00:03 schrieb :
Hi all,
Hi,
This is perhaps more of a general web design problem than a Koha-specific one but I'm wondering if others have had trouble with accented Russian characters displaying properly and what their solution was. Here's a search of all our Russian language records <https://library.cca.edu/cgi-bin/koha/opac-search.pl?advsearch=1&idx=kw&op=and&idx=kw&op=and&idx=kw&limit=ln%2Crtrn%3Arus&sort_by=relevance&do=Search>; there's an accent that's a large, sweeping arch over multiple letters (sorry, after some research I still don't know its name) which renders as blank unicode boxes by fonts that do not support it, e.g. "M. Sarʹi︠a︡n" as opposed to how it should look <https://www.google.com/search?&q=M.+Sar%CA%B9i%EF%B8%A0a%EF%B8%A1n>.
The Arial font is so far the only one I've found that renders the accent properly, though I assume there are others. Koha theme's NotoSans doesn't and neither do any of our institutional brand fonts. Is there anywhere in the public catalog display where the record language is shown? Preferably as a class or other CSS hook, if I could just do `#catalogue_detail_biblio.russian { font-family: arial }` that'd be the perfect solution.
I doubt that this is really a russian "accent", because that makes no sense in this connection. A russian accent would be, i.e. in: Каменский. The Breve above the и makes the normal и=i to a й=j. M. Saryan is an armenian painter and so it's armenian name is not russian/cyrillic but: Մարտիրոս Սարյան according to wikipedia. The russian transliteration is: Мартирос Сарьян and in latin Martiros Saryan. Or in a russian dictionary: Мартиро́с Сарья́н with pronounciation signs. So if you look at his surname Сарья́н, you see a Я, which is pronounced "ja" and is in fact a ligature, or was at least, today it's one letter. The Accent above Я is only a pronounciation hint, but not part of the original name. Since Сарья́н does also include a ь, a mjagkij snak, which is not a letter to pronounce, but a sign to change the "hardness" of the consonant before (here р, which is r). In short: M. Sarʹi︠a︡n seems just to be a strange, broken transliteration of these signs. The ' behind the r refers to the ь, since you "can't" speak it. And the sweeping arches above ian seems to be the rest of the pronounciation sign above the я́ (ia). But you don't write the pronounciation signs in russian language (unless it's a dictionary or similar). So his latin Name is M. Saryan. Or in russian, М. Сарьян. I would not expect to see this pronounciation signs in a book search, since they are not part of his name. But i'm not a Librarian in the way that i know what the demands of these things are.
I've been considering how to handle this but nothing other than simply switching the whole catalog's font to Arial seems great. I can add some JavaScript to change the font on just select pages, but I'd have to tell when affected records are appearing and also apply it to search results. I could edit all the records to romanize them but would be laborious since I don't think there's a way to do a bulk edit that catches every instance of the accent. Plus, newly imported records will have the same problem and I'll have to keep revisiting them.
Best,
ERIC PHETTEPLACE Systems Librarian (he/him)
Sincerely, grex
ephetteplace@cca.edu | o 510.594.3660
5212 Broadway | Oakland, CA | 94618
:(){ :|: & };: _______________________________________________ Koha mailing list http://koha-community.org Koha@lists.katipo.co.nz https://lists.katipo.co.nz/mailman/listinfo/koha
Am 26.01.20 um 12:54 schrieb le-grex: *snip*
So his latin Name is M. Saryan. Or in russian, М. Сарьян. I would not expect to see this pronounciation signs in a book search, since they are not part of his name. But i'm not a Librarian in the way that i know what the demands of these things are.
Excuse me, i meant "М. Сарьян" without the pronounciation sign ;) *snip*
Ineed after asking on Twitter, I discovered the arch is not an accent, but a ligature meant to indicate that two latinate characters are representing one Cyrllic one. It's apparently an idiosyncrasy of library cataloging: "Yup, left ligature and right ligature. It's because library transliteration values absolute precision over readability. Just writing ia could be, in theory, either иа or я, so the ligatures signify that it's all one letter under there." https://twitter.com/marccold/status/1220858664560529408 Honestly though, it doesn't matter what language or meaning the symbol has, it doesn't render correctly in our catalog so I'm still stuck. I wonder if anyone else has a solution that doesn't involve simply using Arial. I see a lot of catalog records with this ligature and yet not every catalog is forced to use one of the small selection of fonts that support it, I hope. Best, ERIC PHETTEPLACE Systems Librarian (he/him) ephetteplace@cca.edu | o 510.594.3660 5212 Broadway | Oakland, CA | 94618 :(){ :|: & };: On Sun, Jan 26, 2020 at 4:15 AM le-grex <post@grex.is-lost.org> wrote:
Am 26.01.20 um 12:54 schrieb le-grex: *snip*
So his latin Name is M. Saryan. Or in russian, М. Сарьян. I would not expect to see this pronounciation signs in a book search, since they are not part of his name. But i'm not a Librarian in the way that i know what the demands of these things are.
Excuse me, i meant "М. Сарьян" without the pronounciation sign ;)
*snip* _______________________________________________ Koha mailing list http://koha-community.org Koha@lists.katipo.co.nz https://lists.katipo.co.nz/mailman/listinfo/koha
To see the ligatures in the catalog at the Library of Congress, see https://lccn.loc.gov/84174397 On 26-Jan-20 19:04, Eric Phetteplace wrote:
Ineed after asking on Twitter, I discovered the arch is not an accent, but a ligature meant to indicate that two latinate characters are representing one Cyrllic one. It's apparently an idiosyncrasy of library cataloging:
"Yup, left ligature and right ligature. It's because library transliteration values absolute precision over readability. Just writing ia could be, in theory, either иа or я, so the ligatures signify that it's all one letter under there." https://twitter.com/marccold/status/1220858664560529408
Honestly though, it doesn't matter what language or meaning the symbol has, it doesn't render correctly in our catalog so I'm still stuck. I wonder if anyone else has a solution that doesn't involve simply using Arial. I see a lot of catalog records with this ligature and yet not every catalog is forced to use one of the small selection of fonts that support it, I hope.
Best,
ERIC PHETTEPLACE Systems Librarian (he/him)
ephetteplace@cca.edu | o 510.594.3660
5212 Broadway | Oakland, CA | 94618
:(){ :|: & };:
On Sun, Jan 26, 2020 at 4:15 AM le-grex <post@grex.is-lost.org> wrote:
Am 26.01.20 um 12:54 schrieb le-grex: *snip*
So his latin Name is M. Saryan. Or in russian, М. Сарьян. I would not expect to see this pronounciation signs in a book search, since they are not part of his name. But i'm not a Librarian in the way that i know what the demands of these things are. Excuse me, i meant "М. Сарьян" without the pronounciation sign ;)
*snip* _______________________________________________ Koha mailing list http://koha-community.org Koha@lists.katipo.co.nz https://lists.katipo.co.nz/mailman/listinfo/koha
_______________________________________________ Koha mailing list http://koha-community.org Koha@lists.katipo.co.nz https://lists.katipo.co.nz/mailman/listinfo/koha
What you're looking for is a class of Unicode characters called combining marks. In this specific case, you're looking for combining half marks, which when put together span multiple characters, such as the ligatures used to transcribe Cyrillic characters in Latin characters. Not all fonts contain the Unicode block that supports combining marks. Arial is a common font that happens to have that support, so you'll often find it used in library catalogs. In my library's discovery system, it seems that whenever characters need the combining marks, they are forced to use the Arial font by placing then within an HTML <span> element styled with Arial as the font, like this. <span class="combined-half-mark">i︠a︡</span> You'll find some records at < https://onesearch.library.nd.edu/primo-explore/search?query=any,contains,vospominanii%EF%B8%A0a%EF%B8%A1&tab=nd_campus&search_scope=nd_campus&vid=NDU&lang=en_US&offset=0
.
I don't know how they get enclosed in the span element. It well could be through the use of Javascript, as you suggested in a previous e-mail. Wikipedia has some information on combining half marks at < https://en.wikipedia.org/wiki/Combining_Half_Marks >. If you want to try using Javascript to detect these marks, I found this article that might be of interest, < https://dmitripavlutin.com/what-every-javascript-developer-should-know-about...
.
Andy On 1/26/2020 1:04 PM, Eric Phetteplace wrote:
Ineed after asking on Twitter, I discovered the arch is not an accent, but a ligature meant to indicate that two latinate characters are representing one Cyrllic one. It's apparently an idiosyncrasy of library cataloging:
"Yup, left ligature and right ligature. It's because library transliteration values absolute precision over readability. Just writing ia could be, in theory, either иа or я, so the ligatures signify that it's all one letter under there." https://twitter.com/marccold/status/1220858664560529408
Honestly though, it doesn't matter what language or meaning the symbol has, it doesn't render correctly in our catalog so I'm still stuck. I wonder if anyone else has a solution that doesn't involve simply using Arial. I see a lot of catalog records with this ligature and yet not every catalog is forced to use one of the small selection of fonts that support it, I hope.
Best,
ERIC PHETTEPLACE Systems Librarian (he/him)
ephetteplace@cca.edu | o 510.594.3660
5212 Broadway | Oakland, CA | 94618
:(){ :|: & };:
On Sun, Jan 26, 2020 at 4:15 AM le-grex <post@grex.is-lost.org> wrote:
Am 26.01.20 um 12:54 schrieb le-grex: *snip*
So his latin Name is M. Saryan. Or in russian, М. Сарьян. I would not expect to see this pronounciation signs in a book search, since they are not part of his name. But i'm not a Librarian in the way that i know what the demands of these things are.
Excuse me, i meant "М. Сарьян" without the pronounciation sign ;)
*snip* _______________________________________________ Koha mailing list http://koha-community.org Koha@lists.katipo.co.nz https://lists.katipo.co.nz/mailman/listinfo/koha
_______________________________________________ Koha mailing list http://koha-community.org Koha@lists.katipo.co.nz https://lists.katipo.co.nz/mailman/listinfo/koha
-- Andy Boze, Associate Librarian University of Notre Dame 208A Hesburgh Library (574) 631-8708
participants (4)
-
Andy Boze -
Eric Phetteplace -
James Weinheimer -
le-grex