Archive

Posts Tagged ‘Breton’

Needle in a haystack

09/02/2013 5 comments

It’s been a strange sort of end to the week. I e-met a new language and came face to face with a linguistic, digital needle in a cyberhaystack. Ok, I’m not making much sense so far, I know… just setting the scene!

We all know Skype, the new version of which (quoting my hilarious brother) “convinces through less functionality and more bugs”.  Back when Skype still belonged to itself, I eventually discovered the fact that, at least on Windows, it’s pretty easy to localize. You go to Tools » Change Language » Edit Skype Language file and right down there where everyone can see it, you have the option to save the English.lang file (which contains the English strings) under a new name and add your own translation. So back in 2011 I started working on a Gaidhlig.lang and by early 2012 had finally caught up with all the updates that kept getting in the way.

LiNiha

The Li Niha (Nias) interface

 

What does one do when one has completed a translation? Sure, you submit it to the project and ask them to bundle it, release it, whatever. Not so fast, buckoes… Due to “size issues” (I’d like to remind everyone at this point that currently, a full language file weighs in at a massive 400KB), Skype only bundles the usual 20 or so suspects, CJK (that’s Chinese, Japanese and Korean) and a bunch of European languages with the install file. Since they never though of adding an Install new language function that could pull a file from some repository, the short of it was that even having localized the lot, you were on your own. Sure, you could post the file as an attachment on the forum but then who goes trawling through a forum in search of a language file?

Using the usual “Gaelic” channels, I think we’ve reached a reasonable number of people so far but certainly less than we would have reached had it been “inside the program itself.

But before I knock the old forum too much, I should point out that it actually had a dedicated localization section. Why do I mention this? Because, moving to the next episode where we finally meet Mr Big, when Skype was bought by Microsoft, the forums were wiped and *cough* improved. That’s right, the localization section went. Especially the parts where people were trying very hard to figure out how to turn a .lang file into something that Linux and MacOS could digest. Am I glad I took copies of the bits that were useful…

Anyway, even in the new forum, the localization questions never went away. But the stock answer of the one admin who bothers to check that corner is always that “there’s no news”. In fairness, I don’t think he actually has the power to do anything, he’s just the unfortunate person who has to interact with, shock and horror, the users. So even though Skype was first launched in 2003, here we are in 2012 still asking the same questions – why can’t you bundle our language, why can’t we convert/localize the files for MacOS/Linux and how about frickin plural formatting?

Yep, “there’s no news”. The chap working on Welsh then had an interesting suggestion – can’t we host them on SourceForge? You see, the problem with distributing the files via the forum is that once your post moves off the first page, who’s going to see it? So, brilliant idea I thought and we went about setting up a project. Nothing fancy, just the .lang files which don’t come bundled with Skype and a few Wiki pages with guidance.

Seeing I had a quiet day and since my contributions in terms of code are… amusing, I decided to hit the web to locate all the .lang files out there, or as many of them as possible anyway – I may suck at code but I rock at websearches! Half a day later, I had the most amazing collection of languages. Some I had known about – Gaelic, Welsh, Cornish, Irish and Uyghur – as their translators had been active on the forum. Some were part of the usual suspects but some were totally unexpected and one I’d never even heard about which is, as a matter of fact, rather unusual. So in the end, we had:

  1. Adyghe
  2. Afrikaans
  3. Albanian
  4. Armenian
  5. Basque
  6. Breton
  7. Chuvash
  8. Cornish
  9. Erzya
  10. Esperanto
  11. Faroese
  12. Gaelic
  13. Irish
  14. Ligurian
  15. Macedonian
  16. Mirandese
  17. Nias
  18. Tajik
  19. Tamil
  20. Uyghur (Persion and Latin script)
  21. Welsh

Definitely wow. Admittedly, not all are complete but it’s still one of the most diverse lists I’ve ever come across, even if there are no languages from the Americas in the list. Especially Adyghe, Chuvash and Erzya are not languages you normally see on localization projects. And Nias I had never even heard about. Turns out it’s a language of some 700,000 speakers off the coast of Sumatra. That certainly cheered me up. Yeah I know, geek 🙂

But what made me shake my head all afternoon was something else – the lengths I had to go to in manipulating my websearches and the places I found some of them. Gaelic I had, Welsh, Albanian and Cornish came of Skype’s forum. Basque (normally a rather well organized language) I found embedded as a .obj file on some archived forum post. Adyghe, Chuvash and Erzya came of some websites that looked a bit like a forum where someone had posted, in the case of Erzya without linebreaks, the translations – in two cases, with the Russian strings still embedded so I had to strip those out first before creating the .lang files. Armenian came out of a public DropBox and Breton off the Ofis ar Brezhoneg website. Afrikaans was on some unlinked page on someone’s personal website. Esperanto was on the Wiki of the Universala Esperanto Asocio but it took me some time to figure that in order to get the strings, I had to trawl through the page history as someone had at some point – accidentally or deliberately – deleted them. Mirandese and Nias were in some silent loop on abandoned university websites – probably student projects from long ago. And one came off a file sharing site, I forget which, making me seriously wonder if I was downloading porn, a virus or actually the .lang file. I actually even found Kurdish but the people who did that seem to have accidentally stripped out the string names so having explained the problem, they’re trying to match them together again as my Kurdish isn’t that baş.

I didn’t quite know whether to congratulate myself or whether to cry. All that effort, all those wonderfully selfless people putting their time and effort into translating something into their language. And then, because the people making money off it couldn’t be bothered, we ended up with these needles in the cyberhaystack. Crying is still an option I feel…

It’s nice to know they’re on SourcForge now (check out SkypeInYourLanguage) and that there’s a few people willing to put some time into making the process a bit better but by gum guys… if people are actually willing to help you make more money by making your product available in more languages, how about giving them a leg up, rather than the finger?

Advertisements

Wishful thinking à la Bretonne

03/02/2013 8 comments

Have you noticed that sometimes developers DO get it right but then are faced with strange user behaviours? No, I’m not talking about developers thinking that something should be the case, which isn’t. I’m talking about a strange chain of events on Facebook which makes me doubt the motivation of some language activists (yes, we’re allowed to self-criticize guys!).

We all know about Facebook. What we don’t all know about Facebook is that they have a pretty bizarre approach to translations (we can hardly call it localization…) and I don’t mean the fact they, for the most part, rely on community volunteers. No, it’s the process. There’s no clear process of adding or registering a new project and heaven knows how they actually pick the languages. At one point, Rumantsch was in (it now isn’t, no idea how it got in or why it’s now out, it’s a fairly small language with between 35,000 and 60,000 speakers), as is Northern Sami, Irish, Mongol and the usual big boys, including some questionable choices like Leet Speak and Pirate. So most languages are out. Not surprisingly, this has led to a number of Facebook groups and campaigns by people trying to get their  languages into the project. There used to be a project page full of posts along the lines of “please add my language” and “how do we get Facebook to add our language?” – universally met with thundering silence. Admins were rarer than Lord Howe Island stick insects.

Back in whenever, a chap called Neskie Manuel had a crafty idea, about getting his language, Secwepemctsín, onto Facebook. Why not, he figured, find a way of overlaying Facebook with a “translation skin” in order to make the process of translation (and in this case even localization) independent of Facebook & Co? It was a neat idea, which was somewhat interrupted by his sad and untimely death.

Now, round about the same time, two things happened. The Bretons set up a “Facebook in Breton” compaign. Fair enough. And a chap called Kevin Scannell took on board Neskie’s Facebook idea. Excellent. Before too long, the Facebook group had over 12,000 members and Kevin had released his script for a slew of amazing languages. It overlays not all of Facebook but just the most visible strings (the one’s we see daily, not the boring EULAs and junk). Even more amazingly, it can handle stuff Facebook hasn’t even woken up to yet, such as plurals, case marking and so on. Wow indeed.

The languages hailed from the four corners of the planet, from Aragonese, Manx and Nawat through Hiligaynon, Secwepemctsín, Samoan, K’iche’ and Māori to Kunwinjku and Gundjeihmi (two Australian languages). Wow indeed. And, of course Breton.

Now here’s the bizarre thing though. Ok, it’s not the full thing but who’d turn down a sandwich while waiting for a roast chicken that might never appear? No one, you’d think, so based on a combined market share of some 50% between Firefox and Chrome, some 200,000 speakers and 12,000 people in the “Facebook in Breton” group, you’d expect what, anything north of 6,000 enthusiastic users of the Breton script. After all, more than 1,100 people installed it in Scottish Gaelic (less than 60,000 speakers) and more than 500 people in Manx (way less than 2,000 fluent speakers).

A case of “you’d think” indeed. To date, a mind-boggling 450 people have installed it in Breton. As far as I can tell, the translation is good and was done by a single, highly fluent speaker (Fulup Jakez who works for Ofis ar Brezhoneg). So it’s not a quality issue. The scripts work (I use the Gaelic one) so it’s not that either. The Facebook group was notified several times, so it’s not like they didn’t know. Ok, so maybe not all Likes of the group actually are from speakers, fair enough, but glancing through the active posters, a lot of them seem to be in the right “linguistic area”.

So while the groupies are still foaming at the mouth about the lack of support from Zuckerberg and Co, there’s a perfectly good interim that would allow you to say Kenavo to French and Degemer mat to Breton on Facebook every day. I really don’t get it. Is it really the case that some activists are more in love with the idea of the thing than would actually use it if it was around? Or am I missing something really obvious? I sure hope I am…

On a more positive note, I hope the general idea of this type of “overlay” will eventually take off big time. We will never be able to convince the big boys to support all the languages on the planet, all of which are equally worthy of services in their own languages, whether they’re trying to re-grow lost speakers or whether they’re just a small to medium sized community. So having a tool that puts control over what we see on our screens into our hands would be great. No more running from company to company trying to make the case for adding language X, a little less duplication (I don’t know how many zillion times I’ve translated “Edit picture”), better quality and more focus on the important bits of an interface to translate (not the EULA for example… a document that sadly every software company is keen to have translated as soon as possible without ever asking who’ll read it). Ach well, I can hope…