It’s been a strange sort of end to the week. I e-met a new language and came face to face with a linguistic, digital needle in a cyberhaystack. Ok, I’m not making much sense so far, I know… just setting the scene!
We all know Skype, the new version of which (quoting my hilarious brother) “convinces through less functionality and more bugs”. Back when Skype still belonged to itself, I eventually discovered the fact that, at least on Windows, it’s pretty easy to localize. You go to Tools » Change Language » Edit Skype Language file and right down there where everyone can see it, you have the option to save the English.lang file (which contains the English strings) under a new name and add your own translation. So back in 2011 I started working on a Gaidhlig.lang and by early 2012 had finally caught up with all the updates that kept getting in the way.
What does one do when one has completed a translation? Sure, you submit it to the project and ask them to bundle it, release it, whatever. Not so fast, buckoes… Due to “size issues” (I’d like to remind everyone at this point that currently, a full language file weighs in at a massive 400KB), Skype only bundles the usual 20 or so suspects, CJK (that’s Chinese, Japanese and Korean) and a bunch of European languages with the install file. Since they never though of adding an Install new language function that could pull a file from some repository, the short of it was that even having localized the lot, you were on your own. Sure, you could post the file as an attachment on the forum but then who goes trawling through a forum in search of a language file?
Using the usual “Gaelic” channels, I think we’ve reached a reasonable number of people so far but certainly less than we would have reached had it been “inside the program itself.
But before I knock the old forum too much, I should point out that it actually had a dedicated localization section. Why do I mention this? Because, moving to the next episode where we finally meet Mr Big, when Skype was bought by Microsoft, the forums were wiped and *cough* improved. That’s right, the localization section went. Especially the parts where people were trying very hard to figure out how to turn a .lang file into something that Linux and MacOS could digest. Am I glad I took copies of the bits that were useful…
Anyway, even in the new forum, the localization questions never went away. But the stock answer of the one admin who bothers to check that corner is always that “there’s no news”. In fairness, I don’t think he actually has the power to do anything, he’s just the unfortunate person who has to interact with, shock and horror, the users. So even though Skype was first launched in 2003, here we are in 2012 still asking the same questions – why can’t you bundle our language, why can’t we convert/localize the files for MacOS/Linux and how about frickin plural formatting?
Yep, “there’s no news”. The chap working on Welsh then had an interesting suggestion – can’t we host them on SourceForge? You see, the problem with distributing the files via the forum is that once your post moves off the first page, who’s going to see it? So, brilliant idea I thought and we went about setting up a project. Nothing fancy, just the .lang files which don’t come bundled with Skype and a few Wiki pages with guidance.
Seeing I had a quiet day and since my contributions in terms of code are… amusing, I decided to hit the web to locate all the .lang files out there, or as many of them as possible anyway – I may suck at code but I rock at websearches! Half a day later, I had the most amazing collection of languages. Some I had known about – Gaelic, Welsh, Cornish, Irish and Uyghur – as their translators had been active on the forum. Some were part of the usual suspects but some were totally unexpected and one I’d never even heard about which is, as a matter of fact, rather unusual. So in the end, we had:
- Uyghur (Persion and Latin script)
Definitely wow. Admittedly, not all are complete but it’s still one of the most diverse lists I’ve ever come across, even if there are no languages from the Americas in the list. Especially Adyghe, Chuvash and Erzya are not languages you normally see on localization projects. And Nias I had never even heard about. Turns out it’s a language of some 700,000 speakers off the coast of Sumatra. That certainly cheered me up. Yeah I know, geek 🙂
But what made me shake my head all afternoon was something else – the lengths I had to go to in manipulating my websearches and the places I found some of them. Gaelic I had, Welsh, Albanian and Cornish came of Skype’s forum. Basque (normally a rather well organized language) I found embedded as a .obj file on some archived forum post. Adyghe, Chuvash and Erzya came of some websites that looked a bit like a forum where someone had posted, in the case of Erzya without linebreaks, the translations – in two cases, with the Russian strings still embedded so I had to strip those out first before creating the .lang files. Armenian came out of a public DropBox and Breton off the Ofis ar Brezhoneg website. Afrikaans was on some unlinked page on someone’s personal website. Esperanto was on the Wiki of the Universala Esperanto Asocio but it took me some time to figure that in order to get the strings, I had to trawl through the page history as someone had at some point – accidentally or deliberately – deleted them. Mirandese and Nias were in some silent loop on abandoned university websites – probably student projects from long ago. And one came off a file sharing site, I forget which, making me seriously wonder if I was downloading porn, a virus or actually the .lang file. I actually even found Kurdish but the people who did that seem to have accidentally stripped out the string names so having explained the problem, they’re trying to match them together again as my Kurdish isn’t that baş.
I didn’t quite know whether to congratulate myself or whether to cry. All that effort, all those wonderfully selfless people putting their time and effort into translating something into their language. And then, because the people making money off it couldn’t be bothered, we ended up with these needles in the cyberhaystack. Crying is still an option I feel…
It’s nice to know they’re on SourcForge now (check out SkypeInYourLanguage) and that there’s a few people willing to put some time into making the process a bit better but by gum guys… if people are actually willing to help you make more money by making your product available in more languages, how about giving them a leg up, rather than the finger?
The debate about digital technology and localization and internationalization has probably raged in one form or other ever since someone invented the first program. Mind, for me personally it goes back to that ill-fated moment when ASCII was born with some bright spark arguing that no one would ever need more than those few letters that English has. My first computing headaches were around ASCII – how do I do an /ɣ/ and what the heck was %73£ when someone typed it at the other end?
Much has happened since and I’ve moved from phonology to software translation big time but I still can’t quite decide whether we’re in a better place now or not when it comes to small languages. Those technicalities (like ASCII vs Unicode) aside, the field has indeed opened up, in particular when it comes to open source software. There’s nothing but laziness that stops a language from having at least an office suite (LibreOffice), a browser (Firefox or Opera), an email client and calendar (Thunderbird and Lightning), a media player (VLC), a wiki (MediWiki), a spellchecker, a forum package (phpBB) and blogging software (WordPress.org and .com) – satisfying a fair chunk of your average user. For the really tough there’s Linux in all its scary glory of course. Ignoring the height of the bar when it comes to actually localizing some of them, that’s not the whole story though.
At least in digitized countries, a significant chunk of our work and social lives have shifted onto various digital platforms. Desktops, laptops, smartphones, tablets… you name it. Hardly a year goes by without some innovation hitting the headlines. And the tech savy (overwhelmingly the young) have become real digital nomads. Yesterday’s app is so passé today and today’s market leader mobile phone OS may be tomorrow’s digital roadkill (anyone remember Symbian?). It’s a bewildering, fluid place.
It’s a place we can’t ignore. Whether we like it or not, virtually anyone under the age of 25 has a smartphone, from rocky outcrops in the Western Ocean like Barra to the mountains of Gipuzkoa, the deserts of Arizona and the steaming hills of Papua New Guinea. Ok, maybe not Papua New Guinea yet though it wouldn’t surprise me. The more of a space we can carve out for out languages and cultures, the better because sadly the old maxim of “Use it or lose it” – or however your language puts that – is true.
So we must compete somehow, at least at some base level. But I increasingly feel that without a small but dedicated full time team, this will become harder and harder unless there’s some magic on the way that I haven’t heard about. Let me give you an example. Predictive texting goes back to the 1970s, believe it or not but not wanting to be too depressive about it, it probably did not make huge inroads into our lives before the year 2000 or so when it really took off on phones. Back then, you had those languages which your manufacturer deemed appropriate, maybe a dozen or so if you were lucky. We’re now in 2012 and I’m waiting with bated breath for the first release of Irish, Scottish Gaelic and Manx on Adaptxt which, after much searching, I discovered last year. Finally an open source predictive texting project open to any language. Yay! Ok, so it only works on Android… I can live with that, looking at the Android market share. It would be good if iPhones also supported 3rd party entry methods but they don’t and I’m getting to the cheesed off stage with Apple’s approach to non-billion-speaker-languages anyway.
But I digress. There we are, happily preparing the tool which will finally take Scots Gaelic and Manx out of the letter-by-letter age (Irish has had Téacs since 2008 but I’m not sure how alive the project is) when Apple starts pushing Siri (that voice recognition thing on iPhones which, by the way, only works if your accent resembles that of the Queen and or Charleton Heston). I bet my bottom dollar that before long, every major mobile phone manufacturer will be running something similar.
Here, I gnash my teeth. Predictive texting is reasonably easy to do as long as you have a framework you can feed your data into. For example a spellchecker. But it’s taken around a decade for such a framework to grow out of the cyber community. Speech recognition is a harder. A lot harder. I have no idea how long it will take for languages such as Gaelic to take that hurdle and even less so of how many of this planet’s 6,000 languages will manage to do so. And that makes it all a little frustrating.
I don’t know what the answer is, right now, I just feel it would be nice if stuff slowed down a bit. Honestly, how much technological innovation do we need in 12 months? Or rather, how many false summits can we and our languages keep pace with?
And in Beurla? That’s English, for the goidelically challenged. Thing is, I already connect with my Goidelic-speaking friends via many a channel but what I may have to say that’s fit for a blog is actually much less aimed at them.
Thing is, I spotted the great opportunities that Open Software had to offer to small languages a long time ago but when I had a look in, I got nowhere. More about that later. It wasn’t until a chance meeting between an American Irish speaker and myself in a pub in Dublin that I finally managed to get something off the ground with the brilliant help of said Gaelgeoir, Kevin Scannell, who encouraged me to go back and localize Mozilla Firefox.
That was back in 2009. I’ve since morphed into the Scottish Gaelic localization team for anything from Mozilla to LibreOffice. Surprisingly common scenario, but again, more on that later. 2011 in particular has been a busy year and I now feel that I’ve moved beyond the noob stage and where I’m allowed to have a view or two on some things.
So, Dear Developer, thanks for tuning in and I hope this will be provide an insight as to what localization looks like from the other end of the fibreoptic cable!