2015 is bound to be a great year for LOV. In March the project will be four years old, and by then we should have delivered a brand new version and interface, of which the LOV-Search has been the prefiguration for a while now. Be patient, +Pierre-Yves Vandenbussche
is working hard on it, I will no deprive him of the pleasure to announce it in due time.
Something I would really wish to see next year is a real community effort to improve the multilingual aspect of vocabularies in LOV, and that's why I open today this discussion category Multilingualism - Translation. Although the community of vocabulary publishers and users is largely multillingual, an overwhelming majority of vocabularies are still published with labels in a single language, most of the time English. Some of them (more than 40) don't even care to indicate at all the language of their labels and comments, including famous ones such as FOAF, Music ontology, Event ontology and Time ontology. This is of course a very bad practice, blatantly ignoring the diversity of languages.
Out of 469 vocabularies (as of today) in LOV, 419 use explicitly English for labels/comments, and 83 use other languages, among which the leading ones are French (42 vocabs), Spanish (28), German (21), Italian (20), Japanese (12), Portuguese (8), Dutch (7), Russian (6), Czech (6), Greek (5), Polish (5) and Chinese (5). 32 more languages are used by less than 3 vocabularies each.
LinkedGeoData ontology, thanks to its large contributor community is using more than 40 languages, making it the undisputed champion of multilingualism in LOV ... but unfortunately this vocabulary seems to have been offline for quite a while, and has never met other LOV publication requirements, such as being retrievable from its namespace. The other massively multilingual vocabulary is DBpedia ontology, which comes in 25 languages, and hopefully more to come, there again thanks to crowdsourcing and various linguistic instances of DBpedia (125 to-date).
But less known vocabularies, even with single publishers, have made an important effort in providing multilingual labels, such as the Military Ontology, providing labels in 17 languages. Recent vocabularies in W3C namespace such as Core organization ontology have also made a noticeable effort of translation, and we know that +Phil Archer
is particulary keen on those issues, given in particular his involvment in European instititutions, making him aware that "translation is the language of Europe".
All those efforts are really far from what we could dream of : a really multilingual vocabulary ecosystem. Multilingualism is important at least for two reasons. The first and most obvious one is allowing users to search, query and navigate vocabularies in their native language. The second one I would stress is that translating is a process through which the quality of a vocabulary can only improve. Looking at a vocabulary through the eyes of other languages and identifying the difficulties of translation helps to better outline the initial concepts and if necessary refine or revise them. Hence multilingualism and translation should be native, built-in features of any vocabulary construction, not a marginal task. And if not, vocabulary users and re-users should be willing, and able to, collaborate to translation in their own language.
How can we improve the current situation?
- As vocabulary creators and publishers, be bold of our natural languages, and provide labels and comments in those languages along with the "default" english ones as part of the default creation and publication effort. This is not a huge effort compared to the overall time needed to develop a vocabulary/ontology, and as said above, it is likely to improve the vocabulary quality even in its original language.
- Develop services on top of LOV database and API allowing collaborative translation for existing vocabularies.