In indic-mashup we had a good discussion on sorting orders and issues of sorting of indic languages. All language experts posted thers expected data on
http://www.indlinux.org/wiki/index.php/CollationData
I have recenlty completed work for mr_IN and it is upstreamed also and now you can check sorting of mr_IN in next glibc releases
just blogging this here since it will be useful for many linguist to test sorting order for there languages and it will be nice if we can test and correct sorting order of all languages. :)
so first step to do this is test sorting and file bugs for wrong sorting order ;) as i am working on collation from some time i will surely help in fixing that
step 1: create text file for ex: barakhadi_test
step 2: write sorting data into such that each sorting syllable on one line
so content of your test file will be like this
http://pravins.fedorapeople.org/sorting/barakhadi_test
step 3: use following command in terminal
syntax : LC_ALL="locale name".utf8 sort "path/test file name"
for marathi case it will be
LC_ALL=mr_IN.utf8 sort test.sort
it will give you output as sorted data each syllable per line in terminal
if you want to write sorted data in some file just add following line instead of above
syntax : LC_ALL="locale name".utf8 sort "path/test file name" > output_file
LC_ALL=mr_IN.utf8 sort barakhadi_test > barakhadi_sorted
http://pravins.fedorapeople.org/sorting/barakhadi_sorted
there are also some other way to test but i have mentioned here method i am using to test
2 comments:
Hello. This post is likeable, and your blog is very interesting, congratulations :-). I will add in my blogroll =). If possible gives a last there on my blog, it is about the Dieta, I hope you enjoy. The address is http://dieta-brasil.blogspot.com. A hug.
You have to express more your opinion to attract more readers, because just a video or plain text without any personal approach is not that valuable. But it is just form my point of view
Post a Comment