How to submit data in CLDR - unicode

My name is Nahid Hossain. A Software Engineer from Bangladesh. We want to develop CLDR of Bengali language for Bangladesh. We are facing difficulties to figure how to submit our CLDR of Bengali language for Bangladesh. As far as I understand , I need a account to do so.
1.How to open an CLDR account.
2. How to submit our data.
3. Is CLDR only limited to bug fixing?
Regards,
Nahid Hossain

Nahid, http://cldr.unicode.org/index/survey-tool/accounts is the right page to learn how to contribute to CLDR.
You can contribute entirely new data, or correct entries. The bug reporting form may be more appropriate if there is just a single fix needed.
For contributing new locales, there is minimum data required: which script is used, which characters are customary for writing the language, and some basic translations of the language's own name and related names.
Bengali is already available in CLDR, so you would not need to start from scratch there. I recommend requesting an account for Bengali and contributing to the process.

Related

Add a new language to OpenEars

I've recently started studying OpenEars speech recognition and it's great! But I also need to support speech recognition and dictation in other languages such as Russian, French and German.I've found that here are available various acoustic and language models.
But I cannot really understand - is that enough what I need to integrate extra language support in application?
Question is - what steps should I take in order to successfully integrate, for example russian, in Open Ears?
As far as I understood - all acoustic and language models for english language in Open Ears demo is located in folder hub4wsj_sc_8k . Same files can be found in voxforge language archives. So I just replaced them in demo. One thing is different - in demo English language, there also was a sendump 2MB large file, which is not located in voxforge language archives.There are two other files used in Open Ears demo:
OpenEars1.languagemodel
OpenEars1.dic
These I replaced with:
msu_ru_nsh.lm.dmp
msu_ru_nsh.dic
as .dmp is similar to .languagemodel. But application is crashing without any error.
What am I doing wrong? Thank You.
From my comments, reposted as an answer:
[....] Step 1 for issues like this is to turn on OpenEarsLogging and verbosePocketsphinx, which will give you very fine-grained info on what is going wrong (search your console output for the words error and warning to save time). Instructions on doing this can be found in the docs. Feel free to bring questions to the OpenEars forums [....]: http://politepix.com/forums/openears You might also want to check out this thread: http://politepix.com/forums/topic/other-languages
The solution:
To follow up for later readers, after turning on logging we got this working by using the mixture_weights file as a substitute for sendump and by making sure that the phonetic dictionary used the phonemes that were present in the acoustic model rather than the English-language phonemes.
The full discussion in which we accomplished this troubleshooting can be read here: http://www.politepix.com/forums/topic/using-russian-acoustic-model/
UPDATE: Since OpenEars 1.5 was released this week, it is possible to pass the path to any acoustic model as an argument to the main listening method, and there is a much more standardized method for packaging and referencing any acoustic model so you can have many acoustic models in the same app. The info in this forum post supersedes the info in the discussion I linked to in this answer: http://www.politepix.com/forums/topic/creating-an-acoustic-model-bundle-for-openears-1-5-and-up/ I left the rest of the answer for historical reasons and because there may be details in that discussion that are still useful, but it can be skipped in favor of the new link.

Translate ASP pages on the fly

Recently i came into a software solution which is developed in ASP and it works only in Internet Explorer. The software is English language and therefore i need to translate it in another language in order to present it to the audience more efficiently.
The problem is that the software was developed with out using resources, and so all the words, sentences etc. that has to be translated are in the code and we have to go line by line to do the translation.
Do you know if there is an IE plugin which can translate the ASP files according to our input in any language?
Thank you in advance!!!
You may want to take a look at Google Translater https://translate.google.com/manager/ It runs on the client side and seems to translate pages well enough.

Clients want to copy/paste from word processors; rich text editors will make it a mess. How do we solve this?

After years of experience with custom made CMS systems, I come to this conclusion:
Clients really want to copy and paste information from word processors into their website CMS. They don't like to create large texts in a website box, and prefer to do so from their good old word processor. Or they simply have their text already prepared for other purposes, and therefore want to copy and paste.
Clients do not like to lose their format. They've spent time on their boldface text, headings, etc, and they do not like to do this all over again.
Rich Text Format fields (TinyMCE, CKEditor, etc) are not yet able to properly convert all formatted text into the right HTML. I do not blame them; this has to be very difficult given the odd 'source code' that word processors put in the clipboard. But reading all SO topics about richttext related issues, I feel this is a known limitation.
What do you do in such cases? I've tried the following:
Explain the client beforehand that this is not a word processor we are implementing, and it has limitations. They can understand, but still want to copy and paste.
Only show very few buttons for formatting (bold, italic, links). That way, we can strip the tags and clean this up quite well, and this limits issues. Works better, but clients keep asking for font options, more colors, headers, etc.
So not a really good solution in sight. Are there others who have tackled this issue successfully?
One solution (and probably the best I've come up with) is to post-process the pasted content. So, catch the publish event and correct all the crappy HTML -- catch all the "mso-normal" styles, for instance, and remove them. You'd have a set of rules which clean stuff coming out of, say, MS Word.
Though, this is not just a word processing problem. You're pasting from one rich text editor to another, and styles just don't transfer between rich editing environments. This is not so much a technical problem as it is a logical problems.
Update: Someone pointed me to this: Copy-Pasting Word to your Web CMS. No real solutions, but just confirmation that it's a sticky problem.
I totally agree with you:
Last week I did a very interesting test with a customer for which I had to prepare some demo's of .NET based CMS systems (Umbraco, Sitefinity, DNN, Composite C1 ect). The customer himself had a Drupal based site and I was ashamed none of my CMS demo's did a 100% job with a complicated Word table (Ceteris paribus: I did not do some CMS fine-tuning, used every CMS out of the box). The worst part was his Drupal CMS did a 100% good job! It was exactly the same as it was in Word. For a client working a lot with Word my CMS-ses were a showstopper. Of course there are a lot of discussions on the web about 'you should not copy from Word' or 'do NOT use Word for CMS things'. Fact is: clients work with Word so we should deal with it.

I have to redesign a website in joomla from HTML.Basically the HTML site is in 2 lanugage English and French

I have to redesign a website in joomla from HTML.Basically the HTML site is in 2 lanugage English and French.Now the problem is that there is different menus and block in both the language sites.If the menus and rest of design is same then i can easily do using Joomfish..please tell me how i can mange this.
The answer to your question really comes down to a choice between two options:
(1) Having a multi-lingual site. This type of site is manually coded to contain duplicate menu items and articles, each written "by hand" in the native language. If your site is one where the language contains nuances that would be missed by auto-translation software like JoomFish, then you better go this route. There is more planning involved for such a site.
(2) Having a single language site that can be translated to another language using a translator like JoomFish (or a dozen others). If the language in your project is not specifically nuanced, you might consider this route, as it will be FAR easier to build. Translation software like JoomFish does a pretty good job. I even use the translation built into CometChat for some of my clients, and they're pretty happy with the results. The translations aren't 100% perfect, but most web viewers understand this nowadays.
You might consider reading this article:
http://docs.joomla.org/Adding_multi-language_support
And then look at this directory:
http://extensions.joomla.org/index.php?option=com_mtree&task=listcats&cat_id=1838&Itemid=35

How to handle localization of controller names?

I run a site where it is important to have good and simple URLs that need to be localized.
Example for the english version:
example.com/car/?type=fiat
Example for the Swedish version:
example.se/bil/?typ=fiat (bil is car in swedish)
And ofcourse I would like to handle all of these URLs from the same codebase. What is the best way to handle this?
Should I set up several controllers (CarController, BilController) or is there a "cleaner" way to handle localized controller names?
BR
Niklas
Don't do that. Ever.
Microsoft, a really big, powerful and resourceful company tried that with Excel. In English versions of Excel, you use IF() in formulas. In the German version, it's WENN(). In French, it's QUAND(), I think. In Japan, it's probably ば(). Now imagine someone from Japan sends me an Excel sheet ... There are two options:
"I'm sorry, I can't open this file"
Translate all names on the fly
Doing #2 seems simple enough ... until you run into a word which uses the same letters but has a different meaning in two languages. Example "see". Means "look" in English and "lake" in German. Since you don't know all the languages in the world, you have no chance to figure out which collisions you will have before it is too late.
Also, how do you know which name to use? From the language in the browser? Or do you hate your international customers who occasionally use the Swedish main site? How do you handle Asian languages? Will the URL be server/%E6%AC%80%E6/?%AD%81%E6%AB=fiat?
Don't. Do. That. Ever.
What about rewriting the URL depending on the domain? This way, the Zend framework will get only the English names, while the URL can use localized names.