|
|
|
[
Permlink
| « Hide
]
Lex Neva added a comment - 06/May/08 09:39 AM
Oh please, please no. The machine translators currently in existence do a woefully poor job when it comes to translating conversations. People regularly come to me trying to ask questions about my products through a machine translator, and it's pretty much uniformly impossible to understand them. Worse yet, when people are led to believe these things work, they quickly become mad at me if I'm unable and unwilling to answer their question because I can't understand them. Putting something like this in the client would be an implicit statement by LL that machine translators can facilitate meaningful conversation, which is not the case.
Personally, I would like to see this option.
LL is already collecting localization information, which could help with this. The machine translators are dodgy, but with patience can be used to provide meaningful help. Tonight I spent more than an hour talking with a German visitor with only occasional help from a native speaker of English and German. I have used several translators, and many people carry them. The problem with bad translations is expectation. And, given that I just had a meaningful conversation with an international friend via a machine translator, your assertion that is not possible is false, in my eyes, and surely others. I would like to set my "Langauge Preferences" to a list of languages I do not want translated, and anyone with other language preferences who chats at me be auto-translated. The messages sent could read, for example: Add another goofy first time popup: By enabling auto-translation, you acknowledge that machine translation is not very good and that LL does not control the accuracy or suitability of any translated message. Use at your own risk. Yes/No? If someone has the automatic translator OFF then return a warning to the speaker (once) saying "the person you IMed does not speak (language here) and does not rely on machine translators. Please find alternate means to translate." This would also reduce a lot of ugly chat clutter from people using machine translators, as well... another viewer experience win, especially for open chat. This is do-able with the software, libraries, and services out there, and will enable whole new realms of potential friendships and interactions between people who do not automatically know freebie and pay-for translators exist in SL. I vote yes because it would be a hugely useful feature for me, and is something I've wished to have in the viewer MANY times. There are compelling short-term and long term reasons to include a translator in the viewer.
I just attended Helen Keller Day this weekend to help to support the blind and hard of hearing in Second Life with my "Ferd's Free Translator" and my new "Chat-to-Speech" translator that I developed specifically for the Helen Keller Day event. This event only reinforced my belief that anyone unable to read or write another person's language in Second Life has a disability. Lex Neva, while trying to converse above with a customer, was by my definition a disabled person, similar to the way that an illiterate person, a blind person, or a mute person is disabled. If I cannot read another's language,then I am the illiterate person. If I cannot speak another language, then I am mute, just as Helen Keller was. 20% of the Second Life users are disabled in some manner. All of us are disabled if we do not speak the right language. I believe that Second Life should be inclusive of all users regardless of disability. This JIRA is a good place to discuss the needs of the community. This effort to add translation to the viewer is already undergoing. Dr Zhang at Carnegie Mellon University has been working on a project like this for some time. Dr. Zhang's version uses CMU's own translation engine, which may be easy to adapt to web-based engines such as Systran from Microsoft, or Google Translate. I am eagerly awaiting a copy of his code. Peter Seebach built a Linguaphile version (local hard drive translation) in his viewer. I have been working with others for 6 months on my own project. This is going to happen. Useful links: Client side translation is needed for many, many reasons. I have seen stats that show that over half of Second Life users now come from a non-English speaking country. English countries are now the 'new minority'. Yet English is the language of most of Second Life. If you speak English, you will do well. You can speak to many others. You can read most of the note cards. Everyone else has a language barrier that is far worse. Do you want to know what the Second Life experience is for a 'typical' Second Life user? Try to imagine coming from a minority country, such as Iran, to almost any land in Second Life, where 99.9991% of the residents do not speak Farsi. Do as I did: eport to a Japanese or other foreign Infohub and try to interact. Nothing makes sense to an English person. Yes, the view is gorgeous, and the girl avatars look great, but you won't know what the signs say or the girls think You won't be able to understand the note cards, the text chat, the IM's, or the messages from objects. Automatic translation of chat would be a huge step in the right direction for all users of Second Life. Second Life should be inclusive of all users. Universal Translation is possible: Chat is actually a small portion of the need for viewer-supported translation. A note card could be automatically opened in the users chosen language. Google does a great job of dynamically translating large amounts of text. This JIRA eliminates copy and paste. More importantly, this JIRA eliminates the need to know how to copy and paste note cards into Google. Universal translation can be available to all parts of Second Life. Web links in the built-in and external web browser can be internationalized: A spin off of my Ferd's Free Translator is already in use on the Second Life wiki. The very clever avatar Zai Lynch made the top nav bar within hours of my posting of example code with a plea for help. The wiki nav bar automatically translates the web pages on the wiki to any language. This same feature can easily be added to the built-in viewer in Second Life for any web page. The format is simple. Since most web page links in Second Life are in English, this simple url would work in almost all cases. If not, Google handles the problem gracefully. http://translate.google.com/translate?u=URL&langpair=en This URL translates this JIRA into Spanish: http://translate.google.com/translate?u=http://jira.secondlife.com/browse/SVC-2299&langpair=en This works in both external and the built-in viewer. Web link translation is important enough and simple enough that it possibly qualifies for it's own JIRA. A 'do not allow translations from XX language' should not be implemented. Anyone can easily bypass this by using http://translate.google.com Machine Translation is communication: Lex's statement that someone using a translator is "uniformly impossible to understand" is simply not true. There is certainly truth in the opposite; not using a translator means ZERO communications will occur. (Caps lock shouting at them, like I just did to you, does not work. Sadly, I have seen SHOUTS used far too often). I have had great results with the Google API used in my 'Ferd's Free Translator'. My partner, WavinggirlsAV Voom, and I have successfully conversed with literally thousands of people. We routinely speak to dozens of people at the same time, in a dozen different languages. Yes, there are some words that do not translate well. Some words will always be that way. Machine translation will get better. The more messages that flow through a statistical engine such as Google, the better the translation will become. Better help will be available: I put a "10 rules for machine translation" note card in my device. Few people find the note card. The notes tell you to use simple words. Use one thought in one phrase. Avoid slang. Use punctuation. Avoid the word 'it'. Speak in Active voice. Use repetition. These suggestions, and others, should become a part of the "F1" help button. Privacy should be respected: There is a switch setting in Ctrl-P menu to allow your language to be shown to the llGetAgentLanguage(0 function. Many chose not to show it. Some people do not want their country of origin to be known. I assume that suppression of free speech in some countries is a reason. You could be arrested for using the wrong words. Depending upon your cvhoice, your PC setting may show up as your country of origin to my llScript-based translator. The Restricted Life BDSM viewer does. Not only do I know how kinky you are, I can narrow you down to a country by using the llGetAgentLanguage() function. I chose to convert this country code to a language, and not to display a country of origin, for these privacy reasons. Privacy should be respected for all users. A client side translator should translate and then send the response to the recipient, and not send the original chat message on channel 0. This 'no echo' option should be switch selectable. Your privacy in being an English speaker should be respected, too, as English speakers are not welcome in some parts of Second Life. Everyone's privacy should be respected. User Interface would be consistent for all: This JIRA would standardize the way that the machine translation is done and controlled so that everyone knows how to use it. The seven major translators that I have tested all have different UI's and features, which is confusing even to the experienced person. Consistency would allow any experienced user to teach a new user how to use the built-in translator. Consistency is good. New User Experience (NUE) would be greatly improved: I have seen hundreds of people come for my FFT as soon as they arrive (sometimes I think they get sent the landmark by people that just want to get rid of 'them'). They do not know how to click things. They do not know how to open up the mysterious, English-only inventory (which is 'stock' in Japanese). They do not know how to wear, attach, attach to HUD, or drag and drop an item onto their avatar. If they do, they drop it on their head and remove their hair. I have spent 6 months experimenting with signs and graphics to help them. Even with a lot of work, many people just do not 'get it', and often they do not get it for a very long time. Built-in machine translation would be a huge first step in retaining new users and improving the NUE. The Help menus and the web site can be auto-translated, easily. Auto-Detection and language translation will get better: The current SL viewer supports only 10 languages. The llGetLanguage() system call returns 3 possible results: The language chosen at installation, the language of the PC, or Null. There are 4 platforms, too, Linux, PC, and two different Macintosh code bases. I have identified over 150 language codes in-world. Auto detection of their typed language, and a consistent, Linden supported use of the PC language ISO code setting, could be greatly improved with this JIRA. Google supports 42 languages. They support 43 if you count Mandarin Traditional and Mandarin Simplified). The Microsoft (formerly Yahoo) Systran API is also available. They use different techniques; Google is statistical, and Microsoft is heuristic. They both work well. They get better over time. They will continue to get better. I have watched huge improvements at Google happen in just a few months. We can, and should use the leverage of Google and their 5 billion word database. Spam Reduction would be total: Use of the llInstantMessage() function, such as I use in Ferds' Free Translator, cannot spam. It sends an IM only to the end user that is speaking a different language. No one speaking my language knows I am speaking in any language other than what I type. Yet the original message is still visible to all nearby. English typing can be considered to be spam by foreign speakers. If a check box were included to enable/disable original chat, it would reduce the original 'foreign language' chat spam to zero. This same reduction applies to outgoing spam, as well as incoming. This is a useful feature that is only available if the translator is client-sided. Instant Messages will work: This JIRA would allow IM's to be translated. This could reduce the chat clutter to zero! Scripts cannot read IM's. IM's require copy and pasting to and from hidden channels, such as the widely supported /1 and /2 channels. The FFT, the QT, MH, UT, XT and other translation products use these private channels to allow copying and pasting of IM's. These channels are already spammed and clutter the screen with the machine translations of commands to other scripts. Automatic IM translation is the #1 request I get. This is a useful feature that is only available if the translator is client-sided. HTTP Limits would be reduced: The limits on translations per seconds in Second Life would be eliminated. This JIRA would eliminate the server HTTP throttle of 25 HTTP fetches per 20 seconds per object. Hank Ramos, author of the free, and excellent HR Universal Translator, has made heroic efforts with his UT to attempt to get around this restriction. 100% automatic translators that detect the language that is 'spoken ' suffer from this more. All translators using llScript will shut down for 40 seconds or more when in a heavily loaded sim due to the HTTP throttle. Translators that do not detect the HTTP throttle will stop completely until there is a (very) long interval of silence. All translators have this problem: some more, some less. This JIRA would eliminate the HTTP Throttle. Licensing may not be an issue: Google Brands Permission has given written approval to me for the use of their API in-world and limited use of their trademark in-world. For-sale translators are technically in violation of Google's API Terms and Conditions. Luckily, they do not seem to be enforcing this at this time. Second Life qualifies to use the API as it is free. No-Script zones would really work: I spend a lot of time in Hanja helping others. It was sometimes impossible to use a web-based translator because I could not tell what language they were typing. I originally made the FFT to just detect an avatars language while in no-script zones, such as the Hanja region, so that my partner and I could get them badly needed help. A client-side translator would obviously work in no-script zones, with no 'vehicle' hacks required. This anti-no-script script does not always work. If you attach a translator while in a no-script zone, you must fly above 160 meters to activate it, then 'drive' it back down. This interrupts the flow of the 'game'. A viewer translator would work everywhere. Server Load would be reduced: A client-side translator would lower script times in the servers to zero. My translator is built for speed and parallelism. Google does most of the work. Yet there is extensive code for queuing and managing Unicode to UTF-8 that must run on the server. This load would be eliminated. Font support could be hugely improved. Unicode to UTF-8 was one of my more complex subroutines. I know that some other translators do not handle the umlauts, accents, and other special characters well, if at all. To be 100% effective, the user must download Microsoft Office in order to view all 43 possible Google translations, (or they must swipe the ArielUNI.ttf font from someone), and then press Ctrl-Alt-D to get to Advanced-Debug settings and copy and paste a critical font fallback entry, and then re-log. If they goof it up, it will require a total un-install and a reinstall. I would hope that the Lindens would license this font. If the switch for 'speak original in chat' were on, it would not affect you if you were missing the font: you would see the [][][] replacement characters, and then see the translation in your language and font. If the switch for 'speak original in chat' were off, it would be a non-issue. But I would hope it could be included. Arabic, Hebrew, Farsi and Chinese Right-to-Left languages can be accomplished fat more easily in the viewer: The letter order is reversed from right to left in these languages. For Chinese, it is a preference choice. A simple method is to just reverse print these letters. This works for most conversation. These languages follow a set of rules that are difficult to incorporate in llScript due to memory and character coding conventions. It was difficult for me to overcome the memory limits involved in these small memory spaces. And it took a lot of effort by many people to test it. Wavinggirlsav Voom told me a joke: "A Jew and an Arab go into a bar....and Ferd follows them in", and it is actually true. As a simple example, numbers should not reversed. They always read left to right. The Yen and and other Dollar signs, when used with numbers, should not be reversed. A number with decimal points, such as $5.00, should not be reversed. In Unicode encoding, all non-punctuation characters are stored in writing order. This means that the writing direction of characters is stored within the characters. A three-pass algorithm is used where each character is judged to be neutral, 'strong', or 'weak'. For further details, see this link: http://en.wikipedia.org/wiki/Bi-directional_text To summarize, there are compelling short-term and long term reasons to include a translator in the viewer. I think we need to try this out and see if it is a good thing. I've created a new entry ... http://jira.secondlife.com/browse/SNOW-93
Linden would be willing to fund contract development of this feature, if interested reply on the thread. |
||||||||||||||||||||||||||||||||||||||||