Speech recognition

Dotan Cohen dotancohen at gmail.com
Fri Jul 4 17:58:27 UTC 2008


2008/7/4 Nigel Henry <cave.dnb2m97pp at aliceadsl.fr>:
> I  somehow that we're still a long way from getting accurate text from speech.
> It works well on Star Trek, where you can have a conversion with the
> computer, and Cap't Picard can submit his log. I remember one Star Trek film,
> where they went back in time, and Scottie tried to talk to an ancient
> computer, then realised that you had to type your request on a keyboard "of
> all things".
>
> Text to speech works ok, but there you have a synthesizer (or is that
> synthesiser), which changes the text to speech. Words like "there", "their",
> and "they're", all sound the same, but have different meanings, but from the
> listeners viewpoint, the context tells you which is which.
>
> Going the other way, speech to text, it's a whole different ballgame. Looking
> at the example above, and assuming that everone spoke exactly the same way
> (no problems with different dialects), the computer would still need to
> understand the context of what was being dictated, so as to print "there",
> "their", or "they're". of course when you bring different dialects into the
> equation, it gets really complex.
>
> The differences I've found, are usually with the pronunciation of vowels,
> which can very often, at first, make it difficult to understand what someone
> is saying, but you sort of get tuned in after a while, but we are humaan, and
> not a computer.
>
> In the UK there are some strong dialects, Geordie, Glaswegian (in scotland),
> and many others. Looking at 2 examples from Wales, and Northern Ireland, the
> word tongue in Wales is pronounced as tong, and the word film in Northern
> Ireland is pronounced as filim, and the name of the actor who plays Chief
> O'Brien in Star Trek, who's first name is Colm, is pronounced Colim.
>
> At this point in time, I personally see a problem in computers converting
> speech to text.
>
> I recently listened to a broadcast on the BBC's world service "Digital
> Planet", and Amtrak in the US seem to be using speech communication to a
> computer to get info for train times, etc.
>
> I recently had a problem with a parcel not being delivered in France, and
> contacting Chronopost by telephone, was asked to speak my parcel reference No
> into the machine. On the premise that you ask, so I do, I spoke each letter,
> and number into the phone. Nothing. Then I'm asked to repeat the parcel
> reference, which I do, but still nothing. To be fair, I'm English, and
> perhaps the computer has some problem with my pronunciation. Now I appreciate
> that this was direct communication by speech with another machine, but I
> believe that accurate speech to text is going to take quite some time to
> achieve.
>
> Just some observations, and comments.
>
> Nigel.
>
>

Nigel, you should definetly see this (requires flash):
http://www.nuance.com/talk/

Dotan Cohen

http://what-is-what.com
http://gibberish.co.il
א-ב-ג-ד-ה-ו-ז-ח-ט-י-ך-כ-ל-ם-מ-ן-נ-ס-ע-ף-פ-ץ-צ-ק-ר-ש-ת

A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?


More information about the kubuntu-users mailing list