Voice and Speech recognition is becoming a huge competition. Apple has Siri which will be updated in iOS 6, and Google has Google Now coming in Jelly Bean. AT&T, seeing that this area is rapidly evolving, decided it’s time to open up their own voice software to developers. Now, developers can access AT&T’s “Watson” engine through an SDK made for iOS or Android. The SDK can be found on AT&T’s Developer website.
Similar to Siri and Google Now, Watson receives input, analyzes, and performs the services interpreted from the input. The input, however, unlike Siri and Google Now, can include audio files, speech, gestures, face recognition, and text. More than the competition. Here is a video put together by AT&T showing what Watson can do:
Watson can not only convert from speech to text but can combine speech with other modalities, such as a touch-screen tap (“show me the closest Starbucks, here”) or other gesture, and send the information to a device.
Watson also converts from speech to speech to do translations, even involving multiple languages: speech input in one language can be converted to text in real time, followed by a text translation (with little delay), followed by the spoken translated sentence at sentence end.
Some notable features it supports include: Web Search Speech to Text, Business Search Speech to Text, Voicemail to Text, SMS Speech to Text, Question and Answer Transcription, TV Speech to Text, and Generic Speech to Text. It is definitely an interesting piece of software that I’m sure developers will begin to do amazing things with.
What do you think? Excited to see developers get their hands on it? What do you think will be the first 3rd party app using it? Let us know in the comments.
Via: AT&T Developer