Wednesday, May 15 2024 | Updated at 12:13 PM EDT

Stay Connected With Us F T R

Mar 10, 2017 02:18 PM EST

IBM announced recently that it broke industry records involving speech recognition by extending deep learning technologies. The result is a technology that recognizes spoken words with only a 5.5 percent word error rate.

In normal human conversations, it is common that there would be times when we miss a word or two. It is only when we cannot connect, or would want a confirmation of what we think we heard, do we ask the person we are speaking with to repeat the last phrase he or she has spoken. Imagine how it would be like for a computer.

Last year, IBM achieved a milestone with its AI's conversational speech that garnered a word error rate of 6.9 percent. This year it lowered that error rate to 5.5 percent breakthrough by combined long short-term memory (LSTM) and WaveNet language models alongside three strong acoustic models, Computer Business Review reported.

The process makes the AI learn not only from positive examples but from it also takes advantage of negative ones. Like anyone living who learns, the AI is getting smarter as it goes and it learns and performs better when similar speech patterns repeated.

Last December, the company added diarization to its Watson Speech to Text service, which meant that the AI is now enabled to identify or differentiate individual speakers in a conversation. It has always been an industry goal to have a machine that can reach human parity, which means having an error rate on par with that of actual humans conversing, according to IBM.

In reassessing the industry benchmark, IBM collaborated with speech and technology service provider Appen, which identified that human parity is lower than what anyone has yet achieved, 5.1 percent. IBM's current standing is now very close, it might be long now when we could talk to machines like those that we do with each other every day.

Meanwhile, a security researcher used Google's own speech recognition service to beat its reCAPTCHA field that bypassed the security feature. According to the researcher who goes by the name of "East-EE," there is logic vulnerability within Google's reCAPTCHA field.

The audio challenge in reCAPTCHA can be bypassed. Apparently, this is not the first time reCAPTCHA has been defeated by security researchers. In 2012, it was defeated 70 percent of the time by using deep-learning technology.

The hacker needs to convert the reCAPTCHA audio to a wav file and send it to Google's Speech Recognition API. It will then send a result in written version (string) of the audio challenge. This string is then copied and pasted into the text box, and clicking 'Verify' on the reCAPTCHA widget.

See Now: Covert Team Inside Newsweek Revealed as Key Players in False Human Trafficking Lawsuit

Follows voice recognition, IBM Deep-learning, ReCAPTCHA, Watson Speech to text Service, IBM Speech Recognition, Speech Recognition, Human Parity
© 2024 University Herald, All rights reserved. Do not reproduce without permission.

Must Read

Common Challenges for College Students: How to Overcome Them

Oct 17, 2022 PM EDTFor most people, college is a phenomenal experience. However, while higher education offers benefits, it can also come with a number of challenges to ...

Top 5 Best Resources for Math Students

Oct 17, 2022 AM EDTMath is a subject that needs to be tackled differently than any other class, so you'll need the right tools and resources to master it. So here are 5 ...

Why Taking a DNA Test is Vital Before Starting a Family

Oct 12, 2022 PM EDTIf you're considering starting a family, this is an exciting time! There are no doubt a million things running through your head right now, from ...

By Enabling The Use Of Second-Hand Technology, Alloallo Scutter It's Growth While Being Economically And Environmentally Friendly.

Oct 11, 2022 PM EDTBrands are being forced to prioritise customer lifetime value and foster brand loyalty as return on advertising investment plummets. Several brands, ...