Kansas State University


IT News

AccessTech: Speech-recognition software diversifies the PC experience

K-Access Logo

This week is “K-State For ALL!” here at K-State. I encourage you to check out our list of events on the 2011 K-State For All webpage. Join us as we celebrate DiversAbility!

This month I would like to introduce a technology that is actually very commonplace in our society, although we do not always realize it. If you have ever spoken with a computerized operator on the phone with a large corporation, told your phone who to call with voice commands, used Google Voice, or played with voice recognition on your computer, you were using speech recognition (SR).

This technology has been around for many decades, though it has only become common in the last two. Many people with disabilities use this software to alter the computer environment to their needs.

  • People who have limited use of their hands use SR to control the computer with their voice.
  • People with learning disabilities often use SR to dictate a paper because speaking on a topic and voicing out a paper does not rely on their ability to spell.
  • Many others simply find that speaking a paper is better for them than typing and allows for a better environment for processing and composing their thoughts.

You can find SR software in many new computers. Windows started using the Microsoft Speech API for their Office product as early as seven years ago. Now it is included in the Windows 7 operating system. For those who want a more complete product you can find software at many computer stores called Via Voice or Dragon NaturallySpeaking. It is even found on mobile devices. Dragon Dictate is an app that can be found for iPhone and iPad devices for free and works great for taking notes on the go. Best of all, it is very easy to use.

Such software took many hours of training 15 years ago. Many of us who tried it found it to be very poor even with days of training. Today you can install and use the software to write a paper, compose an e-mail, or search your computer in minutes.

Unfortunately, there are some limits. Anyone with a Google Voice account can tell you just how imperfect it is. Google Voice is a product that transcribes your voice mail into text via computer, and it sends you this text as a text message on your phone. It is using SR to do this. For getting a quick glimpse of a message it is okay, but it does show some of the downfalls of the current state of SR.

Dictation software must be used in a controlled environment.  It is best in a quiet setting where the computer is listening to only your voice. Many of the noises we humans tend to filter out (people talking in the hall, a fan in the window), it picks up on. Google Voice is a great idea, but most of the time people call us from outside, in noisy rooms, or in the car listening to music. All of that background noise confuses the computer, and one gets a rather odd jumble of words as output.

The same is true if we try to send a video through SR to create captions. There is simply too much background noise to create a reliable transcript. It is far faster to type text by hand than edit SR text from recognizing a video.

On the other hand, if you have a quiet place to study and you find that you can speak your thoughts much better than you write, SR is worth a try! It can significantly alter the way you use a computer and, for many, it allows for much more efficient writing. We will see many more dictation products in the future, and I expect great things.