by Jonathan Lazar
From Dan Frye: Dr. Jonathan Lazar is a professor in the Department of Computer and Information Sciences at Towson University and director of the Universal Usability Laboratory. He addressed the 2008 NFB national convention on making CAPTCHA, an inherently visual computer security system, accessible to blind people. Dr. Lazar and others in the Computer and Information Sciences Department at Towson University have worked closely with the NFB on a variety of blindness-related computer issues throughout the last several years, and the delegates again warmly greeted Dr. Lazar and enthusiastically welcomed his message of progress in making CAPTCHA accessible. The following text is drawn from the comments that he delivered at the convention and is supplemented slightly with a report of subsequent developments:
Good morning, everybody. How are you doing? I love your reaction to the word “CAPTCHA.” He [President Maurer] says “CAPTCHA,” and you all go "Boo." I agree with you. That’s what we’re working on.
At Towson University we’ve partnered for a number of years with the National Federation of the Blind to do research projects related to computer interface issues for blind users. It’s exactly what Dr. Maurer is saying. People say, "You know these things are impossible. You have to have security features. You can’t make them accessible." But do you know what? You can make them accessible; you can make them usable. The key is finding the solutions. Let me take a moment here to acknowledge the other people at Towson University that work with me on finding these solutions: two other professors, Dr. Heidi Feng and Dr. Harry Hochheiser, are working on this project, as are doctoral student Graig Sauer and undergraduate student Jon Holman.
One of the favorite times in my entire life was actually my last presentation at the National Federation of the Blind convention in 2005. You might remember that in 2005 I gave a talk about computer frustration, in which I talked about a study of one hundred blind users who took part in studying what frustrated people on the Web.
Today I want to talk about a specific frustration. I want to talk about CAPTCHAs. Feel free to boo; it's ok. But we’re going to make it better. A CAPTCHA is actually an acronym for a Completely Automated Public Turing Test to Tell Computers and Humans Apart. It's kind of a dorky acronym, isn’t it? You probably know them better as annoying and frustrating.
Developers say that these are software tools designed to separate out who is a human and who is a computer. Now let me tell you that’s what the developers say. In reality CAPTCHAs determine who can see and who can’t. It's not about being human; it's about being able to see. That’s the thing that a lot of these researchers don’t understand.
The idea is that computer bots and viruses can sign up for email accounts and clog systems, so they’ve created these things called CAPTCHAs, which are also known as Human Interaction Proofs--or HIPS--to stop those bots. Now CAPTCHAs are actually not that effective. They don’t work all the time in stopping the bots and the viruses, and they don’t work all the time for determining who actually is a human. Many of the companies we talk with, though, say, "We know these tools aren’t perfect; we know they’re not accessible, but they are the best we have. They stop some of the viruses and bots." But that’s not good enough. CAPTCHAs have to work for people because, not only are you stopping bots and viruses, but you're also stopping people. You’re stopping blind people. You’re stopping people who want to use these systems and can’t use them.
So a CAPTCHA--it's also known as a twisted-text CAPTCHA--is a picture of text and numbers which have a lot of background clutter and noise. It may have the letters and numbers "R-3-B-6," with a lot of dots in its background. The idea is that the visual user can see through all of that background clutter, but a computer bot or a virus can’t. And of course a blind user can’t.
Now, if you ask some people, they say, "No, no, no. We have audio CAPTCHAs." How many people here really like audio CAPTCHAs? [chorus of nos] First of all, most sites don’t have audio CAPTCHAs. Only a few sites actually use them. An audio CAPTCHA is a series of numbers or letters read with a lot of background noise and clutter. Again, the idea is that a human can filter out the background noise, but a computer bot can’t.
We had heard that many problems were related to online security for blind users, so we started by just doing a focus group at NFB headquarters. The goal of the focus group was just to learn more about what these problems were. The number one problem cited in this focus group, as you probably can guess, was access to CAPTCHAs.
The idea is that everyone has trouble with CAPTCHAs. People who can see have trouble with CAPTCHAs, and we had heard informally from the blind users whom we worked with that they shared this sentiment. So we decided to start with the usability test to learn more about the problems that blind users have with audio CAPTCHAs. We started with the typical audio CAPTCHA from the CAPTCHA project up at Carnegie Mellon. We had six blind users taking part in the usability testing of the audio CAPTCHA, and we had each user do five audio CAPTCHAs.
Interestingly enough we actually found that only 46 percent of the time could blind users successfully complete the audio CAPTCHAs. The average time to complete an audio CAPTCHA correctly was 65 seconds. What made it more interesting was that the only people really able to complete the audio CAPTCHAs successfully were people who were also using Braille notetakers. And at the time they were listening to the audio CAPTCHAs, they were taking notes with their Braille notetakers. How many people does that apply to? Clearly audio CAPTCHAs are not usable for many blind people.
We know that visual CAPTCHAs are inaccessible; we know that audio CAPTCHAs don’t work well. So we decided that we would try to develop some new versions of CAPTCHAs that are more accessible to blind users, that are equally usable for people who can see, and that are just as secure. We decided to call it the Human Interaction Proof Universally Usable (HIPUU). We had two reasons for adopting this nifty name. First of all we like the idea of a little hippo. Everyone should have a cool little mascot for their project. More important, we believe that CAPTCHA is a trademark, so we’re calling our project HIPUU.
The first version of HIPUU is a combination of nontextual images and sounds. For instance, we have an image of a dog, and we have a sound clip of a dog barking. The idea being that, if you are blind, you can use the sound clip; if you have a hearing impairment or are sighted with no disability, for that matter, you can use the picture of the dog. But, because it's nontextual—because it uses pictures and sound clips--it's actually more secure. It is more secure because image-recognition and speech-recognition technology are much better at recognizing text than at identifying pictures and certainly at identifying sounds. Humans, however, are quite adept at recognizing both graphics and sounds.
Depending on whether or not the user is blind, the user may use either the picture or the sound clip to respond appropriately to the security questions or tests in this system. Initially we used visual and sound combinations from transportation, animals, weather, and musical instruments. We started with things that are easy to identify. What happens is that, when you hear the sound clip, you must choose the correct answer by identifying the sound using a drop-down box where you select the appropriate choice. We started with the smallest of choices. We enhanced the security of the system by including a wide variety of sounds that would rotate for each test. If the user missed three attempts to identify the image or sound, the user would be blocked out of the program. This is consistent with the traditional CAPTCHA protocol. Again, the idea is that this should be much easier for both blind and sighted people to use and it should actually be more secure.
We did a usability test with the first version of HIPUU with five blind users and five sighted users, each completing fifteen different tests. First, the interesting thing is that sighted users had a 100 percent task success rate. Users who could see were successful, and they actually liked this better than the traditional twisted-text CAPTCHA. Sighted people don’t like those CAPTCHAs either. By contrast, when we had the blind users tested, they actually had a 90 percent success rate–90.6 percent on the first try. And on the second try it actually went up to 100 percent. The average task-completion time was 35.2 seconds.
The foregoing statistics show that this new HIPUU that we created was much more successful for both sighted and blind users than any other version of the conventional CAPTCHAs. Instead of 46 percent success rate in 65 seconds with an audio CAPTCHA, with this accessible version of HIPUU, blind participants completed it successfully 90 percent of the time in only 35 seconds. The successful performance of blind people using HIPUU was almost doubled. Blind users clearly preferred this security system. Sighted testers preferred HIPUU too. This worked out much better than both the audio CAPTCHAs and the visual twisted-text CAPTCHAs.
We’re actually in the process right now of developing a second and more robust prototype. One of the things we're going to do is to introduce some sound delays, because one of the problems is that sometimes the screen reader can overlap a little bit with the actual sound clip of the dog or the piano. We’re also adding many more sound and image combinations than the original ones we included in the first proof of concept. We're doing this for a good reason. One of the arguments that I guarantee you we'll get from conventional defenders of the current CAPTCHA product is, "Well, you know what, CAPTCHA has an unlimited set. If you use a visual CAPTCHA, you can make it secure." So in this second iteration of HIPUU we are adding many more image and sound combinations to our library. In addition to incorporating more sound and image combinations, we are also asking users to respond to and identify multiple sound or image combinations. So, rather than just being asked to identify an image or clip of a dog, you could be asked to confirm the image or sound of several items, e.g., a dog, a piano, and then a rain storm in that order. Just with those three sounds, think about how many combinations you could generate. And, you know what, it will still be much easier to memorize three sounds or images briefly than it will be to struggle with the current forms of CAPTCHA. Also in some of our new versions we are looking at whether we can have users—rather than picking items from a drop-down list—actually typing in text to identify the sound clips. These enhanced measures should go some way to satisfying the questions related to security. Hopefully we can persuade many companies that HIPUU will work as well as CAPTCHA and will have the added benefit of being accessible to blind people. Usability testing of our new prototypes will take place this summer and fall. With any luck this initiative and research will ultimately create a new standard in accessible security for blind computer users.
When you encounter inaccessible CAPTCHAs, tell the designers and Webmasters that other alternatives do exist. Let them know that they simply have to think creatively. Share the work that we are doing at Towson University with these people. Together we'll develop a solution for the vexing problem of CAPTCHA for blind computer users.