top of page

Hey Ally!

Ally is a voice assistant focused primarily on visually impaired users that makes web interaction quicker, easier, and more effective



  • UX Research and Literature survey

  • Customer Journey and User Flows 

  • Ideation, Brainstorming and Sketching

  • Wireframing and Prototyping

  • User Testing


Timeline : September 22 - November 22

Team : 4 members

We followed the User-Centered Design (UCD) process to create a tailored artifact for people with visual impairments


Using a Computer
Braille Keyboard

Since the web is a visual medium, screen readers are the most common way people with visual impairments interact with websites. Screen readers are often reported with issues such as generating confusing feedback to page layouts, causing conflicts between screen readers and applications, having poorly designed/unlabeled forms, presenting no alt text for pictures and causing a 3-way tie between misleading links, inaccessible PDF, and screen reader crashes. 

Amidst these difficulties, a recent technology that is gaining popularity is voice assistants for navigating the web. Voice assistants have also been seen to play a vital role in increasing access to users with disabilities, for instance, they are capable of providing hands-free, text-free access to many systems that could be an integral part of their daily lives.


This project aims to help individuals with visual impairments navigate and interact with a website using a voice assistant. It should be noted that, while secondary, our product can also help people with mobility, cognitive, and situational disabilities.


Nothing about them without them: Understanding user needs and requirements

During the empathy phase, our goal was to understand the critical challenges faced when using screen readers to navigate an e-commerce website, learn about usage and preferences for advanced technologies (such as voice assistants), identify popular accessible e-commerce websites, and learn what the top user flows for our target audience are.


Inspired by one of our class exercises, we began this phase with an immersion exercise. The main goal here was to build team alignment and a solid understanding of the problem space. This exercise helped us:

  • Build empathy

  • Identify unknown pain points

While we did expect to find the experience to be frustrating, the feelings were made all the more real. We also found navigating some e-commerce websites to be time-consuming, unnecessarily complex, and prone to create misunderstandings about what is on the website.

User using screen reader
Users during Immersion Exercise
Users during Immersion Exercise


Through this analysis of 14 academic resources, we gained a concrete understanding of:


  • Challenges with Screen Readers

  • Use of Voice Assistants

  • User Needs

  • Technical Considerations

  • Design Considerations

The image shows the set of principles amongst our learnings which were instrumental in our design. 

Additionally, the research on technical considerations encouraged us to pursue this direction as we found that such a solution is possible. More research is needed to understand how adoption can be made easy and cost effective.



We expected users to report difficulty navigating different websites, a lack of standardization across the web (and so limited web access through screen readers), and a need for voice assistants to aid website navigation.

After interviewing one person with a visual impairment (legal blindness), we learned that many e-commerce websites have native mobile applications which serve our target community very well. This brought about an interesting dilemma: do we really need voice assistants to help with navigation and interaction with websites? If they have mobile applications that serve our target community, what is the incentive for individuals with visual impairments and businesses to invest time using this technology on a desktop? 

Shifting gears, the team decided to look for websites which have AA level accessible websites but may not have native mobile applications. Our objective was to find something simple, concise, and focus on one user flow to explore the utility and the ‘how’ of our goal. 

Using Braille Keyboard


We came across a business model such as Spotify which has partial functionalities in its mobile application and requires the user to go to the original website for those functionalities. Thus, our product can be especially useful for such brands where the mobile application does not have complete functionality.

After an intense brainstorming session coupled with research, as part of the define phase, the team came to the following conclusion: 

Through smart voice assistants, businesses can help individuals with visual impairments know essential services right off the bat.

Instead of having users dig into this information themselves, we can help enforce core functionalities through this method. It should be noted that our product’s functionality will expand beyond this too. Our solution enables conversations and removes the burden from the user of having to go through all the information themselves. For instance, the voice assistant could be used to alert users to the fact that Spotify has a range of plans, compare plans on Spotify, and help the user make decisions. Through this rationale, we are catering to the critical pain points that our target audience has.


In order to define the process that users go through to accomplish their goals, we decided to create a user journey map. 

Through the journey map, we highlighted the different phases, user actions, goals and experiences, feelings and thoughts, opportunities, and pain points. We started by defining the scenario as using a screen reader to select a premium plan for our persona. We then described each step of the journey, starting with onboarding, opening, navigating to the premium plans page, comparing plans, and selecting a specific plan.

Journey Map


To get an idea of how the users will interact with the system, we created a user flow. The user flow we selected was opening and selecting a plan as per the user requirements. It helped us:

  • Visualization of the navigation flow

  • Highlights the users’ interaction with the voice assistant

  • Identifying possible pathways to be simplified using voice commands

  • Error prevention and recovery

User Flow


Exploration and Indentification of potential solutions

In the ideation phase, we used the insights from the research and analysis phase to inform our design solutions. This phase began with a brainstorming session about our features in order to draw solutions and then turn them into workable prototypes. To come up with a variety of solutions we performed the Crazy 8s method.


In this method, the team members sketched multiple solutions to the needs of the users. We used different materials like pen and paper, iPad, and laptop for making the sketches. Each one of us made rough sketches of our ideas in 8 mins.


This rapid sketching exercise helped us generate a lot of ideas individually and allowed everyone on the team an equal opportunity to share their ideas and select the best design solutions collectively. These solutions assisted us in the prototyping process.

Crazy 8s ideation


Designing a model to test and validate our ideas and design assumptions.

In the prototype phase, our aim was at creating a set of preliminary designs that we thought would be beneficial for our users given the use case. Our considerations for this phase were around: 

  • Tour/ Onboarding: To get users comfortable with how A11y works.

  • Preferences: How could we enable users to customize their experience. 

    1. - Accents: We wanted A11y to recognize as many accents as possible.

    2. - Activation Word: Hey A11y was a default we chose, but wanted to give users an option to change this. 

    3. - Sleep Word: How could users get A11y to not listen. 

  • Sound: Feedback was of prime importance to use, hence we aimed at giving users with vision impairments feedback with the use of sounds.

  • The narrative i.e. what how the system talks to the users.

  • All the possible interactions our users could take and the outcome of those interactions. 

  • Supporting visuals. For those with low vision we aimed at having the UI elements customisable.

Our aim was to provide users with an onboarding experience that would give them just the right amount of information to get started using our tech solution. Also prior to testing we also wanted to explore features that would potentially be useful to them. For this, we created a set of lo-fi designs and explored how the onboarding and interaction experiences would pan out in our user's journey.

The first phase was followed by a second phase of prototyping, here we picked a specific service for exploring interactions on top, in our case we chose We especially looked at the flow of purchasing a Spotify plan. All our designs from here were targeted towards this experience.


Evaluating the prototype to get insights into the overall user experience.


After creating an end-end prototype experience. We wanted to test a) the concept, b) the interactions c) ease of use and d) usability of the product. First we created a test script based on our prototype flow. The scenario we focused on was a purchase of a spotify premium plan using Ally. At the end of the test session, we asked participants a series of questions to get insights on their experience


We recruited two participants. Both of the participants were legally blind and proficient in using technology (i.e computers, phones) and screen readers.


The test sessions were conducted over a virtual meeting using zoom. The participants were required to understand the scenario and then interact with Ally to perform that scenario. Since both the participants were blind, the supporting visuals were not used during the test session. Instead, we conducted the test session using the Wizard of Oz method. Here, one of our team members assumed control of the system and responded to the user's commands in accordance with the script we had prepared based on all the potential user actions and how Ally would respond to those actions. Users would start the test session by saying “Hey Ally…” and from then it was a series of actions that would eventually lead them to purchasing their preferred plan. Upon competing or terminating the test scenario participants were then asked follow up questions to rate their experience and give qualitative feedback/insights. The questions were open ended to get participant’s honest opinions on their experience.


After conducting the sessions we reviewed and analyzed the feedback received from both participants. Overall the participants found Ally easy to use. Both participants gave very useful feedback based on experience. Including the technology, sound feedback, concept, and overall usefulness the conversations with participants were rich in details.


Participants shared that having Ally would be very beneficial and useful for their web tasks. Both participants mentioned that they would have liked to get some context on what Ally can do when they visit the website. They also added that having context on the website itself would have been beneficial.

"I would have liked some context about the website when I land on the home page" - P1

"Having Ally announce the primary function of the website and what all it can do on the website would be very helpful" - P2


Both participants mentioned that for tasks they are familiar with, Ally would be very helpful because it will make the interaction faster.

"Not having to navigate through every link using a screen reader would definitely result in a better experience" - P2


Both participants also mentioned they would like to know more about how Ally would work along with a screen reader and complimented it.

"For the known services and tasks, a tool like this could be very helpful and faster than screen reader"

- P2


Thinking more about the speech recognition engine, the technology and accuracy, participants mentioned that if a tool like this exists, they would definitely try it out.

"Less known fact about blind people: We are very happy to operate the computer away from the computer" - P2

Based on the feedback received from the participants, we made an iteration of our design with a first interaction prompt that tells users about the website and what all can they do using Ally on that website.

Video Prototype


  1. A voice assistant has many levels, including accents and dialects,and sound feedback. We intend to conduct more research on these technological advances.

  2. We must ensure that a voice assistant is adaptable when it comes to allowing users to choose their preferred working style.

  3. Because employing a voice assistant could be confusing, considering both audio and visual signals is important.

  4. Based on the findings, it was quite obvious that users needed to be aware of the application's fundamental use cases and the context of the service they are using.

  5. Participants agreed that when users know exactly what they want on a website, A11y would be more beneficial. Recognizing the essential and go-to tasks on web service and optimizing the Ally interactions for those tasks could be beneficial.

Future Scope

  • Based on what we got from participants we can say that A11y would complement a screen reader really well, making it easier for the users to do more tasks in less time. This also gives us a reason to research more on how Ally could work with screen readers and what that relationship looks like for the developers and users.

  • Critical thinking about the findings and how to address the insights.

  • More user research and further iterations.

  • Integrating input fields and forms with Ally

  • Standard guidelines/framework

bottom of page