Automatic Alt Text

Redesigning Alt Text for Microsoft Office

Mockup of a computer with auto alt text in PowerPoint captioning a photo of two polar bears.

My Role

  • Interaction Designer
  • Project Manager

Summary

This project covers the redesign of the alt text experience in Microsoft Office to make it more efficient, intelligent, and easy to use for the creation and consumption of accessible documents.

Cross-functional teams

  • Engineering
  • Design
  • Privacy & Security
  • Marketing

Duration

  • 1 year

Background

Alternative text, commonly referred to as alt text, is critical to helping people with vision impairments and cognitive disabilities understand graphical content. While emphasis on accessibility is increasing across the tech industry, there is still too much digital content that is inaccessible.


For more than a decade, Microsoft Office employed an alt text authoring experience that was difficult to find and use for people trying to make accessible content. Moreover, it resulted in a poor reading experience for people who rely on screen readers to consume content in an increasingly digital world.


In an effort to make the digital world more accessible, I set out to improve and reimagine the alt text creation and consumption experience in Office.

Screenshot of the original alt text UI with a title field and a description field beneath a series of menus.

Users

  • People with vision impairments who rely on alt text to understand documents
  • People who want or need to make documents accessible

Needs

  • An easy, efficient way to add alt text to pictures
  • A reliable experience for consuming alt text

Business goals

  • Improve accessible authoring experiences to promote Office365 subscription purchases in enterprise
  • Leverage intelligent technology to improve workflows and provide value to customers

Research

Methods

Interviews

I interviewed people with vision impairments, accessibility experts, and people who make accessible content as part of their job (i.e. government agency employees) to understand how people create and consume alt text.

Competitive analysis

I analyzed alt text experiences across different productivity and social media applications to see how others have approached this problem space.

Cognitive walkthrough

I completed a detailed walkthrough of the existing alt text experience in Office to identify strengths and weaknesses of the design.

Key findings

Consuming alt text

  • Inconsistent experience: Alt text is not always read in a consistent way with screen readers and other assistive technologies
  • Not enough alt text: Most documents don't have alt text at all, which causes frustration

Authoring alt text

  • Difficult to access: The current input is too difficult to get to, causing a lot of wasted time navigating through menus repeatedly
  • Authoring is confusing: Alt text input currently allows for both a title and a description, but most screen readers only read the description reliably. This causes confusion around which field should be used and when
  • Unsure how to write alt text: There is not enough guidance to support writing high quality alt text

Ideation

While ideating solutions to make alt text easier to access, understand, and create, I came across the newly released Microsoft Cognitive Services APIs.


Cognitive Services are able to generate natural language descriptions of images using Machine Learning and Computer Vision. I thought this could be a great opportunity to add machine intelligence to alt text while improving the overall experience for both authors and consumers of alt text.


I pitched the idea, and worked with several engineers to prototype a proof of concept during a hackathon. The project ultimately was funded by our team.

Microsoft Cognitive Services logo

My roles

During the design and implemtation for this project, I filled both a project management and interaction design role, each with distinct responsibilities.

Interaction designer

  • Iteratively design solutions based on research, feedback, and technical constraints
  • Use systems thinking to understand where the new solution fits into the broader Office app experience
  • Create visual and voice design that fit into Office design standards

Project manager

  • Coordinating work across teams and organizations within Microsoft
  • Solving technical problems
  • Driving timelines for the project
  • Ensuring high standard and compliance for privacy, security, globalization, and accessibility

Challenges

  1. Encouraging good alt text

    Problem: In user studies, I found that many people do not know what alt text is. We needed a prompt that conveyed purpose and helped people understand how to write good alt text.
    Solution: I interviewed accessibility experts and people with vision impairments to inform our chosen prompt. I worked with partners to land on something very clear with little room for misinterpretation. We also added a length suggestion, since people we talked to identified overly long alt text as a core pain point.
  2. Fostering trust

    Problem: Machine Learning models are not perfect, and it might be difficult for someone who is blind to understand whether or not they should trust a caption.
    Solution: Once a description was generated, we appended text that marked the description as being machine generated. We also included a confidence rating so that screen reader users could make better decisions about trusting the alt text.
  3. Gathering feedback

    Problem: A key component of intelligent systems is collecting feedback so that the system can improve over time. There are many privacy regulations that prevent gathering this sort of data.
    Solution: The Computer Vision model we were using had been trained almost exclusively on social media content. In other words, it was extremely bad at recognizing the professional content that someone would insert into an Office document. We thought this might block the entire project from continuing.
  4. Improving caption quality

    Problem: A key component of intelligent systems is collecting feedback so that the system can improve over time. There are many privacy regulations that prevent gathering this sort of data.
    Solution: We built a web app to help curate a set of nearly 100,000 photos to be used to train the image captioning Machine Learning model. When re-trained, the model demonstrated a 30% increase in accuracy for describing professional content.
Graphic showing the new alt text pane with numbered components corresponding to listed items. 1: A new prompt reading How would you describe this picture and its context to someone who is blind? 1-2 sentences recommended. 2: Appended text to an automatic caption that reads Automatically Generated with High Confidence. 3: A prompt asking users to donate image to science. The prompt has a beaker and a button to agree to donate the image. Graphic showing the photo filter app, which shows a grid of images. The user chooses the photographs in the set and submits them.

Interaction flow mapping

The new Alt Text solution that I created reduced the task flow of adding alt text to an image from 6 steps to a single step.

Visual and voice design

Visual design and voice design fell in line with Office Design Standards, while extending the standards to a novel scenario. I wrote the UI text to address the shortcomings of the current alt text experience, as discovered from my research.

Animtaed gif showing the alt text design.

Impact

Today, the Automatic Alt Text feature adds descriptions to
more than 2 million photos every day, helping to give equal access to documents for everyone.