April Tan

Pitch Noodle:​

A Learner-Centered Pronunciation Tool for Pitch Learning​

 

UX Researcher · UX Designer · App Developer (Python) · Linguist

Overview

Pitch Noodle is an interactive pronunciation tool designed to help language learners improve their pitch by visually comparing their speech to a model speaker’s. The ultimate goal: to provide just the right amount of technical feedback to empower learners to recognize and control their pitch movement without overwhelming them.

The Problem

Pitch and tone are difficult to master, especially for learners from non-tonal language backgrounds. While educators rely on pitch analysis tools to teach pitch production, these tools were built for linguists, not learners. Their complex interfaces and steep learning curves lead to frustrating learning experiences.

With no commercial solution available, how might we design a more learner-centered approach to pitch training? How might we translate acoustical analysis, a complex concept, into something that the average learner can understand? 

The Solution

I developed a minimal interface focused on pitch movements, the key element needed for pitch perception and production. With Pitch Noodle, learners can quickly record, visualize, and compare their pitch contours against a model’s. The demo below demonstrates the application:

The Process

1️⃣ Discover: Understanding Learner Struggles

To understand learner needs and pain points, I conducted a 3-week contextual inquiry with 15 students using an existing pitch training tool (PRAAT):

  • Conducted direct observations and took detailed notes on navigation paths, errors made, frustrations, and “aha” moments
  • Conducted user interviews to uncover values, needs, usage behavior, and pain points
  • Analyzed data and synthesized insights using thematic analysis in MaxQDA
 
💬 Key excerpts:
 
“These guys [spectrogram] confuse me. I think it affects my pitch but I’m not sure, so I try random things to try and move the pitch.” (On cognitive overwhelm)
 

“The contours are too hard to read. I don’t understand how it’s supposed to line up with the rest of the stuff like the blue lines.” (On technical complexity)

“I don’t think I’d be able to tell [what my voice is doing] without looking at the screen.” (On visualization helpfulness)

 
📈 Key insights: Users valued pitch visualizations, but the technical interface cause cognitive overload and discouraged repeated practice. This revealed a clear opportunity: reduce friction to support low-stakes, high-frequency practice.
 

To ensure that the application was pedagogical sound, I conducted a literature review to gather pedagogical requirements on pronunciation training. Leaners need:

  • Authenticity of Input: Diverse and accurate samples to help perceive new phonetic categories
  • Opportunities for Output: Space to quickly test and adjust production of sounds via real-time proprioceptive feedback
  • Accuracy of feedback: Clear comparisons between learner and target productions

 

To ensure usability, I applied ISO 9241-11 guidelines as benchmarks for design and evaluation:

  • Effectiveness: Measured through post-task pitch perception and production
  • Efficiency: Measured by task time and button clicks
  • Satisfaction: Measured through post-task surveys
 
📈 Key output: Pedagogical guidelines were translated into design requirements; usability benchmarks were used to guide experimental design.
 

Step 1: Wireframe Sketches (Lo-Fi Testing)

Began with low-fidelity sketches to test early concepts and gauge how users interpreted visual representations. Conducted quick feedback sessions with 2 users to validate core interactions and layout logic.

Step 2: Cognitive Walkthrough (Early Prototype)

Conducted cognitive walkthroughs with 4 users to identify usability issues in task flow and interface logic. Focused on users’ ability to predict, understand, and complete key actions without guidance.

Step 3: Python Prototype (Functional Testing)

Built a functional prototype in Python and tested it with 2 users. Focused on validating core interactions (recording, pitch visualization, and comparison) within a working interface.

I conducted A/B testing with 15 Mandarin tone learners, comparing Pitch Noodle to the existing tool, PRAAT

  • Efficacy: Learners completed perception tasks (tone identification) and production tasks (tone pronunciation rated by native speakers). Pitch Noodle was found to be 9.7% and 12.4% more effective, respectively.
  • Efficiency: Participants completed three identical tasks on each interface. Pitch Noodle required 11 fewer clicks and was 27 seconds faster.
  • Satisfaction: Users rated their satisfaction with both interfaces. Pitch Noodle scored a 90% vs. 21% for PRAAT.
 

📈 Key output: Pitch Noodle outperformed PRAAT in efficacy, efficiency, and user satisfaction. Additionally, learners reported having agency and enjoying the learning process. 

The Outcome

Pitch Noodle addressed the core challenge of making pitch training accessible to non-experts by simplifying a traditionally technical and overwhelming experience. The solution proved that when learner needs drive design, learning becomes effective and enjoyable.

  • Outperformed widely used tool (PRAAT) in efficacy, efficiency and satisfaction
  • Released as an open-source tool for broader use and adaptation in pronunciation training
  • Presented at academic conference (see edited presentation here), contributing to the conversation on learner-centered tools in language education