MoCA Cognition

Engineering AI tools for early cognitive decline detection

The context

Our client, MoCA Cognition, is the creator of the Montreal Cognitive Assessment (also known as the MoCA Test), a gold-standard tool used worldwide to detect early signs of cognitive impairment like Alzheimer’s. Their one-page, 30-point pen-and-paper test takes just under 15 minutes to complete and is trusted by healthcare professionals across disciplines, from neurologists and psychologists to speech-language pathologists, and family doctors. A trained annotator scores responses in real time, which works well for specialists, but limits scalability and consistency.

To address this, MoCA developed MoCA Solo, a digital version of the test whereby patients are guided by a virtual avatar. They came to us for one specific piece of the puzzle: building the AI models that would automatically (and accurately) score patient responses. Here’s how we collaborated closely with both their internal team and their partner agency to engineer the scalable intelligence that powers MoCA Solo.

Project details

Industry

Health

Technologies

YOLO

OpenCV

Whisper

Parakeet

AWS

Lambda functions

LLMs (Claude & Llama)

Langfuse

Services

The challenge

MoCA’s vision was to create a fully autonomous web application, optimized for tablets, that could administer and score the MoCA Test without the need for a trained annotator in the room. Instead of relying on a human to interpret answers, the app would collect patient input, evaluate it, and deliver results directly to a specialist, who could then assess cognitive trends over time.

Scoring followed strict clinical protocols provided by the MoCA team. The test itself includes a series of 13 audio and visual tasks. On the auditory side, patients might be asked to name their location or repeat a sequence of words. Visually, they might be asked to draw a clock with the hands set to a specific time, replicate a cube from a reference image, or complete a trail-making puzzle (similar to connect-the-dots).

A key challenge for us was ensuring consistent scoring across the traditional and digital formats. In the in-person test, a small degree of subjectivity is inevitable, as human annotators may interpret the same response differently. For example, in a visual task like drawing a cube, one evaluator might be more forgiving about the angle or whether the lines are truly parallel. With audio, a minor error like a verb tense change might be missed. AI models, in contrast, assess inputs mathematically and apply the protocol uniformly. Accurately interpreting and implementing MoCA’s criteria was essential to delivering reliable, repeatable results.

Another major challenge was working with a large dataset where only a small portion of the data was usable. For audio tasks in particular, many recordings included multiple patients completing the same exercises in the background. This overlap made it difficult to determine which response belonged to which speaker, rendering the data unreliable. In any AI project, reaching 90% accuracy is a solid milestone, but closing the final 10% gap is the hardest. That’s where edge cases and unpredictable patterns emerge. Training models to handle those reliably requires more high-quality data. With each iteration, our goal was to narrow the gap between what AI can predict and what human evaluators expect.

One of the drwaing tasks required to do doing the MoCA test.

The task at hand

We started the project with our tailored Discovery Phase focused on assessing the business need, the existing data collected by MoCA, and how AI could be effectively applied in this context. This phase was centered on evaluating data quality, reviewing the app’s technical architecture, and mapping out what the AI models needed to achieve clinically. The outcome was a detailed report outlining approaches for both drawing recognition and speech-to-text analysis, the infrastructure required to support them, and our hardware recommendations for accurate and consistent data collection.

Since another agency had built the app’s front end, we first had to understand their infrastructure in order to integrate our models effectively. Technical alignment was essential. We collaborated closely with both teams to ensure everything fit together smoothly, from model integration to data flow. Weekly check-ins kept the momentum strong and progress clear.

Once integrated, we moved into fine-tuning. For visual tasks, we customized the model for each subtest to maintain scoring precision. For audio, we tested speech recognition tools like Whisper and Parakeet to handle nuances like mumbling, accents, and multiple languages. We also introduced data-cleaning steps to filter background noise, ignore annotator interruptions, and validate recordings before analysis.

As a full-service innovation firm with expertise in both digital product design and software development, we often bring value that goes beyond the AI scope. We look for simple, high-impact ways to improve the experience. In this project, some patients (often, but not exclusively, older adults) were unsure when the tablet was recording. This sometimes led to incomplete or disrupted audio. Rather than adding complexity to the AI, we recommended clear, practical UX updates. Disabling interaction during prompts and adding a countdown animation helped signal exactly when recording began. These changes made the experience more intuitive for patients, improved data quality, and kept project costs under control for the client.

Osedea are true knowledge experts. They consistently bring solutions before problems surface, with a multifaceted approach spanning product, web, and database. Their partnership has been invaluable.

Steven Ricciardi, Vice President, Strategy and Partnerships, MoCA Cognition

The end result

Internally, this project was a rewarding challenge. We took an iterative approach, continually refining our models by benchmarking different technologies and adapting our code to improve performance at every step. We applied tools like YOLO for visual detection and evaluated their impact within the context of MoCA’s clinical requirements. With just 200 training samples, the first phase of this project delivered models that achieved over 90% accuracy on real-world data.

While our scope focused on the AI models, we approached the project holistically. Along the way, we offered input on UX improvements, flagged functional issues, and ensured the system met the high standards expected of any medical-grade application. We also kept a close eye on cybersecurity and DevOps to make sure the foundation was solid from every angle.

MoCA was thrilled with the results and especially appreciated the depth of our documentation. Every decision, workaround, and recommendation was clearly captured, much like a research paper. This level of detail is essential in AI projects. It helps future partners understand the model’s dependencies and limitations, and gives MoCA the tools to pursue regulatory approvals with confidence.

What’s next

It was a privilege to contribute to a project with such meaningful impact. MoCA Solo has the potential to reshape how cognitive decline is detected by enabling early, consistent screening in primary care settings. With AI managing the scoring and clinicians maintaining oversight, the future of cognitive testing will become more scalable and accessible.

MoCA Solo has been showcased this fall at All In, Canada’s leading artificial intelligence event. In the meantime, we’re continuing to refine the models with new data and expanding language support to include French, Spanish, and potentially other languages. As part of the regulatory path to market, the system will undergo validation to confirm that its AI performs on par with human annotators (an essential milestone for any clinical-grade application).

If you’re building digital health tools, or tackling complex AI challenges that demand thoughtful design and reliable engineering, please reach out. We’d love to collaborate.