Roseline: An AI platform for interpreting infant cry sounds

Roseline - An AI cry analysis platform. Image shows

November 6, 2023

We introduce Roseline, the first deep learning system for monitoring infant health through their cry sounds. Roseline is a multi-part AI system that can detect

        1. infant crying from audio,
        2. identify triggers like pain,
        3. extract acoustic markers of health, and  
        4. predict conditions such as brain injury. 

In a recently completed clinical validation study involving hospitals across 3 continents (North America, South America and Africa), we tested that Roseline is accurate in detecting neonatal brain injury by up to 92.5%. This work marks a significant step in validating the utility of the infant cry as a vital sign, especially considering the geographical diversity and size of the participating cohort. See publication.

We are applying this technology to different areas of infant health, including precise targeting of new therapies, monitoring the effectiveness of baby formulas, probiotics, and vaccines, as well as for identifying the earliest signs of health anomalies.

Book a Demo

Cry sound - a baby’s first language

Crying is a baby’s first language but is often difficult to understand. Over years of experience, clinicians learn to identify sick babies based on the sound of their cries. This pattern recognition is possible because, in newborns, crying is an involuntary response directly controlled by the central nervous system (CNS). Thus, an infant’s cry characteristics are altered when the functioning of the CNS is disturbed.

Healthy baby (Melody: “rising-falling”)


For almost half a century, researchers have sought to objectively quantify the impact of pathologies on the infant cry, but have been limited by the lack of a large, high-quality clinical database of cry recordings, constraining the application of state-of-the-art machine learning. In collaboration with clinicians and parents from around the world, we gathered the largest database of infant cry sounds containing 450,000+ unique cry sounds summing up to more than 140 hours of crying. This has enabled us to apply advanced signal processing and AI techniques to fully characterize healthy patterns in infant crying.

Screenshot 2023-11-03 at 3.32.33 PM

ROSELINE: Precision medicine powered by acoustic AI

Roseline is the heartbeat of our vision for newborn care at Ubenwa. Roseline has four key capabilities:

1. Automatically detect and quantify infant crying over time. This model takes in a stream of audio, then detects and segments individual cry sounds (called “cry units”)

Screenshot 2023-11-06 at 9.10.16 PM

2. Identify triggers of crying such as pain. This model identifies the probable physical needs such as pain, hunger, discomfort and stress which may have caused the elicitation of cry. This is useful for continuously monitoring pain, or assessing therapies for digestive problems, like colic.

Screenshot 2023-11-06 at 9.26.34 PM

3. Extract acoustic biomarkers of health. This model extracts unique markers from cry sounds to provide physiological insights. This is useful for measuring cry as a digital biomarker in pediatric clinical trials, as well as clinical decision support for newborn care.

4. Detect signs of brain injury. This AI model takes as input cry sounds to quantify the presence and severity (if applicable) of neurological injury.

Screenshot 2023-11-06 at 9.54.46 PM

ROSELINE is an acronym which stands for “Reduction Of Self-supervised Entropy to Learn and Infer Neonatal Encephalopathy”.

Training foundation models for audio analysis

To develop Roseline, we designed a specialized methodology, which is now patent-pending, for the training of pathology detection models on audio data. The method is based on a large audio model (LAM) trained in a sequence of 3 stages to achieve a model that is accurate in detecting signs of brain injury and easily applicable to multiple downstream tasks.

Diagram of the training of Ubenwa's Large Audio Model (LAM)

First, we created a 76M-parameter foundation model by training it on a large set of generic, diverse audio sounds. Then, we tweaked that model to understand crying sounds by having it learn from unlabelled crying data. Finally, using crying data that was carefully labelled by physicians, we fine-tuned the model to predict targets of interest such as brain injury and triggers such as pain.

Graph (1)

The infant cry as a vital sign

With this technology, a baby’s cry is no more just a distressing signal we seek to quiet, but one for which parents, clinicians and researchers are equipped with the technology to analyze, interpret and rapidly act upon.

Imagine a world in which you could shazam a baby’s cry to know if they are in pain or at risk of brain damage.

Babies have been trying to talk to us for long. Roseline is a major step to usher in the era of the infant cry as a vital sign.

Note: This technology has not been approved by a regulatory body as a medical device.

Connect with Us on Social Media


Subscribe to Ubenwa Health's Newsletter