Cache me if you can
This team is at maximum capacity.
Project Concept
🧠 Project Overview
We’re building a mental-health oriented audio journal that becomes a live emotional companion powered by voice.
Today, millions of people talk to AI like they would to a friend – but most systems only analyze what you say (text), not how you say it (tone, rhythm, stress, pauses).
Our idea is to use ElevenLabs Conversational Agents + Audio Intelligence to create a companion that listens to you, talks back, and tracks your emotional state over time using prosody.
🎯 Goals
- Live voice conversations with an AI companion using ElevenLabs:
- Natural, low-latency dialog
- Turn-taking that feels like a real conversation
- Real-time prosody analysis of the user’s voice:
- Detect variations in tone, intensity, speed, hesitations
- Combine them with the transcript to estimate an emotional score
- Emotional timeline over days/weeks:
- Each session contributes to a personal “emotional graph”
- Users can see how their emotional state evolves over time
- Always framed as self-care & awareness, not therapy
🔧 What We’re Building (Technically)
1. Voice Interface with ElevenLabs
-
Use ElevenLabs Conversational AI / Realtime API as:
- The ears: streaming STT (transcription) and conversation events
- The voice: TTS with expressive, controllable voices
-
The user speaks into a web or mobile client:
- Audio is streamed to ElevenLabs
- We receive live transcripts + timing info
- Eleven’s TTS replies with a warm, adaptive voice
2. Prosody & Emotion Layer (Audio Intelligence)
-
In parallel, the same incoming audio is sent to our prosody service:
- Extracts basic features (pitch, loudness, speech rate, pauses…)
- Optionally uses an audio model/embedding for vocal emotion
-
We fuse:
- How the person speaks (prosody features)
- What they say (transcript, sentiment / intent)
-
This produces:
- A real-time emotional index (e.g. 0–100)
- A label like calm, stressed, sad, energized, etc.
This emotional index is:
- Sent back to the agent as metadata to influence how the ElevenLabs voice should respond (more calm, more encouraging, more neutral).
- Stored in the backend to update the user’s emotional timeline.
3. Emotional Journal & Dashboard
-
Every session is saved as:
- Transcript (or key moments)
- Emotional index over time (graph per session)
- Notes / highlights (e.g. “big drop when talking about work”)
-
We provide a simple dashboard where users can:
- See how their emotional index evolves across sessions
- Spot recurring patterns (e.g. Sundays are always heavier, mornings vs evenings, etc.)
- Optionally export anonymized stats
Entry
Status: Submitted
Last saved: November 15 at 6:12 PM CET
Team Roster (team is at max capacity)
Message board not available for this team yet.
Robin Quériaux Team Lead
Research Scientist at BNP Paribas
Arnaud Durand
Data scientist at BNP Paribas