Hackathon Showcase
1st Place Winner
Discovery
Team consisting of Ecole Polytechnique/X‑HEC students and an Alan engineer skilled in Python/Playwright/Azure OpenAI/Gemini, Docker/Cloud Run, JS/Node.js, ML/quant finance.
YouTube Video
Project Description
https://github.com/manfredi31/aiagentsirl-hackathon/blob/main/README.md
Discovery
Discovery is an AI Travel Companion landmarks through your camera and provides interactive audio guides.
Features
- Landmark Recognition: Point your camera at any landmark and get instant AI-powered identification
- 📝 Smart Descriptions: Receive detailed, contextual descriptions of identified landmarks
- 🎧 Audio Narration: Listen to AI-generated audio descriptions
- 💬 Interactive Chat: Ask questions and learn more about landmarks through an AI chatbot
- 📍 Location-Aware: Uses GPS coordinates to improve landmark identification accuracy
Tech Stack
- Flask API with Server-Sent Events (SSE) for real-time streaming
-
AI Agents:
- Vision Agent: Landmark identification (Gemma 3/Magistral Small self-hosted on Google Cloud Run with L4 NVidia GPU)
- Description Agent: Content generation (Google Gemini 2.5 Flash Latest)
- Audio Agent: ElevenLabs v3 Text-to-speech model
- Chatbot Agent: Powered by google Gemini 2.5 Flash latest models with Supabase MCP integration
-
Google Cloud Run GPU:
- We are self-hosting Gemma/Magistral Small on Google Cloud Run GPU. We use it as our main model to identify landmarks. The output is then passed to a pipeline of Gemini/ElevenLabs models to do the actual landmark description and TTS.
- We managed to have as input to the Gemma/Magistral Small self-hosted model an image.
-
Agentic Framework:
- We built our own agentic framework.
- We decided to build our own agentic framework because with the Google ADK we could not send images as input to the Gemma self-hosted model.
Project Structure
travel-hackathon/
├── backend/ # Flask API server
│ ├── agents/ # AI agent modules
│ └── app.py # Main API endpoints
└── mobile/ # React Native app
├── components/ # UI components
└── hooks/ # Custom React hooks
Setup
Backend
cd backend
uv sync
uv run python app.py
Mobile
cd mobile
npm install
npm start
Environment Variables
Backend requires:
-
OLLAMA_HOST: Vision model endpoint - Supabase credentials for chatbot storage
- Google API credentials for Gemini
Team
Products & Tools
Google
Mistral
NVIDIA