Talk
This talk presents a practical approach to building real-time data systems on top of scraped sources,
using a voice-driven interface as the motivating example.
voice interfaces powered by modern AI systems are becoming more common-but they typically rely on
structured or pre-indexed data. This creates a gap between how users want to interact (natural
language, real-time) and how data is actually available (unstructured, fragile, slow).
This session explores how to bridge that gap.
We will walk through an end-to-end system built in Python that:
- Converts voice input into structured queries
- Dynamically retrieves data via scraping pipelines (MCP-style abstraction)
- Processes and validates incomplete or inconsistent data
- Returns responses via text-to-speech under real-time constraints
About the Speaker
I’m a Python developer working on data extraction systems and real-time AI applications. My recent work focuses on building scraping pipelines that turn unstructured web data into usable, structured information.
I use LLMs to improve data extraction and interpretation, designing agent-like systems that can handle ambiguous inputs and adapt to changing data sources. This includes building workflows where language models assist in parsing, validating, and structuring scraped data.
In parallel, I’ve been developing voice-driven interfaces using tools like LiveKit, connecting speech-to-text, LLM-based processing, and text-to-speech into interactive assistants.
I’m increasingly interested in combining AI with traditional backend systems-using it not as a replacement, but as a layer that makes existing systems more flexible, adaptive, and capable of handling real-world complexity.