It features voice and text input, runs on low-resource machines, and handles over 1,000 multi-context queries per day.
The kiosk queries a Supabase-hosted database with embedded knowledge entries managed through a separate Svelte-based CMS.
When a query is made, the system performs a semantic search using pgvector to find the most relevant information, then injects that data—alongside recent chat history—into a high-parameter AI model hosted serverlessly via Groq for response generation.
Stack used:
- React / Svelte
- TypeScript
- Supabase
- Supabase Auth
- PGVector RAG
- Xenova Transformers
- FuzzyJS
- Gemma 2B / 9B Instruct
- Whisper Large v3
- All-MiniLM-L6-v2 (Embedding model)
- Atom (Model runner)
- Groq (AI model inference)
Embeddings are computed locally using Xenova to reduce cloud overhead, while voice input is transcribed into text before processing. The entire retrieval and response logic was written from scratch, with no external AI orchestration tools used.
Every part of the system—from UI to AI pipeline—was built and integrated manually, with a strong focus on performance, clarity, and deployment constraints.