This use case was realized with:

ILTIS Logo

Developing and deploying scalable AI voice transcription

In 2024, Iltis partnered with Codesphere to implement and deploy a speech recognition and knowledge retrieval use case based on the latest open source AI technologies.

“Together with Codesphere, we developed a working PoC in just 6 weeks, which allowed us to test and gather feedback incredibly quickly.”

Alexander Ott

CEO @ ILTIS

Achievements in < 6 weeks

In under 6 weeks, the team managed to develop a fully functional, scalable AI voice transcription tools with 4 different services:

Real time speech recognition

Utilizing OpenAI’s Whisper models for real time voice transcription.

ERP integration

Fully integrated into ILTIS knowledge system with automated updates.

Semantic search

Using the transcribed information to search through a Vector DB.

Operator cockpit UI

UI for ILTIS employees to seamlessly interact with the application.

6

Weeks

>2000+

Documents embedded

100%

GDPR compliant

“Great product, amazing team behind, superb support at any time! We love Codesphere!”

Alexander Woelke

Co-Founder & Co-CEO @ SaaS Titans

via Product Hunt

Fully composable architecture

The application creates numerical representations of the transcribed data and stores them in a vector database.

Frontend

Records voice, handles interaction with services.

Transcription Server

Transcribes audio sequences into text.

Sentence Transformer

Creates numerical representations of spoken input.

PostgreSQL

Database for storing numerical input

Sophisticated ingestion

The application creates numerical representations of the transcribed data and stores them in a vector database.

Ingestion Pipeline Server

Records voice, handles interaction with services.

PDF Server

Transcribes audio sequences into text.

Sentence Transformer Server

Creates numerical representations of spoken input.

ILTIS ERP System

Database for storing numerical input

PostgreSQL

Database for storing numerical input

100% open source stack

All services were built with only open source technologies, easy and seamlessly connected through Codesphere.

Frontend

The frontend runs a basic Sveltekit application, taking care of recording the microphone input and sending it as a .wav to the transcription server.

Transcription Server

A Whisper.cpp server running on a Codesphere Pro plan, enabling real time CPU inference for speech to text. No GPU needed, keeping the cost down

Sentence Transformer

A FastAPI server running the sentence-transformer library to create the embeddings needed to store and retrieve ILTIS data from the database

PostgreSQL + pgvector

A PostgreSQL database with the pgvector extension which extends the stored information with vector embeddings and implements a rapid vector search, perfect for RAG use cases

Data Ingestion Pipeline

Node.js server that pulls data from the ERP, parses all PDFs, creates embeddings and checks and fills the .

PDF Server

Sterling PDF endpoint that takes in legacy .doc files and converts them into PDF files that can be parsed by the main Node server.

Unblock your digital transformation now!

Get in touch to learn more about how Codesphere can help you transform your organization and bridge the tech gap.