← Back Home
RAG System for German Historical Speech
RAG System for German Historical Speech, a project under
Dr. Markus Mühling,
Distributed Systems & Intelligent Computing Group , is a Retrieval Augmented Generation (RAG)
application designed to retrieve relevant historical speech transcripts
and generate context aware responses using Large Language Models.
The system combines semantic retrieval techniques with vector databases
and language models to enable efficient exploration and understanding
of historical speech archives.
Responsibilities
Developed a RAG pipeline for retrieving relevant historical speech transcripts
and generating context aware responses.
Processed, cleaned, and structured transcript datasets
for efficient semantic retrieval workflows.
Built embedding based vector stores and integrated retrieval results
with Large Language Models (LLMs).
Designed evaluation pipelines to measure retrieval precision,
semantic relevance, and response quality.
Worked on information retrieval Documentation
and semantic search strategies for improved query performance.
Technologies & Domains
Python
Elasticsearch
RAG Systems
Vector Databases
Arctic-Embed.2.0
Semantic Splitting
Recursive Text Splitting
Langgraph
Embedding Models
Natural Language Processing
Agentic RAG
Source code are private due to NDA.