← All Projects

Obsidian Notes Pipeline: AI-Powered Knowledge Management

A full-stack RAG application that transforms YouTube videos into interconnected Obsidian notes -- 1,000+ notes, 2,757 auto-generated links, 5,000 searchable chunks, and a chatbot with 2.5s latency, all for $1.50.

1,000+ notes → knowledge graph
2,757 bidirectional links
2.5s RAG chatbot response
$1.50 total pipeline cost
FastAPI React 18 PostgreSQL Qdrant Anthropic Claude OpenAI Embeddings

Overview#

A comprehensive AI-powered knowledge management system that transforms YouTube videos into structured, interconnected Obsidian notes. The pipeline processed 1,000+ notes into a knowledge graph with 2,757 bidirectional links, 5,000 searchable chunks, and a RAG chatbot that answers questions in 2.5 seconds with zero hallucinations — all for $1.50 in API costs.

The Problem#

YouTube contains invaluable technical knowledge, but watching videos is time-consuming and notes end up scattered without connections. After three years of dumping notes into Obsidian, I had 1,000+ notes and 1,280 chaotic tags — a knowledge base in name only. A note about RAG architectures didn’t link to my notes about vector databases. Finding information meant remembering which video covered it.

Before: Disconnected Notes and Chaotic Tags#

BeforeAfter
1,000+ notes in isolated silos, zero connections2,757 bidirectional links forming a navigable knowledge graph
1,280 chaotic tags including a hex color code (#3498db)1,040 curated hierarchical tags with 4-level nesting
No frontmatter, inconsistent metadata across 3 formatsStandardized YAML frontmatter on every note
Finding information required remembering which video covered itRAG chatbot answers in 2.5 seconds with source attribution
Manual note-taking: watch entire video, type notes by handPaste a URL, get a structured note in minutes

Before and after knowledge graph -- sparse disconnected clusters transformed into 1,024 files connected by 2,757 links

My Approach#

Dual-API Architecture#

Built a system that automatically chooses the right API for the job. Single videos go through the Anthropic Sync API for immediate results. Batch processing of 2+ videos routes to the Anthropic Batch API for 50% cost savings. The backend exposes 50+ API endpoints through FastAPI with Pydantic validation, serving both processing modes through a unified interface.

Dual-mode architecture -- automatic routing between sync and batch processing based on file count

Semantic Knowledge Graph#

Embedded all 1,000+ notes using OpenAI’s text-embedding-3-small model, producing 1,536-dimensional vectors stored in a self-hosted Qdrant instance. The system splits notes into 5,000 searchable chunks using a 2,000-character window with 400-character overlap. Notes with cosine similarity above 0.70 get automatically linked with bidirectional wiki links — the threshold that balances quality connections with cross-domain discovery.

Intelligent Vault Curation#

A three-stage Python pipeline (analyze, prompt, apply) cleaned up three years of metadata chaos. The system detects each note’s format — existing YAML, legacy inline metadata, or no metadata at all — and applies tailored prompts for each condition. Claude Code designed a 4-level hierarchical tag taxonomy in minutes, transforming 1,280 chaotic tags into 1,040 organized categories. The entire vault was processed through the Anthropic Batch API: 1,028 files across 8 batches with 100% success rate.

Tag hierarchy before and after -- 1,280 chaotic flat tags transformed into a clean 4-level hierarchical taxonomy

RAG Chatbot#

Ask your vault anything. The chatbot retrieves relevant chunks from Qdrant, builds a context window for Claude 3 Haiku, and synthesizes answers grounded entirely in your notes — zero hallucinations. Every answer includes source attribution with relevance scores, and an “Open in Obsidian” button that deep-links directly to the source note. Query latency averages 2.5 seconds end-to-end.

RAG chatbot demo -- asking about trading discipline techniques with source citations, 2.5s latency, and zero hallucinations

System Architecture#

+---------------------------------------------------------------------------+
YOUTUBE MARKDOWN AGENT
+---------------------------------------------------------------------------+
INPUT PROCESSING OUTPUT
YouTube URL(s) -> Transcript Extraction -> Obsidian Markdown
| |
Anthropic Claude API Qdrant Indexing
| |
Tag Resolution Auto-Link Similar Notes
+---------------------------------------------------------------------------+
INFRASTRUCTURE
PostgreSQL | Qdrant | FastAPI | React
(Metadata) | (Vectors) | (Backend) | (Frontend)
+---------------------------------------------------------------------------+

The four-stage content pipeline -- YouTube URL through extraction, AI processing, semantic indexing, and tag resolution to linked Obsidian note

Key Achievements#

MetricValue
Notes processed1,000+ across the entire vault
Auto-generated links2,757 bidirectional wiki links
Searchable chunks5,000 indexed in Qdrant
Curated tags1,040 hierarchical (from 1,280 chaotic originals)
API cost savings50% using Anthropic Batch API
Chatbot latency2.5 seconds per answer, end-to-end
Hallucination rate0 — every answer traces to a source note
Total cost for vault cleanup$1.50 for 1,028 files

Technical Implementation#

YouTube to Markdown Pipeline#

Two processing workflows serve different needs. The YouTube-to-Markdown mode generates complete structured notes from video transcripts — paste a URL, select summary or detailed mode, and get an Obsidian-ready note with YAML frontmatter, key takeaways, and timestamped sections. The Obsidian Notes Processing mode adds metadata to existing notes in bulk, preserving original content while standardizing frontmatter, tags, and descriptions.

YouTube to Markdown dashboard -- paste a URL, select options, and generate structured Obsidian notes

Obsidian Notes Processing dashboard -- batch process existing notes to add structured frontmatter and links

Anthropic Batch API Integration#

The Batch API delivers 50% cost savings for bulk processing, but required solving a critical indexing bug. Per-batch metadata dictionaries reset their indices, causing file 100’s data to silently overwrite file 0 across batch boundaries. The fix: global indices instead of per-batch local indices. Progressive testing at escalating scales (2, 6, 52, 122, 782 files) caught the bug before production.

Batch API economics -- Standard API cost versus 50% Batch API savings across 1,028 files

Note Transformation#

Every note gets a complete metadata overhaul. The system standardizes three different metadata formats (existing YAML, legacy inline, and no metadata) into consistent frontmatter with lowercase keys, formatted dates, hierarchical tags, descriptions, and bidirectional related note links.

Note before processing -- minimal unstructured metadata

Note after processing -- rich YAML frontmatter with curated tags, aliases, description, and bidirectional related note links

RAG Chatbot#

The chatbot turns your vault into a conversational interface. It embeds the user’s question, searches Qdrant for the most relevant chunks (using a 0.50 similarity threshold for recall), deduplicates to ensure source diversity, and generates an answer grounded in the retrieved context. Source attribution shows relevance scores for each cited note, and the “Open in Obsidian” button transforms the chatbot from a dead-end answer into a discovery tool.

The Vault Chatbot -- ask questions across your entire knowledge base with source attribution

The Compounding Architecture#

Every layer of automation compounds. Transcript extraction enabled structured notes. Batch processing made scale affordable. Semantic indexing enabled auto-linking. Clean metadata enabled accurate retrieval. And retrieval enabled conversation with your own knowledge.

Technologies#

Backend:

  • Python 3.11 + FastAPI
  • PostgreSQL (relational data)
  • Qdrant (vector search, self-hosted on Proxmox)
  • Pydantic (data validation and API contracts)

AI:

  • Anthropic Claude API (Haiku 3.5) — note generation, tag curation, chatbot answers
  • Anthropic Batch API — 50% cost savings for bulk processing
  • OpenAI API (text-embedding-3-small) — embeddings only

Frontend:

  • React 18 + TypeScript + Vite
  • Tailwind CSS v4 + shadcn/ui
  • React Query for server state management

Infrastructure:

  • Self-hosted Qdrant on NAS
  • Docker containerization
  • Hot-reload development environment

Tech stack overview -- Backend, Frontend, State Management, and UI architecture

What Happened Next#

This project was later migrated using the Bootstrap & Migrator Framework — a meta-project where Claude Code agents analyze existing codebases and generate tailored agent infrastructure. The most significant change: the Anthropic Batch API (celebrated in Part 2 below) was replaced with asyncio.TaskGroup parallel processing — the Batch API’s 4+ hour completion times had made batch features effectively unusable in production. The migration also split the 2,800-line monolithic API into 7 focused router modules, centralized 8 independent database connections into a single connection pool, added 118 unit tests, and generated 5 specialized agents with 7 domain skills.

The Article Series#

This project is documented in a 5-part deep-dive series:

Part 1: From YouTube to Knowledge Graph — The content pipeline that turns URLs into structured notes

Part 2: Anthropic Batch API in Production — 50% cost savings and the bug that nearly corrupted 782 files

Part 3: Building a Semantic Note Network — Vector search turned 1,024 isolated notes into a dense knowledge graph

Part 4: Obsidian Vault Curation at Scale — Three years of tag chaos, fixed in 30 minutes for $1.50

Part 5: Ask Your Vault Anything — A RAG chatbot that answers from your notes in 2.5 seconds

← Back to Projects