Skip to main content

Planning & Requirements

Every great project starts with a problem worth solving. Let me take you back to where this all began.

The Problemโ€‹

It was late October 2024, and I was drowning in research papers. I was trying to stay current with advances in transformer architectures, RAG systems, and multi-agent frameworks - but new papers were being published faster than I could read them.

I'd spend hours:

  • Searching across arXiv, Semantic Scholar, PubMed
  • Copying paper titles and abstracts into notes
  • Trying to remember which paper mentioned what concept
  • Re-reading papers because I forgot their key insights

There had to be a better way.

The Visionโ€‹

I imagined a research assistant that could:

  1. Automatically collect papers from multiple sources based on my interests
  2. Build a knowledge graph showing how papers, authors, and concepts relate
  3. Answer my questions by synthesizing information across papers
  4. Remember our conversations so I don't have to repeat context
  5. Scale from my laptop to production without rewriting code

But I didn't want to build just another prototype. I wanted something that demonstrated production-grade patterns I could use in real applications.

Core Requirementsโ€‹

I broke this down into functional and non-functional requirements.

Functional Requirementsโ€‹

Data Collection

  • โœ… Support multiple data sources (academic databases, web search)
  • โœ… Automatic deduplication of papers
  • โœ… Scheduled/automated collection in background
  • โœ… Rate limiting and error handling

Knowledge Organization

  • โœ… Knowledge graph with entities (papers, authors, topics)
  • โœ… Relationships (authored, cites, is_about)
  • โœ… Vector embeddings for semantic search
  • โœ… Graph visualization

Query & Reasoning

  • โœ… Natural language question answering
  • โœ… Multi-hop reasoning across papers
  • โœ… Source citation with paper references
  • โœ… Conversation memory (5-turn history)

Session Management

  • โœ… Multiple research sessions
  • โœ… Save/load session state
  • โœ… Session statistics and metadata

User Interface

  • โœ… Modern, responsive web interface
  • โœ… Data collection controls
  • โœ… Interactive graph visualization
  • โœ… Chat interface for queries

Non-Functional Requirementsโ€‹

Performance

  • Data collection: < 2 minutes for 10+ papers
  • Query answering: < 5 seconds
  • Graph queries: < 100ms
  • Vector search: < 100ms

Reliability

  • 99% uptime for production deployment
  • Graceful degradation if services unavailable
  • Automatic retries with exponential backoff
  • Circuit breakers for external APIs

Scalability

  • Support 1000+ papers in knowledge base
  • Handle concurrent users in production
  • Horizontal scaling with Kafka
  • Efficient caching to reduce costs

Maintainability

  • 90%+ test coverage
  • Type safety with TypeScript/Pydantic
  • Clear separation of concerns
  • Comprehensive documentation

Cost Efficiency

  • Intelligent model selection (use cheapest model for each task)
  • Caching to avoid redundant API calls
  • Token budgets to prevent runaway costs
  • Target: < $10/month for moderate usage

Choosing the Tech Stackโ€‹

This was one of the most important decisions. I needed technologies that were:

  • Mature enough for production
  • Well-documented so I could move fast
  • Composable so I could swap components
  • Cost-effective to run

Here's how I evaluated each component:

LLM Providerโ€‹

Candidates: OpenAI GPT-4, Anthropic Claude, Google Gemini

Winner: Google Gemini 2.0 Flash

Why?

  • โœ… Fast response times (< 2s average)
  • โœ… Cost-effective ($0.35 per 1M tokens)
  • โœ… Large context window (1M tokens)
  • โœ… Good reasoning capabilities
  • โœ… Free tier for development

I experimented with all three, and Gemini gave the best balance of speed, cost, and quality for research Q&A.

Orchestration Frameworkโ€‹

Candidates: LangChain, LangGraph, Custom

Winner: LangGraph

Why?

  • โœ… Built for multi-agent workflows
  • โœ… Excellent state management
  • โœ… Visual workflow debugging
  • โœ… Works seamlessly with LlamaIndex
  • โœ… Good documentation and examples

LangChain was too linear for my multi-agent pattern. LangGraph gave me the graph-based orchestration I needed.

RAG Frameworkโ€‹

Candidates: LlamaIndex, Haystack, Custom

Winner: LlamaIndex

Why?

  • โœ… Best-in-class retrieval strategies
  • โœ… Flexible architecture
  • โœ… Great integration with vector DBs
  • โœ… Built-in evaluation tools
  • โœ… Active community

LlamaIndex saved me weeks of work on chunking strategies, embedding management, and retrieval optimization.

Knowledge Graphโ€‹

Candidates: Neo4j, NetworkX, TigerGraph

Winner: Both Neo4j AND NetworkX (dual backend)

Why?

  • โœ… Neo4j for production (persistent, scalable, visual)
  • โœ… NetworkX for development (fast startup, no infrastructure)
  • โœ… Same API for both (abstraction layer)
  • โœ… Easy switching via environment variables

This was a game-changer. I could develop and test on my laptop with NetworkX, then deploy to production with Neo4j without changing code.

Vector Databaseโ€‹

Candidates: Pinecone, Weaviate, Qdrant, FAISS

Winner: Both Qdrant AND FAISS (dual backend)

Why?

  • โœ… Qdrant for production (persistent, REST API, dashboard)
  • โœ… FAISS for development (in-memory, no setup)
  • โœ… Same abstraction layer
  • โœ… Cost: $0 (self-hosted Qdrant)

Again, dual backends gave me the flexibility to move fast in development and scale in production.

Event Streamingโ€‹

Candidates: Kafka, RabbitMQ, Redis Streams

Winner: Kafka (optional)

Why?

  • โœ… Industry standard for event-driven systems
  • โœ… Event persistence and replay
  • โœ… Horizontal scaling with consumer groups
  • โœ… Rich ecosystem (Kafka UI, connectors)
  • โœ… Optional: falls back to sync mode if unavailable

I made Kafka optional because it's overkill for development but essential for production scalability.

ETL Orchestrationโ€‹

Candidates: Airflow, Prefect, Dagster

Winner: Apache Airflow

Why?

  • โœ… Industry standard for data pipelines
  • โœ… Visual DAG editor and monitoring
  • โœ… Automatic retries and error handling
  • โœ… Scalable with Celery workers
  • โœ… Rich integrations

Airflow gave me 3-4x faster data collection through parallel execution and automatic retries.

Frontendโ€‹

Candidates: Next.js, Vite+React, SvelteKit

Winner: Vite + React + TypeScript

Why?

  • โœ… Lightning fast dev server (< 1s HMR)
  • โœ… React ecosystem and component libraries
  • โœ… TypeScript for type safety
  • โœ… Lightweight (no SSR overhead)
  • โœ… Easy deployment

I didn't need SSR for this app, so Vite's simplicity and speed won.

Architecture Philosophyโ€‹

I made some key architectural decisions early on:

1. Dual-Backend Strategyโ€‹

Problem: Setting up Neo4j, Qdrant, and Kafka for development is slow and resource-heavy.

Solution: Abstract backends behind interfaces, provide in-memory alternatives.

Benefits:

  • โšก Instant startup in development (0s vs 30s)
  • ๐Ÿงช Faster test suite (no Docker overhead)
  • ๐Ÿ’ฐ Lower cloud costs (single container vs 7)
  • ๐Ÿ”„ Easy switching via env vars

2. Multi-Agent Patternโ€‹

Problem: A single monolithic agent becomes complex and hard to test.

Solution: Separate concerns into specialized agents coordinated by an orchestrator.

Benefits:

  • ๐Ÿงฉ Clear separation of concerns
  • ๐Ÿงช Easier unit testing
  • ๐Ÿ”„ Can replace individual agents
  • ๐Ÿ“ˆ Can scale agents independently

3. Event-Driven Communicationโ€‹

Problem: Synchronous agent calls create tight coupling and bottlenecks.

Solution: Agents publish events to Kafka; consumers process asynchronously.

Benefits:

  • โšก Parallel processing (3x faster)
  • ๐Ÿ”Œ Loose coupling
  • ๐Ÿ”„ Event replay for debugging
  • ๐Ÿ“ˆ Horizontal scaling

4. Production-Grade Patternsโ€‹

From day one, I implemented patterns that would matter at scale:

Circuit Breakers: Prevent cascade failures when APIs go down

@circuit_breaker(failure_threshold=5, recovery_timeout=60)
def call_external_api():
...

Token Budgets: Prevent runaway LLM costs

@token_budget(per_request=10000, per_user=100000)
def generate_answer():
...

Intelligent Caching: 40% cost reduction with dual-tier cache

@cache(ttl=3600, strategy="dual-tier")
def expensive_operation():
...

Dynamic Model Selection: Use cheapest model that meets requirements

model = select_model(task_type="summarization", max_latency=2.0)

The Planโ€‹

With requirements and architecture decided, I created a development plan:

Phase 1: Core Agents (Week 1)

  • Set up project structure
  • Implement DataCollectorAgent with 3 sources
  • Implement KnowledgeGraphAgent with NetworkX
  • Implement VectorAgent with FAISS
  • Implement ReasoningAgent with Gemini
  • Basic OrchestratorAgent

Phase 2: Production Features (Week 2)

  • Add Neo4j backend for graphs
  • Add Qdrant backend for vectors
  • Implement Kafka event system
  • Add 4 more data sources
  • Implement SchedulerAgent
  • Session management and persistence
  • Apache Airflow integration

Phase 3: Frontend & Testing (Week 3)

  • React frontend with glassmorphism design
  • 7 pages (Home, Collect, Ask, Graph, Vector, Upload, Sessions)
  • Comprehensive test suite (90%+ coverage)
  • GitHub Actions CI/CD
  • Docker containerization
  • Documentation

Lessons from Planningโ€‹

Looking back, here's what I learned:

โœ… What Worked

  1. Dual-backend strategy was brilliant - Saved hours of dev time
  2. Starting with requirements - Kept me focused
  3. Choosing mature tech - Less debugging, more building
  4. Production patterns from day 1 - No painful refactoring later

๐Ÿค” What I'd Change

  1. Should have added Airflow earlier - Parallel collection is much faster
  2. Could have started with fewer data sources - 3 would have been enough to validate
  3. Frontend design took longer than expected - Glassmorphism is tricky to get right

๐Ÿ’ก Key Insights

The best architecture is one that lets you move fast in development and scale in production without rewriting code.

Abstractions are worth the upfront cost when they give you optionality.

Production patterns implemented early save painful refactoring later.

Ready for Architecture?โ€‹

Now that you understand the "why" behind ResearcherAI, let's dive into the "how". In the next section, I'll walk you through the system architecture and how all these pieces fit together.

โ† Back to Home Next: Architecture Design โ†’