KZ
GuidesAbout
GuidesAbout
All Guides
© 2026 Khader Zatari. All rights reserved.

AI Agents & LLM System Design

Agent architectures, LLM system design patterns, and everything you need to know for AI engineering interviews.

Tool Use & Function Calling

Agent ArchitectureMedium

How LLMs interact with external tools and APIs through structured function calling, including schema design and error handling.

ReAct Pattern

Agent ArchitectureMedium

The Reasoning + Acting loop where agents interleave chain-of-thought reasoning with tool execution steps.

Multi-Agent Orchestration

Agent ArchitectureHard

Coordinating multiple specialized agents — routing, delegation, handoffs, and shared state management.

Memory & Context Management

Agent ArchitectureMedium

Strategies for managing conversation history, long-term memory, context windows, and summarization.

RAG Pipelines

Agent ArchitectureMedium

Retrieval-Augmented Generation: chunking, embedding, vector search, re-ranking, and grounding LLM outputs in external knowledge.

Agent Evaluation & Testing

Agent ArchitectureMedium

How to evaluate agent performance — task completion rates, trajectory analysis, regression testing, and human-in-the-loop evaluation.

Prompt Engineering Patterns

LLM System DesignEasy

Systematic approaches to prompting: few-shot, chain-of-thought, system prompts, and structured output formatting.

Fine-Tuning vs RAG Decisions

LLM System DesignMedium

When to fine-tune a model vs use RAG. Trade-offs in cost, latency, accuracy, and maintainability.

Embedding Pipelines

LLM System DesignMedium

Designing pipelines for generating, storing, and querying embeddings — vector databases, indexing strategies, and similarity search.

Guardrails & Safety

LLM System DesignMedium

Input/output validation, content filtering, PII detection, prompt injection defense, and responsible AI practices.

Scaling Inference

LLM System DesignHard

Batching, model serving (vLLM, TensorRT), GPU optimization, auto-scaling, and multi-model deployment strategies.

Cost Optimization

LLM System DesignMedium

Token management, caching strategies, model selection, prompt compression, and building cost-efficient LLM applications.