Understanding LangChain: How It Works

How LangChain Works — Deep Dive

The Big Picture

LangChain works by chaining together components — each component does one job, and they pass data to each other in a pipeline.

Input → [Component 1] → [Component 2] → [Component 3] → Output

Step-by-Step Execution Flow

			
┌─────────────────────────────────────────────────────┐
│                   USER INPUT                        │
│           "Summarize my uploaded PDF"               │
└──────────────────────┬──────────────────────────────┘
                       ↓
┌─────────────────────────────────────────────────────┐
│              1. DOCUMENT LOADER                     │
│         Reads PDF → extracts raw text               │
└──────────────────────┬──────────────────────────────┘
                       ↓
┌─────────────────────────────────────────────────────┐
│              2. TEXT SPLITTER                       │
│     Splits text into smaller chunks (e.g. 500       │
│     tokens each) so LLM can process them            │
└──────────────────────┬──────────────────────────────┘
                       ↓
┌─────────────────────────────────────────────────────┐
│              3. EMBEDDINGS + VECTOR STORE           │
│   Converts chunks into vectors → stores in DB       │
│   (enables semantic search later)                   │
└──────────────────────┬──────────────────────────────┘
                       ↓
┌─────────────────────────────────────────────────────┐
│              4. RETRIEVER                           │
│   User asks question → finds most relevant chunks   │
└──────────────────────┬──────────────────────────────┘
                       ↓
┌─────────────────────────────────────────────────────┐
│              5. PROMPT TEMPLATE                     │
│   Injects retrieved chunks + question into prompt   │
└──────────────────────┬──────────────────────────────┘
                       ↓
┌─────────────────────────────────────────────────────┐
│              6. LLM (Claude / GPT etc.)             │
│         Generates answer based on context           │
└──────────────────────┬──────────────────────────────┘
                       ↓
┌─────────────────────────────────────────────────────┐
│              7. OUTPUT PARSER                       │
│      Formats raw LLM response → structured output  │
└──────────────────────┬──────────────────────────────┘
                       ↓
                  FINAL ANSWER

		

Core Mechanism 1 — Chains

A Chain is the most basic unit. It connects components in sequence.

			
from langchain.chains import LLMChain
from langchain.prompts import PromptTemplate
# Step 1: Define a prompt template
prompt = PromptTemplate(
    input_variables=["topic"],
    template="Explain {topic} in simple terms."
)
# Step 2: Connect prompt → LLM
chain = LLMChain(llm=llm, prompt=prompt)
# Step 3: Run it
result = chain.run("quantum computing")
# Output: "Quantum computing is..."

		

Data flows like this:

			
"quantum computing"
       ↓
PromptTemplate fills in → "Explain quantum computing in simple terms."
       ↓
LLM generates response
       ↓
Output returned

		

Core Mechanism 2 — Memory

Memory injects past conversation into every new prompt automatically.

			
from langchain.memory import ConversationBufferMemory
memory = ConversationBufferMemory()
# Turn 1
memory.save_context({"input": "My name is Alex"}, 
                    {"output": "Nice to meet you, Alex!"})
# Turn 2 — memory auto-injects history into next prompt
print(memory.load_memory_variables({}))
# → {"history": "Human: My name is Alex\nAI: Nice to meet you, Alex!"}

		

Internally, every prompt becomes:

			
[Past conversation history]   ← injected by memory
[Current user message]        ← new input
[LLM response]

Types of memory:

Type	How it works
`BufferMemory`	Stores full raw conversation
`SummaryMemory`	Summarizes old turns to save tokens
`WindowMemory`	Keeps only last N turns
`VectorStoreMemory`	Retrieves semantically relevant past messages

Core Mechanism 3 — Retrieval (RAG)

RAG = Retrieval-Augmented Generation. Lets the LLM answer questions about YOUR data.

			
YOUR DATA (PDF, website, DB)
         ↓
   Split into chunks
         ↓
  Convert to vectors (embeddings)
         ↓
   Store in Vector DB (e.g. FAISS, Pinecone)
         ↓
User asks: "What does page 5 say about revenue?"
         ↓
  Search vector DB → find top 3 relevant chunks
         ↓
  Inject chunks into prompt → LLM answers

		

			
from langchain.vectorstores import FAISS
from langchain.embeddings import OpenAIEmbeddings
# Store documents as vectors
vectorstore = FAISS.from_documents(docs, OpenAIEmbeddings())
# Retrieve relevant chunks for a query
retriever = vectorstore.as_retriever()
relevant_docs = retriever.get_relevant_documents("What is the revenue?")

		

Core Mechanism 4 — Agents

Agents are the most powerful part. The LLM dynamically decides which tools to use and in what order.

			
User: "Search the web for today's Bitcoin price and convert it to CAD"
         ↓
  Agent thinks: "I need 2 tools — web_search, then currency_converter"
         ↓
  Step 1: calls web_search("Bitcoin price today")
         ↓
  Step 2: reads result → $63,000 USD
         ↓
  Step 3: calls currency_converter(63000, "USD", "CAD")
         ↓
  Step 4: reads result → $86,000 CAD
         ↓
  Agent responds: "Bitcoin is ~$86,000 CAD today"

		

The internal reasoning loop (ReAct pattern):

			
Thought:  What do I need to do?
Action:   Call tool X with input Y
Observation: Tool returned Z
Thought:  Now I need to...
Action:   Call tool A with input B
...repeat until...
Final Answer: [complete response]

		

How Components Connect — LCEL

Modern LangChain uses LCEL (LangChain Expression Language) — a clean pipe | syntax:

			
from langchain_core.runnables import RunnablePassthrough
# Build a RAG chain using pipes
rag_chain = (
    {"context": retriever, "question": RunnablePassthrough()}
    | prompt_template
    | llm
    | output_parser
)
# Run it
rag_chain.invoke("What is the company's revenue?")

		

Each | passes the output of one component as input to the next — just like Unix pipes.

Full Internal Flow Summary

			
User Query
    │
    ▼
[Memory] ──────────────────────────────────────┐
    │                                           │
    ▼                                           │
[Retriever] → finds relevant docs              │
    │                                           │
    ▼                                           ▼
[Prompt Template] ← fills in: query + docs + history
    │
    ▼
[LLM Model] → generates raw text
    │
    ▼
[Output Parser] → structures the response
    │
    ▼
[Memory] ← saves this turn to history
    │
    ▼
Final Response → User

		

Key Takeaway

LangChain works by breaking AI applications into modular, composable pieces — each doing one job well — and connecting them into powerful pipelines that can remember, retrieve, reason, and act.

Infra Cloud Solutions

Understanding LangChain: How It Works

How LangChain Works — Deep Dive

The Big Picture

Step-by-Step Execution Flow

Core Mechanism 1 — Chains

Core Mechanism 2 — Memory

Core Mechanism 3 — Retrieval (RAG)

Core Mechanism 4 — Agents

How Components Connect — LCEL

Full Internal Flow Summary

Key Takeaway

Leave a comment Cancel reply

How LangChain Works — Deep Dive

The Big Picture

Step-by-Step Execution Flow

Core Mechanism 1 — Chains

Core Mechanism 2 — Memory

Core Mechanism 3 — Retrieval (RAG)

Core Mechanism 4 — Agents

How Components Connect — LCEL

Full Internal Flow Summary

Key Takeaway

Share this:

Related

Leave a comment Cancel reply