RAG : LangChain LCEL Vs. LangGraph — Part 1
In the ever-evolving world of AI, Natural Language Processing (NLP) tools are becoming increasingly sophisticated, making it easier for developers to build complex, conversational systems. Two prominent frameworks that have emerged to assist in creating powerful AI applications are LangChain and LangGraph.
LangChain’s LCEL (LangChain Execution Logic) and LangGraph, though both aimed at constructing AI-driven pipelines, differ significantly in their architecture, workflow, and optimization for specific use cases. Understanding these differences can help developers choose the right tool for their project needs, whether they are building a document retriever powered by Azure AI Search, a chatbot, or more advanced AI workflows.
- LangChain LCEL focuses on orchestrating LLM-based chains, simplifying the integration of language models, APIs, databases, and other components. It enables users to create and manipulate chains of operations, leveraging language models to manage tasks such as reasoning, summarizing, and querying in a flexible and modular manner.
- LangGraph, on the other hand, introduces a more graph-based, visual-first approach to building AI workflows. Instead of linear chains, LangGraph uses a node-and-edge structure to represent AI processes, allowing for more dynamic and visually trackable interactions between different models and data sources. This is particularly useful in complex systems where operations may have branching logic or need robust dependency management.
This blog post dives into the comparison between LangChain LCEL and LangGraph, exploring their architectures and real-world applications, with a specific use case in retrieval-augmented generation (RaG) using Azure AI Search for document onboarding. We’ll highlight the strengths and limitations of each framework, helping you understand how to leverage these tools effectively for different AI-driven tasks.
Repository : charotAmine/LangChain-LCEL-Vs-LangGraph (github.com)
Let’s create a Retrieval-Augmented Generation (RaG) example for an onboarding use case using Azure AI Search. The idea behind RAG is that instead of relying solely on a language model to generate responses, we retrieve relevant data from external sources (like Azure AI Search), which enriches the generated response with more accurate and up-to-date information.
Use Case: Employee Onboarding with RaG Using Azure AI Search
In this scenario, a company uses RAG to assist employees with onboarding. Employees upload their onboarding documents to Azure AI Search, and later, they can query this data using a chatbot to retrieve relevant information. The system will use Azure AI Search to retrieve the documents and an LLM (Large Language Model) to generate natural language answers based on those documents.
What we are going to build :
- Upload onboarding documents (PDFs, docs, HR policies, etc.) to Azure AI Search.
- Use RAG to answer employees’ onboarding-related questions by retrieving relevant documents and generating answers using the LLM.
Basic Typical use case :
Step 1: Upload Data to Azure AI Search
We assume that employee onboarding documents, such as company policies, compliance guidelines, or training materials, are uploaded to Azure AI Search for indexing. This step would be done outside of the RaG pipeline, possibly through a data ingestion pipeline or a manual upload process.
Here’s a Python example of how documents could be uploaded to Azure AI Search:
from azure.core.credentials import AzureKeyCredential
from azure.search.documents import SearchClient
from azure.search.documents.indexes import SearchIndexClient
from azure.search.documents.indexes.models import SimpleField, SearchIndex
import os
# Azure Cognitive Search details
endpoint = "" # Replace with your Azure search endpoint
api_key = "" # Replace with your Azure search API key
index_name = "" # Define your index name
# Initialize search client and index client
index_client = SearchIndexClient(endpoint=endpoint, credential=AzureKeyCredential(api_key))
search_client = SearchClient(endpoint=endpoint, index_name=index_name, credential=AzureKeyCredential(api_key))
# Define the search index schema
fields = [
SimpleField(name="id", type="Edm.String", key=True),
SimpleField(name="content", type="Edm.String")
]
index = SearchIndex(name=index_name, fields=fields)
# Create the index in Azure Cognitive Search
index_client.create_index(index)
# Load and upload documents from local .txt files
documents = []
file_directory = "./data" # Directory where your .txt files are stored
for filename in os.listdir(file_directory):
with open(os.path.join(file_directory, filename), "r") as file:
content = file.read()
documents.append({"id": filename.split(".")[0], "content": content})
# Upload documents to Azure Cognitive Search
search_client.upload_documents(documents)
print("Documents uploaded successfully.")
Step 2: Retrieval-Augmented Generation (RaG) with Azure AI Search and LLM
Once the documents are indexed, employees can ask questions, and the system will retrieve relevant documents from Azure AI Search and use an LLM to generate a human-friendly response.
RaG Pipeline
- Retrieve: Query Azure AI Search for documents related to the employee’s question.
- Augment: Use the retrieved documents as context.
- Generate: Use the LLM to generate an answer based on the retrieved data.
1. LCEL (LangChain Execution Language) Example
LCEL Workflow:
In LCEL, the workflow would be declarative, using pre-built chains. The steps are as follows:
- Retrieve Documents from Azure AI Search.
- Use an LLM to generate the final response based on retrieved documents.
- The logic is simple and linear: Retrieve → Generate → Respond.
Here’s how you might implement this in LCEL:
from langchain_openai import AzureChatOpenAI
from langchain.chains import create_history_aware_retriever, create_retrieval_chain
from langchain_community.retrievers import AzureAISearchRetriever
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_core.runnables.history import RunnableWithMessageHistory
from langchain_core.chat_history import BaseChatMessageHistory
from langchain_community.chat_message_histories import ChatMessageHistory
import os
# Azure Cognitive Search details
endpoint = "" # Replace with your Azure search endpoint
api_key = "" # Replace with your Azure search API key
index_name = "" # Define your index name
store = {}
os.environ["OPENAI_API_VERSION"] = "2023-12-01-preview"
os.environ["AZURE_OPENAI_ENDPOINT"] = "https://eastus.api.cognitive.microsoft.com/"
os.environ["AZURE_OPENAI_API_KEY"] = ""
# Initialize LLM (Azure OpenAI)
llm = AzureChatOpenAI(
azure_endpoint=os.environ["AZURE_OPENAI_ENDPOINT"],
azure_deployment="gpt-4o",
api_key=os.environ["AZURE_OPENAI_API_KEY"],
api_version=os.environ["OPENAI_API_VERSION"],
)
retriever = AzureAISearchRetriever(
service_name=endpoint,api_key=api_key, content_key="content", top_k=1, index_name=index_name
)
def create_history_aware(retriever):
contextualize_q_system_prompt = (
"Given a chat history and the latest user question "
"which might reference context in the chat history, "
"formulate a standalone question which can be understood "
"without the chat history. Do NOT answer the question, "
"just reformulate it if needed and otherwise return it as is."
)
contextualize_q_prompt = ChatPromptTemplate.from_messages(
[
("system", contextualize_q_system_prompt),
MessagesPlaceholder("chat_history"),
("human", "{input}"),
]
)
return create_history_aware_retriever(llm, retriever, contextualize_q_prompt)
def create_question_answer_chain():
system_prompt = (
"""
You are an onboarding assistant. Here's the context from company documents: {context}
"""
)
qa_prompt = ChatPromptTemplate.from_messages(
[
("system", system_prompt),
MessagesPlaceholder("chat_history"),
("human", "{input}"),
]
)
return create_stuff_documents_chain(llm, qa_prompt)
def create_retrieval():
return create_retrieval_chain(create_history_aware(retriever), create_question_answer_chain())
def create_conversational_rag_chain(session_id):
def get_session_history(session_id: str) -> BaseChatMessageHistory:
if session_id not in store:
store[session_id] = ChatMessageHistory()
return store[session_id]
return RunnableWithMessageHistory(
create_retrieval(),
get_session_history,
input_messages_key="input",
history_messages_key="chat_history",
output_messages_key="answer",
)
response = create_conversational_rag_chain('testMedium').invoke(
{"input": "who is the company"},
config={"configurable": {"session_id": 'testMecdium'}}
)
print(response["answer"])
Running it :
2. LangGraph Example
LangGraph Workflow:
LangGraph excels when you need to create flexible, complex workflows. In this example, we can also imagine a more advanced onboarding scenario where:
- We first check the type of question the employee is asking.
- Depending on the question, we either retrieve information from Azure AI Search or invoke a secondary process (e.g., FAQs, employee handbooks).
- Generate a response based on the retrieved documents or other data.
LangGraph lets us build nodes representing different operations (retrieve, generate, check), and we can connect them based on conditional logic. It allows branching and more complex decision-making in the process.
LangGraph Code Example:
from langgraph.graph import StateGraph, END
from typing import Dict, TypedDict
from langchain_openai import AzureChatOpenAI
from langchain_community.retrievers import AzureAISearchRetriever
import os
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate
from typing import TypedDict
from IPython.display import Image, display
# Azure Cognitive Search details
endpoint = "" # Replace with your Azure search endpoint
api_key = "" # Replace with your Azure search API key
index_name = "" # Define your index name
store = {}
os.environ["OPENAI_API_VERSION"] = ""
os.environ["AZURE_OPENAI_ENDPOINT"] = ""
os.environ["AZURE_OPENAI_API_KEY"] = ""
# Initialize LLM (Azure OpenAI)
llm = AzureChatOpenAI(
azure_endpoint=os.environ["AZURE_OPENAI_ENDPOINT"],
azure_deployment="gpt-4o",
api_key=os.environ["AZURE_OPENAI_API_KEY"],
api_version=os.environ["OPENAI_API_VERSION"],
)
retriever = AzureAISearchRetriever(
service_name=endpoint,api_key=api_key, content_key="content", top_k=5, index_name=index_name
)
def retrieve(state):
print("RETRIEVE NODE")
state_dict = state["keys"]
question = state_dict["question"]
local = state_dict["local"]
documents = retriever.invoke(question)
return {"keys": {"documents": documents, "local": local,
"question": question}}
def format_docs(state):
print("FORMAT DOCUMENT NODE")
state_dict = state["keys"]
question = state_dict["question"]
local = state_dict["local"]
documents = state_dict["documents"]
document = "\n\n".join(doc.page_content for doc in documents)
return {"keys": {"formatted_documents": document, "local": local,
"question": question}}
def generate(state):
print("generate:", state)
state_dict = state["keys"]
question = state_dict["question"]
formatted_docs = state_dict["formatted_documents"]
result = chain_with_prompt.invoke({"question": question, "context": formatted_docs})
return {"keys": {"formatted_documents": formatted_docs, "response": result,
"question": question}}
template = """Answer the question based only on the following context:
{context}
Question: {question}
"""
prompt = ChatPromptTemplate.from_template(template)
chain_with_prompt = prompt | llm | StrOutputParser()
class GraphState(TypedDict):
"""
Represents the state of our graph.
Attributes:
keys: A dictionary where each key is a string.
"""
keys: Dict[str, any]
# Build LangGraph workflow
workflow = StateGraph(GraphState)
workflow.add_node("retrieve", retrieve) # retrieve
workflow.add_node("format_document", format_docs) # format documents
workflow.add_node("generate", generate) # generatae
workflow.add_edge("retrieve", "format_document")
workflow.add_edge("format_document", "generate")
workflow.add_edge("generate", END)
workflow.set_entry_point("retrieve")
app = workflow.compile()
try:
display(Image(app.get_graph(xray=True).draw_mermaid_png()))
except:
pass
# Employee asks a question
employee_query = "What is the company’s policy on remote work?"
result = app.invoke({"keys": { "local": "",
"question": employee_query}})
print("Response : ",result['keys']['response'])
By Running, I have displayed first the nodes and edges like :
Then, let me ask him about the remote work policy and I have an answer :
Which is like the documents.
LangGraph Workflow Summary:
- Node-based architecture: LangGraph allows you to define nodes for each part of the workflow (input, retrieval, generation).
- Conditional logic: You can add branches for different actions based on the type of input. For example, certain queries might not need to hit Azure AI Search, but simply reference pre-built answers (like company FAQs).
- Complex workflows: You can build multi-step, non-linear workflows (e.g., verifying the query type, adding approval steps, sending emails, etc.).
Pros:
- Highly flexible: Ideal for complex workflows where actions depend on user input or other conditions.
- Customizable nodes: You can easily add new nodes (e.g., a node for checking employee role or department).
- Reusable components: Nodes can be reused across different workflows.
Cons:
- More complex: Requires additional setup compared to LCEL, which could be overkill for simple tasks.
- Initial overhead: More work upfront to define all nodes and their interactions.
Scenario 1: Using LCEL for Onboarding Q&A
- Simple and Linear: Employee asks “What is the company’s remote work policy?” LCEL retrieves documents, uses the LLM to generate a response, and sends the answer.
- Ease of use: LCEL is ideal for a straightforward, Q&A-style onboarding process, where every query follows the same linear path (retrieve and answer). No complex decision-making is involved.
Scenario 2: Using LangGraph for Onboarding Q&A
- Dynamic and Flexible: LangGraph can be used for a more adaptive onboarding process. For instance, if the employee asks a general question like “What are my onboarding tasks?”, it might not query Azure Search but instead retrieve pre-stored FAQs. If the question is more specific, like “What’s the company’s remote work policy?”, it uses Azure Search and LLM to generate an answer.
- Conditional branching: For more advanced use cases, you might have different nodes for legal approvals, external API calls, or even scheduling onboarding meetings.
When to Choose LCEL:
- Simplicity: Perfect for small companies or straightforward Q&A processes where you don’t need much customization.
- Minimal Setup: If the onboarding process is simple and doesn’t involve complex decision-making, LCEL does the job without much overhead.
When to Choose LangGraph:
- Complex workflows: If your onboarding system needs to handle a variety of scenarios (e.g., HR approvals, security clearances, role-based onboarding), LangGraph’s node-based flexibility is better.
- Customization: LangGraph allows you to create workflows that dynamically adapt based on input, such as fetching information from different sources based on the type of question.
Conclusion:
- LCEL: Linear, easy-to-implement, and great for simple workflows.
- LangGraph: Non-linear, flexible, and designed for more advanced, dynamic onboarding processes
- Repository : charotAmine/LangChain-LCEL-Vs-LangGraph (github.com)
LangGraph may seem complex at first glance, but it’s actually highly modular and closely mirrors the way the human brain processes information — step-by-step and logically. Humans naturally tend to break problems down into smaller, sequential tasks, and LangGraph aligns with that intuitive workflow. I completely agree with you when it comes to debugging. With LCEL, after writing tons of code, I sometimes struggle to understand my own logic, even when everything works. Now, imagine the frustration when it doesn’t work and I have to debug it!
Next Steps :
Here you go : Next Steps will be about developping and End To End AI Onboarding project ! Your ideas are welcomed !