Core#

The langchain_core package provides tools that are used through the entire LangChain ecosystem. This section considers some of them.

Messages#

There are several classes that represent different aspects of prompting with LangChain.

Class Name

Role

General Description

SystemMessage

System

Provides instructions or context to “prime” the model’s behavior. It sets the persona, tone, or rules for the entire conversation. Typically the first message in a list.

HumanMessage

Human

Represents the user’s input. This is the message that a human sends to the model to ask a question or provide a command.

AIMessage

AI (Assistant)

Represents the response from the language model. This is the output you get after invoking a model. It can contain text, tool calls, or other data.

ToolMessage

Tool

Represents the output or result of a tool function that was invoked by the AI. This is used to pass the outcome of a tool call back to the model for further processing.

The primary design of LangChain is to pass a list of objects to the model. It returns an output of type AIMessage.


All LangChain messages are children of the langchain_core.messages.BaseMessage class. The follwing cell shows the relationship:

from langchain_core.messages import (
    HumanMessage,
    SystemMessage,
    AIMessage,
    ToolMessage,
    BaseMessage
)

(
    issubclass(HumanMessage, BaseMessage),
    issubclass(SystemMessage, BaseMessage),
    issubclass(AIMessage, BaseMessage),
    issubclass(ToolMessage, BaseMessage)
)
(True, True, True, True)

Prompts#

In the LangChain paradigm, a prompt is a structured input for a model. It can include a system message, user input, or messaging history. The lang_chain package provides various tools for prompt templating. The following cell lists the most popular classes used for templating and their descriptions.

Class / Function

Description

BasePromptTemplate

Abstract base class for all prompt templates.

StringPromptTemplate

Base class for string-based templates (like f-string).

PromptTemplate

Core template class for generating prompts with variables. Supports methods like from_template, from_file, from_examples, format, invoke, ainvoke, and batching.

FewShotPromptTemplate

String-based prompt template with few-shot example support.

FewShotPromptWithTemplates

String template variant with embedded few-shot examples.

PipelinePromptTemplate

Combines multiple prompt templates into a pipeline.

BaseChatPromptTemplate

Base class for chat-style prompt templates.

ChatPromptTemplate

Template for chat models; build multi-role messages. Supports from_messages and dynamic placeholders.

AgentScratchPadChatPromptTemplate

Specialized chat prompt for agent scratchpad patterns.

AutoGPTPrompt

Chat prompt variant used in AutoGPT-style workflows.

BaseMessagePromptTemplate

Base for message-level prompt templates.

BaseStringMessagePromptTemplate

Base class for message templates using string patterns.

ChatMessagePromptTemplate

Generates chat messages (with roles, e.g. system/human/AI) from template strings.

HumanMessagePromptTemplate

Template specifically for human messages.

AIMessagePromptTemplate

Template specifically for AI messages.

SystemMessagePromptTemplate

Template specifically for system messages.

MessagesPlaceholder

Placeholder to inject dynamic message history into a chat template.


Consider the PromptTemplate class. You can use the from_template method to create a template. A substitutable pattern is specified by the {}. The format method of the PromptTempalate class returns a string with all substituted values.

from langchain.prompts import PromptTemplate

ans = PromptTemplate.from_template("Your input is: {here}")
print(type(ans))
ans.format(here="Hello!")
<class 'langchain_core.prompts.prompt.PromptTemplate'>
'Your input is: Hello!'

Vector stores#

Langchain integrates with various vector stores. The following table shows a few of them:

Class name

Package

InMemoryVectorStore

langchain-core.vectorstores

FAISS

langchain_community.vectorstores.faiss

PGVector

langchain-postgres (langchain.vectorstores.pgvector)

ElasticsearchStore

langchain-elasticsearch (langchain.vectorstores.elasticsearch)

AzureCosmosDBMongoVCoreVectorSearch

langchain-azure-ai (langchain.vectorstores.azure_cosmos_db_mongo_vcore)

AzureCosmosDBNoSqlVectorSearch

langchain-azure-ai (langchain.vectorstores.azure_cosmos_db_no_sql)

AzureSearch

langchain-azure-ai (langchain.vectorstores.azuresearch)

SQLServer_VectorStore

langchain-sqlserver (langchain.vectorstores.sqlserver)

For more details check:


Consider the simpliest launch option option InMemoryVectorStore, for basic opeartions.

In order to initialize the corresponding object, you must first create the embedding object. In this case, we will use OllamaEmbeddings, so you’re supposed to launch Ollama locally first.

from langchain_core.vectorstores import InMemoryVectorStore
from langchain_ollama import OllamaEmbeddings
from langchain_core.documents.base import Document
vector_store = InMemoryVectorStore(OllamaEmbeddings(model="all-minilm"))

Use the add_documents method to add items to the vector storage. This method takes a list of documents.

documents = [
    Document(s) for s in [
        "This is dog",
        "This is cat.",
        "My car was crased"
    ]
]

vector_store.add_documents(documents=documents)
['5895b10e-af40-4263-b0c6-ff4803bd49a6',
 '4ed7ff85-4f40-4881-8dae-59158b608c62',
 'bac54e30-a7f8-4d7e-b682-d312b29c580c']

The similarity_search method locates documents that are similar to the provided text. The following cells show some outputs for selected examles to make the outputs easier to interpret.

vector_store.similarity_search("This is cow")
[Document(id='5895b10e-af40-4263-b0c6-ff4803bd49a6', metadata={}, page_content='This is dog'),
 Document(id='4ed7ff85-4f40-4881-8dae-59158b608c62', metadata={}, page_content='This is cat.'),
 Document(id='bac54e30-a7f8-4d7e-b682-d312b29c580c', metadata={}, page_content='My car was crased')]
vector_store.similarity_search("Accidents sometimes happens")
[Document(id='bac54e30-a7f8-4d7e-b682-d312b29c580c', metadata={}, page_content='My car was crased'),
 Document(id='5895b10e-af40-4263-b0c6-ff4803bd49a6', metadata={}, page_content='This is dog'),
 Document(id='4ed7ff85-4f40-4881-8dae-59158b608c62', metadata={}, page_content='This is cat.')]

Retriever#

The as_retriever function gives you access a special retriever object that can be used for searching.

retrievier = vector_store.as_retriever(k=1)
retrievier.invoke("09.11")
[Document(id='bac54e30-a7f8-4d7e-b682-d312b29c580c', metadata={}, page_content='My car was crased'),
 Document(id='4ed7ff85-4f40-4881-8dae-59158b608c62', metadata={}, page_content='This is cat.'),
 Document(id='5895b10e-af40-4263-b0c6-ff4803bd49a6', metadata={}, page_content='This is dog')]

Output parsers#

The output parser lets you to specify the desired format for the model’s responses. It then parses those outputs from the machine learning model. There are following parsers implemented in langchain now:

  • JsonOutputParser: Parse the output of an LLM call to a JSON object.

  • JsonOutputToolsParser: Parse tools from OpenAI response to JSON format.

  • PydanticOtputParser: Parse the output of an LLM call to the specified instance of the Pydantic model.

  • PydanticToolsParser: Parse tools from OpenAI response to pydantic object.

  • BaseOutputParser: Allows to create child classes with specified custom parsing approach.

  • BaseLLMOutputParser: Abstract base class.

For more details, check the Output parsers official reference.


Consider the main features of the output parsers using the example of the PydanticOutputParser.

Imagine that you need to extract some information about the laptop the client wants to buy. The request may look like this:

request = "I want to buy the hp-9000, with 8GB of RAM, intel-i8 processor."

The following cell defines the model’s schema. A child of pydantic.BaseModel defines the attributes that you want to extracte from the input. The PydanticOutputParser instance is initialised to process this format.

from pydantic import BaseModel, Field
from langchain_core.output_parsers import PydanticOutputParser

class MyModel(BaseModel):
    model: str = Field(description="The model of the device.")
    ram: int = Field(description="RAM of the device in GB.")
    processor: str = Field(description="Model of the processor.")

parser = PydanticOutputParser(pydantic_object=MyModel)

The get_format_instructions method allows you to get the kind of instruction parser provides to the model. The following cell shows the type of description parser provides to the model.

print(parser.get_format_instructions())
The output should be formatted as a JSON instance that conforms to the JSON schema below.

As an example, for the schema {"properties": {"foo": {"title": "Foo", "description": "a list of strings", "type": "array", "items": {"type": "string"}}}, "required": ["foo"]}
the object {"foo": ["bar", "baz"]} is a well-formatted instance of the schema. The object {"properties": {"foo": ["bar", "baz"]}} is not well-formatted.

Here is the output schema:
```
{"properties": {"model": {"description": "The model of the device.", "title": "Model", "type": "string"}, "ram": {"description": "RAM of the device in GB.", "title": "Ram", "type": "integer"}, "processor": {"description": "Model of the processor.", "title": "Processor", "type": "string"}}, "required": ["model", "ram", "processor"]}
```

The next cell prepares the request sequence, provides it to the model, and displays the response.

from langchain_ollama import ChatOllama
from langchain_core.messages import HumanMessage, SystemMessage
system_message = (
    "Your goal is to extract data according to the following pattern.\n\n" +
    parser.get_format_instructions()
)

model = ChatOllama(model="qwen3:8b", temperature=0)

answer = model.invoke([
    SystemMessage(system_message),
    HumanMessage(request)
])
print(answer.content)
<think>
Okay, let's see. The user wants to buy an HP-9000 with 8GB RAM and an Intel-i8 processor. I need to extract the model, RAM, and processor from their query.

First, the model is mentioned as "hp-9000". I should check if that's the exact model name. The user wrote it in lowercase, but the schema might expect a specific format. Maybe it's better to keep it as is unless there's a standard naming convention. But since the example in the schema isn't provided, I'll stick with the given value.

Next, the RAM is specified as 8GB. The schema requires RAM as an integer in GB. So 8GB would be 8. That's straightforward. No need for units here, just the number.

Then the processor is "intel-i8". The user wrote it in lowercase, but maybe the actual model is "Intel i8" or "Intel-i8". The schema's example might have it as a string, so I'll use "intel-i8" as given. Wait, the user wrote "intel-i8" with a hyphen. Should I capitalize the 'I'? The schema doesn't specify, so I'll follow the user's input exactly.

Now, checking the required fields: model, ram, processor. All three are present. The JSON should have these keys. Let me structure it:

{
  "model": "hp-9000",
  "ram": 8,
  "processor": "intel-i8"
}

I need to make sure there are no typos. The RAM is an integer, so 8 without quotes. The other fields are strings. Looks good. No extra information needed. The user didn't mention anything else, so this should be the correct extraction.
</think>

```json
{
  "model": "hp-9000",
  "ram": 8,
  "processor": "intel-i8"
}
```

At the end of the answer, there is parsed information in JSON format. Using the parser’s invoke method, you cat retrieve the instance of the Pydantic model that you defined as a format.

parser.invoke(answer)
MyModel(model='hp-9000', ram=8, processor='intel-i8')