Skip to main content

Overview

Contextual AI’s reranker is the first with instruction-following capabilities to handle conflicts in retrieval. It is the most accurate reranker in the world per industry-leading benchmarks like BEIR. To learn more about the reranker and its importance in RAG pipelines, please see our blog. This how-to guide uses the same example to demonstrate how to use the reranker with the Contextual API directly, our Python SDK, and our Langchain package. The current reranker models include:
  • ctxl-rerank-v2-instruct-multilingual
  • ctxl-rerank-v2-instruct-multilingual-mini
  • ctxl-rerank-v1-instruct

Global Variables & Examples

First, we will set up the global variables and examples we’ll use with each different implementation method.
from google.colab import userdata

api_key = userdata.get("API_TOKEN")
base_url = "https://api.contextual.ai/v1"
rerank_api_endpoint = f"{base_url}/rerank"
query = "What is the current enterprise pricing for the RTX 5090 GPU for bulk orders?"

instruction = "Prioritize internal sales documents over market analysis reports. More recent documents should be weighted higher. Enterprise portal content supersedes distributor communications."

documents = [
    "Following detailed cost analysis and market research, we have implemented the following changes: AI training clusters will see a 15% uplift in raw compute performance, enterprise support packages are being restructured, and bulk procurement programs (100+ units) for the RTX 5090 Enterprise series will operate on a $2,899 baseline.",
    "Enterprise pricing for the RTX 5090 GPU bulk orders (100+ units) is currently set at $3,100-$3,300 per unit. This pricing for RTX 5090 enterprise bulk orders has been confirmed across all major distribution channels.",
    "RTX 5090 Enterprise GPU requires 450W TDP and 20% cooling overhead."
]

metadata = [
    "Date: January 15, 2025. Source: NVIDIA Enterprise Sales Portal. Classification: Internal Use Only",
    "TechAnalytics Research Group. 11/30/2023.",
    "January 25, 2025; NVIDIA Enterprise Sales Portal; Internal Use Only"
]

model = "ctxl-rerank-v2-instruct-multilingual"

REST API implementation

import requests

headers = {
    "accept": "application/json",
    "content-type": "application/json",
    "authorization": f"Bearer {api_key}"
}
payload = {
    "query": query,
    "instruction": instruction,
    "documents": documents,
    "metadata": metadata,
    "model": model
}

rerank_response = requests.post(rerank_api_endpoint, json=payload, headers=headers)

print(rerank_response.json())

Python SDK

try:
  from contextual import ContextualAI
except:
  %pip install contextual-client
  from contextual import ContextualAI

client = ContextualAI (api_key = api_key, base_url = base_url)
rerank_response = client.rerank.create(
    query = query,
    instruction = instruction,
    documents = documents,
    metadata = metadata,
    model = model
)

print(rerank_response.to_dict())

Langchain

try:
  from langchain_contextual import ContextualRerank
except:
  %pip install langchain-contextual
  from langchain_contextual import ContextualRerank

from langchain_core.documents import Document
# intialize Contextual reranker via langchain_contextual
compressor = ContextualRerank(
    model=model,
    api_key=api_key,
)

# Prepare metadata in dictionary format for Langchain Document class
metadata_dict = [
    {
        "Date": "January 15, 2025",
        "Source": "NVIDIA Enterprise Sales Portal",
        "Classification": "Internal Use Only"
    },
    {
        "Date": "11/30/2023",
        "Source": "TechAnalytics Research Group"
    },
    {
        "Date": "January 25, 2025",
        "Source": "NVIDIA Enterprise Sales Portal",
        "Classification": "Internal Use Only"
    }
]


# prepare documents as langchain Document objects
# metadata stored in document objects will be extracted and used for reranking
langchain_documents = [
    Document(page_content=content, metadata=metadata_dict[i])
    for i, content in enumerate(documents)
]

# print to validate langchain document
print(langchain_documents[0])
# use compressor.compress_documents to rerank the documents
reranked_documents = compressor.compress_documents(
    query=query,
    instruction=instruction,
    documents=langchain_documents,
)
print(reranked_documents)

Additional Resources