Predictive Hacks

How to Connect Wikipedia with ChatGPT and LangChain

ChatGPT’s knowledge is limited to its training data, which has the cutoff year of 2021. This implies that we cannot extract information for cases that have occurred after the cutoff year. However, we can integrate Wikipedia with ChatGPT. We will go straightforward with an example. Our goal is to extract information about Juancho Hernangomez, the new star of Panathinaikos BC.

The wikipedia Python Package

We will need to install the wikipedia python package by running:

pip install wikipedia

From the wikipedia package, we will use the WikipediaLoader that has the following arguments

  • query: you query to wikipedia
  • optional lang: the language where the default is “en”.
  • optional load_max_docs: default=100. You specify the number of downloaded documents.
  • optional load_all_available_meta: default=False. By default only the most important fields downloaded such as the published date of the wikipedia document the title and a summary.

Integrate Wikipedia with ChatGPT

Let’s start by loading the required libraries.

from langchain.document_loaders import WikipediaLoader

from langchain.chat_models import ChatOpenAI
from langchain.llms import OpenAI
from import (
                                AIMessagePromptTemplate )

Our query to Wikipedia will be the “Juancho Hernangomez”. Let’s see how we can get the related Wikipedia articles as plain text. We will set a limit of 5 loaded documents.

# The number of max documents
n = 2

# The loader
loader = WikipediaLoader(query='Juancho Hernangomez', load_max_docs=n)

# Concatenate the text to variable called context_text
context_text = ''
for d in range(len(loader.load())):
    context_text = context_text + ' ' + loader.load()[d].page_content

Now, we will use the ChatGPT prompt templates to pass the question and the document. The document is the Wikipedia article and the question is:

‘Which is the current team of Juancho Hernangomez’

model = ChatOpenAI()

template = "Answer this question:\n{question}\n Here is some extra context: \n{document}"
human_prompt = HumanMessagePromptTemplate.from_template(template)

chat_prompt = ChatPromptTemplate.from_messages([human_prompt])

question = 'Which is the current team of Juancho Hernangomez'

document = context_text

result = model(chat_prompt.format_prompt(question=question, document=context_text).to_messages())


The current team of Juancho Hernangomez is Panathinaikos of the Greek Basket League and the EuroLeague

Closing Remarks

We showed you how to unlock the power of ChatGPT by integrating it with Wikipedia. Stay tuned for examples.

Share This Post

Share on facebook
Share on linkedin
Share on twitter
Share on email

Leave a Comment

Subscribe To Our Newsletter

Get updates and learn from the best

More To Explore


Document Splitting with LangChain

In this tutorial, we will talk about different ways of how to split the loaded documents into smaller chunks using