ChatGPT’s knowledge is limited to its training data, which has the cutoff year of 2021. This implies that we cannot extract information for cases that have occurred after the cutoff year. However, we can integrate Wikipedia with ChatGPT. We will go straightforward with an example. Our goal is to extract information about Juancho Hernangomez, the new star of Panathinaikos BC.
The wikipedia Python Package
We will need to install the wikipedia
python package by running:
pip install wikipedia
From the wikipedia
package, we will use the WikipediaLoader
that has the following arguments
query
: you query to wikipedia- optional
lang
: the language where the default is “en”. - optional load_max_docs: default=100. You specify the number of downloaded documents.
- optional
load_all_available_meta
: default=False. By default only the most important fields downloaded such as the published date of the wikipedia document the title and a summary.
Integrate Wikipedia with ChatGPT
Let’s start by loading the required libraries.
from langchain.document_loaders import WikipediaLoader from langchain.chat_models import ChatOpenAI from langchain.llms import OpenAI from langchain.prompts.chat import ( PromptTemplate, ChatPromptTemplate, SystemMessagePromptTemplate, HumanMessagePromptTemplate, AIMessagePromptTemplate )
Our query to Wikipedia will be the “Juancho Hernangomez”. Let’s see how we can get the related Wikipedia articles as plain text. We will set a limit of 5 loaded documents.
# The number of max documents n = 2 # The loader loader = WikipediaLoader(query='Juancho Hernangomez', load_max_docs=n) # Concatenate the text to variable called context_text context_text = '' for d in range(len(loader.load())): context_text = context_text + ' ' + loader.load()[d].page_content
Now, we will use the ChatGPT prompt templates to pass the question and the document. The document is the Wikipedia article and the question is:
‘Which is the current team of Juancho Hernangomez’
model = ChatOpenAI() template = "Answer this question:\n{question}\n Here is some extra context: \n{document}" human_prompt = HumanMessagePromptTemplate.from_template(template) chat_prompt = ChatPromptTemplate.from_messages([human_prompt]) question = 'Which is the current team of Juancho Hernangomez' document = context_text result = model(chat_prompt.format_prompt(question=question, document=context_text).to_messages()) result.content
The current team of Juancho Hernangomez is Panathinaikos of the Greek Basket League and the EuroLeague
Closing Remarks
We showed you how to unlock the power of ChatGPT by integrating it with Wikipedia. Stay tuned for examples.