People always complain of never having enough time for reading. And with so much information being shared every day, it’s getting increasingly harder to keep up with everything. Fortunately, LLMs such as ChatGPT come to the rescue.
These can be used to extract the essence of web pages, articles, and even PDFs, so we can ingest more information faster than ever.
In this article, we’ll cover the best ways to summarize articles and PDFs with ChatGPT, going over the pros and cons of each method in terms of accuracy and costs.
Here’s the fastest way to summarize almost any article with ChatGPT:
- Copy the text of the article (or the URL if you use ChatGPT Plus).
- Log in to chat.open.ai and create a new chat (select Model > Web Browsing for ChatGPT Plus customers).
- Paste the text (or URL for ChatGPT Plus members) and ask ChatGPT to “summarize it” by typing that right after the text.
Here’s how it looks:
You can always change the prompt to suit your needs. For example, you can ask ChatGPT for an “ELI5 summary”, which will oversimply the text, or even write a “shorter summary with a bullet list that starts with emojis”.
Regardless of how you want the summary to sound like, you have to keep in mind that ChatGPT has two main limitations:
- The summary might be incorrect - LLMs are notorious for the way they tend to hallucinate, meaning they can write plausible-sounding text that is actually completely wrong. You should keep this in mind and always read the article to double-check that the output is correct.
- There is a max token limit - The free version of ChatGPT has a limit of around 3000 words (4,096 tokens), while ChatGTP Plus’ limit is double. This includes everything you have typed in the chat before asking for the summary, as well as ChatGPT’s responses, so make sure you always create a new Chat before you ask for a summary. It’s also important to note that some models available in the OpenAI playground have a higher token limit (such as the gpt-3.5-turbo-16k, which has a 16,384 token limit).
Online articles are usually no longer than 2500 words, which means that ChatGPT can easily summarize them in less than 500 words. On the other hand, PDF files can easily surpass the maximum number of tokens, making the whole process a little more difficult.
Even so, here are two methods to summarize any PDF file, regardless of its size.
This method involves using a chunking strategy with ChatGPT. Here’s what you have to do:
- Manually divide the text from the PDF document into smaller, manageable chunks that fit within ChatGPT's maximum message limit.
- Input each chunk separately into the ChatGPT interface.
- Ask ChatGPT to summarize each chunk.
- Combine the summaries of each chunk to get a preliminary summary of the document.
- Input the preliminary summary into ChatGPT and ask it to summarize it again to get the final summary.
Even though this might seem tedious, it’s the best way to summarize a PDF using the free version of ChatGPT. You should also bear in mind that the output might not be completely accurate.
The approach involves using the AI PDF plugin (or any other PDF plugin) within the ChatGPT interface. This method is only available to ChatGPT Plus subscribers, but it’s one of the best ways to summarize PDFs using ChatGPT’s UI.
Here are the steps:
- Open ChatGPT, navigate to the settings by selecting your name in the bottom left, then choose “Settings & Beta”.
- Inside the settings menu, click on “Beta features” and toggle on the “Plugins”.
- Open a new chat, select GPT-4, and click on the “Plugins” button.
- Now open the Plugin store, search for Ai PDF, and install it.
- Provide the URL of the PDF document to ChatGPT and ask it to summarize it.
Plugins can handle the entire process, from extracting the text to generating the summary, making it a great option for any PDF length.
We have tried the plugin with documents of up to 300 pages, and it was always able to correctly summarize the contents and answer any questions about the PDF.
So far, we’ve explored various methods to summarize text using the ChatGPT UI. Now, let's shift gears and go over the best ways to summarize online articles with the GPT API.
You should only follow this method if you're comfortable with Python and want a high degree of customization. It involves using the programming language in conjunction with the GPT API, as detailed in this tutorial.
One thing to keep in mind is the cost associated with this method. GPT models charge per token, and costs can quickly add up depending on the size of the text you're summarizing. Not to mention that you must use the GPT-4 model if you want the highest possible accuracy, which is 20 times more expensive than GPT-3.5-Turbo.
This approach breaks down the content into smaller, more manageable pieces, making it easier for the model to process and summarize.
The real game-changer here is Url2Text, a tool that can extract the body text of a web page and convert it to markdown.
Why is this important?
Well, by converting to markdown and stripping unnecessary HTML tags, inline CSS, and JS scripts, you can significantly reduce token usage. This will lead to cost savings, especially when summarizing complex web pages.
You should also keep in mind that text chunking requires a bit of finesse. You need to carefully chunk the text to maintain context across each piece. If done correctly, this can lead to high-quality summaries. But remember, the accuracy of the summary largely depends on the quality of the chunking and conversion process.
This method is a hybrid approach that combines the power of the GPT API with the efficiency of extractive summarization.
The first step involves using an extractive summarization algorithm. This type of algorithm works by pinpointing the key sentences or phrases in the text - the ones that capture the essence of the content. It's like mining for gold, where the gold nuggets are the most essential pieces of information in your text.
Once you've identified these key elements, the next step is to feed these sentences into the GPT API. The GPT model then takes these inputs and generates a coherent summary. It's like taking the gold nuggets you've mined and melting them into a gold bar.
One of the key advantages of this method is its cost-effectiveness, mainly because you're feeding the GPT model with the essence of the text rather than the entire body.
But like all the previous methods, the quality of the output depends on the quality of the input. The accuracy of the summary can depend on the quality of the extractive summarization and the GPT model's ability to generate a coherent summary from the provided sentences.
Last but not least, you can use the GPT API with a post-processing step. This method is a two-step process that first generates preliminary summaries and then combines and refines them into a polished version.
The first step involves using the GPT API to generate a preliminary summary. This is where the GPT model does its magic, taking your text and condensing it into a shorter version that captures the main points.
Once you have your preliminary summary, it's time for the second step: post-processing. This is where you refine the summary, removing redundant information, ensuring coherence, and checking for grammatical correctness.
One of the key benefits of this method is the potential for high accuracy. By refining the summary, you can improve its quality and ensure it accurately represents the original text, but this process will surely increase the overall token usage and cost.
The best way to summarize an article with ChatGPT largely depends on the type and size of the document, your budget, and how accurate you want the summary to be.
If you're dealing with smaller articles and prefer a straightforward approach, copy-pasting text into the ChatGPT UI might be best. You should use the AI PDF plugin or the GPT API in conjunction with other tools for larger documents.
Yes, GPT-4 can summarize a webpage. It can process the text content of the webpage and generate a concise summary, although the effectiveness can depend on the complexity and length of the original content.
Of course, you can use ChatGPT to summarize articles. Simply input the text of the article, or URL, into the ChatGPT interface and the model will generate a concise summary of the main points.
To get ChatGPT to summarize a PDF, you can manually copy and paste the text into the ChatGPT interface. For larger PDFs, you can use the text chunking strategy or a plugin.