Render screenshots and HTML using chatGPT

With the announcement of ChatGPT plugins, it's now possible to use chatGPT to interact with many external services and API's.

Urlbox has created it's own chatGPT plugin which means it's possible to ask chatGPT to render any HTML or ask to 'see' a web page. It's also possible to generate PDF's from HTML or even scrolling videos of webpages.

Install the plugin

At the time of writing, the plugin model is in Alpha and only certain OpenAI accounts will have access in order to test and install their plugins.

To install the plugin, first ensure that you are using the plugin specific model of GPT:

Then, visit the plugin store from the chatGPT UI and click install unverified plugin:

Then, enter urlbox.com to install the plugin:

Click through the disclaimer and the next dialog will ask for your HTTP access token, this is where you will put in your Urlbox secret API key from any of your urlbox projects.

Great! Now you have the plugin installed and you can start using it.

Using the plugin to generate a screenshot

Let's take it for a quick spin.

I simply asked chatGPT what does apples homepage look like today? and it correctly understood what I meant, and used the urlbox plugin to render a screenshot of apple.com:

Here are the options chatGPT sent to the Urlbox API:

You can see it correctly passed in the correct URL and used quite sensible options for the width and height.

Using the plugin to generate a PDF

You can also ask chatGPT to render a PDF. Based on the previous context, I just prompted chatGPT like so: how about in pdf format, and the result:

Warning - Alpha software

With the announcement of chatGPT plugins less than a week old at the time of writing, there are still of course some rough edges which i'm sure will be smoothed out over time.

One example I encountered was that the model would sometimes add comments to the request JSON payload, causing a syntax error:

You can just prompt chatGPT to never use comments in the request payload, and it will work more smoothly.

See the latest news

I asked to see the latest news, and it chose to render a screenshot of the CNN homepage:

Unfortunately there is a large ad across the top of the screenshot, so I simply asked chatGPT without the ads:

and it correctly chose to set and send in the block_ads option to Urlbox's API.

Modifying the request

The ad is blocked but there is still a huge space at the top of the page, obscuring most of the actual content.

Simply asking for a taller screenshot, et voilà chatGPT correctly modifies the height option, keeping all other options the same as before:

Rendering multiple screenshots

You can get chatGPT to render multiple screenshots without even being that explicit, here I asked can i see the webpages of the most popular saas applications, then it asked for confirmation:

and it called the Urlbox API eight times to render the screenshots of saas application such as zoom, slack, salesforce etc:

Rendering HTML with the Urlbox chatGPT plugin

One of the most powerful features of the Urlbox chatGPT plugin is that you can ask chatGPT to render any HTML you can think of, well, any HTML you can describe to chatGPT adequately enough :)

Having asked for multiple screenshots of the most popular saas applications, I then asked chatGPT to render those screenshots as a grid in HTML:

This was my prompt:

thanks, can you put them into a html gallery in a grid and then render the gallery

and chatGPT generated some HTML, passed it into the html option of the Urlbox API and returned the output:

and it initially returned the screenshots on top of each other, which was not a bad attempt considering the prompt was not very explicit.

I had imagined a classic gallery kind of grid structure, so I attempted to get chatGPT to improve the HTML by asking:

display the images smaller and make it a 3x3 grid

and it did just that:

Using tailwind CSS

Whilst that result was pretty cool, I then decided to go a little further and see how far I could push chatGPT to generate, and render HTML using Urlbox's API via the plugin.

Here is my prompt:

amazing. now lets add a caption to each screenshot below it, and add some padding and margin between the screenshots. let's also add a drop shadow to each image. you can use tailwind for the css.

and the initial response resulted in a syntax error:

I figured that chatGPT was taking too long to generate the HTML string, and it was causing a syntax error because it was failing to finish generating the HTML in time, leaving an unclosed quotation mark:

By asking chatGPT to show me the HTML it was generating, I was able to see that it was indeed generating invalid/unfinished HTML.

To ensure the HTML it generated was a little shorter, I just asked it to render 6 of the screenshots, (rather than the 8 it originally created):

That seemed to do the trick, and the resulting output was:

Improving the styling

Going a little bit further, I asked to increase the padding, make the images have a softer rounded edge and put a subtle grey gradient on the background.

and here was the result:

as you can see the result was pretty good, but the we can improve the output by asking chatGPT to render in a higher resolution and only render the body element.

chatGPT correctly calls the urlbox API with the selector option set to body and the retina option set to true:

and the resulting output:

Adding a title and subtitle

Finally, adding a title and subtitle, and we have a pretty cool screenshot gallery that could be used as the open graph image for this very blog post!

and the result:

View the final generated image in full resolution here

The possibilities are endless

The possibilities are endless, and I'm sure you can come up with some pretty cool use cases for this plugin.

Here's another gallery I generated after asking for screenshots of bootstrapped saas products, bootstrapped saas retina

chatGPT combined with plugins are going to be a pretty powerful combination when they become generally available.

Using the Urlbox plugin, you can now extend chatGPT and get it to quickly generate screenshots of webpages, and iterate on and render HTML all using Urlbox's API.

And with the recent release of pix2struct, you'll soon be able to query and ask questions about a screenshot too...