Article

Generating Website Screenshots With Ruby

·
Bilal Budhani
·
4 min read

Providing a great user experience is probably one of the best ways to build a loyal customer base that's why whenever I have to deal with external websites I like to provide previews of those URLs inside the interface.

There are of course numerous other reasons why our applications would require generating screenshots

In this blog post, I will explore different approaches to generating website screenshots with Ruby

Let's create a folder first

mkdir screenshot-ruby
cd screenshot-ruby

Selenium Webdriver

Selenium provides official ruby bindings using which we can navigate to websites and generate screenshots.

First, we will need to download and set up an appropriate web driver to work with selenium, follow the instructions mentioned here to install the required driver as per your operating system and browser version.

Once done, we will install the gem to get started

gem install selenium-webdriver

Now let's write the script to generate a screenshot and save it as selenium.rb

require "selenium-webdriver"
 
driver = Selenium::WebDriver.for :chrome
driver.navigate.to "https://google.com"
 
driver.save_screenshot("selenium.png")
driver.quit

we will run this with

ruby selenium.rb

we will now see the script has generated a screenshot as selenium.png

Selenium Webdriver

Using Selenium we can control every aspect of the browser and mimic user behaviour. Let's take a screenshot of a specific element

require "selenium-webdriver"
 
driver = Selenium::WebDriver.for :chrome
driver.navigate.to "https://urlbox.io"
 
element = driver.find_element(tag_name: "main")
 
element.save_screenshot("selenium2.png")
driver.quit

This will capture only the contents of main element.

Although Selenium provides a low-level API on top of browsers, it comes with a learning curve and a lot of dependencies on browsers and their drivers. Moreover, running this approach on a server could be very expensive.

Grover

Grover is a wrapper on top of the puppeteer to automate the browser for converting web pages to images Let's install the required dependencies

gem install grover
npm init -y
npm install puppeteer
touch grover.rb

open grover.rb in your favorite code editor

require "grover"
 
grover = Grover.new('https://urlbox.io')
 
# Pick your perferred format
png = grover.to_png
jpeg = grover.to_jpeg
 
File.write("grover.png", png)
File.write("grover.jpeg", jpeg)

now when we run the following script with `ruby grover.rb`` it should generate a screenshot of the specified URL.

Grover Screenshot

When we open the image file we can see it has only captured the hero part of the website. Let's change that to capture the entire website

 
//..
 
grover = Grover.new('https://urlbox.io', {
  full_page: true
})
//..
 

when we run this script again we should see generated an image with the entire website. Grover can be configured based on our requirements.

While this approach works it has its own limitations and challenges.

First and foremost, it is a resource-hungry approach, as soon as we start generating more screenshots it will start occupying a larger chunk of our server memory leaving less room for any other process, It may become a bottleneck to scale.

Secondly, this approach introduces multiple dependencies i.e Nodejs & Chromium in our application, which would require separate efforts to maintain in future.

Playwright

Playwright is a cross-browser automation engine developed by Microsoft team. There is playwright ruby gem available which can be used to generate website screenshots.

Again, we will install dependencies before we start writing code

gem install playwright-ruby-client
npm install playwright

Let's get down to writing code

require 'playwright'
 
Playwright.create(playwright_cli_executable_path: './node_modules/.bin/playwright') do |playwright|
  playwright.chromium.launch(headless: true) do |browser|
    page = browser.new_page
    page.goto('https://playwright.dev')
    page.screenshot(path: './playwright.png')
  end
end

We will save this as playwright.rb and run it ruby playwright.rb. Now we should see it generate a screenshot of the url

Playwright Screenshot

Playwright can be useful if there is a requirement of generating screenshots from different browsers, however, it does create the same set of challenges as puppeteer regarding dependencies maintenance.

Urlbox

Urlbox is a website screenshot API which reduces the efforts required to generate website screenshots drastically. Let's take a look at how

We will first grab credentials to use for their API

Urlbox Credentials

Now let's install their official website screenshot gem

then we will create a file called urlbox.rb and use their official gem to generate a screenshot

require  'urlbox/client'
 
urlbox_client = Urlbox::Client.new(api_key: '', api_secret: '')
 
screenshot_url = urlbox_client.generate_url({url: 'https://urlbox.io/'})
# This can be used for image url <%= image_tag screenshot_url %>
 
response = urlbox_client.get({
  url: "https://urlbox.io/",
  format: "jpeg"
})
 
File.write('urlbox.jpeg', response.body)
 

above script will generate the following image

Urlbox screenshot

now we have generated screenshots of the website without introducing any javascript dependency using an approach which can be scaled as we grow.

Moreover, Urlbox supports a wide range of options such as "retina", "dark_mode", "block_ads" etc. which can improve the end result drastically.

Let's update our above script to try out some options

 
response = urlbox_client.get({
  url: "https://urlbox.io/",
  format: "jpeg",
  retina: true,
  full_page: true,
  wait_until: "domloaded"
})
 

This will generate a full page screenshot of the page

What have we learned so far?

Generating website screenshots is an expensive memory operation and comes with a bunch of dependencies to maintain. Whereas, options like Urlbox.io are easier to get started and come with so many useful options to give us a high-quality final product.