Hire the author: Saurav A

Introduction:

In this post, We will see how to develop a simple chrome extension which can download all images from google pages. This extension is desired to download images from google images page until the user scrolls. This particularly helps when one wants to collect data to use in deep learning models. The entire code can be referenced at github.

My entire goal behind doing this project was to start learning javascript, so what a better way than to build a chrome extension which completely runs on javascript.

I estimated an effort of not more than 10 hours on it. But since I am new to javascript, it took me around 15 hours to complete the work.

Glossary:

  1. Manifest file: A file which tells the browser all the information about the extension.
  2. Background file: A file which runs into the background of the browser. It can be an event page which runs only on specific pages in the browser or a background page which runs every time browser runs.
  3. PageAction: A chrome extension can be a Page Action or a Browser Action extension. A browser action extension is one which runs on all pages. A Page Action extension is the one which runs on specific pages defined in the background file.

Javascript Methods:
1. AddEventListener()
2. Event SendMessage()
3. Event OnMessage()
4. Button.onclick

Steps to Develop the extension

1. Create Manifest file

To develop the extension to download images, We first need a manifest file. The first step is to create a file called manifest.json which contains details about the extension. This file when loaded into chrome tells the browser details about the extension. The following is a simple example of manifest.json:

{
    "name": "Google Image Downloader",
    "version": "1.0",
    "manifest_version": 2,

    "description": "This extension helps you to download all the images from a google page in a click."
}

Since we need our extension to run only on google pages, we want it to be page action extension. The other type it could have been is a browser action extension, which runs every time the browser is opened, which is what we do not want from our extension. Therefore we want it to be Page Action. So we need to define it in manifest.json as page action. So the manifest.json now looks like :

{
    "name": "Google Image Downloader",
    "version": "1.0",
    "manifest_version": 2,

    "description": "This extension helps you to download all the images from a google page in a click.",
    "page_action": {
        "default_title": "Download Images from Google",
        "default_popup": "popup.html"
    }
}

2. Create extension page

Next to develop the extension page, we would like to have two buttons in our extension popup, one for collecting all the images and one for downloading the images. Therefore, we need a popup.html to display what we want in our extension:

<!DOCTYPE html>
<html>
<head>
	<title>Download images from google</title>
</head>
<body>
	http://jquery-3.4.1.min.js
	http://popup.js
	<button id="collect_images">collect All Images</button>
	<p id="textCollect"></p>
	<button id="download_images">download All Images</button>
	<br>
    <div id="image_div"></div>
</body>
</html>

This html page contains two buttons as mentioned earlier and an element to display some text about number of images collected when clicked on collect button.

Next we need a script popup.js where we will define the behavior when a user clicks on the buttons. When a user clicks on collect button we want our extension to go and search for all the ‘img’ tags on the page and to filter out the image information. Also, then we want how many of these image sources obtained are non-empty and contain valid information and filter out these sources. For this, we need to store information in Chrome and we need to have permission for the same in manifest.json

{
    "name": "Google Image Downloader",
    "version": "1.0",
    "manifest_version": 2,

    "description": "This extension helps you to download all the images from a google page in a click.",
    "page_action": {
        "default_title": "Download Images from Google",
        "default_popup": "popup.html"
    },

    "permissions": [
        "storage",
        "downloads",
        "activeTab",
        "declarativeContent"
    ]
}

Additionally, we need to have permissions for the downloads, active tab and declarative content. These we will work out later as we go through this post.

After this, we want to show the user how many images are there to download and we will display it using our text element. Then, when the user presses the download button, we want to trigger a message with information of image sources and therefore, we will define this behavior in popup.js.

window.onload = function() {
	let collectButton = document.getElementById('collect_images');
	collectButton.onclick = function() {
		chrome.tabs.executeScript({code : scriptCodeCollect});
		let textCollect = document.getElementById('textCollect');
		chrome.storage.local.get('savedImages', function(result) {
			textCollect.innerHTML = "collected "+ result.savedImages.length + " images"; 
		});
	};

	let downloadButton = document.getElementById('download_images');
	downloadButton.onclick = function() {
		downloadButton.innerHTML = "Downloaded ";
		chrome.tabs.executeScript({code : scriptCodeDownload});
	};		
};
const scriptCodeCollect =
  `(function() {
  		// collect all images 
  		let images = document.querySelectorAll('img');
		let srcArray = Array.from(images).map(function(image) {
			return image.currentSrc;
		});
        chrome.storage.local.get('savedImages', function(result) {
        		// remove empty images
        		imagestodownload = [];
        		for (img of srcArray) {
        			if (img) imagestodownload.push(img);
        		};
				result.savedImages = imagestodownload;
				chrome.storage.local.set(result);
				console.log("local collection setting success:"+result.savedImages.length); 
			});
    })();`;

const scriptCodeDownload =
  `(function() {
		chrome.storage.local.get('savedImages', function(result) {
			let message = {
				"savedImages" : result.savedImages
			};
			chrome.runtime.sendMessage(message, function(){
				console.log("sending success");
			});
		});
    })();`;

3. Write background script

Now, we want to tell chrome when to show our extension action. For this, we need a background script. Therefore, we will define this behavior in background.js. We need permission to tell chrome about when to show our extension action. We can do this by having declarative content permission and tell manifest about background page. Additionally, we also want to take use of chrome tabs. So, we have added active tab permission in the manifest as well.

{
    "name": "Google Image Downloader",
    "version": "1.0",
    "manifest_version": 2,

    "description": "This extension helps you to download all the images from a google page in a click.",
    "page_action": {
        "default_title": "Download Images from Google",
        "default_popup": "popup.html"
    },

    "permissions": [
        "storage",
        "downloads",
        "activeTab",
        "declarativeContent"
    ],

    "background" : {
        "scripts" : ["background.js", "jquery-3.4.1.min.js"],
        "persistent" : false
    }
}
let downloadsArray= [];
let initialState = {
	'savedImages': downloadsArray
};
chrome.runtime.onInstalled.addListener(function() {
	chrome.declarativeContent.onPageChanged.removeRules(undefined, function() {
		chrome.declarativeContent.onPageChanged.addRules([{
			conditions: [
				new chrome.declarativeContent.PageStateMatcher({
					pageUrl: { hostContains: '.google'},
					css: ['img']
				})
			],
			actions: [ new chrome.declarativeContent.ShowPageAction() ]
		}]);
	});
	chrome.storage.local.set(initialState);
	console.log("initialState set");
});

chrome.runtime.onMessage.addListener(
    function(message, callback) {
      console.log("message coming");
      console.log(message);
      let srcArray = message.savedImages;
      var counter = 1;
      for (let src of srcArray) {
      	chrome.downloads.download({url:src, 
          filename:"GoogleImages/"+counter+".jpg"});
      	counter++;
      };
   });

We have defined a listener event here to listen to the message we sent from popup.js. This message contains image source information when the user clicks on download button. This will take use of onMessage event and extract the image sources. Then, it will ask the chrome to download the images in a folder named GoogleImages in the users download directory.

Steps to See the extension in action

Finally, We need to load the extension into the browser. To do so, go to chrome://extensions and switch on the developer mode and click on load unpacked. The extension will be loaded into the browser.

Now, its time to see our extension in action. So, let us first load the extension in the browser.

So we can see our extension in the browser. Now, since our extension works on google pages, we should now go to a google tab and type mango images and then click on images tab there and scroll down until if we got enough. Then, we click on the G icon there, which is default icon because I have not added any icons to it. There we can see two buttons, as following:

Next, we click on the “collect All images” button and it will display collected x images. If this show collected 0 images, probably try to hit that button once more.

Next, we want to download these images, so we click on “download All Images” button. It will turn to say Downloaded when we have clicked and download starts.

All the downloaded images will go to a folder named GoogleImages in the download directory. So, we have completed our development of a chrome extension to download all images from a google page. This will surely provide a great help to users who need to collect images to form their datasets to use in Deep Learning models. Unfortunately, we have not added any filtering based on image, so the extension will download images in the jpg format and the size same as source.

Learning Tools:

  1. 1. developer.chrome.com/extensions/devguide is a good place to start if you already have knowledge of javascript.
  2. 2. For javascript learners, w3school is a good place to start.
    3. Additionally, I found these stackoverflow posts useful :
    a. https://stackoverflow.com/questions/23596972/chrome-runtime-sendmessage-in-content-script-doesnt-send-message/23597052
    b. https://stackoverflow.com/questions/13667176/chrome-extension-onmessage

Learning Strategy:

During the Course of this project, I was having some difficulties with the javascript events. In those cases, I got to learn a lot about events through stackoverflow. I was first using chrome.downloads.download directly in the popup.js script, but then that was not working. So, I used the strategy of events and used sendMessage event to send a message and then defined an onMessage listener in background file. That solved the problem.

Reflective Analysis:

Previously, when collecting images from google, we had to download all images individually. Now, using this extension one can just install it and then in a single click, it will download all the images on that page.

Conclusion and Future Directions:

So, using this post, one will now be able to develop simple chrome extensions that can be of the Page Action or Browser Action types. The extension developed in this post, just downloads all images when clicked. Next, what one can do is to have an options page to first show user all the images collected and then when the user clicks on the image, it gets downloaded.

So, that’s it! Thanks for reading. The complete source code is on github.

Citations:
1. https://developer.chrome.com/home
2. https://developer.chrome.com/extensions/devguide

Hire the author: Saurav A