Generating MD5 Hashes in a Browser

study

One day I stumbled upon an interesting question: "How to generate a list of hashes for all the page images inside the browser?". It got me intrigued about how versatile JavaScript actually is. Thus here are the result of that endeavor.

Prelude

I will be using https://picsum.photos/ since there are no CORS issues. For any other website an extension/flag might be required to disable CORS protection inside the browser.

For the MD5 library there are many choices, through trial and error I've selected https://github.com/emn178/js-md5/. The most common issue with others was lack of support for ArrayBuffer which resulted in incorrect hash values.

Loading an External Package

While this might sound daunting at first, it's not that bad. Most packages can be executed inside the browser without any compilation via babel/webpack.

Inside the browser console it is done by creating a script element and adding a source to the script file. In our case the source will be https://cdn.jsdelivr.net/npm/js-md5@0.7.3/src/md5.min.js.

var script = document.createElement("script");
script.src = "https://cdn.jsdelivr.net/npm/js-md5@0.7.3/src/md5.min.js";

document.querySelector("head").appendChild(script);

If there were no CORS issues then an md5 function will be available inside the terminal. Note: this can also be used to require jQuery or similar libraries to do some DOM heavy lifting.

Getting the Sources

Next it is necessary to find all the image sources on the page. With ES6 it is a trivial task.

var imgs = [...document.querySelectorAll("img")];
var imgSrcs = imgs.map((i) => i.src);

Preparing to Fetch

Now that imgSrcs array has all the sources a function is needed to fetch each source, and convert it into a Blob. Also this blog object must be converted into an ArrayBuffer. Most modern browsers have FileReader API that facilitates working with Files and Blobs. And last but not least Fetch is a modern replacement for XMLHttpRequest.

var getData = (url) =>
  fetch(url)
    .then((response) => response.blob())
    .then(
      (blob) =>
        new Promise((resolve, reject) => {
          const reader = new FileReader();
          reader.onloadend = () => resolve(reader.result);
          reader.onerror = reject;
          reader.readAsArrayBuffer(blob);
        })
    );

This function will be used later to generate a list of Promises. Inside the second then block a new Promise is created. It is necessary because reader.readAsArrayBuffer is an asynchronous operation, that triggers onloadend or onerror after a certain period of time.

Making Promises

It is time to send out the request for each image.

var promises = Promise.all(imgSrcs.map(getData));

Promise.all helps in reducing overall delay by sending out the fetch requests simultaneously. Of course it might fail if one image is corrupt.

Creating Hashes

Final step would be to iterate over the promises array, convert each ArrayBuffer into md5 and print out the results.

promises
  .then((buffers) => buffers.map(md5))
  .then((hashes) => console.log(hashes));

Final

There are many other things that could be tried. Another hashing algorithm such as SHA256. Web Crypto API could be used instead of injecting an external library (Only for SHA). And what about error handling, since Promise.all will throw an error if any of the promises fail.

Code:

// Get md5 library
var md5 = document.createElement("script");
md5.src = "https://cdn.jsdelivr.net/npm/js-md5@0.7.3/src/md5.min.js";
document.querySelector("head").appendChild(md5);

// Prepare function to get image blobs
var getData = (url) =>
  fetch(url)
    .then((response) => response.blob())
    .then(
      (blob) =>
        new Promise((resolve, reject) => {
          const reader = new FileReader();
          reader.onloadend = () => resolve(reader.result);
          reader.onerror = reject;
          reader.readAsArrayBuffer(blob);
        })
    );

// Get image sources
var imgSrcs = [...document.querySelectorAll("img")].map((i) => i.src);

// Load images
var promises = Promise.all(imgSrcs.map(getData));

// Calculate hashes
promises
  .then((buffers) => buffers.map(md5))
  .then((hashes) => console.log(hashes));

In the end I was satisfied with the results, because modern JavaScript is a powerful tool to know and use.