Compare images in Ruby

So last week I have to go over a problem that really got me for a couple of hours. I am using RMagick at Lifebooker to generate some images with stickers on them. In short we need to put a sticker on our deals like “Last chance” and similar stuff so people when they receive an email with their discounts they will notice which ones are about to finish.

We aim for 100% test coverage on Lifebooker so we try to embrace TDD as much as possible. In this case I had to test that I was always generating the same Image with the sticker where I wanted it. Given the two images, I want to come up with the exact thing that is below it.

 

lifebooker original

 

 

 

 

 

 

 

 

 

 

 

 

 

Easy enough, I came up with this function who did this job very well.

So you might be wondering, what’s the big deal? Testing this was a big pain in the ass. The normal approach would be to compute its digest, using Digest::MD5.hexdigest(file). That was messed up and after many tries, I gave up and decided to compare them manually with an awesome hex diff tool called dhex. And I found something weird. Check out the image, you’ll find out that 8 bytes are different. And they differ in the same way, 0c 13 becomes 1c 20.. twice. There are many nuances that could be said about why did this happen, but in short it all came down to the fact that I passed the file path to RMagick and I let it do its ‘magick’ opening the file, writing the new content on it, and saving it.

 

 

It turns out that probably the best approach is to open the file in “rb” (read binary) mode, change everything using RMagick on memory, and then saving it back using standard File class from Ruby. That would yield a correct MD5 hash which was what I was using. I did not realize this and ended up looking for a way to compare they directly on RMagick. And lucky me, there are many methods to compare images in RMagick!

It turns out that the best way to do this, instead of calculating the digest of the binary data, which will be different (really, no matter how bad you try it always generates an image with those 8 different bytes who will mess up the digest. But what about taking a different approach and comparing the pixels RGB values instead? At the end of the day, they should be the same for two identical images. And it works. Glenn Randers-Pehrson (libpng maintainer/imagemagick contributor) wrote about this on the Imagemagick mail list,  So I just use RMagick’s signature.

 

Lesson learned, if you have to compare an image, compare its pixels. Guys at libpng and Imagemagick probably know better than me.