Create a Colour Palette from an RGB Image with the Gauguin Ruby Gem
How many colors do you see?
Have you ever wondered how many colors a picture actually has? How many colors do you think are on the following image?
Many people would say 2, but actually there are 1942.
It’s because of the fact that in order to make an image with smooth borders not all of the pixels are pure black but consist of many grayscale colors.
Existing solutions
I came across this interesting problem recently, as I needed to implement a validation of a color count for the image user uploads to our Rails application.
My first thought was that it should be trivial to do so by just using RMagick. Unfortunately, the color_histogram method returns all these grayscale colors too and I couldn’t find a way to exclude them using just RMagick. I’ve tried all possible options for image processing, trying to flatten the colors as much as possible to reduce the color count, but it seemed to be impossible to set such parameters that would work for all the possible images that people could upload.
I found some sort of a solution using a Python library – colorific, but I also wasn’t able to configure it in a way that it will work for all my sample images.
Algorithm
So I decided to change my approach completely. Using RMagick I got a color histogram, so I had all the existing colors with their occurrence count, and I was able to easily calculate the percentage of each particular color in the image.
Now I needed to decide which colors I should take into consideration and which to ignore. I didn’t want to just set an arbitrary percentage threshold. Imagine we set the threshold to 1%, so every color that doesn’t cover more than 1% is ignored. And now lets have an image 10 pixels wide and 10 pixels high consisting of 100 different colors, so each one covers 1% of the image. With such a threshold, all colors would be ignored, so the image would have 0 colors while it actually has 100 colors. Also setting the threshold too low would not work, as with other pictures it would calculate more colors than expected.
So I came across the idea to sort colors by percentage in descending order and take the first of them that sums to an arbitrary percentage – after experimenting with many samples I chose 98.1%.
But then I conducted an experiment on the image below:
As you can see, there are gradients on every letter and here these gradients actually matter.
Take a look on magnified letters “o” and “r”:
If you just take into consideration one of the green colors from “o”, it will make such a low percentage score that it will be ignored by the reduction algorithm.
I realized that I need to group similar colors before reducing. So firstly, I had to agree what does “similar” actually mean. I tried comparing two colors in different color spaces – RGB, YUV and Lab and the last turned out to be the most appropriate – it matches more accurately with how the human eye perceives color.
For performance reasons I also added a phase of color limiting if the amount of colors is so big that calculating groups would take too long – by default I take into consideration only the first 10000 colors with the highest percentage value.
Summary
So to sum up the phases go in this particular order: limiting colors, clustering and noise reduction in this particular order. The result is a Ruby hash with the main colors as keys and grouped colors as values.
You can use it to display color palette: After doing so much work on that I decided to make it publicly useful and extracted the code to a gem. I named it gauguin in honor of my favourite artist.
Recoloring
One of my Lunar colleagues, Hania Seweryn, came across a great idea – a feature that would take the image and it’s calculated palette and return a new image, colored only with the main colors. I loved the idea, so I implemented it in the gauguin gem.
For the above image it would be:
What can I use this for?
You can use palette method whenever you need to get a realistic color count of an image or want to create color palettes based on images. The recolor method can be used to reduce noise, for example before vectorising an image.
It was very useful in my original task – I needed to validate the color count of the images uploaded to our application because they were meant to be printed, and the price depended on the number of colors. Images were to be vectorised before printing, and thanks to the recolor feature I could show the user how the colors will be reduced when the image is printed.