Intro
This was the original application of the dominant color detection described in my “Color Palette Extractor” blog post. These scripts take a video (a movie, ideally), and extract the dominant colors every n frames of the movie, to create a “fingerprint” of the colors used throughout the movie.
CodeDev
Re-scaling
The first step was to pre-process the movie to lower the resolution to a more manageable size (mainly to speedup the processing time of the clustering). Even though resizing could potentially have effects on the colors due to compression, we aren’t downsizing the video too much. Here’s a snippet of the code that does this down-sampling by using ffmpeg:
Exporting frames
Now, to do the clustering to identify the dominant color of the images, there’s two approaches we can take:
- Calculate the dominant colors “on-the-fly” by loading the video and going through the frames in memory (more efficient)
- Export the frames to independent images, and then calculate the clustering on the images in a different script (more flexible)
As this is an exploratory script, which requires various tweaks and incremental improvements, and that I wanted to build upon it for other applications, I chose to go for the second route (although I do provide a deprecated version of the “on-the-fly” analysis for anyone interested). The exportFrames file does this by taking a video file, and the number of frames we need exported:
Which gives us all our files in a folder for further processing in this and other applications (such as “Color Palette Extractor”):
Dominant colors
With our images ready, we go on to calculate the clusters of dominant colors with fingerprint.py:
The function that does all the work is aux.calculateDominantColors
, so we’re gonna break it down:
Assembling
Here, we are iterating through the images in the folder by loading them, converting from BGR to RGB (opencv loads them in BGR format), reshaping the array to remove one of the dimensions of the image (converting a 2D array of RGB entries, into a 1D vector of RGB rows). Finally, we do the clustering of the colors to obtain the dominant color palette (this process is described in more detail in my other blog post “Color Palette Extractor”), and add the result to a vector that will contain all of the palettes of the frames.
Further Thoughts
I’d like to automate this process even further in the near future by creating a wrapper that takes care of all the steps in the process, so stay tuned!
Code repo
- Repository: Github repo
- Dependencies: opencv-python, ffmpeg-python, Pillow, numpy, scikit-learn, matplotlib, ffmpeg