Connecting color and language

10:30 AM - 10:55 AM on August 15, 2015, Room 704

Dean Hillan

Audience level:


A project that maps descriptive, human-generated language for many different colors to a collection of images. This allows specific searches to be carried out on the color content of the images. The technique is applied to the image collection of Paperless Post cards to help improve user search.


A few years ago a color survey was run by the online comic xkcd. Volunteers were asked to construct names for randomly displayed images of color. These data provide a unique and fun way to map detailed human descriptions to color space. I use some straightforward transformations between the word corpus of colors and the RGB color space, and leverage libraries such as numpy and scikit-learn to create ranked results for any user search on a collection of images. The technique is applied to the card collection at Paperless Post to demonstrate the technique's accuracy and ability to return images containing very specific color combinations.