How Shutterstock Searches 35 Million Images by Color Using Apache Solr
How Shutterstock uses Apache Solr to extract color data from images and index them into Solr search.
As we countdown to the annual Lucene/Solr Revolution conference in Austin this October, we’re highlighting talks and sessions from past conferences. Today, we’re highlighting Shutterstock engineer Chris Becker’s session on how they use Apache Solr to search 35 million images by color.
This talk covers some of the methods they’ve used for building color search applications at Shutterstock using Solr to search 40 million images. A couple of these applications can be found in Shutterstock Labs – notably Spectrum and Palette. We’ll go over the steps for extracting color data from images and indexing them into Solr, as well as looking at some ways to query color data in your Solr index. We’ll cover some issues such as what does relevance mean when you’re searching for colors rather than text, and how you can achieve various effects by ranking on different visual attributes.
At the timeof this presetnation, Chris was the Principal Engineer of Search at Shutterstock– a stock photography marketplace selling over 35 million images– where he’s worked on image search since 2008. In that time he’s worked on all the pieces of Shutterstock’s search technology ecosystem from the core platform, to relevance algorithms, search analytics, image processing, similarity search, internationalization, and user experience. He started using Solr in 2011 and has used it for building various image search and analytics applications.
Join us at Lucene/Solr Revolution 2015, the biggest open source conference dedicated to Apache Lucene/Solr on October 13-16, 2015 in Austin, Texas. Come meet and network with the thought leaders building and deploying Lucene/Solr open source search technology. Full details and registration…
LEARN MORE
Contact us today to learn how Lucidworks can help your team create powerful search and discovery applications for your customers and employees.