ImageSift Bot

ImageSiftBot is a web crawler that scrapes the internet for publicly available images to support our suite of web intelligence products

Requests from ImageSiftBot set the User-Agent to:

Mozilla/5.0 (compatible; ImagesiftBot; +imagesift.com)

Contact Us

If you have any questions about ImageSiftBot or would like to opt-out of being crawled, please contact us by email at

support@imagesift.com

FAQ

Does ImageSiftBot follow Robots.txt rules?

Standard directives in robots.txt that target ImagesiftBot are respected. For example, the following will allow ImagesiftBot to crawl all pages, except those under /private/:

User-Agent: ImagesiftBot
Allow: /
Disallow: /private/

ImagesiftBot also supports the crawl-delay directive in robots.txt files. It interprets the value as the minimum duration, in seconds, between the start of consecutive requests. For example, assume you have specified the following in your robots.txt file:

User-Agent: ImagesiftBot
Crawl-delay: 5

ImagesiftBot will split each day into 5 second intervals and issue at most one request to your domain inside each interval.

If there is no rule targeting ImagesiftBot, but there is a rule targeting Googlebot, then ImagesiftBot will follow the Googlebot directives. For example, ImagesiftBot will fetch all pages, except those under /private/ with the following robots.txt:

User-Agent: *
Disallow: /
User-Agent: Googlebot
Allow: /
Disallow: /private/

What information does ImageSiftBot save?

Along with images, ImageSiftBot saves the following information:

  • Host URL and text on the page

  • Alt text associated with image

How do we use this information?

Once images and text are downloaded from a webpage, ImageSift analyzes this data from the page and stores the information in an index. Our web intelligence products use this index to enable search and retrieval of similar images.