Essential Insights on How Google Indexes Images!

Apr 26, 2012 | 4,526 views | by Navneet Kaushal
VN:F [1.9.20_1166]
Rating: 5.0/5 (1 vote cast)

Google has explained in a blog post how it indexes images. The search engine’s bots look at the textual content on the page, the page's title and its body to judge the nature of the image. Google also refers to the image’s filename, anchor text pointing to it, the alt text and also the captions of the Image Sitemap.

Google has advised webmasters to ensure that the image filename must be related to the image’s content. If you have any other questions, then see them answered here by Google:

Q: Why do I sometimes see Googlebot crawling my images, rather than Googlebot-Image?

A: Generally this happens when it’s not clear that a URL will lead to an image, so we crawl the URL with Googlebot first. If we find the URL leads to an image, we’ll usually revisit with Googlebot-Image. Because of this, it’s generally a good idea to allow crawling of your images and pages by both Googlebot and Googlebot-Image.

Q: Is it true that there’s a maximum file size for the images?

A: We’re happy to index images of any size; there’s no file size restriction.

Q: What happens to the EXIF, XMP and other metadata my images contain?

A: We may use any information we find to help our users find what they’re looking for more easily. Additionally, information like EXIF data may be displayed in the right-hand sidebar of the interstitial page that appears when you click on an image.

Q: Should I really submit an Image Sitemap? What are the benefits?

A: Yes! Image Sitemaps help us learn about your new images and may also help us learn what the images are about.

Q: I’m using a CDN to host my images; how can I still use an Image Sitemap?

A: Cross-domain restrictions apply only to the Sitemaps’ tag. In Image Sitemaps, the tag is allowed to point to a URL on another domain, so using a CDN for your images is fine. We also encourage you to verify the CDN’s domain name in Webmaster Tools so that we can inform you of any crawl errors that we might find.

Q: Is it a problem if my images can be found on multiple domains or subdomains I own — for example, CDNs or related sites?

A: Generally, the best practice is to have only one copy of any type of content. If you’re duplicating your images across multiple hostnames, our algorithms may pick one copy as the canonical copy of the image, which may not be your preferred version. This can also lead to slower crawling and indexing of your images.

Q: We sometimes see the original source of an image ranked lower than other sources; why is this?

A: Keep in mind that we use the textual content of a page when determining the context of an image. For example, if the original source is a page from an image gallery that has very little text, it can happen that a page with more textual context is chosen to be shown in search. If you feel you've identified very bad search results for a particular query, feel free to use the feedback link below the search results or to share your example in our Webmaster Help Forum.”

Were these insights from Google helpful for you? Do share your views.

Essential Insights on How Google Indexes Images!, 5.0 out of 5 based on 1 rating
4.thumbnail Essential Insights on How Google Indexes Images!

Navneet Kaushal

Navneet Kaushal is the founder and CEO of PageTraffic, an SEO Agency in India with offices in Chicago, Mumbai and London. A leading search strategist, Navneet helps clients maintain an edge in search engines and the online media. Navneet's expertise has established PageTraffic as one of the most awarded and successful search marketing agencies.
4.thumbnail Essential Insights on How Google Indexes Images!
4.thumbnail Essential Insights on How Google Indexes Images!