Understanding The Meta Image: Why it is Important for Image Search and SEO
What happens to rankings in image search when making changes to image files?
Whether you’re planning a relaunch, a server relocation, or wondering if it was really such a good idea for your SEO your images are named in Klingon and if it would be safe to rename them without havocking your rankings: people often ask us what might happen to their rankings if they made any changes to their images, such as a little cropping here and there, compressing the file size, changing the URL (because you moved it to another folder), or changing the file name (because, well, Klingon).
Once you become more familiar with the principle of processing images in the Google index, these questions are largely self-explanatory. Since Google do not provide any information on this, Martin has developed a model which illustrates these processes in a comprehensible way.
How does Google index images?
On uploading a new image to Google, Google won’t save “your” image – but “the” image. In other words: Google creates an internal copy of this image and adds it to its index. This is what Martin calls “the meta-picture” in his ebook “Image SEO” (which is, until today, by far the most comprehensive and compelling informational source on image optimisation for SEO – unfortunately, it is only available in German currently, but we’re working on the translation). Initially, Google runs sophisticated algorithms (“artificial intelligence”) on this meta image in an attempt to identify automatically what can be seen in a picture. These algorithms are becoming increasingly better at interpreting images correctly. You can test this for yourself with the Cloud-Vision API from Google, it’s quite impressive. The image analysis process may take some time though – see more about this below …
By the time the image has finally made its way into the Google index as a meta image, Google creates a new reference to its original source, analyses onpage signals, and important keywords are mapped to the image. As long as this specific image remains unique, image search will always link to the original source page – that is, to yours.
How does Google handle alternate image versions?
Yet this very image can reappear in the web in various altered versions, either through you changing the image or others using it on their own websites. That’s why Google compares each new indexable image with all others that are already in the index upon its discovery. This is done with the help of a simplification algorithm (describing how this works would most certainly provide enough material for another article – and too much distraction from our current topic, so we leave it at that for now).
If the newly discovered image looks similar to an already indexed meta image, it will be added to it – or, to be more specific, the new image URL is added to the meta image. The figure below illustrates a meta image file card sample:
As we have seen, two images that are very similar to each other are being represented by just one single meta image in the Google index. The meta image then has two reference URLs to the websites embedding the image (or any versions of it).
Any meta-image appears only once in a Google image search. By doing so, Google aims to avoid showing too many similar versions of the same image in image search. Below you will find an infographic summarising the process. The whole process is, of course, much more complex than that, but we hope to just give you an idea (Martin’s eBook has got nearly 170 pages for a reason).
Fundamentals for Image SEO: This is how Google Image Search works
Change Image File Name
Change Image URL
Image Compression (File Size Optimization)
Be careful here! Any changes made to the image’s dimensions or file size are, from Google’s point of view, be seen as an entirely new image. So here’s what’s going to happen in this case: Google can no longer find the reference from the original meta image to the original website (because the former image has been replaced by a compressed version). Google will therefore delete the link between the meta image and the website – keeping the meta image in the Google index (!) nevertheless. If there is any other website using this meta picture, Google will link to this website in the image search (via hotlink) instead. In case the meta image no longer has any references at all (given that there is no alternate version of in anywhere else in the web), the image will no longer appear in Google Image Search (looking as if it had been deindexed).
The new image – which might look identical to the human eye – has to run through the “security check” of the image analysis process again (as explained above). Afterwards it’s being run through the comparison process another time. Both of these processes can take several days each. Only after having run the image through all these processes will Google notice that this is just an altered version of the already existing meta image and add the source URL back to the meta image (the original website where it was embedded first). Only then the image will reappear in image search (if it previously seemed as if it had been gone). In most cases, the image will quickly return to its previous ranking position.
Please note, however: The whole process of detecting and mapping may take several days.
Even small changes like cropping an image just a tiny bit means you made changes to the size and dimensions of the image file. In which case you have to expect the same happening (bohooo!) as discussed in section 3 (Image Compression).
The number of images you’re dealing with is the crucial question in this matter. If you only need to make changes to one or two images: not an issue. Just compress them and give it a few days and you’ll see your images gain back their previous rankings soon (that’s the way (aha aha) I like it!).
However, you are always going to run into issues when dealing with larger quantities of images. Most of the time, Googlebot-image is busy retrieving already known image URLs to check if an image is still there (which accounts for approx. 95% of its activities1). If an image is no longer accessible after 3-4 requests, Googlebot deletes the referring hyperlink between the meta image and the previous website. If, for instance, during a relaunch hundreds or thousands of images are changed in bulk, the bot will gradually toss all images out of the index over the course of a few days. (More accurately, the links between the meta images and the domain will then be removed.) After 1-2 weeks all of your rankings have vanished.
When it comes to indexing new images, however, Googlebot only invests very little time (about 5%). That means it is going to take weeks and months before all the images will have been recollected by the image bot, and until they are run through the image recognition loops again and finally reassigned to the original meta images. Since the resources for image recognition are limited for each domain, this re-indexation process will be taking a long time.
In a nutshell: Do NOT do this with large quantities of images, ever! (Unless you’re comfortable with losing your rankings for a long time. Have a heart for your images, will you?)
The Dreadful Relaunch: What To Do To Prevent From Losing Your Rankings
Instead, you should leave the original (unchanged) images just right where they are (meaning you got to keep the image files accessible under their previous URLs). The compressed (Googlebot speak: new) images should be located at a different image URL (e. g. a slightly changed file name or another location, like a different subfolder, so you can distinguish “old” files from “new” files).
On the website, however, you should change the link to the new image version, so it does no longer point to the previous one.
Why would you do that? What is going to happen now? Googlebot will continue to find the old images and will therefore not change the link between the meta image and the original source website. It will also find new images that are then sent into the indexing process. In the end, all new images are mapped to the corresponding meta image.
That way, the image will keep its ranking position and the link target can point to the website where it’s embedded without any interruption.
301: To Redirect Or Not To Redirect
You also got to make sure that redirects are set correctly – or rather, that they are NOT set at all. In our current case, of course, you must not set a 301 redirect from the old to the new image. If you set a 301 redirect, Googlebot would interprete this as the old image is no longer being there. Consequently, it would remove the referring link as described above. The image can only remain visible for Googlebot (hence keep its rankings in image search) if there is NO 301 redirect.
In Googlebot speak, a 301 for (edited) images will be interpreted as: the old image has moved over to greener pastures (meaning: it is gone). A new image must go its way one way or the other. A 301 redirect only ever makes sense if the image is completely identical (in file size and dimensions). It is regarded as being identical if you only changed the image’s file name or location, but nothing else (see above).
Now you might ask: for how long should we keep the old images? Unfortunately, there is no clear time frame for this. So the answer is: until all new images have been detected and processed successfully. Depending on the crawl budget and number of images, this can take months to years. If we had to name numbers, we’d roughly suggest 12 months. It might be a good idea to monitor your rankings in Google image search to define when it would be a good time to remove and bury your old images.
Why does it take so long?
Thinking Is Silver, Testing Is Gold
Interesting insights & updates from Martin’s comment section
- How do we know it’s 95% of all activities?
Martin has been doing random evaluations of his log files for years, and that’s how he found out that most requests by Googlebot-image go to files that have been in the index for a long time. So he statistically evaluated this at some point in time (only for a period of 2 weeks and about 20 of his domains). If we were looking at a larger number, it might as well be 90 / 10 but basically, the exact numbers do matter here, it’s crucial to know the image bot spends a lot of time checking old images.