cancel
Showing results for 
Search instead for 
Did you mean: 

Sitemap.xml using cached version of image URL

Sitemap.xml using cached version of image URL

We are using FPC in Magento2

We enabled the sitemap.xml file to include the image URLs.

 

The problem:

It includes the cached version of the images in the sitemap file. This doesn't help with SEO.

 

URL appearing in sitemap.xml file: https://mysite.com/pub/media/catalog/product/cache/10f519365b01716ddb90abc57de5a837/g/r/image_1.jpg

Correct URL: https://mysite.com/pub/media/catalog/product/g/r/image_1.jpg

 

Anyone know how to resolve this issue?

 

-hk

4 REPLIES 4

Re: Sitemap.xml using cached version of image URL

Hello @hetul,

 

There is not an issue. You have add image path in sitemap which is rendering on the actual page. So it is not an issue. If you think about when the cache directory path was changed at the time what I have to do? Magento cron update sitemap daily, Please check screenshot https://www.screencast.com/t/HI9PARybN

2018-11-23_10-56-16.jpg

--
If my answer is useful, please Accept as Solution & give Kudos

 

Re: Sitemap.xml using cached version of image URL

I puzzled this command for this issue. It removes "/cache/a-z0-9somefoolishcacheid/" from your links. sed -i 's!/cache/[a-z0-9]*/!/!pg' sitemap.xml

Re: Sitemap.xml using cached version of image URL

In my opinion it could be an issue, at least if not a Mage2 issue a SEO-related issue.

When we flush the image-cache the images' urls gets regenerated: we all agree on that and that is an expected Mage2 behaviour.

 

The problem is that images with excellent position in google get removed from the search engine and the new ones will be indexed instead.

There is no guarantee that they will be indexed again with the same position in google. And there in no redirect mechanism as there is with product urls (at least, not that I know of).

 

We saw thousand of images disappear from google just 2 days after an upgrade (from 2.2.x to 2.3.x). Btw, we don't see errors in search console though (and robots.txt looks just fine).

I'm trying to understand if the issue can be linked to other factors but so far the indexing of a cached version (before the upgrade the image cache was not flushed for months) seems a major SEO drawback. 

I reckon that sooner or later I'll end up giving it a try with developing a module that generates xml sitemap with images' url NOT coming from the cached version. Cache crawling must be allowed so google can test mobile issues with the images, but telling the crawler to index catalog images (the ones not in cache) won't hurt if we find a way to avoid (possible) duplicate content.

 

 

Re: Sitemap.xml using cached version of image URL

Hi!

 

We are facing the same problem.  The sitemaps uses cached images and the product page uses cached images too... So... once that cache is flushed that would generate a 404 problem for Google, right?