Hi All,
I have a cronjob that does a reindexall for all indexes in my magento setup, and it takes more than 6 hours to finish the job, making the server slower to respond during that period.
Since flat tables are another way to store data, I wonder what I can do to optimize this process, and I have the following questions:
Thanks.
Topic moved to the Technical Issues board which I believe is more suitable for your questions.
Can you identify which particular index it is that is taking the time?
If you run the indexes via the command line e.g
php shell/indexer.php --reindexall
And let me know which one takes the longest.. on the later versions of Magento - it will give you how long each process takes to run but if yours doesn't do this you should check visually which one is the 'problem index' so-to speak.
You can also run them individually by running
php shell/indexer.php --reindex {code}
Where code is one of:
catalog_product_attribute Product Attributes catalog_product_price Product Prices catalogsearch_fulltext Catalog Search Index cataloginventory_stock Stock Status tag_summary Tag Aggregation Data catalog_url Catalog URL Rewrites catalog_product_flat Product Flat Data catalog_category_flat Category Flat Data catalog_category_product Category Products
Hi @sarix I have a few questions
1)How many products and cateogires do you have in your catalog?
2)How are you reindexing , using command lline or Magento admin?
3)Which version of Magento are you using?
Thanks
Hi all,
Let me first update you on what I've done while waiting for my post to get moved
Now I have a lot of activity on products (adding, editing, deleting), so its really annoying to make the page (or API) load for 5 minutes on an action which makes "Update On Save" not an option.
So, I had this code to run the index process:
$pCollection = Mage::getSingleton('index/indexer')->getProcessesCollection(); foreach ($pCollection as $process) { $process->indexEvents(); }
And after I realized which take more time i changed the code to this:
$pCollection = Mage::getSingleton('index/indexer')->getProcessesCollection(); foreach ($pCollection as $process) { if($process->getData('indexer_code')!='catalog_url' && $process->getData('indexer_code')!='catalogsearch_fulltext'){ $process->indexEvents(); } }
And now I have injected a code that runs on adding a code that does the needed modifications to the core_url_rewrite table (which what the catalog_url_rewrite index does), (I'm also working on ditching the product_flat indexer by doing the same)
The problem I have now is the catalogsearch_fulltext index, which is the biggest problem.
Hi,
core_url_rewrite is most likely your problem it can grow very large because of duplicate url_keys on products. When magento builds a url it uses the url_key it then checks in the core_url_rewrite table if that key exists if it does it appends a number to the url to make it unique. Every index run after this it will create a new url with a new number. You can verify this by checking the target_paths in the core_url_rewrite table, you will see several url entries like:
request_path | target_path
----------------------------------------------------+----------------------------------------------------
women/dresses-skirts/sheath-317.html | women/dresses-skirts/sheath-405.html
women/dresses-skirts/sheath-318.html | women/dresses-skirts/sheath-405.html
women/dresses-skirts/sheath-319.html | women/dresses-skirts/sheath-405.html
women/dresses-skirts/sheath-320.html | women/dresses-skirts/sheath-405.html
women/dresses-skirts/sheath-321.html | women/dresses-skirts/sheath-405.html
Part of the problem is that magento includes simple products even if they are only available as part of a configurable product.
The solution is to ensure all url_keys are unique.
You could try clearing out the duplicates in the core_url_rewrite table - but be careful you don't delete any custom urls you may have added. There is also an SEO risk, a search engine may have indexed an old url i.e. women/dresses-skirts/sheath-320.html which would 404 if it does not exist.
Regards,
Ed
In normal Magento shops, you don't have to schedule a regular full reindex at all. If you are updating products and categories from the backend or via the SOAP api, the index will keep up to date automatically.
You will only need the full reindex if you import products through other methods, like Import/Export which uses CSV files.
Please note that there are import modules which support partial reindexing, see my blog entry at http://www.integer-net.com/2014/05/27/product-import-with-magento/.
If you can't get around the full reindex every night, I can recommend a faster URL indexer by Ivan Chepurniy which you can find at https://github.com/EcomDev/EcomDev_UrlRewrite. But please test it on a development / test instance before deploying it to the live server!