cancel
Showing results for 
Search instead for 
Did you mean: 

Magento 2.4 - Import products programmatically - performance

SOLVED

Magento 2.4 - Import products programmatically - performance

Hi community, 

 

I have been writing a complicated script to import a vast number of products (> 5000).

The import is quite fast (2 or 3 product per second) to start with but gets slower and slower the more products run through.

After 500 - 1000 products it is maybe 1 per second, which is not really ideal. It gets even slower later. Bundle products can take up to 3 or 4 seconds.

 

I admit that there is a lot happening in my script, but seeing that it is fast at the beginning, I'm asking myself if there is a trick to clean up the process every 500 items so that it starts running fast again... (any idea there?)

 

Here some of the main sections in my script:

- The script reads the data from a WSDL-Interface and puts it in serialized form into a file

$storageFile = 'SERIALIZED_PRODUCTS-'.date("Ymd-Hi").'.tmp';
foreach( $products as $row ) {
    file_put_contents($storageFile, serialize($row)."-fxf-", FILE_APPEND);
}

I am adding a delimiter (-fxf-) after each sequence which I will use later to read the file.

$f = fopen($storageFile,"r");
while ( ! feof($f) ):

        $line = stream_get_line($f, 1000000, '-fxf-');
        $p = unserialize($line);

        $cats  = explode(",", $p->CATEGORIES);      
        $sku = $p->SKU;
        $pType = strtolower($p->TYPE);

        // Initiate current product (check if exists or if new):
        try {
                $product = $productRepository->get($sku);
                $isNew = false;
        } catch (\Magento\Framework\Exception\NoSuchEntityException $e) {
                $product = $productFactory->create();
                $isNew = true;
        }

        try {
		...
		...
                $product->setSku($sku);
                $product->setName($p->NAME);
                $product->setAttributeSetId(4);                                 
                $product->setStatus( ($p->STATUS=='Enabled'?1:0) );             
                $product->setPrice($p->PRICE);                                  
                $product->setShortDescription($sdesc);
                $product->setDescription($desc);
                $product->setTypeId($pType);                                    
                $product->setWebsiteIds(array(1));
                $product->setWeight($p->WEIGHT);                                
                $product->setVisibility(4);                                     
                $product->setTaxClassId($tax_c[$p->TAX_CLASS_ID]);              
                $product->setSpecialPrice($p->SPECIAL_PRICE);
                $product->setSpecialFromDate($p->SPECIAL_FROM_DATE);
                $product->setSpecialToDate($p->SPECIAL_TO_DATE);
                $product->setMetaTitle($p->META_TITLE);
                $product->setCountryOfManufacture($p->COUNTRY_OF_MANUFACTURE);
								...
								...

There is a whole section for Bundle products, a section for categories, a section for images.

I understand that this is a lot but - as I've said - it runs well at the beginning and I would like to get this happening for the rest of the process.

 

I have tried to clear cache and re-index every 500 items, but it doesn't seem to be very helpful:

        $count++;
        if ( $count%500 === 0 ) :
                // re-index
                error_log("START INDEX.....");
                foreach ($_indextypes as $indexid)
                {
                        $indexidarray = $indexFactory->create()->load($indexid);
                        $indexidarray->reindexAll($indexid);
                }
                // clear cache
                foreach ($_cachetypes as $type) {
                    $typeListInterface->cleanType($type);
                }
                foreach ($pool as $cacheFrontend) {
                    $cacheFrontend->getBackend()->clean();
                }
        endif; // mod 500

Cache and index arrays are defined as such:

$_indextypes = [
        'catalog_category_product',
        'catalog_product_category',
        'catalog_product_attribute',
        'cataloginventory_stock',
        'inventory'
            ];
$_cachetypes = [
        'collections',
        'db_ddl',
        'compiled_config',
        'eav'
            ];

I have set the following at the beginning of the script:

ini_set('default_socket_timeout', 900);
ini_set('memory_limit','4G');

... anything else I can try to get the most of the performance?

 

Again, script is quite fast at the beginning, it is just getting tired after 500 to 1000 products.

 

Thanks for any help or hint.

 

Jerome

1 ACCEPTED SOLUTION

Accepted Solutions

Re: Magento 2.4 - Import products programmatically - performance

Check that the indexers are set to Schedule and not to Save. 

 

If the indexers are already set to schedule try to divide the import into two step.
First the import of the products with the various attributes and then the import of the images (data is more important than pictures). Image processing takes more time.
So import the products without images and without reindexing/cleaning the cache every 500 items.
After having finished importing the data, the images import starts.
let us know if you find improvements or if you find other solutions.

View solution in original post

7 REPLIES 7

Re: Magento 2.4 - Import products programmatically - performance

Thanks Zoyascootg for pointing that out. But not really what I am looking for. As mentioned above my code is so far working fine. Products are imported correctly. Just the performance is questionable after a while. First 500 products are ok, but it gets slower and takes too long after a while.

Best would be something that relieves the CPU every 500 items so that it gets faster again... 

Any idea?

Re: Magento 2.4 - Import products programmatically - performance

... does nobody have any idea or tip?

 

I can provide more details if needed. 
It would be great to know how to speed import 1000s of products programmatically Smiley Happy 

Re: Magento 2.4 - Import products programmatically - performance

Check that the indexers are set to Schedule and not to Save. 

 

If the indexers are already set to schedule try to divide the import into two step.
First the import of the products with the various attributes and then the import of the images (data is more important than pictures). Image processing takes more time.
So import the products without images and without reindexing/cleaning the cache every 500 items.
After having finished importing the data, the images import starts.
let us know if you find improvements or if you find other solutions.

Re: Magento 2.4 - Import products programmatically - performance

Thanks for your reply @tunnel_dev !

I will give it a go in the next days and report back.

Re: Magento 2.4 - Import products programmatically - performance

@tunnel_dev !! Your tip was the salvation! 

Thank you so much!

I've only tried the solution with indexes put to "schedule" and the import is already so much faster! 

Re: Magento 2.4 - Import products programmatically - performance

@jeromeclic79eb  - 

Thanks for the post . I have the same issue . I had a some errors running the script Are you are to post the full script ?  thanks

Re: Magento 2.4 - Import products programmatically - performance

Its better to have indexing in Mind and data is also much important then Images as well.
Think about the input and output reading totally.

When computer / program start reading the input file / Json / Text, then in starting it found the products quickly, after 500 products data.
Its memory is still reading that 500 product data to reach 501 product. that's time taking and memory consuming as well.

you already did the hardest part analyzing the limit of 500 product, Try to divide it into batches and there would no garbage memory no CPU load as well as no time consuming etc.