As a .NET developer, every time I was asked to create an online shop I turned to NopCommerce, since it was free and had every feature one could image. The code is very professional and the architecture a beauty.
Well the last few weeks I also saw the ugly side of NopCommerce. Apparently the devs only tested it on “Mom and pop’s” shops with few products. My project required integration with some distributors APIs and automatic importing of products. 27000 products later NopCommerce was shacking, and the VPS that was running it struggling with hard cpu usage and huge memory consumptions. Considering other e-commerce applications have no problem with this number of products my attention tured at profiling and to the SQL queries used.
The worst part was the admin interface. The entire product management is centered around the same stored procedure that searches and fetches products by a complicated set of criterias, such as categories, min and max prices, keywords, stock, etc. This proc, named NOP_ProductLoadAllPaged, is called on every single page and there is no caching on this. Not even per request (not saying that caching per request would help anything – I’d try sql dependency cache if I was NopCommerce dev team). So I open out sql profiler and to my horror the query took 21 seconds to run. Can you guess why I though the admin panel was unusable?
After some googleing it turns out I’m not the only one to notice this. Jared Nielsen wrote on his blog here http://www.fuzionagency.com/post/2010/09/18/NOP-Commerce-Product-Administration-Slow-with-Large-Number-of-Products-and-Variants.aspx
Jared didn’t have that many products, but he had a lot of variants. He noticed that the actual search query ( a very two page long query) joins too many tables for it’s own good, in an attempt to get all the required information right from the start, and then started to work on the paging with two temporary tables and some queries. The problem with joining every product with categories, variants, manufacturer mapping, product mapping, tags etc is that even with a few hundred products you quickly end up doing matches on 1 million rows. That’s a real performance killer. Jared’s solution was to do the search and only extract productId. He fetches the actual fields only after the paging and everything done, right before the rowset is returned.
Initially I was very happy with his idea, but after I’ve implemented it I realized how different our situations were. I didn’t have lots of variants (one for each product), nor tags, etc. Turns out that in my case it only shaved out 1 sec from the 21 seconds the stored procedure required 😦
After looking at the execution plan I immediately understood what was wrong. Let me explain how NOP_ProductLoadAllPaged works:
First there is a very large query that searches everything that matches the criteria. Going to product admin in the admin panel, calls this proc immediately with no criteria, returning all 27k products. All results from this query is entered in a temp table, with the purpose of executing a query on them and adding and extra column with a rownumber:
CREATE TABLE #PageIndex ( [IndexID] int IDENTITY (1, 1) NOT NULL, [ProductID] int NOT NULL ) INSERT INTO #PageIndex ([ProductID]) SELECT ProductID FROM #DisplayOrderTmp with (NOLOCK) GROUP BY ProductID ORDER BY min([ID])
See the identity column? It’s their way of number the results, so that in the next query they can select the results from any page (for instance results from row nuber 20 to 30):
SELECT p.ProductId, p.Name, p.ShortDescription, p.FullDescription, p.AdminComment, p.ProductTypeId, p.TemplateId, p.ShowOnHomePage, p.MetaKeywords, p.MetaDescription, p.MetaTitle, p.SEName, p.AllowCustomerReviews, p.AllowCustomerRatings, p.RatingSum, p.TotalRatingVotes, p.Published, p.Deleted, p.CreatedOn, p.UpdatedOn FROM #PageIndex [pi] INNER JOIN Nop_Product p on p.ProductID = [pi].ProductID WHERE [pi].IndexID > @PageLowerBound AND [pi].IndexID < @PageUpperBound ORDER BY IndexID
So the stored procedure just created two temporary tables with 27000 products each, and all this just to create another column with the rownumber. The irony is that exactly this operation, of inserting huge amount of data on a table with an identity index is causing the performance problem. I’m not saying the other temp table isn’t a bad idea either but this not the way to number rows. That’s what the ROW_NUMBER() function is there for!!!
So if we don’t need a temp table to number rows we can rewrite the whole thing into one large query with either a derived tabled or CTE (Common table expressions). I rewrote everything without temp tables and i managed to get the time down to 3 seconds for a full not-pages result set, and a few milliseconds for a 15 item page of results. The problem was that now I didn’t have a way to return the total number of items before the paging, since you can mix data retrieval and variable assignments in the same query. This mean that I did have to split the query in two with a temp table. So I did, but the difference is that my temp table only contained the results from one page (by default only 15 results), and only the productId column with an extra ‘TotalResults’ column, with no indexes to make inserting slow.
The resulting stored proc executes on my server in 4 sec with a large page of 1 mil items and in 1 sec for 15 item page. This is a massive improvement from the original 21 sec.
So what I basically did is to put the large search query inside a CTE (Common Table Expression), added rownumbering with ROW_NUMBER() OVER (ORDER BY p.ProductID), counted the initial result set with COUNT(*) OVER(PARTITION BY NULL) and make a simple paging query on the CTE, that inserts only the content of the page in a small temp table. Then I retrieve the total results set from, which I now have as an extra column in the temp table and put it as the return value of the stored proc. At the end I simply return all the results from my temp table.
We started with two large 27000 items temp tables, and ended up with only one, 15 items table with no indexes to slow down inserts. The time dropped from 21 sec to 4 sec, so that’s a 500% reduction in time. Not bad. Of course it can still be cut in half (to 1-2 sec) if you implement proper paging in the C# code of the NopCommerce. Right now the gridview gets ALL results and does the paging himself, which is not optimal. In a future article I might show you how I implemented the custom paging in NopCommerce. In the mean time you can read this article about custom paging: http://www.codeproject.com/KB/aspnet/ASPNET_DataGrid_Paging.aspx
Here is the optimized stored procedure, that I wrote starting with from Jared Nielsen’s version. I’ve kept some of his ideas but decided that it’s better to join another table (Nop_ProductVariant) that to run 2 scalar functions on each row, each doing a select. I’m not sure why Jared though scalar functions get optimized by the server. As far as I know they don’t get included in the execution plan at all, so using them not only is an O(n*n) operation like the join he tried to avoid, but you loose the benefit of execution plan optimization and compiling.
-- ============================================= -- Author: www.bitstarsolutions.com - Tudor Carean -- ============================================= ALTER PROCEDURE [dbo].[Nop_ProductLoadAllPaged] ( @CategoryID int = 0, @ManufacturerID int = 0, @ProductTagID int = 0, @FeaturedProducts bit = null, --0 featured only , 1 not featured only, null - load all products @PriceMin money = null, @PriceMax money = null, @Keywords nvarchar(MAX) = '', @SearchDescriptions bit = 0, @ShowHidden bit = 0, @PageIndex int = 0, @PageSize int = 20, --2147483644, @FilteredSpecs nvarchar(300) = null, --filter by attributes (comma-separated list). e.g. 14,15,16 @LanguageID int = 7, --0, @OrderBy int = 0, --0 position, 5 - Name, 10 - Price @TotalRecords int = null OUTPUT ) AS BEGIN --init SET @Keywords = isnull(@Keywords, '') SET @Keywords = '%' + rtrim(ltrim(@Keywords)) + '%' SET @PriceMin = isnull(@PriceMin, 0) SET @PriceMax = isnull(@PriceMax, 2147483644) --DECLARE @PageIndexTable TABLE CREATE TABLE #PageIndexTable ( [ProductID] int NOT NULL, [total] int ) --filter by attributes SET @FilteredSpecs = isnull(@FilteredSpecs, '') CREATE TABLE #FilteredSpecs ( SpecificationAttributeOptionID int not null ) INSERT INTO #FilteredSpecs (SpecificationAttributeOptionID) SELECT CAST(data as int) FROM dbo.[NOP_splitstring_to_table](@FilteredSpecs, ','); DECLARE @SpecAttributesCount int SELECT @SpecAttributesCount = COUNT(1) FROM #FilteredSpecs --paging DECLARE @PageLowerBound int DECLARE @PageUpperBound int DECLARE @RowsToReturn int SET @RowsToReturn = @PageSize * (@PageIndex + 1) SET @PageLowerBound = @PageSize * @PageIndex SET @PageUpperBound = @PageLowerBound + @PageSize + 1; WITH DispOrderCTE (ProductIDCTE, /*RowNumber,*/total) AS ( SELECT DISTINCT p.ProductID, --ROW_NUMBER() OVER (ORDER BY p.ProductID), COUNT(*) OVER(PARTITION BY NULL) FROM Nop_Product p with (NOLOCK) INNER JOIN Nop_ProductVariant pv with (NOLOCK) ON p.ProductID = pv.ProductID LEFT OUTER JOIN Nop_Product_Category_Mapping pcm with (NOLOCK) ON p.ProductID=pcm.ProductID LEFT OUTER JOIN Nop_Product_Manufacturer_Mapping pmm with (NOLOCK) ON p.ProductID=pmm.ProductID LEFT OUTER JOIN Nop_ProductTag_Product_Mapping ptpm with (NOLOCK) ON p.ProductID=ptpm.ProductID WHERE ( --P.ProductId = 9 --AND ( -- See if product is published @ShowHidden = 1 OR p.Published = 1 ) AND ( -- See if at least one product variant is published @ShowHidden = 1 OR (SELECT Max(Convert(int,Published)) FROM nop_ProductVariant where ProductID=p.ProductID) = 1 ) AND ( p.Deleted=0 ) AND ( -- Make sure the product is listed between a certain price range (SELECT Min(Price) FROM nop_ProductVariant WHERE ProductID=p.ProductID GROUP BY ProductID) BETWEEN IsNull(@PriceMin,0) AND IsNull(@PriceMax,999999999999) ) AND ( @CategoryID IS NULL OR @CategoryID=0 OR ( pcm.CategoryID=@CategoryID AND ( @FeaturedProducts IS NULL OR pcm.IsFeaturedProduct=@FeaturedProducts ) ) ) AND ( @ManufacturerID IS NULL OR @ManufacturerID=0 OR ( pmm.ManufacturerID=@ManufacturerID AND ( @FeaturedProducts IS NULL OR pmm.IsFeaturedProduct=@FeaturedProducts ) ) ) AND ( @ProductTagID IS NULL OR @ProductTagID=0 OR ptpm.ProductTagID=@ProductTagID ) AND ( -- search standard content patindex(@Keywords, isnull(p.name, '')) > 0 or patindex(@Keywords, isnull(pv.name, '')) > 0 or patindex(@Keywords, isnull(pv.sku , '')) > 0 or (@SearchDescriptions = 1 and patindex(@Keywords, isnull(p.ShortDescription, '')) > 0) or (@SearchDescriptions = 1 and patindex(@Keywords, isnull(p.FullDescription, '')) > 0) or (@SearchDescriptions = 1 and patindex(@Keywords, isnull(pv.Description, '')) > 0) -- search language content -- commented out localized product variants to save speed --or patindex(@Keywords, isnull(pl.name, '')) > 0 --or patindex(@Keywords, isnull(pvl.name, '')) > 0 --or (@SearchDescriptions = 1 and patindex(@Keywords, isnull(pl.ShortDescription, '')) > 0) --or (@SearchDescriptions = 1 and patindex(@Keywords, isnull(pl.FullDescription, '')) > 0) --or (@SearchDescriptions = 1 and patindex(@Keywords, isnull(pvl.Description, '')) > 0) ) AND ( @ShowHidden = 1 OR (getutcdate() between isnull(pv.AvailableStartDateTime, '1/1/1900') and isnull(pv.AvailableEndDateTime, '1/1/2999')) ) AND ( --filter by specs @SpecAttributesCount = 0 OR ( NOT EXISTS ( SELECT 1 FROM #FilteredSpecs [fs] WHERE [fs].SpecificationAttributeOptionID NOT IN ( SELECT psam.SpecificationAttributeOptionID FROM dbo.Nop_Product_SpecificationAttribute_Mapping psam WHERE psam.AllowFiltering = 1 AND psam.ProductID = p.ProductID ) ) ) ) ) ) INSERT INTO #PageIndexTable (ProductId,total) SELECT templist.ProductIDCTE,templist.total from (select fulllist.ProductIDCTE, fulllist.total, ROW_NUMBER() OVER (ORDER BY fulllist.ProductIDCTE) as RowNumber from DispOrderCTE fulllist ) templist WHERE RowNumber > @PageLowerBound AND RowNumber < @PageUpperBound ORDER BY RowNumber SELECT p.ProductId, p.Name, p.ShortDescription, p.FullDescription, p.AdminComment, p.ProductTypeId, p.TemplateId, p.ShowOnHomePage, p.MetaKeywords, p.MetaDescription, p.MetaTitle, p.SEName, p.AllowCustomerReviews, p.AllowCustomerRatings, p.RatingSum, p.TotalRatingVotes, p.Published, p.Deleted, p.CreatedOn, p.UpdatedOn FROM #PageIndexTable [pi] INNER JOIN Nop_Product p on p.ProductID = [pi].ProductID if ((select COUNT(*) from #PageIndexTable)>0) SELECT @TotalRecords=max(total) from #PageIndexTable else select @TotalRecords = 0 SET ROWCOUNT 0 DROP TABLE #FilteredSpecs DROP TABLE #PageIndexTable /* --- Test Harness exec [dbo].[Nop_ProductLoadAllPaged] @PageSize=2147483644 */ END
August 26, 2011 at 2:13 am
Thanks for putting the effor to rewrite the SP with CTEs, I just had a site down all day and when I opened the code and saw the original SQL I was so upset, how come there are still querys written like that, it’s 2011 for gods sake.
Anyway I had the code rewritten for NopCommerce 1.9, if you want it go to the following Nopcommerce board post:
http://www.nopcommerce.com/boards/t/9831/nop_productloadallpaged-needs-to-be-rewritenoptimized.aspx#46728