Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WebApp.Api.Content.GetAllAsync() - Very slow #1696

Closed
hrvoje-grabusic opened this issue Aug 22, 2021 · 9 comments
Closed

WebApp.Api.Content.GetAllAsync() - Very slow #1696

hrvoje-grabusic opened this issue Aug 22, 2021 · 9 comments

Comments

@hrvoje-grabusic
Copy link

hrvoje-grabusic commented Aug 22, 2021

Hi,

I made a simple product list page where i just get all products like this:
var products = await WebApp.Api.Content.GetAllAsync<StandardProduct>();
And it takes from 600-750ms to return the html.
There is only 34 items, that seems way to much.

I checked the source to try to find the problem.
The GetAllMethod is fetching items from the database one by one.
This is a critical performance problem.

/// <summary>
        /// Gets all of the available content for the optional
        /// group id.
        /// </summary>
        /// <typeparam name="T">The model type</typeparam>
        /// <param name="groupId">The optional group id</param>
        /// <returns>The available content</returns>
        public async Task<IEnumerable<T>> GetAllAsync<T>(string groupId = null) where T : GenericContent
        {
            var models = new List<T>();
            var all = await _repo.GetAll(groupId).ConfigureAwait(false);

            foreach (var contentId in all)
            {
                var content = await GetByIdAsync<T>(contentId).ConfigureAwait(false);

                if (content != null)
                {
                    models.Add(content);
                }
            }
            return models;
        }

Is there some way around this ?

@tidyui
Copy link
Member

tidyui commented Aug 22, 2021

This issue already exists and will be implemented with the next release. #1364

@tidyui
Copy link
Member

tidyui commented Aug 22, 2021

Besides the fact that the current implementation isn’t optimal, 600-700ms just for retrieving the data seems long, how is your setup?

  • Are you using cache?
  • Where is the database positioned in regards to the database, is there network lag?

Best regards

@hrvoje-grabusic
Copy link
Author

Hi,

I haven't changed the startup settings.
The cache config is present in the startup.
I think it should be enabled.

I am using SQLite on localhost.

public class Startup
    {
        // This method gets called by the runtime. Use this method to add services to the container.
        // For more information on how to configure your application, visit https://go.microsoft.com/fwlink/?LinkID=398940
        public void ConfigureServices(IServiceCollection services)
        {
            

            services.AddPiranha(options =>
            {
                options.AddRazorRuntimeCompilation = false;

                options.UseFileStorage(naming: FileStorageNaming.UniqueFolderNames);
                options.UseImageSharp();
                options.UseManager();
                options.UseTinyMCE();
                options.UseMemoryCache();

                options.UseEF<SQLiteDb>(db =>
                    db.UseSqlite("Filename=./RazorWeb.web.db"));
                options.UseIdentity<IdentitySQLiteDb>(db =>
                    db.UseSqlite("Filename=./RazorWeb.web.db"));

                options.UseSecurity(o =>
                {
                    o.UsePermission("Subscriber");
                });
            });

            services.AddScoped<ProductRepository>();
        }

        // This method gets called by the runtime. Use this method to configure the HTTP request pipeline.
        public void Configure(IApplicationBuilder app, IWebHostEnvironment env, IApi api)
        {
            if (env.IsDevelopment())
            {
                app.UseDeveloperExceptionPage();
            }

            App.Init(api);

            // Configure cache level
            App.CacheLevel = Piranha.Cache.CacheLevel.Full;

            // Register custom components
            App.Blocks.Register<RazorWeb.Models.Blocks.MyGenericBlock>();
            App.Blocks.Register<RazorWeb.Models.Blocks.RawHtmlBlock>();
            App.Modules.Manager().Scripts.Add("~/assets/custom-blocks.js");
            App.Modules.Manager().Styles.Add("~/assets/custom-blocks.css");
            
            // Build content types
            new ContentTypeBuilder(api)
                .AddAssembly(typeof(Startup).Assembly)
                .Build()
                .DeleteOrphans();

            // Configure editor
            Piranha.Manager.Editor.EditorConfig.FromFile("editorconfig.json");

            /**
             *
             * Test another culture in the UI
             *
            var cultureInfo = new System.Globalization.CultureInfo("en-US");
            System.Globalization.CultureInfo.DefaultThreadCurrentCulture = cultureInfo;
            System.Globalization.CultureInfo.DefaultThreadCurrentUICulture = cultureInfo;
             */

            app.UsePiranha(options =>
            {
                options.UseManager();
                options.UseTinyMCE();
                options.UseIdentity();
            });

            Seed.RunAsync(api).GetAwaiter().GetResult();
        }

The cache seems to not be implemented in the ContentService.cs ?

/// <summary>
        /// Gets the content model with the specified id.
        /// </summary>
        /// <typeparam name="T">The model type</typeparam>
        /// <param name="id">The unique id</param>
        /// <param name="languageId">The optional language id</param>
        /// <returns>The content model</returns>
        public async Task<T> GetByIdAsync<T>(Guid id, Guid? languageId = null) where T : GenericContent
        {
            T model = null;

            // Make sure we have a language id
            if (languageId == null)
            {
                languageId = (await _langService.GetDefaultAsync())?.Id;
            }

            // First, try to get the model from cache
            if (typeof(IDynamicContent).IsAssignableFrom(typeof(T)))
            {
                model = null; // TODO: _cache?.Get<T>($"DynamicContent_{ id.ToString() }");
            }
            else
            {
                model = null; // TODO: _cache?.Get<T>(id.ToString());
            }

            // If we have a model, let's initialize it
            if (model != null)
            {
                if (model is IDynamicContent dynamicModel)
                {
                    await _factory.InitDynamicAsync(dynamicModel, App.ContentTypes.GetById(model.TypeId)).ConfigureAwait(false);
                }
                else
                {
                    await _factory.InitAsync(model, App.ContentTypes.GetById(model.TypeId)).ConfigureAwait(false);
                }
            }

            // If we don't have a model, get it from the repository
            if (model == null)
            {
                model = await _repo.GetById<T>(id, languageId.Value).ConfigureAwait(false);

                await OnLoadAsync(model).ConfigureAwait(false);
            }

            // Check that we got back the requested type from the
            // repository
            if (model != null && model is T)
            {
                return model;
            }
            return null;
        }

I just pulled the source i didn't create the site from template with dot net new.

@hrvoje-grabusic
Copy link
Author

About 300ms of that time seems to be because i have DataSelectFields in my product region.

[Field(Title ="Product type")]
        public DataSelectField<ProductTypeItem> ProductType { get; set; }

        [Field(Title = "Product material")]
        public DataSelectField<ProductMaterialItem> ProductMaterial { get; set; }

The GetById method of those classes is called for every product item automatically.
By just setting the model to null those 300ms disappear.
This way I will have to load the categories manually but i prefer to have control over loading of related data.

static async Task<ProductTypeItem> GetById(string id, IApi api)
        {
            return new ProductTypeItem
            {
                Id = new Guid(id),
                Model = null// await api.Content.GetByIdAsync<ProductTypeCategory>(new Guid(id))
            };
        }

I managed to get it down to 32-45ms with a custom repository.

using Microsoft.EntityFrameworkCore;
using Piranha;
using Piranha.Data;
using Piranha.Models;
using Piranha.Services;
using System;
using System.Collections.Generic;
using System.Linq;
using System.Threading.Tasks;

namespace RazorWeb.Models.Repository
{
    public class ProductRepository
    {
        private readonly IDb _db;
        private readonly IContentFactory _factory;
        //private readonly IContentService _contentService;
        private readonly IContentService<Piranha.Data.Content, ContentField, GenericContent> _efCcontentService;
        //private readonly ILanguageService _langService;
        public ProductRepository(IDb db, 
            IContentFactory factory,
           // IContentService contentService,
            IContentServiceFactory ef_content_service_factory
            )
        {
            _db = db;
            _factory = factory;
            //_contentService = contentService;
            _efCcontentService = ef_content_service_factory.CreateContentService();
            //_langService = langService;
        }

        public IQueryable<Piranha.Data.Content> GetQuery()
        {
            return _db.Content
                .AsNoTracking()
                .Include(c => c.Category)
                .Include(c => c.Translations)
                //.Include(c => c.Blocks).ThenInclude(b => b.Fields).ThenInclude(f => f.Translations)
                .Include(c => c.Fields).ThenInclude(f => f.Translations)

                .Include(c => c.Fields).ThenInclude(f => f.Content).ThenInclude(f => f.Translations)

                //.Include(c => c.Tags).ThenInclude(t => t.Taxonomy)

                .Where(c=> c.Type.Id == "StandardProduct")
                .AsSplitQuery()
                .AsQueryable();
        }

        public async Task<List<StandardProduct>> GetAllProducts(Guid? languageId=null)
        {
            var results = new List<StandardProduct>();

            var items = await GetQuery().ToListAsync();
            var type = App.ContentTypes.GetById("StandardProduct");
            foreach (var item in items)
            {
                StandardProduct model = null;
                //model = await _factory.InitAsync(model, type).ConfigureAwait(false);

                model = await _efCcontentService.TransformAsync<StandardProduct>(item,
                    type, 
                    ProcessAsync, 
                    languageId);

                await OnLoadAsync(model);
                results.Add(model);
            }

            return results;
        }

        private async Task OnLoadAsync(GenericContent model)
        {
            // Make sure we have a model
            if (model == null) return;

            // Initialize the model
            await _factory.InitAsync(model, App.ContentTypes.GetById(model.TypeId)).ConfigureAwait(false);
            

            // Initialize primary image
            if (model.PrimaryImage == null)
            {
                model.PrimaryImage = new Piranha.Extend.Fields.ImageField();
            }

            if (model.PrimaryImage.Id.HasValue)
            {
                await _factory.InitFieldAsync(model.PrimaryImage).ConfigureAwait(false);
            }

        }

        /// <summary>
        /// Performs additional processing and loads related models.
        /// </summary>
        /// <param name="content">The source content</param>
        /// <param name="model">The target model</param>
        private Task ProcessAsync<T>(Piranha.Data.Content content, T model) where T : Piranha.Models.GenericContent
        {
            return Task.Run(() => {
                // Map category
                //if (content.Category != null && model is Piranha.Models.ICategorizedContent categorizedModel)
                //{
                //    categorizedModel.Category = new Piranha.Models.Taxonomy
                //    {
                //        Id = content.Category.Id,
                //        Title = content.Category.Title,
                //        Slug = content.Category.Slug
                //    };
                //}

                // Map tags
                //if (model is Piranha.Models.ITaggedContent taggedContent)
                //{
                //    foreach (var tag in content.Tags)
                //    {
                //        taggedContent.Tags.Add(new Piranha.Models.Taxonomy
                //        {
                //            Id = tag.Taxonomy.Id,
                //            Title = tag.Taxonomy.Title,
                //            Slug = tag.Taxonomy.Slug
                //        });
                //    }
                //}

                // Map Blocks
                //if (!(model is Piranha.Models.IContentInfo) && model is Piranha.Models.IBlockContent blockModel)
                //{
                //    blockModel.Blocks = _efCcontentService.TransformBlocks(content.Blocks.OrderBy(b => b.SortOrder), 
                //        content.SelectedLanguageId);
                //}
            });
        }
    }
}

@tidyui
Copy link
Member

tidyui commented Aug 22, 2021

Yes, even after the GetByIds is implemented, which will make a single database operation for all items missing from cache, the Init() method on the fields will be called for each model nevertheless. This is when GetById is called on your data select items.

The best solution from the framework would probably be to provide separate load methods from within the manager UI and from the client application, this way you can load the external entity when constructing the edit models, but do nothing when loading it in the client and then make a combined call to load the external data like you’ve done.

Best regards

@hrvoje-grabusic
Copy link
Author

GetByIds is relatively easy to add in the source code.

In the ContentService.cs i just changed the GetAllAsync and added a new method and changed the interface.
That was easier then making a custom repo. I should have tried that first.

public async Task<IEnumerable<T>> GetAllAsync<T>(string groupId = null) where T : GenericContent
        {
            var models = new List<T>();
            var all = await _repo.GetAll(groupId).ConfigureAwait(false);

            models = await GetByIdsAsync<T>(all.ToArray()).ConfigureAwait(false);

            //foreach (var contentId in all)
            //{
            //    var content = await GetByIdAsync<T>(contentId).ConfigureAwait(false);

            //    if (content != null)
            //    {
            //        models.Add(content);
            //    }
            //}
            return models;
        }

public async Task<List<T>> GetByIdsAsync<T>(Guid[] ids, Guid? languageId = null)
            where T : GenericContent
        {
            // Make sure we have a language id
            if (languageId == null)
            {
                languageId = (await _langService.GetDefaultAsync())?.Id;
            }

            List<T> list = new List<T>();

            var models = await _repo.GetByIds<T>(ids, languageId.Value);

            foreach (var id in ids)
            {

                T model = models.Where(i=>i!=null).FirstOrDefault(i=>i.Id==id);


                if (model != null)
                {
                    //await _factory.InitAsync(model, App.ContentTypes.GetById(model.TypeId)).ConfigureAwait(false);
                    await OnLoadAsync(model).ConfigureAwait(false);
                }

                // If we don't have a model, get it from the repository
                //if (model == null)
                //{
                //    model = await _repo.GetById<T>(id, languageId.Value).ConfigureAwait(false);

                //    await OnLoadAsync(model).ConfigureAwait(false);
                //}

                // Check that we got back the requested type from the
                // repository
                if (model != null && model is T)
                {
                    list.Add(model);
                }

            }
            return list;
        }

That takes care of the first problem.
Will look into optimizing the transformation next.

@hrvoje-grabusic
Copy link
Author

Yes, even after the GetByIds is implemented, which will make a single database operation for all items missing from cache, the Init() method on the fields will be called for each model nevertheless. This is when GetById is called on your data select items.

The best solution from the framework would probably be to provide separate load methods from within the manager UI and from the client application, this way you can load the external entity when constructing the edit models, but do nothing when loading it in the client and then make a combined call to load the external data like you’ve done.

Best regards
That sounds good.

I just tried uncommenting the TODO caching lines in ContentService.cs
The caching does work and it does solve the performance problem.
It loads now in about 40-55ms with the DataSelectField models.
Not sure what was left TODO with the caching.

// First, try to get the model from cache
            if (typeof(IDynamicContent).IsAssignableFrom(typeof(T)))
            {
                model =  _cache?.Get<T>($"DynamicContent_{ id.ToString() }");
            }
            else
            {
                model = _cache?.Get<T>(id.ToString());
            }

@tidyui
Copy link
Member

tidyui commented Aug 23, 2021

The implementation (roughly) that will be moved into all services can be found in the PageService here: https://github.com/PiranhaCMS/piranha.core/blob/master/core/Piranha/Services/PageService.cs#L314 This method first tries to get the requested models from cache, then makes a single databas call for the models that aren't available in the cache, and lastly ensures that the returned entries are in the same order as the input array. We will optimize the implementation slightly by using Dictionaries instead of lists internally which will make the sorting faster for bigger collections.

Also, like you mentioned, why the cache is commented out in the ContentService in the first place, I'll add a separate issue for this scheduled for the next minor release as well.

Best regards

@tidyui
Copy link
Member

tidyui commented Oct 7, 2021

Closing as this is a duplicate

@tidyui tidyui closed this as completed Oct 7, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants