-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AutoBogus taking too long to generate data #12
Comments
Hi @jjroth Apologies for the delayed response. What version of I have just added some more unit tests to verify this behaviour based on the following scenarios:
If you are on a 2+ version, I'd be interested in understanding why it doesn't stop at the 2 levels. As for performance, there is a lot of reflection going on under the hood. When I get chance, I am hoping to review the code base and address any bottlenecks like this. Nick. |
Hi @nickdodd79, No problem on the delay in responding. I totally understand. I am using Autobogus 2.2.1. And yeah, I did see the documentation about it stopping at two levels...but it doesn't seem to be fully working as you can see from my test above. Regarding performance, on my machine it tends to make each test take about 1 second each....which really adds up after hundreds of tests. What I don't understand is that there is one colleague of mine whose machine takes 40 seconds or more per test...and our virtual machines are fairly identical. Are you able to see the performance issue on your side that is caused by excessive data generation? I have been able to find a workaround of sorts and that is to create a RuleFor that forces any reference objects such as |
Hey @jjroth I will invest some more time trying to replicate your setup. My tests showed it was working as expected so there must be some factor not being considered by them. As for the performance, I shall also attempt to run testing on some larger data sets to see where the impact and bottlenecks are. Nick. |
I have same issue, the entities have a multiple dependencies and circular dependencies, consume 100% CPU. Demonstrating entities: public class Team : Entity
{
// Empty constructor for EF
public Team() { }
public new string Id { get; protected set; }
public string CompanyId { get; private set; }
public virtual Company Company { get; private set; }
public string AffiliateId { get; private set; }
public virtual Affiliate Affiliate { get; private set; }
public string CostCenterId { get; private set; }
public virtual CostCenter CostCenter { get; private set; }
public string Name { get; private set; }
public virtual ICollection<EmployeeTeam> EmployeeTeams { get; private set; }
public virtual ICollection<Order> Orders { get; private set; }
public virtual ICollection<Measurement> Measurements { get; private set; }
public bool IsAdmin { get; private set; }
public DateTime ChangedAt { get; private set; }
}
public class Company : Entity
{
// Empty constructor for EF
protected Company() { }
public new string Id { get; protected set; }
public string Name { get; protected set; }
public virtual ICollection<Affiliate> Affiliates { get; private set; }
public virtual ICollection<CostCenter> CostCenters { get; private set; }
}
public class Affiliate : Entity
{
// Empty constructor for EF
public Affiliate() { }
public new string Id { get; protected set; }
public string Name { get; private set; }
public string CompanyId { get; private set; }
public virtual Company Company { get; private set; }
public virtual ICollection<CostCenter> CostCenters { get; private set; }
}
public class CostCenter : Entity
{
// Empty constructor for EF
public CostCenter() { }
public new string Id { get; protected set; }
public string CompanyId { get; private set; }
public virtual Company Company { get; private set; }
public string AffiliateId { get; private set; }
public virtual Affiliate Affiliate { get; private set; }
public string CoreBusinessId { get; private set; }
public virtual CoreBusiness CoreBusiness { get; private set; }
public string Name { get; private set; }
public DateTime ChangedAt { get; private set; }
public bool AuthorizesOrderValueRange { get; private set; }
public int MinimumOrderAuthorization { get; private set; }
}
public class EmployeeTeam : ValueObject
{
// Empty constructor for EF
private EmployeeTeam() { }
public string TeamId { get; private set; }
public virtual Team Team { get; private set; }
public int EmployeeId { get; private set; }
public virtual Employee Employee { get; private set; }
public string CompanyId { get; private set; }
public virtual Company Company { get; private set; }
public string AffiliateId { get; private set; }
public virtual Affiliate Affiliate { get; private set; }
public string CostCenterId { get; private set; }
public virtual CostCenter CostCenter { get; private set; }
public string YearMonth { get; private set; }
public float Factor { get; private set; }
}
public class Order : Entity
{
// Empty constructor for EF
public Order() { }
public string CompanyId { get; private set; }
public virtual Company Company { get; private set; }
public string AffiliateId { get; private set; }
public virtual Affiliate Affiliate { get; private set; }
public string CostCenterId { get; private set; }
public virtual CostCenter CostCenter { get; private set; }
public string TeamId { get; private set; }
public virtual Team Team { get; private set; }
public DateTime Date { get; private set; }
public virtual Authorization Authorization { get; private set; }
public virtual ICollection<ServiceOrder> ServiceOrders { get; private set; }
public virtual ICollection<MeasurementService> MeasurementServices { get; private set; }
public string Assignee { get; private set; }
public DateTime ChangedAt { get; private set; }
public DateTime? AppDownloadedAt { get; set; }
public DateTime? ChangedByManagerAt { get; set; }
public DateTime? UnblockedByManagerAt { get; set; }
}
public class Measurement : Entity
{
public Measurement(int id, string companyId, string affiliateId, string costCenterId, int executionTermId, Authorization authorization,
string teamId, decimal contractAmount, decimal measurementAmount, decimal netAmount, decimal payAmount, bool isAdmin,
Audit audit, IList<MeasurementService> measurementServices, IList<MeasurementEmployee> measurementEmployees)
{
Id = id;
CompanyId = companyId;
AffiliateId = affiliateId;
CostCenterId = costCenterId;
ExecutionTermId = executionTermId;
Date = DateTime.Now.Date;
Audit = audit;
Authorization = authorization;
ContractAmount = contractAmount;
TeamId = teamId;
MeasurementAmount = measurementAmount;
NetAmount = netAmount;
PayAmount = payAmount;
IsAdmin = isAdmin;
CalculateFactorSalary = false;
CalculateFactorProduction = false;
MeasurementServices = measurementServices;
MeasurementEmployees = measurementEmployees;
ChangedAt = DateTime.Now.Date;
}
// Empty constructor for EF
public Measurement() { }
public string CompanyId { get; private set; }
public virtual Company Company { get; private set; }
public string AffiliateId { get; private set; }
public virtual Affiliate Affiliate { get; private set; }
public string CostCenterId { get; private set; }
public virtual CostCenter CostCenter { get; private set; }
public virtual Authorization Authorization { get; private set; }
public virtual Audit Audit { get; private set; }
public int? ExecutionTermId { get; private set; }
public virtual ExecutionTerm ExecutionTerm { get; private set; }
public DateTime Date { get; private set; }
public string TeamId { get; private set; }
public virtual Team Team { get; private set; }
public decimal? ContractAmount { get; private set; }
public float? ReadjustmentIndex { get; private set; }
public decimal? ReadjustmentAmount { get; private set; }
public decimal? MeasurementAmount { get; private set; }
public decimal? WithheldAmount { get; private set; }
public decimal? InssAmount { get; private set; }
public decimal? IrrfAmount { get; private set; }
public decimal? IssAmount { get; private set; }
public decimal? CofinsAmount { get; private set; }
public decimal? PisAmount { get; private set; }
public decimal? CsslAmount { get; private set; }
public decimal? DiscountAmount { get; private set; }
public decimal? NetAmount { get; private set; }
public decimal? PayAmount { get; private set; }
public bool IsAdmin { get; private set; }
public DateTime? DateInit { get; private set; }
public DateTime? DateEnd { get; private set; }
public decimal? PayrollAmount { get; private set; }
public string Description { get; private set; }
public bool CalculateFactorSalary { get; private set; }
public bool CalculateFactorProduction { get; private set; }
public virtual ICollection<MeasurementService> MeasurementServices { get; private set; }
public virtual ICollection<MeasurementEmployee> MeasurementEmployees { get; private set; }
public DateTime ChangedAt { get; private set; }
} |
Hey @afranioce Thanks for providing another instance of this. I think the amount of reflection going on is quite intense when generating EF entities. I have a plan to introduce a type cache so reflected info doesn't get loaded multiple times. At the moment I am in the process of adding some generator override mechanisms which I plan on releasing soon. Once that is done, I will be looking at how the bottlenecks can be alleviated. Nick. |
Hey Nick, I don't know if this helps, but I received a PR for some performance improvements to Bogus that had some reflection improvements. These might be the same caching mechanisms you're thinking about, so I'm not sure how helpful they are: Also, there was a small bug in the original PR, so something to watch out for. Here's the proper fix for the bug: |
Thanks @bchavez. I will take a look. |
I'm having the same issue. Using EF .NET Core 2.2, which produced all the POCOs (around 40 table which have a fair few one-to-many relationships). The Generate() never actually returns and uses a fair bit of CPU. |
I'm also, having this issue with .net core 2.2 EF Core and the following dependencies.
My EF dependencies
|
I have the same issue too, trying to fake EF Core data. But in fact, I do think the Generate will return something. It just takes a huge amount of time to generate data. |
Hey all, Apologies for my delayed response. I have looked into this several time, but haven't got far with it. I have some spare time over the next few weeks so plan on dedicating some effort to it. I suspect it something related specifically to EF and have some theories around why it is slow. Just need to do some exploration. Thanks for your patience. |
Hi Nick, Any luck with this issue? I'm facing the same problem when generating fake objects. |
Hey @srihere17 I have attempted to look into and resolve this several times, but haven't got far with the reproduction. I suspect there is something in the EF setup that is making As another use case to pin it down, could you post what your code looks like here. I can then run over the framework to see if it highlights anything. Thanks, |
I encountered similar problem while using AutoBogus with autogenerated POCOs. The problem seemed to be complexity of NavigationProperty reference tree, which seemed to be surprisingly wide with just 2 levels of depth for my database model. It was fixed by adjusting AutoFaker configuration (recourse level: down to 0-1 / number of elements for collections). |
I do believe the issue has to do with Navigation properties and my POCOs/Entities were not autogenerated. Probably similar to a JSON circular reference issue. If there was a way we could set AutoBogus to ignore virtual properties (or set a default of null to all virtual properties) that might resolve the issue. Here's a class for an example of one of my entities.
|
Should be rather easy to reproduce. I just had a large number of class referencing themselves, with a large amount of properties and possible circular dependencies. |
Adding a +1 for this issue. Using EFCore5, and without introducing a FWIW, looks like the grief is occurring in |
OK... apologies for the previous post. It turns out I had a leftover navigation property that was pointing back to my domain model and that was causing the issue. It turns out that this issue is simply due to traversing navigation properties. I have been able to reproduce this... and this time posted it as working code. 😉 https://github.com/Mike-E-angelo/AutoBogus.Performance Basically, the following model will do the trick (notice that the project does not have EF installed whatsoever):
Running this on my machine will take about two full seconds. Interestingly enough, commenting out this line will reduce that to over half (about .9 seconds total runtime). Anyways, wanted to get that out there for more data towards a possible diagnosis. |
This issue is rather complexe and not really solvable. In my opinion, Autobogus should scan data, picture a navigation property tree, and check for circular dependencies. If it founds some, it should throws a runtime exception asking for the dev to specify a specific action about those properties. |
I've ran in to this same issue. I'm a bit confused about how |
The way it seems to work is it will stop if the number of parents of the current type being generated that match the current type is >= depth. |
I've added a TreeDepth option in this PR. #64 |
There's a package here if it helps https://github.com/Ian1971/AutoBogus/packages/637294 |
Hey All, Thanks for effort of @Ian1971 I think we have a performance improvement in v2.13.0. It does involve using a I did some top level testing and a simple model that took 4s to generate now takes 18ms 👍 Nick. |
@nickdodd79 Wanted to add my $0.02 here, but possibly from a different angle. I've got a test that's generating a set of
This is the entire AutoFaker portion of the code, redacted:
My questions are
|
Following up: I was able to get the runtime down from an average of above 2s per transaction to around 300ms per transaction by explicitly setting rules for each of the navigation properties I'm not using (returning Before:
Using
Using
So that basically addresses the concern I raised in my first question, although I'm still curious whether there's a way I'm not aware of to have the setup be opt-in instead of opt-out. As an aside: from a tiny sampling, using |
I think we also could improve some performance (not sure if this is already possible)
Of course, this won't fix all the cases. But it's maybe a better solution (faster and better) than assigning a tree depth |
Hi @nickdodd79
will throw StackOverflow exception. EDIT: |
Are you using the WithTreeDepth or WithRecursiveDepth config option? |
@Ian1971 What actually worked is moving the property from the constructure to the record body, as follows:
|
I've not actually use records yet so I guess these need checking out |
@Ian1971 the link links to another page then the title would say... ;) |
I've come across a performance issue when using AutoFaker to generate test objects for use in unit tests.
I've been able to create a super simplified example of my domain model to demonstrate two things.
Here is my unit test demonstrating the issue:
The following code represents a simplified version of my domain model. Please note that I've tried to make it make sense without giving away our exact domain model...modeling a document so that unique sentences can be applied to other paragraphs is not realistic...it was just the closest analogy that I could come up with. We basically have objects whose configuration needs to be tracked in branches (like version control). The links between objects are the foreign key relationships in the database and our entities are creating using Entity Framework.
The things that seem to contribute to the issue are:
Any thoughts on how this could be fixed? At the moment I am having to add rules so that for most objects Branch will be
null
ornew Branch()
....but it would be better if AutoBogus could stop itself from generating so many levels of data.The text was updated successfully, but these errors were encountered: