Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

investigate memory utilization post-native engine #4101

Closed
kwlzn opened this issue Nov 29, 2016 · 10 comments
Closed

investigate memory utilization post-native engine #4101

kwlzn opened this issue Nov 29, 2016 · 10 comments
Assignees

Comments

@kwlzn
Copy link
Member

kwlzn commented Nov 29, 2016

currently, running ./pants --enable-v2-engine list :: in our monorepo leads to memory utilization of ~7GB in the main pants python2.7 process on my OSX machine. this is likely problematic for obvious reasons.

we'll want to dig into optimizing the memory footprint now that the native engine has landed.

@JieGhost
Copy link
Contributor

JieGhost commented Nov 29, 2016

By "main" pants process do you mean the main daemon process? or you are not using daemon?

@kwlzn
Copy link
Member Author

kwlzn commented Nov 29, 2016

@JieGhost the main pants' python2.7 process without the daemon.

@JieGhost
Copy link
Contributor

@kwlzn Did python-backed v2 engine also consume that much memory when doing "list ::"?

@kwlzn
Copy link
Member Author

kwlzn commented Nov 30, 2016

@JieGhost looks like the peak memory utilization for a v1 list :: is ~921MB on my machine - so, considerably lower.

@JieGhost
Copy link
Contributor

@kwlzn I mean v2 engine with python implementation. Not sure if we ever captured that before. But it should be much higher than v1 usage (921 MB), right?

@kwlzn
Copy link
Member Author

kwlzn commented Nov 30, 2016

@JieGhost ah, my bad. yeah, we did capture v2-python timings and iirc, it was roughly the same. we were thinking the native engine would help, but doesn't seem to have moved the needle on mem utilization at all. because of this, I suspect the high memory utilization is coming from the python side of the code vs the native bit.

@JieGhost
Copy link
Contributor

JieGhost commented Dec 1, 2016

I can repro the 7GB mem usage.

I checked the mem usage of ExternContext object, in particular, mem usage of _id_to_obj, _obj_to_id and _handles. I use sys.getsizeof method.

It seems _id_to_obj and _obj_to_id both uses around 50MB, and _handles uses around 134MB.
Since sys.getsizeof can only return size of the object itself, not including the size of the referenced objects in the original objects field, I added up the size of referenced objects as well.

size = 0
for k in self.scheduler._context._id_to_obj.keys():
  size += sys.getsizeof(self.scheduler._context._id_to_obj[k])

This returns around 68MB of size.

For _handles, it is around 461MB.

One thing I am not sure is whether those referenced objects have further referenced objects.
Since product graph is now in rust code, I need to find a way to figure out the mem usage there.

@stuhood
Copy link
Member

stuhood commented Feb 21, 2017

This definitely needs to include the transitive references of the object.

@stuhood
Copy link
Member

stuhood commented Mar 15, 2017

I think that this is probably approaching its finale once #4333, #4331, and #4334 land.

@kwlzn
Copy link
Member Author

kwlzn commented Mar 15, 2017

sgtm

@kwlzn kwlzn closed this as completed Mar 21, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants