Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding Supply Explodes Memory Usage #36

Closed
clukewatson opened this issue Jul 24, 2019 · 12 comments
Closed

Adding Supply Explodes Memory Usage #36

clukewatson opened this issue Jul 24, 2019 · 12 comments
Labels
enhancement New feature or request question Further information is requested

Comments

@clukewatson
Copy link

I am running a fairly big problem with about 9000 product firms, absorbing a single fixed effect (with ~250 levels), with five random coefficients (price being endogenous). I can give more details in needed. Using only the demand side, I can estimate the model in Matlab and via pyBLP and get the same results (though your pyBLP is much faster). There is no memory strain on my 16gb RAM laptop. In fact I can run three or four notebooks with no problem.

Issue: I wanted to try out pyBLP's Supply Side which I have not coded in Matlab. When I did this, the memory usage exploded and the kernel died. Do you know why this could be the case? Is there anything that could be done given the problem size?

Note: I chose the log option, used the cost bounds from the BLP example, and absorbed the FEs as in the X1 matrix for demand.

Aside: thank you for all your work in developing pyBLP; I've implemented some items into my Matlab code and am looking forward to using you package in other projects going forward

Best,
Luke

@jeffgortmaker
Copy link
Owner

So the most demanding part of supply-side estimation is computing the gradient. Doing so requires lots of tensor products. When markets are large, this can use a lot of memory, especially if you're using a numerical integration routine with lots of nodes.

Two things I'd try out:

  1. To see if your memory bottleneck is computing the analytic gradient, try setting compute_gradient=False in your Optimization configuration. Optimization will be much slower (the routine will use finite differences), so ultimately I wouldn't recommend this, but it's a fast way to check if analytic supply-side gradients are the issue.

  2. Use a Integration configuration that requires fewer nodes. For example, sparse grid integration tends to perform well in high dimensions. With five RCs, the number of nodes (agents/individuals) in each market can be very large for comparably accurate Monte Carlo or product rule configurations.

And thanks for your interest in the project! Let me know what you find. Maybe there's something here that I can do to reduce memory usage.

@chrisconlon
Copy link
Collaborator

If you look on Appendix p.2 in Conlon and Gortmaker -- you can see the problem:

I bet the problematic step is the d \Delta / d \xi (where Delta is the dQ/dP matrix) calculation which is a tensor Jacobian of a (J x J) matrix with respect to a (J x J) vector. The resulting tensor has J^3 elements and 9000^3 is like 729 billion. Since it gets accumulated into a J x 1 vector at the end of the day it might be better to avoid storing the J^3 tensor and just unroll the loop. This looks like a discrete project that would benefit from some low-level optimization in case anyone is interested.

@chrisconlon
Copy link
Collaborator

Otherwise that tensor needs about 6TB of memory.

@jeffgortmaker
Copy link
Owner

Yeah, that sounds right. I think the loop (over delta/xi) would probably start with the first 3D array and end by filling the last subsequent 2D array that depends on 3D ones.

Micro moments' contribution to the gradient probably suffers from the same problem, so it unfortunately might be necessarily to unroll that as well.

@jeffgortmaker jeffgortmaker added the enhancement New feature or request label Jul 24, 2019
@clukewatson
Copy link
Author

Update: setting compute_gradient=False allows notebook to run (and is still running), so I think you all are confirmed on that one. Coding the gradient is what kept me from doing this in Matlab, so was hoping this would work out.

Do you guys want me to close this issue?

Thanks for quick response!

@jeffgortmaker
Copy link
Owner

Great, thanks for confirming! If you don't mind, I'm going to keep this issue open as a reminder to work on this at some point.

@jeffgortmaker
Copy link
Owner

As a follow-up to this, 95404c7 re-structures how we do supply-side gradients so that we no longer need to compute a J^3 tensor. This is in the dev version of the code for now, but I'll include it in the next stable release.

@zhijianli9999
Copy link

I'm facing an issue that looks similar to what's described here. I have a problem with the dimensions below and see memory usage peak at around 400GB, which happens only if I do compute_gradient=True. Is this expected?

Dimensions:
==============================================================================
  T      N     F      I      K1    K2    K3    D    MD    MS    MC    ED    H 
-----  -----  ----  ------  ----  ----  ----  ---  ----  ----  ----  ----  ---
14915  81364  3831  745750   2     2     1     1    6     1     1     2     1 
==============================================================================

Thanks for your help and for your work in maintaining the package.

Best,
Li

@jeffgortmaker
Copy link
Owner

jeffgortmaker commented Jun 12, 2024

You have a large problem, but I'm not sure it's 400GB large. A few questions to help pinpoint the issue:

  1. Where does memory peak when setting compute_gradient=False?
  2. What about when removing the nesting structure and separately setting compute_gradient=True vs. False?
  3. What about when removing the covariance moment and separately setting compute_gradient=True vs. False?
  4. What's the maximum number of products per market $\max_t J_t$?
  5. I'm guessing you have 50 agents per market, i.e. $I_t = 50$?
  6. Are you using any micro moments?
  7. Does memory usage scale fairly linearly with $T$ and $I_t$?

@jeffgortmaker jeffgortmaker reopened this Jun 12, 2024
@jeffgortmaker jeffgortmaker added the question Further information is requested label Jun 12, 2024
@zhijianli9999
Copy link

  1. There's a spike in memory at every optimization iteration -- at every iteration, memory stays very low for the most part, gradually builds up to 400GB, then ramps down to 200GB for a bit before it prints the status update line for that optimizer iteration.
  2. Removing the nesting did not noticeably reduce memory use.
  3. Removing the covariance moment worked. I replaced my covariance_instruments column with a demand instrument and saw memory peak at around 0.5 GB (for both compute_gradient=True and False).
  4. There's a maximum of 10 products per market.
  5. Yes, $I_t = 50$ for each market.
  6. No, I'm not using micro moments.
  7. Memory usage looks to scale with $T^2$ -- removing half the markets reduced it to roughly 1/4 the original. The peak memory usage does not change with $I_t$.

@jeffgortmaker
Copy link
Owner

jeffgortmaker commented Jun 12, 2024

Thanks, I think the issue is with how I was adjusting Jacobians after the linear IV regression. This is only needed with covariance moments, so it makes sense that disabling those would reduce memory usage.

Specifically, my approach created a $2N \times 2N$ "annihilator matrix". Your setting's quite large $N$ results in this matrix taking up just under 200GB, which lines up with your estimate in (1).

I just pushed a commit that should do the calculation without ever creating a large matrix (just re-ordering matrix operations). Can you (1) pip uninstall pyblp, (2) clone this repo to some directory, (3) add it to your PYTHONPATH environment variable, and (4) restart your terminal?

Hopefully that fixes the issue. And thanks for the report! This is definitely a useful improvement for anyone doing covariance moments with a nontrivial number of observations.

@zhijianli9999
Copy link

That fixed it! Thanks so much!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request question Further information is requested
Projects
None yet
Development

No branches or pull requests

4 participants