oom error #27

zoepiran · 2022-01-20T18:57:20Z

hi,

Please excuse me in advance if there is a basic mis-usage in the attached code/setting.
In this drive you can find the following folders:
[1] code: two code files (differing in the cost object type, in 'gwlr.py' costs are pointclouds and in gwlr_geom,py we use general Geometry) which can reproduce our oom errors we encounter running on GPUs.
** we noticed that changing alpha to 0.75 we are able refrain from oom using low rank however while geometry run returned quickly the pointcloud calculation took >2 hrs.
[2] logs: 4 log-files documenting the errors in regular (quad) and low rank (lr) runs using pointcloud costs and Geometry (geom) objects.
[3] data: The data to run the code.
We realize it may be that the data is too large but will be grateful for your input.
(I am still trying to set-up different mem. allocation on our GPUs that may allow this)

Thank you in advance :)

The text was updated successfully, but these errors were encountered:

olivierteboul · 2022-01-24T15:18:57Z

Hi @zoepiran , would you mind shrinking down your code to a minimal example and paste it here? It would be much easier to track down the bug. Also I believe you can probably replace your data by random data, if the error is memory, the problem most likely linked to the amount of data, not the actual values.

marcocuturi · 2022-01-24T15:22:34Z

Hi Zoe, I agree with Olivier that a working colab should be easier for us to parse.

In the meantime, maybe you could help me parse the log files. In quad-0.001-0.75-pointcloud.log I see

Traceback (most recent call last):
  File "/cs/labs/mornitzan/zoe.piran/research/projects/moscot_framework_analysis/notebooks/seqfish/gwlr.py", line 127, in <module>
    gwlr(args)
  File "/cs/labs/mornitzan/zoe.piran/research/projects/moscot_framework_analysis/notebooks/seqfish/gwlr.py", line 78, in gwlr
    ot_quad = solve(args.alpha, C1=Cexp_geom, C2=Cphys_geom, M=M_geom, epsilon=args.epsilon)

but I can't see immediately where the Cexp_geom and Cphys_geom geometry objects are instantiated?

zoepiran · 2022-01-24T15:27:39Z

Here;
this should do:

import scanpy as sc
import ott

epsilon = 1e-3
alpha = 0.5
rank_order = 6
gamma = 1

adata_sq = sc.read('~/adata_sq_processed.h5ad')
adata_sc = sc.read(~/adata_processed.h5ad')

Cphys_geom = ott.geometry.pointcloud.PointCloud(adata_sq.obsm['spatial'])
Cexp_geom = ott.geometry.pointcloud.PointCloud(adata_sc.obsm['X_scvi'])

adata_sc_tmp = adata_sc[:, adata_sc.var['marker']]
adata_spatial_tmp = adata_sq[:, adata_sc_tmp.var['SYMBOL'].astype(str).values]
M_geom = ott.geometry.pointcloud.PointCloud(adata_sc_tmp.X.toarray(), adata_spatial_tmp.X.toarray(), epsilon=epsilon)

Cphys_geom.rescale_cost(1 / Cphys_geom.mean_cost_matrix)
Cexp_geom.rescale_cost(1 / Cexp_geom.mean_cost_matrix)
M_geom.rescale_cost(1 / M_geom.mean_cost_matrix)

ot_prob = ott.core.quad_problems.QuadraticProblem(geom_xx=Cexp_geom,
                                                              geom_yy=Cphys_geom,
                                                              geom_xy=M_geom,
                                                              fused_penalty=(1 - alpha) / alpha) 
if type =='lr':
         n, m = M_geom.shape
         rank = int(min(n, m) / rank_order)
         solver = ott.core.gromov_wasserstein.GromovWasserstein(rank=rank, gamma=gamma)
else:
         solver = ott.core.gromov_wasserstein.GromovWasserstein(epsilon=epsilon)

ot_gw = solver(ot_prob)

zoepiran · 2022-01-24T15:30:44Z

its missing in the log files. will open a colab real quick (based on this nb)

marcocuturi · 2022-01-24T15:34:19Z

I suspect the issue lies in this

Cphys_geom.rescale_cost(1 / Cphys_geom.mean_cost_matrix)
Cexp_geom.rescale_cost(1 / Cexp_geom.mean_cost_matrix)
M_geom.rescale_cost(1 / M_geom.mean_cost_matrix)

here mean_cost_matrix is fairly naive and will instantiate the entire cost matrix.

We might be able to come up with an efficient implementation of this for squared Euclidean distance, but at this point I would suggest directly rescaling the entry vectors in geom.x and geom.y when you instantiate the geometries.

additionally.. we just removed the rescale_cost function BTW... but we might add it back if needed!

zoepiran · 2022-01-24T16:21:25Z

Excuse my delay; created the following colab
Running the colab Low Rank does not hit oom however the matrix is of nans or does not define a valid coupling (i.e. .sum() \neq 1) and does not respect desired (uniform) marginals .a, .b.
@olivierteboul regarding general random data - i agree and we could do general benchmarking experiments to test maximal inputs but i assume its best to verify all funcs are acting properly.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

oom error #27

oom error #27

zoepiran commented Jan 20, 2022 •

edited

Loading

olivierteboul commented Jan 24, 2022

marcocuturi commented Jan 24, 2022

zoepiran commented Jan 24, 2022 •

edited

Loading

zoepiran commented Jan 24, 2022

marcocuturi commented Jan 24, 2022

zoepiran commented Jan 24, 2022 •

edited

Loading

oom error #27

oom error #27

Comments

zoepiran commented Jan 20, 2022 • edited Loading

olivierteboul commented Jan 24, 2022

marcocuturi commented Jan 24, 2022

zoepiran commented Jan 24, 2022 • edited Loading

zoepiran commented Jan 24, 2022

marcocuturi commented Jan 24, 2022

zoepiran commented Jan 24, 2022 • edited Loading

zoepiran commented Jan 20, 2022 •

edited

Loading

zoepiran commented Jan 24, 2022 •

edited

Loading

zoepiran commented Jan 24, 2022 •

edited

Loading