Remove quick fix to avoid applying DataParallel twice? #1093

julian-risch · 2021-05-25T15:26:25Z

Is your feature request related to a problem? Please describe.
Issue #224 reported problems with a multi-GPU setup and #234 introduced a quick fix. However, commenting out the fix, I cannot reproduce the earlier problems anymore.

Describe the solution you'd like
I wonder if the quick fix to avoid applying DataParallel twice can be removed by deleting the following three lines of code in reader and ranker nodes:

self.inferencer.model.save("tmp_model")
model = BaseAdaptiveModel.load(load_dir="tmp_model", device=device, strict=True)
shutil.rmtree('tmp_model')

Throughout the rest of the code, model should be replaced with self.inferencer.model if these lines are removed.

Additional context
Having apex installed on a machine with 4 GPUs and running tutorial 5 with python -m torch.distributed.launch , I couldn't find any difference in the logging output with or without the quick fix. It seems to run fine but I have never used apex before so I might have overlooked something. I could not find a check in FARM's optimize_model() that avoids applying DataParallel there if it was already done before: https://github.com/deepset-ai/FARM/blob/816b4e3e65c142f8a31a63833058b75fe0419ed4/farm/modeling/optimization.py#L272

The text was updated successfully, but these errors were encountered:

julian-risch · 2021-05-27T16:42:16Z

With Timo I just saw that the optimize_model method is not called anymore in the Inferencer in FARM but the import is still there. @tholor your commit here removed the call of optimize_model in the Inferencer. That's why I could not reproduce the error, I guess. deepset-ai/FARM@94c6b8d#diff-8a98ef73f1fa756ee7fdc0cd3a2f8bdc07f4e2eb490111356337ddc8c170066cR278

tholor · 2021-06-14T16:11:38Z

@julian-risch Ok, so let's remove the lines in Haystack then :)

julian-risch mentioned this issue May 25, 2021

Re-ranking component for document search without QA #1025

Merged

tholor assigned julian-risch Jun 14, 2021

julian-risch mentioned this issue Jun 15, 2021

Remove quickfix from reader and ranker #1196

Merged

julian-risch closed this as completed in #1196 Jun 15, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove quick fix to avoid applying DataParallel twice? #1093

Remove quick fix to avoid applying DataParallel twice? #1093

julian-risch commented May 25, 2021

julian-risch commented May 27, 2021

tholor commented Jun 14, 2021

Remove quick fix to avoid applying DataParallel twice? #1093

Remove quick fix to avoid applying DataParallel twice? #1093

Comments

julian-risch commented May 25, 2021

julian-risch commented May 27, 2021

tholor commented Jun 14, 2021