Skip to content

Commit

Permalink
Cherry pick PR #7003 (#7441)
Browse files Browse the repository at this point in the history
* Pass tp config via hydra

Signed-off-by: Jan Baczek <[email protected]>

* Remove self.ub_cfgs field - it isn't used anywhere else

Signed-off-by: Jan Baczek <[email protected]>

* Allow tp_overlap tree substitution in hydra config

Signed-off-by: Jan Baczek <[email protected]>

* Add warning in case of usage of the default tp config

Signed-off-by: Jan Baczek <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Jan Baczek <[email protected]>

* Change warning message

Signed-off-by: Jan Baczek <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: Jan Baczek <[email protected]>

* Add compute capability resolver

Signed-off-by: Jan Baczek <[email protected]>

* Bugfix

Signed-off-by: Jan Baczek <[email protected]>

* Fix cherry pick

Signed-off-by: Jan Baczek <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Jan Baczek <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
  • Loading branch information
2 people authored and web-flow committed Sep 18, 2023
1 parent 33f5b9f commit f285ae2
Show file tree
Hide file tree
Showing 2 changed files with 1 addition and 5 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -551,7 +551,6 @@ def initialize_ub_func(self):
self.cfg.get('encoder_seq_length') * self.cfg.get('micro_batch_size'),
self.cfg.get('hidden_size'),
]

te_module.base.initialize_ub(
shape=input_shape,
tp_size=self.cfg.get('tensor_model_parallel_size'),
Expand Down
5 changes: 1 addition & 4 deletions nemo/core/config/hydra_runner.py
Original file line number Diff line number Diff line change
Expand Up @@ -25,10 +25,7 @@


def _get_gpu_name():
try:
import pynvml
except (ImportError, ModuleNotFoundError):
return None
import pynvml

pynvml.nvmlInit()
handle = pynvml.nvmlDeviceGetHandleByIndex(0)
Expand Down

0 comments on commit f285ae2

Please sign in to comment.