Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add profile decorator to simpa utils #241

Merged
merged 3 commits into from
Jun 12, 2024

Conversation

lkeegan
Copy link
Contributor

@lkeegan lkeegan commented Sep 20, 2023

This is just a convenience wrapper around using different profilers - making a PR in case you also find it useful

  • Allows @profile decorator to be added to functions
  • By default this decorator does nothing
  • But if the SIMPA_PROFILE` environment variable is set to:
    • TIME: line_profiler is used for line-by-line run-time profiling
    • MEMORY: memory_profiler is used for line-by-line RAM use profiling
    • GPU_MEMORY: pytorch_memlab is used for line-by-line GPU RAM profiling
  • Profiling output is written to the console when the script finishes
  • For GPU_MEMORY, a summary of gpu memory use (torch.cuda.memory_summary()) is also written to the console
  • Add these profiling dependencies to tool.poetry.group.profile.dependencies

Please check the following before creating the pull request (PR):

  • Did you run automatic tests?
  • Did you run manual tests?
  • Is the code provided in the PR still backwards compatible to previous SIMPA versions?

List any specific code review questions

List any special testing requirements

Additional context (e.g. papers, documentation, blog posts, ...)

Provide issue / feature request fixed by this PR

Example of output after adding @profile decorator to get_enclosed_indices in utils/libraries/structure_library/EllipticalTubularStructure.py:

With SIMPA_PROFILE=TIME:

Total time: 1.13564 s
File: /export/home/lkeegan/simpa/simpa/utils/libraries/structure_library/EllipticalTubularStructure.py
Function: get_enclosed_indices at line 54

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
    54                                               @profile
    55                                               def get_enclosed_indices(self):
    56         4       8631.0   2157.8      0.0          start_mm, end_mm, radius_mm, eccentricity, partial_volume = self.params
    57         4     914321.0 228580.2      0.1          start_mm = torch.tensor(start_mm, dtype=torch.float).to(self.torch_device)
    58         4     116221.0  29055.2      0.0          end_mm = torch.tensor(end_mm, dtype=torch.float).to(self.torch_device)
    59         4     122881.0  30720.2      0.0          radius_mm = torch.tensor(radius_mm, dtype=torch.float).to(self.torch_device)
    60         4      92632.0  23158.0      0.0          eccentricity = torch.tensor(eccentricity, dtype=torch.float).to(self.torch_device)
    61         4      96431.0  24107.8      0.0          partial_volume = torch.tensor(partial_volume, dtype=torch.float).to(self.torch_device)
    62                                           
    63         4     263872.0  65968.0      0.0          start_voxels = start_mm / self.voxel_spacing
    64         4      43171.0  10792.8      0.0          end_voxels = end_mm / self.voxel_spacing
    65         4      39302.0   9825.5      0.0          radius_voxels = radius_mm / self.voxel_spacing
    66                                           
    67         4     277034.0  69258.5      0.0          x, y, z = torch.meshgrid(torch.arange(self.volume_dimensions_voxels[0]).to(self.torch_device),
    68         4      92551.0  23137.8      0.0                                   torch.arange(self.volume_dimensions_voxels[1]).to(self.torch_device),
    69         4      90311.0  22577.8      0.0                                   torch.arange(self.volume_dimensions_voxels[2]).to(self.torch_device),
    70         4       1770.0    442.5      0.0                                   indexing='ij')
    71                                           
    72         4     313413.0  78353.2      0.0          x = x + 0.5
    73         4      58640.0  14660.0      0.0          y = y + 0.5
    74         4      46590.0  11647.5      0.0          z = z + 0.5
    75                                           
    76         4    6628647.0 1657161.8      0.6          if partial_volume:
    77         4       1250.0    312.5      0.0              radius_margin = 0.5
    78                                                   else:
    79                                                       radius_margin = 0.7071
    80                                           
    81         4     357833.0  89458.2      0.0          target_vector = torch.subtract(torch.stack([x, y, z], axis=-1), start_voxels)
    82         4       1930.0    482.5      0.0          if self.do_deformation:
    83                                                       # the deformation functional needs mm as inputs and returns the result in reverse indexing order...
    84         4    8636566.0 2159141.5      0.8              deformation_values_mm = self.deformation_functional_mm(torch.arange(self.volume_dimensions_voxels[0]) *
    85         4       1030.0    257.5      0.0                                                                     self.voxel_spacing,
    86         4      35750.0   8937.5      0.0                                                                     torch.arange(self.volume_dimensions_voxels[1]) *
    87         4       4641.0   1160.2      0.0                                                                     self.voxel_spacing).T
    88         4       8921.0   2230.2      0.0              deformation_values_mm = deformation_values_mm.reshape(self.volume_dimensions_voxels[0],
    89         4       1460.0    365.0      0.0                                                                    self.volume_dimensions_voxels[1], 1, 1)
    90         8   23343404.0 2917925.5      2.1              deformation_values_mm = torch.tile(torch.from_numpy(deformation_values_mm).to(
    91         8       4270.0    533.8      0.0                  self.torch_device), (1, 1, self.volume_dimensions_voxels[2], 3))
    92         4    5772348.0 1443087.0      0.5              target_vector = (target_vector + (deformation_values_mm / self.voxel_spacing)).float()
    93         4      50350.0  12587.5      0.0          cylinder_vector = torch.subtract(end_voxels, start_voxels)
    94                                           
    95         4     503735.0 125933.8      0.0          main_axis_length = radius_voxels/(1-eccentricity**2)**0.25
    96         4   44072171.0 11018042.8      3.9          main_axis_vector = torch.tensor([cylinder_vector[1], -cylinder_vector[0], 0]).to(self.torch_device)
    97         4     467274.0 116818.5      0.0          main_axis_vector = main_axis_vector/torch.linalg.norm(main_axis_vector) * main_axis_length
    98                                           
    99         4     305813.0  76453.2      0.0          minor_axis_length = main_axis_length*torch.sqrt(1-eccentricity**2)
   100         4     180841.0  45210.2      0.0          minor_axis_vector = torch.cross(cylinder_vector, main_axis_vector)
   101         4     129381.0  32345.2      0.0          minor_axis_vector = minor_axis_vector / torch.linalg.norm(minor_axis_vector) * minor_axis_length
   102                                           
   103         4  435438607.0 108859651.8     38.3          dot_product = torch.matmul(target_vector, cylinder_vector)/torch.linalg.norm(cylinder_vector)
   104                                           
   105         4     284523.0  71130.8      0.0          target_vector_projection = torch.multiply(dot_product[:, :, :, None], cylinder_vector)
   106         4      59810.0  14952.5      0.0          target_vector_from_projection = target_vector - target_vector_projection
   107                                           
   108         4     204562.0  51140.5      0.0          main_projection = torch.matmul(target_vector_from_projection, main_axis_vector) / main_axis_length
   109                                           
   110         4     131242.0  32810.5      0.0          minor_projection = torch.matmul(target_vector_from_projection, minor_axis_vector) / minor_axis_length
   111                                           
   112         4     253364.0  63341.0      0.0          radius_crit = torch.sqrt(((main_projection/main_axis_length)**2 + (minor_projection/minor_axis_length)**2) *
   113         4      38501.0   9625.2      0.0                                   radius_voxels**2)
   114                                           
   115         4  417708728.0 104427182.0     36.8          volume_fractions = torch.zeros(tuple(self.volume_dimensions_voxels), dtype=torch.float).to(self.torch_device)
   116         4     864170.0 216042.5      0.1          filled_mask = radius_crit <= radius_voxels - 1 + radius_margin
   117         4     146841.0  36710.2      0.0          border_mask = (radius_crit > radius_voxels - 1 + radius_margin) & \
   118         4     100772.0  25193.0      0.0                        (radius_crit < radius_voxels + 2 * radius_margin)
   119                                           
   120         4     261661.0  65415.2      0.0          volume_fractions[filled_mask] = 1
   121         4   17686319.0 4421579.8      1.6          volume_fractions[border_mask] = 1 - (radius_crit - (radius_voxels - radius_margin))[border_mask]
   122         4     151731.0  37932.8      0.0          volume_fractions[volume_fractions < 0] = 0
   123         4      80371.0  20092.8      0.0          volume_fractions[volume_fractions < 0] = 0
   124                                           
   125         4   12488797.0 3122199.2      1.1          if partial_volume:
   126                                           
   127         4      91050.0  22762.5      0.0              mask = filled_mask | border_mask
   128                                                   else:
   129                                                       mask = filled_mask
   130                                           
   131         4  156560720.0 39140180.0     13.8          return mask.cpu().numpy(), volume_fractions[mask].cpu().numpy()

With SIMPA_PROFILE=MEMORY:

Filename: /export/home/lkeegan/simpa/simpa/utils/libraries/structure_library/EllipticalTubularStructure.py

Line #    Mem usage    Increment  Occurrences   Line Contents
=============================================================
    54   3507.3 MiB   3507.3 MiB           1       @profile
    55                                             def get_enclosed_indices(self):
    56   3507.3 MiB      0.0 MiB           1           start_mm, end_mm, radius_mm, eccentricity, partial_volume = self.params
    57   3507.3 MiB      0.0 MiB           1           start_mm = torch.tensor(start_mm, dtype=torch.float).to(self.torch_device)
    58   3507.3 MiB      0.0 MiB           1           end_mm = torch.tensor(end_mm, dtype=torch.float).to(self.torch_device)
    59   3507.3 MiB      0.0 MiB           1           radius_mm = torch.tensor(radius_mm, dtype=torch.float).to(self.torch_device)
    60   3507.3 MiB      0.0 MiB           1           eccentricity = torch.tensor(eccentricity, dtype=torch.float).to(self.torch_device)
    61   3507.3 MiB      0.0 MiB           1           partial_volume = torch.tensor(partial_volume, dtype=torch.float).to(self.torch_device)
    62                                         
    63   3507.3 MiB      0.0 MiB           1           start_voxels = start_mm / self.voxel_spacing
    64   3507.3 MiB      0.0 MiB           1           end_voxels = end_mm / self.voxel_spacing
    65   3507.3 MiB      0.0 MiB           1           radius_voxels = radius_mm / self.voxel_spacing
    66                                         
    67   3507.3 MiB      0.0 MiB           2           x, y, z = torch.meshgrid(torch.arange(self.volume_dimensions_voxels[0]).to(self.torch_device),
    68   3507.3 MiB      0.0 MiB           1                                    torch.arange(self.volume_dimensions_voxels[1]).to(self.torch_device),
    69   3507.3 MiB      0.0 MiB           1                                    torch.arange(self.volume_dimensions_voxels[2]).to(self.torch_device),
    70   3507.3 MiB      0.0 MiB           1                                    indexing='ij')
    71                                         
    72   3507.3 MiB      0.0 MiB           1           x = x + 0.5
    73   3507.3 MiB      0.0 MiB           1           y = y + 0.5
    74   3507.3 MiB      0.0 MiB           1           z = z + 0.5
    75                                         
    76   3507.3 MiB      0.0 MiB           1           if partial_volume:
    77   3507.3 MiB      0.0 MiB           1               radius_margin = 0.5
    78                                                 else:
    79                                                     radius_margin = 0.7071
    80                                         
    81   3507.3 MiB      0.0 MiB           1           target_vector = torch.subtract(torch.stack([x, y, z], axis=-1), start_voxels)
    82   3507.3 MiB      0.0 MiB           1           if self.do_deformation:
    83                                                     # the deformation functional needs mm as inputs and returns the result in reverse indexing order...
    84   3507.3 MiB      0.0 MiB           4               deformation_values_mm = self.deformation_functional_mm(torch.arange(self.volume_dimensions_voxels[0]) *
    85   3507.3 MiB      0.0 MiB           1                                                                      self.voxel_spacing,
    86   3507.3 MiB      0.0 MiB           2                                                                      torch.arange(self.volume_dimensions_voxels[1]) *
    87   3507.3 MiB      0.0 MiB           2                                                                      self.voxel_spacing).T
    88   3507.3 MiB      0.0 MiB           2               deformation_values_mm = deformation_values_mm.reshape(self.volume_dimensions_voxels[0],
    89   3507.3 MiB      0.0 MiB           1                                                                     self.volume_dimensions_voxels[1], 1, 1)
    90   3507.7 MiB      0.4 MiB           3               deformation_values_mm = torch.tile(torch.from_numpy(deformation_values_mm).to(
    91   3507.3 MiB      0.0 MiB           2                   self.torch_device), (1, 1, self.volume_dimensions_voxels[2], 3))
    92   3507.7 MiB      0.0 MiB           1               target_vector = (target_vector + (deformation_values_mm / self.voxel_spacing)).float()
    93   3507.7 MiB      0.0 MiB           1           cylinder_vector = torch.subtract(end_voxels, start_voxels)
    94                                         
    95   3507.7 MiB      0.0 MiB           1           main_axis_length = radius_voxels/(1-eccentricity**2)**0.25
    96   3507.7 MiB      0.0 MiB           1           main_axis_vector = torch.tensor([cylinder_vector[1], -cylinder_vector[0], 0]).to(self.torch_device)
    97   3507.7 MiB      0.0 MiB           1           main_axis_vector = main_axis_vector/torch.linalg.norm(main_axis_vector) * main_axis_length
    98                                         
    99   3507.7 MiB      0.0 MiB           1           minor_axis_length = main_axis_length*torch.sqrt(1-eccentricity**2)
   100   3507.7 MiB      0.0 MiB           1           minor_axis_vector = torch.cross(cylinder_vector, main_axis_vector)
   101   3507.7 MiB      0.0 MiB           1           minor_axis_vector = minor_axis_vector / torch.linalg.norm(minor_axis_vector) * minor_axis_length
   102                                         
   103   4379.6 MiB    871.9 MiB           1           dot_product = torch.matmul(target_vector, cylinder_vector)/torch.linalg.norm(cylinder_vector)
   104                                         
   105   4379.6 MiB      0.0 MiB           1           target_vector_projection = torch.multiply(dot_product[:, :, :, None], cylinder_vector)
   106   4379.6 MiB      0.0 MiB           1           target_vector_from_projection = target_vector - target_vector_projection
   107                                         
   108   4379.6 MiB      0.0 MiB           1           main_projection = torch.matmul(target_vector_from_projection, main_axis_vector) / main_axis_length
   109                                         
   110   4379.6 MiB      0.0 MiB           1           minor_projection = torch.matmul(target_vector_from_projection, minor_axis_vector) / minor_axis_length
   111                                         
   112   4379.6 MiB      0.0 MiB           2           radius_crit = torch.sqrt(((main_projection/main_axis_length)**2 + (minor_projection/minor_axis_length)**2) *
   113   4379.6 MiB      0.0 MiB           1                                    radius_voxels**2)
   114                                         
   115   4380.0 MiB      0.4 MiB           1           volume_fractions = torch.zeros(tuple(self.volume_dimensions_voxels), dtype=torch.float).to(self.torch_device)
   116   4380.0 MiB      0.0 MiB           1           filled_mask = radius_crit <= radius_voxels - 1 + radius_margin
   117   4380.0 MiB      0.0 MiB           2           border_mask = (radius_crit > radius_voxels - 1 + radius_margin) & \
   118   4380.0 MiB      0.0 MiB           1                         (radius_crit < radius_voxels + 2 * radius_margin)
   119                                         
   120   4380.0 MiB      0.0 MiB           1           volume_fractions[filled_mask] = 1
   121   4380.0 MiB      0.0 MiB           1           volume_fractions[border_mask] = 1 - (radius_crit - (radius_voxels - radius_margin))[border_mask]
   122   4380.0 MiB      0.0 MiB           1           volume_fractions[volume_fractions < 0] = 0
   123   4380.0 MiB      0.0 MiB           1           volume_fractions[volume_fractions < 0] = 0
   124                                         
   125   4380.0 MiB      0.0 MiB           1           if partial_volume:
   126                                         
   127   4380.0 MiB      0.0 MiB           1               mask = filled_mask | border_mask
   128                                                 else:
   129                                                     mask = filled_mask
   130                                         
   131   4470.8 MiB     90.7 MiB           1           return mask.cpu().numpy(), volume_fractions[mask].cpu().numpy()

With SIMPA_PROFILE=GPU_MEMORY:

## EllipticalTubularStructure.get_enclosed_indices

active_bytes reserved_bytes line code                                                                                                                  
         all            all                                                                                                                            
        peak           peak                                                                                                                            
       8.45G         11.73G   54     @profile                                                                                                          
                              55     def get_enclosed_indices(self):                                                                                   
       8.12M         11.73G   56         start_mm, end_mm, radius_mm, eccentricity, partial_volume = self.params                                       
       8.13M         11.73G   57         start_mm = torch.tensor(start_mm, dtype=torch.float).to(self.torch_device)                                    
       8.13M         11.73G   58         end_mm = torch.tensor(end_mm, dtype=torch.float).to(self.torch_device)                                        
       8.13M         11.73G   59         radius_mm = torch.tensor(radius_mm, dtype=torch.float).to(self.torch_device)                                  
       8.13M         11.73G   60         eccentricity = torch.tensor(eccentricity, dtype=torch.float).to(self.torch_device)                            
       8.13M         11.73G   61         partial_volume = torch.tensor(partial_volume, dtype=torch.float).to(self.torch_device)                        
                              62                                                                                                                       
       8.13M         11.73G   63         start_voxels = start_mm / self.voxel_spacing                                                                  
       8.13M         11.73G   64         end_voxels = end_mm / self.voxel_spacing                                                                      
       8.13M         11.73G   65         radius_voxels = radius_mm / self.voxel_spacing                                                                
                              66                                                                                                                       
       8.14M         11.73G   67         x, y, z = torch.meshgrid(torch.arange(self.volume_dimensions_voxels[0]).to(self.torch_device),                
       8.14M         11.73G   68                                  torch.arange(self.volume_dimensions_voxels[1]).to(self.torch_device),                
       8.14M         11.73G   69                                  torch.arange(self.volume_dimensions_voxels[2]).to(self.torch_device),                
       8.14M         11.73G   70                                  indexing='ij')                                                                       
                              71                                                                                                                       
     372.06M         11.73G   72         x = x + 0.5                                                                                                   
     735.98M         11.73G   73         y = y + 0.5                                                                                                   
       1.07G         11.73G   74         z = z + 0.5                                                                                                   
                              75                                                                                                                       
       1.07G         11.73G   76         if partial_volume:                                                                                            
       1.07G         11.73G   77             radius_margin = 0.5                                                                                       
                              78         else:                                                                                                         
                              79             radius_margin = 0.7071                                                                                    
                              80                                                                                                                       
       3.21G         11.73G   81         target_vector = torch.subtract(torch.stack([x, y, z], axis=-1), start_voxels)                                 
       2.14G         11.73G   82         if self.do_deformation:                                                                                       
                              83             # the deformation functional needs mm as inputs and returns the result in reverse indexing order...       
       2.14G         11.73G   84             deformation_values_mm = self.deformation_functional_mm(torch.arange(self.volume_dimensions_voxels[0]) *   
       2.14G         11.73G   85                                                                    self.voxel_spacing,                                
       2.14G         11.73G   86                                                                    torch.arange(self.volume_dimensions_voxels[1]) *   
       2.14G         11.73G   87                                                                    self.voxel_spacing).T                              
       2.14G         11.73G   88             deformation_values_mm = deformation_values_mm.reshape(self.volume_dimensions_voxels[0],                   
       2.14G         11.73G   89                                                                   self.volume_dimensions_voxels[1], 1, 1)             
       4.27G         11.73G   90             deformation_values_mm = torch.tile(torch.from_numpy(deformation_values_mm).to(                            
       2.14G         11.73G   91                 self.torch_device), (1, 1, self.volume_dimensions_voxels[2], 3))                                      
       8.54G         11.73G   92             target_vector = (target_vector + (deformation_values_mm / self.voxel_spacing)).float()                    
       4.27G         11.73G   93         cylinder_vector = torch.subtract(end_voxels, start_voxels)                                                    
                              94                                                                                                                       
       4.27G         11.73G   95         main_axis_length = radius_voxels/(1-eccentricity**2)**0.25                                                    
       4.27G         11.73G   96         main_axis_vector = torch.tensor([cylinder_vector[1], -cylinder_vector[0], 0]).to(self.torch_device)           
       4.27G         11.73G   97         main_axis_vector = main_axis_vector/torch.linalg.norm(main_axis_vector) * main_axis_length                    
                              98                                                                                                                       
       4.27G         11.73G   99         minor_axis_length = main_axis_length*torch.sqrt(1-eccentricity**2)                                            
       4.27G         11.73G  100         minor_axis_vector = torch.cross(cylinder_vector, main_axis_vector)                                            
       4.27G         11.73G  101         minor_axis_vector = minor_axis_vector / torch.linalg.norm(minor_axis_vector) * minor_axis_length              
                             102                                                                                                                       
       4.98G         11.73G  103         dot_product = torch.matmul(target_vector, cylinder_vector)/torch.linalg.norm(cylinder_vector)                 
                             104                                                                                                                       
       5.70G         11.73G  105         target_vector_projection = torch.multiply(dot_product[:, :, :, None], cylinder_vector)                        
       6.76G         11.73G  106         target_vector_from_projection = target_vector - target_vector_projection                                      
                             107                                                                                                                       
       7.47G         11.73G  108         main_projection = torch.matmul(target_vector_from_projection, main_axis_vector) / main_axis_length            
                             109                                                                                                                       
       7.83G         11.73G  110         minor_projection = torch.matmul(target_vector_from_projection, minor_axis_vector) / minor_axis_length         
                             111                                                                                                                       
       8.54G         11.73G  112         radius_crit = torch.sqrt(((main_projection/main_axis_length)**2 + (minor_projection/minor_axis_length)**2) *  
       7.83G         11.73G  113                                  radius_voxels**2)                                                                    
                             114                                                                                                                       
       8.18G         11.73G  115         volume_fractions = torch.zeros(tuple(self.volume_dimensions_voxels), dtype=torch.float).to(self.torch_device) 
       8.27G         11.73G  116         filled_mask = radius_crit <= radius_voxels - 1 + radius_margin                                                
       8.54G         11.73G  117         border_mask = (radius_crit > radius_voxels - 1 + radius_margin) & \                                           
       8.45G         11.73G  118                       (radius_crit < radius_voxels + 2 * radius_margin)                                               
                             119                                                                                                                       
       8.36G         11.73G  120         volume_fractions[filled_mask] = 1                                                                             
       8.72G         11.73G  121         volume_fractions[border_mask] = 1 - (radius_crit - (radius_voxels - radius_margin))[border_mask]              
       8.45G         11.73G  122         volume_fractions[volume_fractions < 0] = 0                                                                    
       8.45G         11.73G  123         volume_fractions[volume_fractions < 0] = 0                                                                    
                             124                                                                                                                       
       8.36G         11.73G  125         if partial_volume:                                                                                            
                             126                                                                                                                       
       8.45G         11.73G  127             mask = filled_mask | border_mask                                                                          
                             128         else:                                                                                                         
                             129             mask = filled_mask                                                                                        
                             130                                                                                                                       
       8.45G         11.73G  131         return mask.cpu().numpy(), volume_fractions[mask].cpu().numpy()                                               
|===========================================================================|
|                  PyTorch CUDA memory summary, device ID 0                 |
|---------------------------------------------------------------------------|
|            CUDA OOMs: 0            |        cudaMalloc retries: 0         |
|===========================================================================|
|        Metric         | Cur Usage  | Peak Usage | Tot Alloc  | Tot Freed  |
|---------------------------------------------------------------------------|
| Allocated memory      |   8320 KiB |   8652 MiB |  10072 MiB |  18716 MiB |
|       from large pool |   8320 KiB |   8652 MiB |  10072 MiB |  18716 MiB |
|       from small pool |      0 KiB |      0 MiB |      0 MiB |      0 MiB |
|---------------------------------------------------------------------------|
| Active memory         |   8320 KiB |   8652 MiB |  10072 MiB |  18716 MiB |
|       from large pool |   8320 KiB |   8652 MiB |  10072 MiB |  18716 MiB |
|       from small pool |      0 KiB |      0 MiB |      0 MiB |      0 MiB |
|---------------------------------------------------------------------------|
| Requested memory      |   8320 KiB |   8651 MiB |  10070 MiB |  18713 MiB |
|       from large pool |   8320 KiB |   8651 MiB |  10070 MiB |  18713 MiB |
|       from small pool |      0 KiB |      0 MiB |      0 MiB |      0 MiB |
|---------------------------------------------------------------------------|
| GPU reserved memory   |   2184 MiB |  12016 MiB |      0 B   |   9832 MiB |
|       from large pool |   2184 MiB |  12012 MiB |      0 B   |   9828 MiB |
|       from small pool |      0 MiB |      4 MiB |      0 B   |      4 MiB |
|---------------------------------------------------------------------------|
| Non-releasable memory |   2175 MiB |   5907 MiB |   9893 MiB |   8895 MiB |
|       from large pool |   2175 MiB |   5907 MiB |   9889 MiB |   8889 MiB |
|       from small pool |      0 MiB |      1 MiB |      4 MiB |      6 MiB |
|---------------------------------------------------------------------------|
| Allocations           |       1    |      29    |      54    |      82    |
|       from large pool |       1    |      16    |      26    |      41    |
|       from small pool |       0    |      13    |      28    |      41    |
|---------------------------------------------------------------------------|
| Active allocs         |       1    |      29    |      54    |      82    |
|       from large pool |       1    |      16    |      26    |      41    |
|       from small pool |       0    |      13    |      28    |      41    |
|---------------------------------------------------------------------------|
| GPU reserved segments |       1    |       8    |       0    |       7    |
|       from large pool |       1    |       6    |       0    |       5    |
|       from small pool |       0    |       2    |       0    |       2    |
|---------------------------------------------------------------------------|
| Non-releasable allocs |       2    |       8    |      18    |      21    |
|       from large pool |       2    |       8    |      13    |      14    |
|       from small pool |       0    |       3    |       5    |       7    |
|---------------------------------------------------------------------------|
| Oversize allocations  |       0    |       0    |       0    |       0    |
|---------------------------------------------------------------------------|
| Oversize GPU segments |       0    |       0    |       0    |       0    |
|===========================================================================|

Copy link
Collaborator

@kdreher kdreher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks sensible! :)
If we are starting the simulations from an editor, though, it's not that easy to set the environment variables, no? I took a brief look and setting them with os.environ["VARIABLE"] = x didn't work for me.

Do I miss something?

@lkeegan
Copy link
Contributor Author

lkeegan commented Sep 21, 2023

Which editor are you using? I think in vscode you can add an env key to your launch.json, and in pycharm you can edit/add a python run configuration where you can set environment variables.
(I think if you want to set them in the script with os.environ["VARIABLE"] = x this would have to be done before importing simpa - the env var is only checked once on import)
But I only used env vars for my convenience since I'm running in a console on a server - happy to change this if there's a better / more convient way for you to do this (e.g. using simpa settings?)

@kdreher
Copy link
Collaborator

kdreher commented Sep 21, 2023

I'm using PyCharm and adding environment variables to the run configuration worked but since we'll call everything else in simpa in a script, I wanted to set them there.
Yes, I think having it in the settings would make sense :)

@lkeegan
Copy link
Contributor Author

lkeegan commented Sep 21, 2023

So I haven't really found a clean way of passing settings to the @profile decorator, but at least for me it works fine to set the env var in the script as long as you do it before importing simpa, e.g. the following works with your script in pycharm:

# new lines:
import os
os.environ["SIMPA_PROFILE"] = 'GPU_MEMORY'

# rest of existing script:
from simpa.utils import Tags
import simpa as sp
...

Would that be ok, or are there situations where you can't set the env var before importing simpa?

@kdreher
Copy link
Collaborator

kdreher commented Sep 22, 2023

The only thing that would come to my mind would be that according to PEP8, imports should be at the top without constant declarations or else.

Other than that, I don't see a problem

@lkeegan
Copy link
Contributor Author

lkeegan commented Sep 22, 2023

While it technically violates PEP8, as it's only modifying the environment it doesn't really seem against the spirit of PEP8 to me, and as this code is not going into the simpa library but would only be in a user script I think it's fine.

(Also checking at run-time which profiler to use with every call to the @profile decorator would likely impact the accuracy of the timing, so I'd avoid doing that)

kdreher
kdreher previously approved these changes Jun 7, 2024
lkeegan added 2 commits June 10, 2024 08:25
- Allows `@profile` decorator to be added to functions
- By default this decorator does nothing
- But if the SIMPA_PROFILE` environment variable is set to:
  - `TIME`: `line_profiler` is used for line-by-line run-time profiling
  - `MEMORY`: `memory_profiler` is used for line-by-line RAM use profiling
  - `GPU_MEMORY`: `pytorch_memlab` is used for line-by-line GPU RAM profiling
- Profiling output is written to the console when the script finishes
- For GPU_MEMORY, a summary of gpu memory use (torch.cuda.memory_summary()) is also written to the console
- Add these profiling dependencies to tool.poetry.group.profile.dependencies
- avoids `SIMPA_PROFILE` env var being checked at first import of simpa
- @Profile decorator is now imported using `from simpa.utils.profiling import profile`
- allows env var to be set later in script
@lkeegan lkeegan force-pushed the profile_decorator branch from c7f01c0 to cec9d9e Compare June 10, 2024 06:26
@lkeegan lkeegan changed the base branch from main to develop June 10, 2024 06:27
@kdreher kdreher merged commit 4ab4e1d into IMSY-DKFZ:develop Jun 12, 2024
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants