Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add hazenlib init tests coverage #144

Merged
merged 28 commits into from
Dec 22, 2021

Conversation

Lucrezia-Cester
Copy link
Contributor

@Lucrezia-Cester Lucrezia-Cester commented Nov 15, 2021

Hello,

I have been writing test functions for the hazenlib.init script.

There are a couple of things which seem odd to me:

1: the get_manufacturer function (in init ), which extracts the manufacturer tag https://dicom.innolitics.com/ciods/rt-plan/general-equipment/00080070), outputs not one but many manufacturers' names.

2: the rescale_to_byte function, which is then used in the spatial_resolution script rescales the data, but it is unclear to me why this is necessary. It also seems to decrease the image resolution.

I added the first test function for hazenlib __init__. It checks if a file is an enhanced dicom. In the test data I could only find files with the class: 1.2.840.10008.5.1.4.1.1.4 and no file with the class 1.2.840.10008.5.1.4.1.1.4.1 so I only tested part of the conditional statement. I will gradually add more tests functions. This pull request is not complete.
@Lucrezia-Cester Lucrezia-Cester linked an issue Nov 15, 2021 that may be closed by this pull request
@hshuaib90
Copy link
Collaborator

1: the get_manufacturer function (in init ), which extracts the manufacturer tag https://dicom.innolitics.com/ciods/rt-plan/general-equipment/00080070), outputs not one but many manufacturers' names.

def get_manufacturer(dcm: pydicom.Dataset) -> str:
    supported = ['ge', 'siemens', 'philips', 'toshiba']
    manufacturer = dcm.Manufacturer.lower()
    for item in supported:
        if item in manufacturer:
            return item

    raise Exception(f'{manufacturer} not recognised manufacturer')

I'm not sure what you mean. It just returns item once.

2: the rescale_to_byte function, which is then used in the spatial_resolution script rescales the data, but it is unclear to me why this is necessary. It also seems to decrease the image resolution.

This is in so that you can use OpenCV functions, which only work on 8 byte images. It doesn't directly change the image resolution but it does change the image contrast.

@Lucrezia-Cester
Copy link
Contributor Author

Lucrezia-Cester commented Nov 17, 2021

Hello,

  • right, sorry, I hadn't noticed the manufacturer function was used in child classes.

  • Would it make sense to try to use another function to rescale, so that it only normalises the image? changing the contrast might alter the MTF for MRI images which have both magnitude and phase. However I am not entirely sure of this, maybe it would be worth testing it out?

  • should the main function be tested?

-all the other functions besides the main and get_field_of_view have a test function now. The get_field_of_view function returns an error when the manufacturer is ge. This is the error it outputs:

Traceback (most recent call last):
File "C:\Users\lucrezia\Documents\GitHub\hazen\tests\test_hazenlib.py", line 163, in setUp
print(self.dcm[0x19, 0x101e].value)
File "C:\Users\lucrezia\AppData\Local\Programs\Python\Python36\lib\site-packages\pydicom\dataset.py", line 852, in getitem
data_elem = self._dict[tag]
KeyError: (0019, 101e)

I am trying to work this out now.

@github-actions
Copy link

github-actions bot commented Nov 22, 2021

@laurencejackson
Copy link
Collaborator

Hey Lu, is this ready for review?

@Lucrezia-Cester
Copy link
Contributor Author

@laurencejackson Hi Laurence, yes it is.

Just one thing to note: the get_field_of_view function wasn't working when the manufacturer was 'GE'.

@laurencejackson
Copy link
Collaborator

Great, and sorry, I've just seen your questions above!

  • I don't think we should be investigating a different rescaling function here, it's probably beyond the scope of this PR. I think @tomaroberts has been looking at an updating rescaling function as part of Porting hazen to python 3.9 #143, but that is still being investigated.

  • Ideally yes the main function should be tested - if it's not clear how to do this then I'm happy to help

  • looking at the error for GE scanners, could you push your most recent changes? when I look at the code on the PR branch I can't see a test for get_field_of_view? The GE data will almost definitely have that dicom field in the metadata so it's odd that it can't find it now. Could you double check that it is a GE dataset that raises this error and it's not getting a Philips/Siemens/etc dataset instead?

@Lucrezia-Cester
Copy link
Contributor Author

Lucrezia-Cester commented Nov 22, 2021

Hi @laurencejackson, the get_field_of_view tests are at lines 84, 107, 128, 149. These tests work for all manufacturers but 'GE'.

For 'GE, If I just try to do

file = str(TEST_DATA_DIR / 'ge' / 'ge_eFilm.dcm')
dcm = pydicom.read_file(file)
fov = hazenlib.get_field_of_view(dcm)
print(fov)

I get this error (sorry, I just noticed I wrote the same thing in my previous comment)

Traceback (most recent call last):
File "C:\Users\lucrezia\Documents\GitHub\hazen\tests\test_hazenlib.py", line 170, in
fov = hazenlib.get_field_of_view(dcm)
File "c:\users\lucrezia\documents\github\hazen\hazenlib_init_.py", line 308, in get_field_of_view
fov = dcm[0x19, 0x101e].value
File "C:\Users\lucrezia\AppData\Local\Programs\Python\Python36\lib\site-packages\pydicom\dataset.py", line 852, in getitem
data_elem = self._dict[tag]
KeyError: (0019, 101e)

I am gonna give it a shot at making a test function for the main now, does that sound good?

@laurencejackson
Copy link
Collaborator

laurencejackson commented Nov 22, 2021

I am gonna give it a shot at making a test function for the main now, does that sound good?

yeah sounds great, let me know when ready for review.

I've had a look at it seems like none of the available GE test data in the hazen repo has the (0019, 101e) dicom tag filled, it is however in an old GE QA dataset I have left over from a few years ago. I think this is a limitation of the test data. @hshuaib90 could you advise on these options :

  1. we can add some newer GE test data with the tag present for this test
  2. we could add a try-except to adjust the get_field_of_view function to try an alternative method ? In the below, running lines 296 and 298 give the same answer on my GE dataset (fov = 256.0)- we could add a try-except clause here to get the FOV using the logic in line 298 when the dicom tag doesn't exist?
  3. Or finally, is there a particular reason we choose to get the FOV from a GE specific dicom tag that isn't always present? Have you come across any examples where the dcm.Columns *dcm.PixelSpacing[0] doesn't work? If we could use this for all manufacturers first it seems like the most easily maintainable method.

image

@Lucrezia-Cester
Copy link
Contributor Author

@laurencejackson this is now ready for review. Cheers

Copy link
Collaborator

@laurencejackson laurencejackson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Lu, this looks good and it's nearly there! Just requested a few things below.

One thing I'd like to suggest is that package up each test dicom with its true values in dictionaries so we can see where everything is easily. So at the top of the file we can define something like:

test_dicoms = {'philips': {'file': str(TEST_DATA_DIR / 'resolution' / 'philips' / 'IM-0004-0002.dcm'),
                          'MANUFACTURER': 'philips',
                          'ROWS': 512,
                          'COLUMNS': 512,
                          'TR_CHECK': 500,
                          'BW': 205.0,
                          'ENHANCED': False,
                          'PIX_ARRAY': 1},
                'siemens': {'file': str(TEST_DATA_DIR / 'resolution' / 'siemens' / 'IM-0004-0002.dcm'),
                            'MANUFACTURER': 'siemens', ...

We can then use this identical structure to loop over a test example for each manufacturer in the test cases where the function should work on all manufacturers (everything in the base TestHazenLib class) e.g.

def test_get_pixel_size(self):
  for manufactuer in test_dicoms.keys():
      with pydicom.read_file(test_dicoms[manufactuer]['file']) as dcm:
          PIX_SIZE = dcm.PixelSpacing
          PIX_SIZE=tuple(PIX_SIZE)
          pix_size = hazenlib.get_pixel_size(dcm)
          assert pix_size == PIX_SIZE

Structuring the test dicoms this way should let us easily expand the scope of our tests!, which would be great!

Hazen_logger.log Outdated
@@ -0,0 +1,3 @@
2021-11-15 21:19:24 I [spatial_resolution.py:main:514] 'FileDataset' object has no attribute 'SeriesDescription'
2021-11-15 21:20:56 I [spatial_resolution.py:main:514] 'FileDataset' object has no attribute 'SeriesDescription'
2021-11-15 21:28:24 I [spatial_resolution.py:main:514] 'FileDataset' object has no attribute 'SeriesDescription'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add all *.log files to the .gitignore so we don't track them?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, can you also delete this log from the repo?

task = importlib.import_module(f"hazenlib.{arguments['<task>']}")
folder = arguments['<folder>']
files = [os.path.join(folder, x) for x in os.listdir(folder) if x not in EXCLUDED_FILES]
dicom_objects = [pydicom.read_file(x, force=True) for x in files if is_dicom_file(x)]
dicom_objects = [pydicom.read_file(x, force=True) for x in files]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you re-add your if is_dicom_file condition here? It looks like this branch was forked before the is_dicom_file check was added. We need to add this back in here or it will overwrite this line.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am giving thumbs up to comments I have acted upon

if not arguments['<task>'] == 'snr' and arguments['--measured_slice_width']:
raise Exception("the (--measured_slice_width) option can only be used with snr")
elif arguments['<task>'] == 'snr' and arguments['--measured_slice_width']:
measured_slice_width = float(arguments['--measured_slice_width'])
return pp.pprint(task.main(dicom_objects, measured_slice_width, report_path=report))
return task.main(dicom_objects, measured_slice_width, report_path=report)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why have you chosen to return the output outside of a pprint object? How does this effect the output the user sees?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not done to change the output the user sees, rather for me to be able to test the function because print returns NONE

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean print returns none or pp.pprint returns none? Could you add some screenshots to example outputs to explain what you mean? I don't think we want to change this if we can.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Lucrezia-Cester, would that mean that the user does not see the result printed to the command line when they run the SNR task? Could you try something like the below so we print the result and we also return a value?

    if not arguments['<task>'] == 'snr' and arguments['--measured_slice_width']:
        raise Exception("the (--measured_slice_width) option can only be used with snr")
    elif arguments['<task>'] == 'snr' and arguments['--measured_slice_width']:
        measured_slice_width = float(arguments['--measured_slice_width'])
        result = task.main(dicom_objects, measured_slice_width, report_path=report)
    elif arguments['<task>'] == 'relaxometry':
        result = parse_relaxometry_data(task, arguments, dicom_objects, report)
    else:
        result = task.main(dicom_objects, report_path=report)

    pp.pprint(result)

    return result


# Pass arguments with dictionary, stripping initial double dash ('--')
relaxometry_args = {}
return parse_relaxometry_data(task, arguments, dicom_objects, report)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above we need to re-add this relaxometry code due to the earlier branching from develop

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is still there, I have put this relaxometry code in a function outside the main function.

@@ -37,9 +52,47 @@ def test_get_TR(self):
TR = hazenlib.get_TR(self.dcm)
assert TR == self.TR_CHECK

def test_is_enhanced_dicom(self):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we have a second test for an ENHANCED = True dicom too? just need to copy/paste this one but with self.dcm overwritten with an enhanced dicom and assert that returns True

@@ -49,41 +102,178 @@ def setUp(self):
self.file = str(TEST_DATA_DIR / 'resolution' / 'philips' / 'IM-0004-0002.dcm')
self.dcm = pydicom.read_file(self.file)

file = str(TEST_DATA_DIR / 'resolution' / 'philips' / 'IM-0004-0002.dcm')
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as above, this bit looks like it needs to be ina function


def setUp(self):
self.file = str(TEST_DATA_DIR / 'resolution' / 'eastkent' / '256_sag.IMA')
self.dcm = pydicom.read_file(self.file)
FOV = self.dcm.Columns * self.dcm.PixelSpacing[0]

file = str(TEST_DATA_DIR / 'resolution' / 'eastkent' / '256_sag.IMA')
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as above.

FOV = self.dcm.Columns * self.dcm.PixelSpacing[0]


file = str(TEST_DATA_DIR / 'toshiba' / 'TOSHIBA_TM_MR_DCM_V3_0.dcm')
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as above


def setUp(self):
self.file = str(TEST_DATA_DIR / 'ge' / 'ge_eFilm.dcm')
self.dcm = pydicom.read_file(self.file)


file = str(TEST_DATA_DIR / 'ge' / 'ge_eFilm.dcm')
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as above




from docopt import docopt
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you move all import statements so they're at the top of the file?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @laurencejackson, thank you for all the feedback. I'll get to it in the following days

Hi @laurencejackson, I have added the changes you suggested (If my interpretation was correct). I could not add the test_field_of_view with the other tests because the GE manufacturer was not working with the hazenlib function. 
Also, I could not add the enhanced_dicom = True bit because I do not have a test image which returns true for that condition.
@Lucrezia-Cester Lucrezia-Cester changed the base branch from develop to release/0.5.0 December 9, 2021 13:27
@Lucrezia-Cester Lucrezia-Cester changed the base branch from release/0.5.0 to develop December 9, 2021 14:50
@Lucrezia-Cester Lucrezia-Cester changed the base branch from develop to release/0.5.0 December 9, 2021 15:12
@tomaroberts tomaroberts changed the title Add hazelib. init tests coverage Add hazenlib init tests coverage Dec 10, 2021
@tomaroberts tomaroberts changed the base branch from release/0.5.0 to release/0.5.1 December 10, 2021 16:40
@Lucrezia-Cester
Copy link
Contributor Author

Lucrezia-Cester commented Dec 15, 2021

Hi @laurencejackson, this is now ready for review

@laurencejackson laurencejackson changed the base branch from release/0.5.1 to release/0.5.2 December 16, 2021 10:25
Copy link
Collaborator

@laurencejackson laurencejackson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry that this took so long to review @Lucrezia-Cester, but it looks fantastic! thank you! MY My one main suggestion is that we split out the tests again so they are individual functions, this helps a lot with debugging. thanks!

Hazen_logger.log Outdated
@@ -0,0 +1,3 @@
2021-11-15 21:19:24 I [spatial_resolution.py:main:514] 'FileDataset' object has no attribute 'SeriesDescription'
2021-11-15 21:20:56 I [spatial_resolution.py:main:514] 'FileDataset' object has no attribute 'SeriesDescription'
2021-11-15 21:28:24 I [spatial_resolution.py:main:514] 'FileDataset' object has no attribute 'SeriesDescription'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, can you also delete this log from the repo?

.gitignore Outdated
@@ -5,6 +5,7 @@ coverage.xml
**/__pycache__
.idea
logs
*.logs
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you meant this to be *.log

if not arguments['<task>'] == 'snr' and arguments['--measured_slice_width']:
raise Exception("the (--measured_slice_width) option can only be used with snr")
elif arguments['<task>'] == 'snr' and arguments['--measured_slice_width']:
measured_slice_width = float(arguments['--measured_slice_width'])
return pp.pprint(task.main(dicom_objects, measured_slice_width, report_path=report))
return task.main(dicom_objects, measured_slice_width, report_path=report)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean print returns none or pp.pprint returns none? Could you add some screenshots to example outputs to explain what you mean? I don't think we want to change this if we can.

tests/test_hazenlib.py Show resolved Hide resolved
@tomaroberts tomaroberts added this to the 0.5.2 milestone Dec 21, 2021
@tomaroberts tomaroberts changed the base branch from release/0.5.2 to release/0.5.1 December 21, 2021 16:39
@Lucrezia-Cester Lucrezia-Cester changed the base branch from release/0.5.1 to release/0.5.2 December 22, 2021 06:12
@Lucrezia-Cester
Copy link
Contributor Author

Hi @laurencejackson
Screenshot 2021-12-22 at 08 29 11

regarding the pprint issue, here is what happens if I don't just use return rather than using pprint. I get None as output rather than a dictionary.

if not arguments['<task>'] == 'snr' and arguments['--measured_slice_width']:
raise Exception("the (--measured_slice_width) option can only be used with snr")
elif arguments['<task>'] == 'snr' and arguments['--measured_slice_width']:
measured_slice_width = float(arguments['--measured_slice_width'])
return pp.pprint(task.main(dicom_objects, measured_slice_width, report_path=report))
return task.main(dicom_objects, measured_slice_width, report_path=report)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Lucrezia-Cester, would that mean that the user does not see the result printed to the command line when they run the SNR task? Could you try something like the below so we print the result and we also return a value?

    if not arguments['<task>'] == 'snr' and arguments['--measured_slice_width']:
        raise Exception("the (--measured_slice_width) option can only be used with snr")
    elif arguments['<task>'] == 'snr' and arguments['--measured_slice_width']:
        measured_slice_width = float(arguments['--measured_slice_width'])
        result = task.main(dicom_objects, measured_slice_width, report_path=report)
    elif arguments['<task>'] == 'relaxometry':
        result = parse_relaxometry_data(task, arguments, dicom_objects, report)
    else:
        result = task.main(dicom_objects, report_path=report)

    pp.pprint(result)

    return result

@laurencejackson laurencejackson merged commit c292347 into release/0.5.2 Dec 22, 2021
@laurencejackson laurencejackson deleted the add-hazelib.__init__-tests-coverage branch December 22, 2021 16:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[HZN-90] Hazenlib: improve __init__.py coverage
4 participants