Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

218 coda transforms #244

Merged
merged 81 commits into from
Dec 12, 2023
Merged

218 coda transforms #244

merged 81 commits into from
Dec 12, 2023

Conversation

em-t
Copy link
Collaborator

@em-t em-t commented Nov 28, 2023

Implement the requested logratio transformations for compositional data (issue #218):

  • ILR is only partially implemented: it's included only as a single ILR operation (instead of a full set of a given dataset), and there inverse ILR hasn't been implemented. (ILR is waiting on further details).
  • Add a decorator/wrapper for checking that the requirements for compositional data are fulfilled.
  • Include a notebook demonstrating use.

Emmu T added 30 commits October 31, 2023 12:40
example of using ALR, CLR & ILR from the package pyrolite.
arg for keeping the redundant column. Use existing exception classes
where possible. Move some utility functions into more appropriate
places.
versions of simplex check and normalizing functions. Fix some issues discovered during writing tests.
@em-t
Copy link
Collaborator Author

em-t commented Nov 28, 2023

For the purposes of testing that the functionality is correct, this paper was a good overall source (and the first source I found with a proper definition for pivot logratio transform), and this paper has a simple, follow-along example for CLR, ALR and ILR.

Copy link
Collaborator

@nmaarnio nmaarnio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey! Overall code looks great and is easy to follow! My comments mostly concern some naming conventions and suggestion – feel free to disagree or counter-suggest something for these. I tried to check the different logratio logics too, but I have to admit I am not familiar with these methods myself.

eis_toolkit/transformations/coda/alr.py Outdated Show resolved Hide resolved
Comment on lines 44 to 48
denominator_column = df.columns[idx]
columns = [col for col in df.columns]

if not keep_redundant_column and denominator_column in columns:
columns.remove(denominator_column)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we decide to keep the denominator column, we will divide it by itself in the inner function, or am I wrong in here? This seems intuitively unintended to me, but I could be wrong

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, that's the case. I thought there should be the option to keep the column even though it becomes "redundant", since I don't know what the actual use cases for the function are. But default behavior will still be to return a dataframe with one less columns than the input frame.

eis_toolkit/transformations/coda/alr.py Outdated Show resolved Hide resolved
eis_toolkit/transformations/coda/alr.py Outdated Show resolved Hide resolved
eis_toolkit/transformations/coda/alr.py Show resolved Hide resolved
eis_toolkit/transformations/coda/pairwise.py Outdated Show resolved Hide resolved
eis_toolkit/transformations/coda/pairwise.py Outdated Show resolved Hide resolved
eis_toolkit/utilities/aitchison_geometry.py Outdated Show resolved Hide resolved
eis_toolkit/utilities/checks/compositional.py Outdated Show resolved Hide resolved
eis_toolkit/utilities/miscellaneous.py Outdated Show resolved Hide resolved
@em-t
Copy link
Collaborator Author

em-t commented Dec 4, 2023

@nmaarnio All the suggested changes should now be addressed!

Copy link
Collaborator

@nmaarnio nmaarnio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the quick changes – looks really good to me now! I only had one comment/question about the alr_transform parameterization.

Comment on lines 29 to 30
column: The integer position based index of the column of the dataframe to be used as denominator.
If not provided, the last column will be used.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you leave the type of this parameter int on purpose? What I suggested was to let the user give the desired column name (str) instead of the index. This solution isn't bad, but can be a bit tricky if there are tens of columns in a DF :)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I must have missed it or forgotten halfway through going through the change suggestions. 😄 It's now fixed!

…ete check_column_index_in_dataframe function as unused. Update notebook.
@nmaarnio
Copy link
Collaborator

nmaarnio commented Dec 7, 2023

One more thing I noticed: the doc files are missing for these functions. After you have added them I will merge!

@em-t
Copy link
Collaborator Author

em-t commented Dec 11, 2023

I added doc files similar to those for other modules. But I couldn't get mkdocs serve or build to work as described in the instructions.

I'm getting (the same issue is with all of them):

ERROR   -  mkdocstrings: eis_toolkit.transformations.coda.alr could not be found
ERROR   -  Error reading page 'transformations/coda/alr.md':
ERROR   -  Could not collect 'eis_toolkit.transformations.coda.alr'

Is there something that needs to be done in addition to adding the files for them to work?

@nmaarnio
Copy link
Collaborator

nmaarnio commented Dec 12, 2023

You need to add an empty __init__.py file in the coda folder (the folder with the implementations) so that mkdocs will recognize it. It should work after that.

Copy link
Collaborator

@nmaarnio nmaarnio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, merging!

@nmaarnio nmaarnio merged commit 014b825 into master Dec 12, 2023
4 checks passed
@nmaarnio nmaarnio mentioned this pull request Dec 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants