-
Notifications
You must be signed in to change notification settings - Fork 209
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
T018: Deprecation warnings and images not showing #334
Comments
Adding this line after generarting the molecule column should fix it: |
See PR #330, for example |
This does not fix the error from the screenshot, unfortunately. |
@AndreaVolkamer @mbackenkoehler I submitted a pull request (#350) that solves the problems in T018. However, there are still some points to discuss, which are also important for other talktorials. One of the problems was the rendering of RDKit structures inside Pandas DataFrames, and all other problems were deprecation warnings. Both are discussed separately below. Rendering RDKit structures in PandasTLDRThe problem arises from the pinned RDKit version (2021.09.5) in TeachOpenCADD environment, and is not affected by using this suggested method (see code snippets below). This is not unique to T018 and will happen in all talktorials. The only solution I could find to fix the problem in all situations is to remove the pin from the TeachOpenCADD environment (at Problem DescriptionThere are two factors contributing to this problem in T018; one factor is unique to T018, and the other one affects all TeachOpenCADD notebooks. The one unique to T018 is straightforward and discussed first: Unique problem in T018Earlier RDKit versions were able to render from IPython.display import display
import rdkit
import pandas as pd
from rdkit.Chem import PandasTools
PandasTools.RenderImagesInAllDataFrames(True)
mol = rdkit.Chem.MolFromSmiles("c1ccccc1")
# Create dataframe containing `Mol` objects
df = pd.DataFrame({"mol":[mol]})
display(df) # Dataframe displays correctly
# Put Mol object in index
display(
pd.DataFrame({"col1":[1]}, index=df.mol)
) # Doesn't display correctly
# Another way of putting Mol in an index column
display(
pd.concat({df.loc[0, "mol"]: pd.DataFrame({"a":[1,2,3],"b":[3,4,4]})}, names=["Structure"])
) # Doesn't display correctly Assuming there won't a be fix anytime soon, I simply solved this in the PR by moving the structure into a normal column. Bug in pinned RDKit version (2021.09.5)The main problem affecting all talktorials is a bit more complex and requires further work: There is a bug in the RDKit version (2021.09.5) pinned in the current environment for TeachOpenCADD, regarding rendering structures in dataframes. The bug occurs regardless of the method used for rendering (i.e. even when this suggested method is used). The good news is that this bug is already fixed in later RDKit versions. However, it requires unpinning the RDKit version in the TeachOpenCADD environment, which requires re-checking all other talktorials for issues that might arise from doing so. Bug description and examplesDuring the runtime, regardless of the method used to render Below are several code snippets to reproduce the bug in the currently pinned RDKit version, using different methods. Current method used in TeachOpenCADDHere, we are using the old method that is currently used in talktorials, i.e. using from IPython.display import display
import rdkit
import pandas as pd
from rdkit.Chem import PandasTools
print(f"RDKit version: {rdkit.__version__}")
print(f"Pandas version: {pd.__version__}")
# Create dataframe with SMILES column
df = pd.DataFrame({"smiles":["c1ccccc1"]})
# Add Mol column using RDKit
PandasTools.AddMoleculeColumnToFrame(df, "smiles")
print("First dataframe:")
display(df) # Dataframe displays correctly
# Now do the same for a second dataframe
df2 = pd.DataFrame({"smiles":["c1ccccc1"]})
PandasTools.AddMoleculeColumnToFrame(df2, "smiles")
print("Second dataframe:")
display(df2) # Dataframe does not display correctly
print("First dataframe again:")
display(df) # The first dataframe also becomes corrupted As shown in the output below, the first dataframe renders correctly the first time, but when another dataframe is created both of them stop working correctly: Using
|
Thanks! This really helps a lot! I will go through the details later on. |
The related PR #350 Armin made updates rdkit from the previously pinned version. This solves the dataframe/molecule renderng issue. @dominiquesydow Do you think, this will cause any problems? |
…ings-and-images-not-showing T018: deprecation warnings and images not showing (#334)
Currently, there are several:
distutils Version classes are deprecated. Use packaging.version instead.
The
get_pdb_filefunction within pypdb.py is deprecated.See
pypdb/clients/pdb/pdb_client.py` for a near-identical function to usePassing unrecognized arguments to super(NGLWidget).__init__(height='600px').
broken image in dataframe in the
Putting the pieces together
section (see image)The text was updated successfully, but these errors were encountered: