[BUG] error with tqmd and pandas #919

miguelgfierro · 2019-09-09T10:54:26Z

Description

============================= test session starts ==============================
platform linux -- Python 3.6.8, pytest-5.0.1, py-1.8.0, pluggy-0.12.0
rootdir: /data/home/recocat/cicd/28/s
collected 29 items / 12 deselected / 17 selected

tests/integration/test_criteo.py .                                       [  5%]
tests/integration/test_movielens.py .........                            [ 58%]
tests/integration/test_notebooks_python.py ......F                       [100%]

=================================== FAILURES ===================================
__________________________ test_wikidata_integration ___________________________

notebooks = {'als_deep_dive': '/data/home/recocat/cicd/28/s/notebooks/02_model/als_deep_dive.ipynb', 'als_pyspark': '/data/home/re...aseline_deep_dive.ipynb', 'data_split': '/data/home/recocat/cicd/28/s/notebooks/01_prepare_data/data_split.ipynb', ...}
tmp = '/tmp/pytest-of-recocat/pytest-1697/tmpruv_um4p'

    @pytest.mark.integration
    def test_wikidata_integration(notebooks, tmp):
        notebook_path = notebooks["wikidata_KG"]
        MOVIELENS_SAMPLE_SIZE = 5
        pm.execute_notebook(notebook_path, OUTPUT_NOTEBOOK, kernel_name=KERNEL_NAME,
                            parameters=dict(MOVIELENS_DATA_SIZE='100k',
                                            MOVIELENS_SAMPLE=True,
>                                           MOVIELENS_SAMPLE_SIZE=MOVIELENS_SAMPLE_SIZE))

tests/integration/test_notebooks_python.py:173: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
/anaconda/envs/nightly_reco_base/lib/python3.6/site-packages/papermill/execute.py:94: in execute_notebook
    raise_for_execution_errors(nb, output_path)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

nb = {'cells': [{'cell_type': 'code', 'metadata': {'inputHidden': True, 'hide_input': True}, 'execution_count': None, 'sour...nd_time': '2019-09-07T23:48:50.115802', 'duration': 14.995972, 'exception': True}}, 'nbformat': 4, 'nbformat_minor': 2}
output_path = 'output.ipynb'

    def raise_for_execution_errors(nb, output_path):
        """Assigned parameters into the appropriate place in the input notebook
    
        Parameters
        ----------
        nb : NotebookNode
           Executable notebook object
        output_path : str
           Path to write executed notebook
        """
        error = None
        for cell in nb.cells:
            if cell.get("outputs") is None:
                continue
    
            for output in cell.outputs:
                if output.output_type == "error":
                    error = PapermillExecutionError(
                        exec_count=cell.execution_count,
                        source=cell.source,
                        ename=output.ename,
                        evalue=output.evalue,
                        traceback=output.traceback,
                    )
                    break
    
        if error:
            # Write notebook back out with the Error Message at the top of the Notebook.
            error_msg = ERROR_MESSAGE_TEMPLATE % str(error.exec_count)
            error_msg_cell = nbformat.v4.new_code_cell(
                source="%%html\n" + error_msg,
                outputs=[
                    nbformat.v4.new_output(output_type="display_data", data={"text/html": error_msg})
                ],
                metadata={"inputHidden": True, "hide_input": True},
            )
            nb.cells = [error_msg_cell] + nb.cells
            write_ipynb(nb, output_path)
>           raise error
E           papermill.exceptions.PapermillExecutionError: 
E           ---------------------------------------------------------------------------
E           Exception encountered at "In [19]":
E           ---------------------------------------------------------------------------
E           ImportError                               Traceback (most recent call last)
E           /anaconda/envs/nightly_reco_base/lib/python3.6/site-packages/tqdm/_tqdm.py in pandas(tclass, *targs, **tkwargs)
E               612             # pandas>=0.23.0
E           --> 613             from pandas.core.groupby.groupby import DataFrameGroupBy, \
E               614                 SeriesGroupBy, GroupBy, PanelGroupBy
E           
E           ImportError: cannot import name 'DataFrameGroupBy'
E           
E           During handling of the above exception, another exception occurred:
E           
E           ImportError                               Traceback (most recent call last)
E           <ipython-input-19-6ccb9974139b> in <module>
E           ----> 1 tqdm().pandas(desc="Number of movies completed")
E                 2 result = pd.concat(list(movies.progress_apply(lambda x: wikidata_KG_from_movielens(x), axis=1)))
E           
E           /anaconda/envs/nightly_reco_base/lib/python3.6/site-packages/tqdm/_tqdm.py in pandas(tclass, *targs, **tkwargs)
E               614                 SeriesGroupBy, GroupBy, PanelGroupBy
E               615         except ImportError:
E           --> 616             from pandas.core.groupby import DataFrameGroupBy, \
E               617                 SeriesGroupBy, GroupBy, PanelGroupBy
E               618 
E           
E           ImportError: cannot import name 'PanelGroupBy'

In which platform does it happen?

How do we replicate the issue?

Expected behavior (i.e. solution)

Other Comments

The text was updated successfully, but these errors were encountered:

miguelgfierro added the bug Something isn't working label Sep 9, 2019

miguelgfierro added a commit that referenced this issue Sep 16, 2019

🐛 #919

7a0f836

miguelgfierro mentioned this issue Sep 16, 2019

Wikidata fix and unit tests #932

Merged

3 tasks

miguelgfierro closed this as completed Sep 17, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] error with tqmd and pandas #919

[BUG] error with tqmd and pandas #919

miguelgfierro commented Sep 9, 2019