Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

added "-p" cli option to match kedro pipeline options #1961

Merged
merged 5 commits into from
Jul 5, 2024

Conversation

VladCozma
Copy link
Contributor

@VladCozma VladCozma commented Jun 28, 2024

Signed-off-by: Vlad Cozma [email protected]

Description

CLI pipeline option for kedro viz is different than the option for kedro run.

Development notes

Added the -p option for kedro viz along --pipeline option.

QA notes

  • all test ran successfully
  • option behaviour tested with demo-project:
(.venv)$ demo-project % kedro viz -p "Data ingestion" 
Starting Kedro Viz ...
Kedro Viz started successfully. 

Checklist

  • Read the contributing guidelines
  • Opened this PR as a 'Draft Pull Request' if it is work-in-progress
  • Updated the documentation to reflect the code changes
  • Added new entries to the RELEASE.md file
  • Added tests to cover my changes

fixed ann anchor in CONTRIBUTING.md
Signed-off-by: Vlad Cozma <[email protected]>
@ravi-kumar-pilla
Copy link
Contributor

Hi @VladCozma ,

Thank you for the PR. Could you please add a test parameter with the new option at https://github.com/kedro-org/kedro-viz/blob/main/package/tests/test_launchers/test_cli.py#L183 ?

Thank you

Signed-off-by: Vlad <Vlad>
@VladCozma
Copy link
Contributor Author

Hi @VladCozma ,

Thank you for the PR. Could you please add a test parameter with the new option at https://github.com/kedro-org/kedro-viz/blob/main/package/tests/test_launchers/test_cli.py#L183 ?

Thank you

Done!

@astrojuanlu
Copy link
Member

Hi @VladCozma, thanks for your PR! What would kedro viz -p "Data ingestion" do? And also, my understanding is that this wouldn't be 100 % consistent because in Kedro it would be kedro run -p data_ingestion (no "beautiful name")

@VladCozma
Copy link
Contributor Author

Hi @astrojuanlu, it is consistent with the definition of the demo-project. The pipelines registered in settings.py are these:

    return {
        "__default__": (
            ingestion_pipeline
            + feature_pipeline
            + modelling_pipeline
            + reporting_pipeline
        ),
        "Data ingestion": ingestion_pipeline,
        "Modelling stage": modelling_pipeline,
        "Feature engineering": feature_pipeline,
        "Reporting stage": reporting_pipeline,
        "Pre-modelling": ingestion_pipeline + feature_pipeline,
    }

Here is the execution of kedro run and kedro viz:

~kedro run -p "Data ingestion"
[06/30/24 12:05:16] INFO     Kedro project demo-project                                                                     session.py:324
[06/30/24 12:05:17] INFO     Using synchronous mode for loading and saving data. Use the --async flag for          sequential_runner.py:64
                             potential performance gains.                                                                                 
                             https://docs.kedro.org/en/stable/nodes_and_pipelines/run_a_pipeline.html#load-and-sav                        
                             e-asynchronously                                                                                             
                    INFO     Loading data from companies (CSVDataset)...                                               data_catalog.py:508
                    INFO     Running node: apply_types_to_companies: apply_types_to_companies([companies]) ->                  node.py:361
                             [ingestion.int_typed_companies]                                                                              
                    INFO     Saving data to ingestion.int_typed_companies (ParquetDataset)...                          data_catalog.py:550
                    INFO     Completed 1 out of 6 tasks                                                            sequential_runner.py:90
                    INFO     Loading data from reviews (CSVDataset)...                                                 data_catalog.py:508
[06/30/24 12:05:18] INFO     Loading data from params:ingestion.typing.reviews.columns_as_floats (MemoryDataset)...    data_catalog.py:508
                    INFO     Running node: apply_types_to_reviews:                                                             node.py:361
                             apply_types_to_reviews([reviews;params:ingestion.typing.reviews.columns_as_floats]) ->                       
                             [ingestion.int_typed_reviews]                                                                                
                    INFO     Saving data to ingestion.int_typed_reviews (ParquetDataset)...                            data_catalog.py:550
                    INFO     Completed 2 out of 6 tasks                                                            sequential_runner.py:90
                    INFO     Loading data from shuttles (ExcelDataset)...                                              data_catalog.py:508
[06/30/24 12:05:23] INFO     Running node: apply_types_to_shuttles: apply_types_to_shuttles([shuttles]) ->                     node.py:361
                             [ingestion.int_typed_shuttles@pandas1]                                                                       
                    INFO     Saving data to ingestion.int_typed_shuttles@pandas1 (ParquetDataset)...                   data_catalog.py:550
                    INFO     Completed 3 out of 6 tasks                                                            sequential_runner.py:90
                    INFO     Loading data from ingestion.int_typed_companies (ParquetDataset)...                       data_catalog.py:508
                    INFO     Running node: company_agg: aggregate_company_data([ingestion.int_typed_companies]) ->             node.py:361
                             [ingestion.prm_agg_companies]                                                                                
[06/30/24 12:05:24] INFO     Saving data to ingestion.prm_agg_companies (MemoryDataset)...                             data_catalog.py:550
                    INFO     Completed 4 out of 6 tasks                                                            sequential_runner.py:90
                    INFO     Loading data from ingestion.int_typed_shuttles@pandas2 (ParquetDataset)...                data_catalog.py:508
                    INFO     Loading data from ingestion.prm_agg_companies (MemoryDataset)...                          data_catalog.py:508
                    INFO     Loading data from ingestion.int_typed_reviews (ParquetDataset)...                         data_catalog.py:508
                    INFO     Running node: combine_step:                                                                       node.py:361
                             combine_shuttle_level_information([ingestion.int_typed_shuttles@pandas2;ingestion.prm_agg_compani            
                             es;ingestion.int_typed_reviews]) -> [prm_shuttle_company_reviews;prm_spine_table]                            
                    INFO     Saving data to prm_shuttle_company_reviews (ParquetDataset)...                            data_catalog.py:550
                    INFO     Saving data to prm_spine_table (ParquetDataset)...                                        data_catalog.py:550
                    INFO     Completed 5 out of 6 tasks                                                            sequential_runner.py:90
                    INFO     Loading data from prm_spine_table (ParquetDataset)...                                     data_catalog.py:508
                    INFO     Running node: <lambda>([prm_spine_table]) -> [ingestion.prm_spine_table_clone]                    node.py:361
                    INFO     Saving data to ingestion.prm_spine_table_clone (MemoryDataset)...                         data_catalog.py:550
                    INFO     Completed 6 out of 6 tasks                                                            sequential_runner.py:90
                    INFO     Pipeline execution completed successfully.                                                      runner.py:119
                    INFO     Loading data from ingestion.prm_spine_table_clone (MemoryDataset)...                      data_catalog.py:508
~kedro viz -p "Data ingestion"      
Starting Kedro Viz ...
Kedro Viz started successfully. 

✨ Kedro Viz is running at 
 http://127.0.0.1:4141/

kedro run -p data_ingestion will throw an error:

~kedro run -p ingestion_pipeline
[06/30/24 12:09:24] INFO     Kedro project demo-project                                                                     
[...]
    raise ValueError(
ValueError: Failed to find the pipeline named 'ingestion_pipeline'. It needs to be generated and returned by the 'register_pipelines' function.

Cheers

Copy link
Contributor

@ravi-kumar-pilla ravi-kumar-pilla left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM ! Thank you @VladCozma

Copy link
Contributor

@rashidakanchwala rashidakanchwala left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. thanks @VladCozma

@rashidakanchwala
Copy link
Contributor

FYI - While testing your PR, I accidentally included some of your changes in cli.py file in my own PR, which has now been merged. As a result, your PR no longer shows those changes in cli.py when compared to the main branch.

Copy link
Member

@astrojuanlu astrojuanlu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @VladCozma !

@rashidakanchwala rashidakanchwala merged commit da92447 into kedro-org:main Jul 5, 2024
23 checks passed
@VladCozma VladCozma deleted the chore/add-p-cli-option branch July 10, 2024 10:12
@ravi-kumar-pilla ravi-kumar-pilla mentioned this pull request Jul 25, 2024
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants