Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exploratory data analysis with PyCaret #311

Merged
merged 16 commits into from
Feb 27, 2024
Merged

Conversation

marijaselakovic
Copy link
Contributor

@marijaselakovic marijaselakovic commented Feb 20, 2024

Summary of the changes / Why this is an improvement

  • Exploratory data analysis with PyCaret
  • Time series decomposition with PyCaret
  • Updated README
  • Added requirements.txt

Checklist

  • Link to issue this PR refers to (if applicable):
  • CLA is signed

Copy link
Member

@amotl amotl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other than a few more suggestions, I am approving this patch. Thanks again, Marija!

However, I did not do a thorough review of the notebook content, shape, and prose. Let's also have @ckurze a final voice on it if he wants to. Thanks!

topic/timeseries/exploratory_data_analysis.ipynb Outdated Show resolved Hide resolved
topic/timeseries/exploratory_data_analysis.ipynb Outdated Show resolved Hide resolved
Copy link
Contributor

@ckurze ckurze left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for extending the notebook! Very good work. I added a few notes about typos.

topic/timeseries/exploratory_data_analysis.ipynb Outdated Show resolved Hide resolved
topic/timeseries/exploratory_data_analysis.ipynb Outdated Show resolved Hide resolved
topic/timeseries/exploratory_data_analysis.ipynb Outdated Show resolved Hide resolved
topic/timeseries/exploratory_data_analysis.ipynb Outdated Show resolved Hide resolved
topic/timeseries/exploratory_data_analysis.ipynb Outdated Show resolved Hide resolved
@marijaselakovic marijaselakovic force-pushed the marija/timeseries-analysis branch from 0a904a3 to 1805a48 Compare February 23, 2024 08:50
Copy link
Contributor

@ckurze ckurze left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed dependency to SQLAlchemy/Crate, and removed " at the end of requirements file.

topic/timeseries/exploratory_data_analysis.ipynb Outdated Show resolved Hide resolved
topic/timeseries/requirements.txt Outdated Show resolved Hide resolved
@marijaselakovic
Copy link
Contributor Author

marijaselakovic commented Feb 26, 2024

Can someone merge this PR? For me, the merge is blocked.

@amotl
Copy link
Member

amotl commented Feb 27, 2024

For me, the merge is blocked.

The reason is this:
image

@ckurze needs to acknowledge this PR, because he requested changes on it. For this to optimally happen, you would have needed to re-request a review from him. I've just invoked the corresponding action for you.

Can someone merge this PR?

Sure. Thanks! Can I ask you to squash your commits beforehand?

@amotl amotl requested a review from ckurze February 27, 2024 09:14
@marijaselakovic
Copy link
Contributor Author

marijaselakovic commented Feb 27, 2024

I added a time series decomposition notebook as part of this pull request. Let's have that reviewed as well @ckurze @amotl

@amotl
Copy link
Member

amotl commented Feb 27, 2024

I don't think it is a good idea to add yet another piece to this PR, now that we have concluded the review process on the previous one. Please commandeer it into a different PR. Thanks!

@marijaselakovic
Copy link
Contributor Author

Is the review process really completed? Merging was still blocked. The new piece is just a continuation of the previous one but put in a separate notebook for organizational reasons (easier for subsequent video creation.)

@amotl amotl requested review from amotl and removed request for amotl February 27, 2024 09:48
@amotl
Copy link
Member

amotl commented Feb 27, 2024

I think after resolving @ckurze's comments, it was just a technicality to acknowledge the review. Maybe he is not available now, so I also could have merged the PR after squashing the commit. I can't do the merge now after you added more content, apologies.

Copy link
Contributor

@ckurze ckurze left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor changes needed in Readme.md and the timeseries decomposition notebook.

This notebook explores how to access timeseries data from CrateDB via SQL,
and do the exploratory data analysis with PyCaret.

It also shows how you can generate various plots and charts for EDA, helping you understand data distributions, relationships between variables, and identify patterns.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Time Series decomposition should be added to the readme file as well, incl. link to Colab.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The last commit updates README file.

topic/timeseries/time-series-decomposition.ipynb Outdated Show resolved Hide resolved
Copy link
Contributor

@ckurze ckurze left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you!

@ckurze ckurze merged commit fd8c00d into main Feb 27, 2024
@ckurze ckurze deleted the marija/timeseries-analysis branch February 27, 2024 17:07
@ckurze ckurze restored the marija/timeseries-analysis branch February 27, 2024 17:08
@ckurze
Copy link
Contributor

ckurze commented Feb 27, 2024

@marijaselakovic Just restored the branch, can we harmonize the file names, please?

image

I'd prefer if (similar to the folder name), timeseries is just one word, pls. see screenshot.

Thank you!

@amotl amotl deleted the marija/timeseries-analysis branch February 27, 2024 17:12
@amotl amotl restored the marija/timeseries-analysis branch February 27, 2024 17:13
@amotl
Copy link
Member

amotl commented Feb 27, 2024

Hi, thanks for your contribution, and congratulations for merging.

It is fine when a PR includes multiple commits which have a semantic meaning, but not like this:

image

In a situation like this, you usually squash the commits into one or multiple groups of commits before integrating them into the main branch.

I've just amended the history and force-pushed to main, in order to fix that. In that spirit, I am asking you to wipe your main branches, and pull again from remote.

-- https://github.com/crate/cratedb-examples/commits/main/

@amotl
Copy link
Member

amotl commented Feb 27, 2024

Just restored the branch, can we harmonize the file names, please?
I'd prefer if (similar to the folder name), timeseries is just one word, pls. see screenshot.

Please branch off from a freshly fetched main, and submit a separate PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants