Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ClusterSeries to Hero Series Types #170

Open
wants to merge 17 commits into
base: master
Choose a base branch
from

Conversation

henrifroese
Copy link
Collaborator

@henrifroese henrifroese commented Aug 29, 2020

We have noticed that with topic modelling etc. coming up, we get more and more use out of the clustering functions. It thus makes sense to introduce a HeroSeries type ClusterSeries through this PR.

A ClusterSeries has dtype "category" and every entry is a cluster-ID (e.g. 5 or "topic 1"). For example, pd.Series([0, 3, 0, 1], dtype="category") is a valid ClusterSeries.

NOTE: only so many commits/lines as this builds on #157

mk2510 and others added 14 commits August 18, 2020 22:06
suport MultiIndex as function parameter

returns MultiIndex, where Representation was returned

* missing: correct test


Co-authored-by: Henri Froese <[email protected]>
*missing: test adopting for new types


Co-authored-by: Henri Froese <[email protected]>
- add functionality for decorator @InputSeries to handle several allowed input types
- Add typing decorator/hints to representation.py
- add tests for _types DocumentTermDF

Co-authored-by: Maximilian Krahn <[email protected]>
@henrifroese
Copy link
Collaborator Author

henrifroese commented Aug 29, 2020

Note: Black (our formatter) just rolled out V20.8b1 3 days ago. This creates errors with our ./tests.sh in preprocessing because of whitespace. Will investigate this further but atm we set the black version in .travis.yml and setup.cfg to the last working version (19.10b1).

EDIT: found the issue, see the issue opened at Black here

@henrifroese henrifroese added the enhancement New feature or request label Sep 6, 2020
@jbesomi
Copy link
Owner

jbesomi commented Sep 8, 2020

Cool, good idea!

Will review and merge once #156 and #157 are merged and #180 is solved

@jbesomi jbesomi marked this pull request as draft September 8, 2020 11:42
@jbesomi jbesomi linked an issue Sep 8, 2020 that may be closed by this pull request
@mk2510
Copy link
Collaborator

mk2510 commented Sep 22, 2020

Merged master branch into this one, so it is ready for review/to be merged 🦂 🐼

@mk2510 mk2510 marked this pull request as ready for review September 22, 2020 13:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add ClusterSeries to Hero Types
3 participants