-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add special domain keys to data #2
Conversation
if ( | ||
not hasattr(obj, f"{name}_mask") | ||
or not getattr(obj, f"{name}_mask").all() | ||
): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we remove this whole logic? I believe "trials" is not a general concept, and so we shouldn't have an if-else for it in temporaldata
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tagged likes 2902-2914 here. GH only shows a few for some reason
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was old code that I wanted to deprecate so that I don't break too many things for people (or at least provide a clear error so they know what to fix), but I am down to also completely remove it.
I get why this is needed, and I don't mind current implementations. Though I have a reservation about using object names to decide what logic will be executed on them. It's an opinion, but I don't think it's clean. Another way, which I right now think avoids this is - create a new class |
I am open to that. The only drawback is that people will not think that domains are also intervals, or that they can do & and | between domains and intervals. also how do we resolve Domain & Interval, will the output be a Domain or an Interval? |
I don't see that as a problem. Equivalently, people wouldn't know if they name something "_domain", then it behaves differently. One has to read the docs to understand what's going on in either case. |
the
Data
object is defined by adomain
which is an interval object. but more generally we would want to associate different domains with this object. In particular there are two types:train_domain
,test_domain
etc... which help define the domains from which to sample for a given splitThese domains are "higher order entities" and are exempt from splitting, so we skip them.
In current code,
trials
is not checked for data leakage. instead of removing this, currently it is simply deprecated