-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to represent binding of a gene product in LEGO: #29
Comments
Note - in annotation the binding partner is typically (traditionally) buried in the with statement (yuk). |
Did you mean there to be 3 options here? Or is the second option a single model with distinct direction-specific instances of the binding process? Can this be simplified with a swrl rule? e.g.
|
The latter. without stating both directions one of the partners will not get an annotatino.
So you'd only have to state the enables link to get the implication of has_input. I like it. |
Where should these live - RO or GO? if GO, presumably we need a new file for rules? |
Hmmm... Just thought. Given this equivalent class axiom, the rule would end up inferring that all binding is protein binding: Strictly, this may reflect the limitations of our binding design pattern rather than a mistake in the logic: an active participant (mediator) of a process (in this case the gene product) is arguably an input to that process. Still, it may be best to stick with the two node pattern for annotation in these cases. |
The confusion of what is correct here comes from treating MFs as processes but calling them functions. Each gene product has its own binding function, but those functions are simultaneously realized in a single binding process. (This is not to say that we should start treating MFs as 'realizables', but this framing makes the problem clearer). |
Possible solution: Change logical definition of protein binding. |
CC @cmungall - would be good to chat about this. |
Easy part first: I think SWRL rules should live in RO, separate module, but imported by default. That makes them most amenable to global consistency checks with other relations. Harder part: good catch and I agree with the analysis. Can we first explore your original option number 2. (I don't know what the cell adhesion mediator story is and how that fits in). Is it the case that there is always 2 complementary binding processes? I am imagining two scenarios:
Here I use 'bind' in the process sense, and 'function' as shorthand for evolved-to-do. In case 1, we would place two activities in the lego model. In case 2, only one. It seems it may be difficult to tease apart these scenarios. But the ability to tease them apart could be very useful. If we decide to go with original option number 1, then your solution should work in theory, but is not totally straightforward. The el-shunt will help us with TBox reasoning, but for ABox(LEGO) reasoning we need to actually count distinct proteins. I'm not even sure if this is possible with a SWRL rule. There are subtleties here to do with the unique name assumption (which we implicitly make). We could have something like a SPARQL rule that makes the UNA and injects the PATO quality. Oh, and what about RNAs whose function is to bind a protein? I think I am tending towards your option2. It's how I think of this naturally, FWIW. There are some counter-intuitive aspects. E.g. if we consider has_input as a subprop of has_participant (we could model it differently) then the two binding processes b1 and b2 are spatiotemporally identical. However, they are differentiated by their enablers; different views of the same process. Crudely, I have an analogy with a fight between two people (not sure where that came from). It's IMO more useful to look at this as two coincident processes from two different perspectives, each has different properties. |
I tend to agree that option 2 is the better choice. David OS and I discussed this today and he had a good point, namely that anything we can do to reduce the number of nodes in the graph is helpful. I think this is right, but I note that only one of the two directional nodes will generally be a "function" node in a LEGO graph. The reverse direction node is usually only a subfunction of the overall downstream MF. |
I agree with Paul, and it is consistent with the way I have been modeling. The function of the binding tends to be from a given perspective of an active participant/enabler. I think we have done some of these at the various workshops. In the old annotation paradigm, we always made the reciprocal binding annotation, with the caveat that the actual binding partner went in the 'with' field. If it was a mouse protein binding a human protein we would make the mouse annotation and the human protein went in the 'with' field. We didn't/couldn't make the reciprocal annotation for the human protein. |
Having one node makes some important inference easier. |
Linking protein binding effector to sensor: In this case, we need some association between the two protein binding nodes in order to keep a continuous chain of regulates relations (essential for inference). Perhaps, rather than a new relationship for '?', noctua should have something like scratch - where two compatible nodes can 'snap' together. |
@ukemi @vanaukenk |
For GO:0005515 protein binding: @thomaspd proposes that we use 'has input' for both proteins (no 'enables') |
This issue was moved to geneontology/go-annotation#2280 |
When a gene product binds to another gene product, do we need to
choose one side as enabling?
Should we do this:
Or this:
The former is potentially useful in some templates - e.g. defining
cell-adhesion mediator activity
The text was updated successfully, but these errors were encountered: