Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to represent binding of a gene product in LEGO: #29

Closed
dosumis opened this issue Feb 8, 2017 · 16 comments
Closed

How to represent binding of a gene product in LEGO: #29

dosumis opened this issue Feb 8, 2017 · 16 comments

Comments

@dosumis
Copy link
Contributor

dosumis commented Feb 8, 2017

When a gene product binds to another gene product, do we need to
choose one side as enabling?

Should we do this:

  GP1 -enables-> binding <-enables- GP2 
    ^__has_input___| |___has_input___^

Or this:

 GP1-enables->binding-has_input->GP2

 GP2-enables->binding-has_input->GP1

The former is potentially useful in some templates - e.g. defining
cell-adhesion mediator activity

@dosumis
Copy link
Contributor Author

dosumis commented Feb 8, 2017

Note - in annotation the binding partner is typically (traditionally) buried in the with statement (yuk).

@cmungall
Copy link
Member

cmungall commented Feb 9, 2017

Did you mean there to be 3 options here? Or is the second option a single model with distinct direction-specific instances of the binding process?

Can this be simplified with a swrl rule?

e.g.

?x enables ?p
?p type binding
->
?p has-input ?x

@dosumis
Copy link
Contributor Author

dosumis commented Feb 9, 2017

Did you mean there to be 3 options here? Or is the second option a single model with distinct direction-specific instances of the binding process?

The latter. without stating both directions one of the partners will not get an annotatino.

Can this be simplified with a swrl rule?

e.g.

?x enables ?p
?p type binding
->
?p has-input ?x

So you'd only have to state the enables link to get the implication of has_input. I like it.

CC @thomaspd @vanaukenk @ukemi

@dosumis
Copy link
Contributor Author

dosumis commented Feb 11, 2017

- [ ] TODO: add swrl rule as detailed above.

Where should these live - RO or GO? if GO, presumably we need a new file for rules?

@dosumis
Copy link
Contributor Author

dosumis commented Feb 16, 2017

Hmmm... Just thought. Given this equivalent class axiom, the rule would end up inferring that all binding is protein binding:

image

Strictly, this may reflect the limitations of our binding design pattern rather than a mistake in the logic: an active participant (mediator) of a process (in this case the gene product) is arguably an input to that process. Still, it may be best to stick with the two node pattern for annotation in these cases.

@dosumis
Copy link
Contributor Author

dosumis commented Feb 19, 2017

The confusion of what is correct here comes from treating MFs as processes but calling them functions. Each gene product has its own binding function, but those functions are simultaneously realized in a single binding process. (This is not to say that we should start treating MFs as 'realizables', but this framing makes the problem clearer).

@dosumis
Copy link
Contributor Author

dosumis commented Feb 20, 2017

Possible solution:

Change logical definition of protein binding.
In DL we could say binding that has_input > 1 protein
EL shunt pattern: Add qualifier to indicate this: multiprotein
'protein binding' bearer_of some 'multiprotein'
GCI: bearer_of multiprotein equivalentTO has_input > 1 protein
GCI: bearer_of multiprotein subClassOf has_input some protein

@dosumis
Copy link
Contributor Author

dosumis commented Feb 20, 2017

CC @cmungall - would be good to chat about this.

@cmungall
Copy link
Member

Easy part first: I think SWRL rules should live in RO, separate module, but imported by default. That makes them most amenable to global consistency checks with other relations.

Harder part: good catch and I agree with the analysis.

Can we first explore your original option number 2. (I don't know what the cell adhesion mediator story is and how that fits in).

Is it the case that there is always 2 complementary binding processes? I am imagining two scenarios:

  1. it is the function of p1 to bind with proteins such as p2, and the function of p2 to bind with proteins such as p1
  2. it is the function of p1 to bind to p2 (for example, to disable it, in the case of foreign proteins, or simply proteins that are ubiquinated)

Here I use 'bind' in the process sense, and 'function' as shorthand for evolved-to-do.

In case 1, we would place two activities in the lego model. In case 2, only one.

It seems it may be difficult to tease apart these scenarios. But the ability to tease them apart could be very useful.

If we decide to go with original option number 1, then your solution should work in theory, but is not totally straightforward. The el-shunt will help us with TBox reasoning, but for ABox(LEGO) reasoning we need to actually count distinct proteins. I'm not even sure if this is possible with a SWRL rule. There are subtleties here to do with the unique name assumption (which we implicitly make). We could have something like a SPARQL rule that makes the UNA and injects the PATO quality.

Oh, and what about RNAs whose function is to bind a protein?

I think I am tending towards your option2. It's how I think of this naturally, FWIW. There are some counter-intuitive aspects. E.g. if we consider has_input as a subprop of has_participant (we could model it differently) then the two binding processes b1 and b2 are spatiotemporally identical. However, they are differentiated by their enablers; different views of the same process. Crudely, I have an analogy with a fight between two people (not sure where that came from). It's IMO more useful to look at this as two coincident processes from two different perspectives, each has different properties.

@thomaspd
Copy link

I tend to agree that option 2 is the better choice. David OS and I discussed this today and he had a good point, namely that anything we can do to reduce the number of nodes in the graph is helpful. I think this is right, but I note that only one of the two directional nodes will generally be a "function" node in a LEGO graph. The reverse direction node is usually only a subfunction of the overall downstream MF.

@ukemi
Copy link

ukemi commented Feb 20, 2017

I agree with Paul, and it is consistent with the way I have been modeling. The function of the binding tends to be from a given perspective of an active participant/enabler. I think we have done some of these at the various workshops. In the old annotation paradigm, we always made the reciprocal binding annotation, with the caveat that the actual binding partner went in the 'with' field. If it was a mouse protein binding a human protein we would make the mouse annotation and the human protein went in the 'with' field. We didn't/couldn't make the reciprocal annotation for the human protein.

@dosumis
Copy link
Contributor Author

dosumis commented Feb 21, 2017

Having one node makes some important inference easier.

See: https://github.com/geneontology/molecular_function_refactoring/blob/master/direct_reg_inf_notes.md

@dosumis
Copy link
Contributor Author

dosumis commented Feb 24, 2017

Linking protein binding effector to sensor:

image

In this case, we need some association between the two protein binding nodes in order to keep a continuous chain of regulates relations (essential for inference).

Perhaps, rather than a new relationship for '?', noctua should have something like scratch - where two compatible nodes can 'snap' together.

CC @cmungall @ukemi

@pgaudet
Copy link
Contributor

pgaudet commented Feb 21, 2019

@ukemi @vanaukenk
Is there anything left to do here ? Seems like there is some agreement. (Do we need to document?)
If not, can you please close ?

@pgaudet
Copy link
Contributor

pgaudet commented Mar 1, 2019

For GO:0005515 protein binding: @thomaspd proposes that we use 'has input' for both proteins (no 'enables')
Otherwise this 'overloads' the 'enables' relation, since both proteins participate sort of equally.

@pgaudet
Copy link
Contributor

pgaudet commented Mar 1, 2019

This issue was moved to geneontology/go-annotation#2280

@pgaudet pgaudet closed this as completed Mar 1, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants