How to represent binding of a gene product in LEGO: #29

dosumis · 2017-02-08T13:53:39Z

When a gene product binds to another gene product, do we need to
choose one side as enabling?

Should we do this:

  GP1 -enables-> binding <-enables- GP2 
    ^__has_input___| |___has_input___^

Or this:

 GP1-enables->binding-has_input->GP2

 GP2-enables->binding-has_input->GP1

The former is potentially useful in some templates - e.g. defining
cell-adhesion mediator activity

The text was updated successfully, but these errors were encountered:

dosumis · 2017-02-08T15:26:13Z

Note - in annotation the binding partner is typically (traditionally) buried in the with statement (yuk).

cmungall · 2017-02-09T06:47:47Z

Did you mean there to be 3 options here? Or is the second option a single model with distinct direction-specific instances of the binding process?

Can this be simplified with a swrl rule?

e.g.

?x enables ?p
?p type binding
->
?p has-input ?x

dosumis · 2017-02-09T07:03:30Z

Did you mean there to be 3 options here? Or is the second option a single model with distinct direction-specific instances of the binding process?

The latter. without stating both directions one of the partners will not get an annotatino.

Can this be simplified with a swrl rule?

e.g.

?x enables ?p
?p type binding
->
?p has-input ?x

So you'd only have to state the enables link to get the implication of has_input. I like it.

CC @thomaspd @vanaukenk @ukemi

dosumis · 2017-02-11T13:32:20Z

~~- [ ] TODO: add swrl rule as detailed above.~~

Where should these live - RO or GO? if GO, presumably we need a new file for rules?

dosumis · 2017-02-16T20:59:35Z

Hmmm... Just thought. Given this equivalent class axiom, the rule would end up inferring that all binding is protein binding:

Strictly, this may reflect the limitations of our binding design pattern rather than a mistake in the logic: an active participant (mediator) of a process (in this case the gene product) is arguably an input to that process. Still, it may be best to stick with the two node pattern for annotation in these cases.

dosumis · 2017-02-19T11:25:15Z

The confusion of what is correct here comes from treating MFs as processes but calling them functions. Each gene product has its own binding function, but those functions are simultaneously realized in a single binding process. (This is not to say that we should start treating MFs as 'realizables', but this framing makes the problem clearer).

dosumis · 2017-02-20T18:05:14Z

Possible solution:

Change logical definition of protein binding.
In DL we could say binding that has_input > 1 protein
EL shunt pattern: Add qualifier to indicate this: multiprotein
'protein binding' bearer_of some 'multiprotein'
GCI: bearer_of multiprotein equivalentTO has_input > 1 protein
GCI: bearer_of multiprotein subClassOf has_input some protein

dosumis · 2017-02-20T18:12:11Z

CC @cmungall - would be good to chat about this.

cmungall · 2017-02-20T19:22:45Z

Easy part first: I think SWRL rules should live in RO, separate module, but imported by default. That makes them most amenable to global consistency checks with other relations.

Harder part: good catch and I agree with the analysis.

Can we first explore your original option number 2. (I don't know what the cell adhesion mediator story is and how that fits in).

Is it the case that there is always 2 complementary binding processes? I am imagining two scenarios:

it is the function of p1 to bind with proteins such as p2, and the function of p2 to bind with proteins such as p1
it is the function of p1 to bind to p2 (for example, to disable it, in the case of foreign proteins, or simply proteins that are ubiquinated)

Here I use 'bind' in the process sense, and 'function' as shorthand for evolved-to-do.

In case 1, we would place two activities in the lego model. In case 2, only one.

It seems it may be difficult to tease apart these scenarios. But the ability to tease them apart could be very useful.

If we decide to go with original option number 1, then your solution should work in theory, but is not totally straightforward. The el-shunt will help us with TBox reasoning, but for ABox(LEGO) reasoning we need to actually count distinct proteins. I'm not even sure if this is possible with a SWRL rule. There are subtleties here to do with the unique name assumption (which we implicitly make). We could have something like a SPARQL rule that makes the UNA and injects the PATO quality.

Oh, and what about RNAs whose function is to bind a protein?

I think I am tending towards your option2. It's how I think of this naturally, FWIW. There are some counter-intuitive aspects. E.g. if we consider has_input as a subprop of has_participant (we could model it differently) then the two binding processes b1 and b2 are spatiotemporally identical. However, they are differentiated by their enablers; different views of the same process. Crudely, I have an analogy with a fight between two people (not sure where that came from). It's IMO more useful to look at this as two coincident processes from two different perspectives, each has different properties.

thomaspd · 2017-02-20T20:12:28Z

I tend to agree that option 2 is the better choice. David OS and I discussed this today and he had a good point, namely that anything we can do to reduce the number of nodes in the graph is helpful. I think this is right, but I note that only one of the two directional nodes will generally be a "function" node in a LEGO graph. The reverse direction node is usually only a subfunction of the overall downstream MF.

ukemi · 2017-02-20T20:24:25Z

I agree with Paul, and it is consistent with the way I have been modeling. The function of the binding tends to be from a given perspective of an active participant/enabler. I think we have done some of these at the various workshops. In the old annotation paradigm, we always made the reciprocal binding annotation, with the caveat that the actual binding partner went in the 'with' field. If it was a mouse protein binding a human protein we would make the mouse annotation and the human protein went in the 'with' field. We didn't/couldn't make the reciprocal annotation for the human protein.

dosumis · 2017-02-21T11:52:22Z

Having one node makes some important inference easier.

See: https://github.com/geneontology/molecular_function_refactoring/blob/master/direct_reg_inf_notes.md

dosumis · 2017-02-24T04:16:21Z

Linking protein binding effector to sensor:

In this case, we need some association between the two protein binding nodes in order to keep a continuous chain of regulates relations (essential for inference).

Perhaps, rather than a new relationship for '?', noctua should have something like scratch - where two compatible nodes can 'snap' together.

CC @cmungall @ukemi

pgaudet · 2019-02-21T15:17:49Z

@ukemi @vanaukenk
Is there anything left to do here ? Seems like there is some agreement. (Do we need to document?)
If not, can you please close ?

pgaudet · 2019-03-01T17:27:30Z

For GO:0005515 protein binding: @thomaspd proposes that we use 'has input' for both proteins (no 'enables')
Otherwise this 'overloads' the 'enables' relation, since both proteins participate sort of equally.

pgaudet · 2019-03-01T17:29:05Z

This issue was moved to geneontology/go-annotation#2280

cmungall mentioned this issue Apr 12, 2017

Clarify protein binding representation geneontology/go-site#334

Merged

pgaudet mentioned this issue Mar 1, 2019

How to represent binding of a gene product in LEGO: geneontology/go-annotation#2280

Open

pgaudet closed this as completed Mar 1, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to represent binding of a gene product in LEGO: #29

How to represent binding of a gene product in LEGO: #29

dosumis commented Feb 8, 2017

dosumis commented Feb 8, 2017

cmungall commented Feb 9, 2017

dosumis commented Feb 9, 2017

dosumis commented Feb 11, 2017 •

edited

Loading

dosumis commented Feb 16, 2017

dosumis commented Feb 19, 2017 •

edited

Loading

dosumis commented Feb 20, 2017 •

edited

Loading

dosumis commented Feb 20, 2017

cmungall commented Feb 20, 2017

thomaspd commented Feb 20, 2017

ukemi commented Feb 20, 2017

dosumis commented Feb 21, 2017

dosumis commented Feb 24, 2017 •

edited

Loading

pgaudet commented Feb 21, 2019

pgaudet commented Mar 1, 2019 •

edited

Loading

pgaudet commented Mar 1, 2019

How to represent binding of a gene product in LEGO: #29

How to represent binding of a gene product in LEGO: #29

Comments

dosumis commented Feb 8, 2017

dosumis commented Feb 8, 2017

cmungall commented Feb 9, 2017

dosumis commented Feb 9, 2017

dosumis commented Feb 11, 2017 • edited Loading

dosumis commented Feb 16, 2017

dosumis commented Feb 19, 2017 • edited Loading

dosumis commented Feb 20, 2017 • edited Loading

dosumis commented Feb 20, 2017

cmungall commented Feb 20, 2017

thomaspd commented Feb 20, 2017

ukemi commented Feb 20, 2017

dosumis commented Feb 21, 2017

dosumis commented Feb 24, 2017 • edited Loading

pgaudet commented Feb 21, 2019

pgaudet commented Mar 1, 2019 • edited Loading

pgaudet commented Mar 1, 2019

dosumis commented Feb 11, 2017 •

edited

Loading

dosumis commented Feb 19, 2017 •

edited

Loading

dosumis commented Feb 20, 2017 •

edited

Loading

dosumis commented Feb 24, 2017 •

edited

Loading

pgaudet commented Mar 1, 2019 •

edited

Loading