Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add nodal attribute subsetting to gwesp and friends #478

Closed
sgoodreau opened this issue Aug 2, 2022 · 13 comments
Closed

Add nodal attribute subsetting to gwesp and friends #478

sgoodreau opened this issue Aug 2, 2022 · 13 comments

Comments

@sgoodreau
Copy link
Contributor

We've had a request from a long-standing friend of the group (Jim Moody) to add attribute subsetting to gwesp, akin to the version in triangle. Yes, he could do it, but in the end, making it consistent involves a lot of terms and is probably best done in-house. The functionality makes good sense, and is in our interest, in that we've been trying to encourage the use of gwesp over triangle for more than a decade, so it's good to have it provide a superset of triangle's functionality.

By "friends" I mean the directed versions, as well as gwdegree and gwdsp and their directed versions. There are probably others I'm missing.

Should have a diff argument with T and F.

@CarterButts I know you've always made the strong case for thinking about triangle closure as an inhomogeneous phenomenon -- do you have aversion of this somewhere already I don't see?

@chad-klumb, @martinamorris, @krivit, @drh20drh20 thoughts?

@krivit
Copy link
Member

krivit commented Aug 2, 2022

What do they mean by "attribute subsetting"? Evaluating the terms on an induced subgraph of nodes with a specific attribute value? Or something else?

@CarterButts
Copy link

CarterButts commented Aug 2, 2022 via email

@handcock
Copy link
Contributor

handcock commented Aug 2, 2022

Hi Steve,

If you mean statistics like a homogeneous GWESP (i.e., all incident nodes have the same value of an attribute), then I have implemented GWESP, GWDSP and GWDEG. It is a start and can share, if of interest.

Best,

Mark

@krivit
Copy link
Member

krivit commented Aug 3, 2022

@handcock , @sgoodreau , that's what I am wondering about. If we are talking about GWESP (or similar) evaluated on an induced subgraph defined by vertices having a certain attribute value, we already have machinery for that with term operators. I believe something like

S(~gwesp, ~a==1)

will evaluate gwesp on a subgraph comprising vertices for which vertex attribute a has value 1.

Doing it for each level requires several terms, i.e.,

S(~gwesp, ~a==1) + S(~gwesp, ~a==2) + S(~gwesp, ~a==3)

but I can see implementing something like a For operator, i.e.,

For(~S(~gwesp, ~a==x), "x", 1:3)

which would be expanded into the above.

@krivit
Copy link
Member

krivit commented Aug 3, 2022

I've opened a ticket for a "foreach" operator.

krivit added a commit that referenced this issue Aug 3, 2022
…e formula length(list) times, substituting each element of list for counter.

references #478, fixes #479
@krivit
Copy link
Member

krivit commented Aug 3, 2022

@CarterButts , too.

@sgoodreau
Copy link
Contributor Author

Thanks all! I tend to forget about the full flexibility of the term operators.

And yes, I meant the cases in which all members of the relevant structure have the same value of the attribute.

So, to make sure I'm understanding correctly: let's say I have a vertex attribute named "group", with values 1:4. And I want to consider the number of edges that are in at least one attribute-homogenous triangle, regardless of what the specific attribute value is. I think that would currently mean combining four uses of S() with one of Sum(), into something like:

~Sum(~S(~gwesp(decay=0, fied=TRUE), ~group==1) + S(~gwesp(decay=0, fied=TRUE), ~group==2) + S(~gwesp(decay=0, fied=TRUE), ~group==3) + S(~gwesp(decay=0, fied=TRUE), ~group==4))

and with the new foreach operator that Pavel just mentioned, this code would be simplifed considerably.

Is that right? I admit I can't quite follow all of the nuance in the ERGM 4.0 paper regarding the use of ~Sum in a case like this.

@krivit
Copy link
Member

krivit commented Aug 3, 2022

Almost: you also need to tell Sum to add up everything on the formula:

~Sum("sum"~S(~gwesp(decay=0, fied=TRUE), ~group==1) + S(~gwesp(decay=0, fied=TRUE), ~group==2) + S(~gwesp(decay=0, fied=TRUE), ~group==3) + S(~gwesp(decay=0, fied=TRUE), ~group==4))

@sgoodreau
Copy link
Contributor Author

Can you explain why the word "sum" appears twice, once capitalized and once not? That I don't quite get. Thanks.

PS You also missed my misspelling of fixed :-)

@krivit
Copy link
Member

krivit commented Aug 3, 2022

It tells Sum() to sum up all the items on the formula. By default, it sums up the formulas on the list.

@krivit
Copy link
Member

krivit commented Aug 4, 2022

We really should start adding examples to terms. Any volunteers?

Also, can y'all take a look at the #479 ticket? I have a preliminary implementation, but I would like some feedback on the user interface.

@krivit
Copy link
Member

krivit commented Aug 8, 2022

@sgoodreau, actually, if

  1. the statistics are local in the sense that if the network has multiple connected components, the value of the statistic is the sum of its values on each component (true for gwesp), and
  2. what is wanted is the total of the statistics over all levels of a rather than broken down by a,

then there is a simpler and probably more computationally efficient way to do this:

F(~gwesp, ~nodematch("a"))
NodematchFilter(~gwesp, "a")

What these do is start with the LHS network, delete all edges for which nodematch("a") does not hold, and then evaluate gwesp() on that.

If you want a gwesp for each separately, you still need the subgraph method, though perhaps a future version of F()

I'll still add the foreach operator, but if the above works, I'd recommend that.

@sgoodreau
Copy link
Contributor Author

Wonderful, thanks. We definitely need a gallery of these examples, because I doubt many users will be able to ascertain all of this added flexibility, even after reading the ergm 4.0 paper. I know I couldn't!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants