-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Recover block memberships with dcsbm? #35
Comments
We reorder It sounds like you may also want the undirected version of the sbm, in which case you should use If you would like to combine the latent information in your set.seed(32)
bm <- as.matrix(cbind(
c(.3, .005, .005, .005, .005),
c(.002, .3, .005, .005, .005),
c(.002, .01, .3, .005, .005),
c(.002, .01, .005, .2, .005),
c(.002, .005, .005, .005, .2)
))
pi <- c(5, 50, 20, 25, 100)
sbm <- fastRG::directed_dcsbm(
n = 200,
B = bm,
pi_in = pi,
pi_out = pi,
expected_out_degree = 3,
allow_self_loops = FALSE,
sort_nodes = TRUE
)
#> Generating random degree heterogeneity parameters `theta_in` and `theta_out` from LogNormal(2, 1) distributions. This distribution may change in the future. Explicitly set `theta_in` and `theta_out` for reproducible results.
net <- fastRG::sample_igraph(sbm)
## the order of the blocks as given for the block probabilities don't align with the order of the block memberships in the factor model:
pi
#> [1] 5 50 20 25 100
table(sbm$z_in)
#>
#> 1 2 3 4 5
#> 3 25 24 47 101
net |>
igraph::set_vertex_attr("in_block", value = sbm$z_in)
#> Error in i_set_vertex_attr(graph = graph, name = name, value = value, : Length of new attribute value must be 1 or 156, the number of target vertices, not 200 Created on 2023-08-14 with reprex v2.0.2 However, it looks like there is an issue creating the |
Fixed. You'll need to update to the dev version with remotes::install_github("RoheLab/fastRG") Then I think you'll want something like the following set.seed(32)
bm <- as.matrix(cbind(
c(.3, .005, .005, .005, .005),
c(.002, .3, .005, .005, .005),
c(.002, .01, .3, .005, .005),
c(.002, .01, .005, .2, .005),
c(.002, .005, .005, .005, .2)
))
pi <- c(5, 50, 20, 25, 100)
latent <- fastRG::dcsbm(
n = 200,
B = bm,
pi = pi,
expected_degree = 3,
allow_self_loops = FALSE,
sort_nodes = TRUE,
poisson_edges = FALSE # my guess is that you want this! would read the documentation about this carefully!
)
#> Generating random degree heterogeneity parameters `theta` from a LogNormal(2, 1) distribution. This distribution may change in the future. Explicitly set `theta` for reproducible results.
ig <- fastRG::sample_igraph(latent)
# node orders between `latent` and `ig` object will match up :)
ig_with_block <- ig |>
igraph::set_vertex_attr("block", value = latent$z)
igraph::V(ig_with_block)$block
#> [1] block1 block1 block1 block1 block1 block1 block1 block1 block2 block2
#> [11] block2 block2 block2 block2 block2 block2 block2 block2 block2 block2
#> [21] block2 block2 block2 block2 block2 block2 block2 block2 block2 block2
#> [31] block2 block2 block3 block3 block3 block3 block3 block3 block3 block3
#> [41] block3 block3 block3 block3 block3 block3 block3 block3 block3 block3
#> [51] block3 block3 block4 block4 block4 block4 block4 block4 block4 block4
#> [61] block4 block4 block4 block4 block4 block4 block4 block4 block4 block4
#> [71] block4 block4 block4 block4 block4 block4 block4 block4 block4 block4
#> [81] block4 block4 block4 block4 block4 block4 block4 block4 block4 block4
#> [91] block4 block4 block4 block4 block4 block4 block4 block4 block4 block4
#> [101] block4 block4 block4 block4 block4 block4 block4 block4 block4 block4
#> [111] block4 block4 block4 block5 block5 block5 block5 block5 block5 block5
#> [121] block5 block5 block5 block5 block5 block5 block5 block5 block5 block5
#> [131] block5 block5 block5 block5 block5 block5 block5 block5 block5 block5
#> [141] block5 block5 block5 block5 block5 block5 block5 block5 block5 block5
#> [151] block5 block5 block5 block5 block5 block5 block5 block5 block5 block5
#> [161] block5 block5 block5 block5 block5 block5 block5 block5 block5 block5
#> [171] block5 block5 block5 block5 block5 block5 block5 block5 block5 block5
#> [181] block5 block5 block5 block5 block5 block5 block5 block5 block5 block5
#> [191] block5 block5 block5 block5 block5 block5 block5 block5 block5 block5
#> Levels: block1 block2 block3 block4 block5 Again note that the blocks are ordered by block size Created on 2023-08-14 with reprex v2.0.2 |
Thanks for the quick and helpful reply! Just to be clear: imagine you have two blocks of the same size, but very different entries in the block matrix: how could you be sure that you're associating the right block with the right entries in the block matrix? e.g., if you run the above with Also, with (Finally, I do actually want a directed network to result (and the block matrix given wasn't symmetric), so is there any chance of setting |
When we sort
Right, they're sorted by expected size.
Yes, although it's a little hacky. set.seed(32)
bm <- as.matrix(cbind(
c(.3, .005, .005, .005, .005),
c(.002, .3, .005, .005, .005),
c(.002, .01, .3, .005, .005),
c(.002, .01, .005, .2, .005),
c(.002, .005, .005, .005, .2)
))
pi <- c(5, 50, 20, 25, 100)
# note: this is a Poisson DCSBM, rather than a Bernoulli DCSBM
latent <- fastRG::directed_dcsbm(
n = 200,
B = bm,
pi_in = pi,
pi_out = pi,
expected_out_degree = 3,
allow_self_loops = FALSE,
sort_nodes = TRUE
)
#> Generating random degree heterogeneity parameters `theta_in` and `theta_out` from LogNormal(2, 1) distributions. This distribution may change in the future. Explicitly set `theta_in` and `theta_out` for reproducible results.
# for sampling to work as expected, all you need is this, which forces
# blocks and degree-correction parameters to match across incoming and
# outgoing blocks
latent$Y <- latent$X
# fix meta-data
latent$theta_out <- latent$theta_in
latent$z_out <- latent$z_in
latent$pi_out <- latent$pi_in
ig <- fastRG::sample_igraph(latent)
# node orders between `latent` and `ig` object will match up :)
ig_with_block <- ig |>
igraph::set_vertex_attr("block", value = latent$z_in)
igraph::V(ig_with_block)$block
#> [1] 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3
#> [38] 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
#> [75] 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 5 5 5 5 5 5 5 5 5 5 5 5
#> [112] 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5
#> [149] 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5
#> [186] 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5
#> Levels: 1 2 3 4 5 Created on 2023-08-15 with reprex v2.0.2 |
- See discussion in #35 - Also flip incoming and outgoing blocks, such that X now contains info about outgoing blocks and Y now contains info about incoming blocks, as you would expected if A[i, j] encodes an edge from node i to node j - Update NEWS accordingly
Hi! Thanks for your work on the package!
I was hoping to use the
directed_dcsbm
function to generate simple degree-corrected SBMs. I've run into some issues surrounding the block memberships.At very least, I would like to be able to recover each node's membership as we move from the factor model to the sampled network. I know that the factor model given as input has parameters
z_in
andz_out
, but as the samples that result from that (I'm usingsample_igraph
) don't necessarily have the same number of nodes, there can't be a direct mapping of those vectors to what we get in any graph drawing from that.I'm also specifying a vector of block probabilities, but their ordering don't seem to carry through to the factor model. I'm not sure what that implies for the use with the block matrix. Code to demonstrate that while the blocks are sized roughly in agreement with what's given to
pi_in
orpi_out
, they don't appear in the same order inz_in
orz_out
:All of this means that I'm unsure of the block memberships of each node, and whether it's appropriately aligning with the block matrix given in input. (In a perfect world, the igraph object that's created would have vertex attributes that are the block memberships).
Ideally, I'd yet further be able to specify that
z_in == z_out
. Even if I specify bothp_in
andp_out
, they don't align (clear withtable(sbm$z_in, sbm$z_out)
).Let me know if these issues are unclear! Thanks again!
The text was updated successfully, but these errors were encountered: