You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
On some input files I have the following crash with bwtool matrix -cluster=10
./bwtool matrix -keep-be…” terminated by signal SIGSEGV (Address boundary error)
I traced the error to this code in beato/cluster.c
staticint*k_means(structcluster_bed_matrix*cbm, doublet)
{
/* output cluster label for each data point */int*labels; /* Labels for each cluster (size n) */inth, i, j; /* loop counters, of course :) */doubleold_error;
doubleerror=DBL_MAX; /* sum of squared euclidean distance */double**tmp_centroids; /* centroids and temp centroids (size k x m) */intn=cbm->n;
intm=cbm->m;
intk=cbm->k;
AllocArray(labels, n);
AllocArray(tmp_centroids, k);
printf("k_means: 0\n");
for (i=0; i<k; i++)
AllocArray(tmp_centroids[i], m);
/* assert(data && k > 0 && k <= n && m > 0 && t >= 0); /\* for debugging *\/ *//* init ialization */printf("k_means: 1\n");
for (i=0, h=cbm->num_na; i<k; h+= (cbm->n-cbm->num_na) / k, i++)
{
printf("k_means: 1:%d\n", i);
/* pick k points as initial centroids */for (j=0; j<m; j++) {
printf("k_means: 1:%d %d %d %d\n", i, j, m, h);
cbm->centroids[i][j] =cbm->pbm->matrix[h][j];
}
}
...
Thanks for finding this. Man, I used the cluster feature a lot in my research and never came across this. I'll investigate. Thanks for the debugging information. Is it possible to put data on the web that can reproduce the error? I understand if that's not possible with real data. Maybe you have toy/fake data. I just haven't been in this code in a while and without data it'll be hard to dissect the problem. I imagine it's something where if all the numbers involved are perfect multiples of the size of an array, something gets calculated off-by-one. Well... that's what I imagine. Of course I won't rule out something more sinister. I've made a lot of mistakes here and there. But I thought I would've noticed if the algorithm was producing incorrect results.
On some input files I have the following crash with bwtool matrix -cluster=10
I traced the error to this code in beato/cluster.c
For a working file:
For a non-working file:
I think h (=20000) is calculated wrongly:
I have a window size of 10000 and 20000 regions in my BED file.
The text was updated successfully, but these errors were encountered: