-
Hi, the pairs.csv file produced by dolos analysis includes two fields that I don't understand: leftCovered and rightCovered. Can you please explain what these are/how to interpret them? thanks! |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
This is the absolute number of shared fingerprints in the left and right file. These are not necessarily the same because a shared fingerprint might have a different number of occurrences in the left and right file. A fingerprint is a series of The similarity between two source files a and b is computed as The naming is indeed a bit awkward. It was chosen a few years ago and we try to keep our API somewhat stable. |
Beta Was this translation helpful? Give feedback.
This is the absolute number of shared fingerprints in the left and right file. These are not necessarily the same because a shared fingerprint might have a different number of occurrences in the left and right file.
A fingerprint is a series of$$k$$ subsequent tokens (k-grams) in the syntax tree selected out of a window of $$w$$ k-grams.
The similarity between two source files a and b is computed as
$$sim(a,b) = \frac{S_a + S_b}{T_a + T_b}$$ $$T_x$$ the total number of fingerprints in file $$x$$ and $$S_x$$ the number of fingerprints in file $$x$$ that also occur in the other file.
with
The naming is indeed a bit awkward. It was chosen a few years ago and we try to keep our API somewhat …