-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error using chol Matrix must be positive definite on liblsl 1.14 + LabRecorderCLI 1.13.1 #44
Comments
What's the longest duration xdf file you've had with this problem? Can you attach a problematic one that is at least 1 minute long to this issue? If I recall correctly, the clock offsets are retrieved via UDP whereas the data come via TCP. If you have any reason to think that your network configuration might be dropping a large number of UDP packets then this could be the source of your problem. |
I've been encountering this bug lately too. But, I have only gotten it
when sending data from a Linux computer and recording on a Windows PC.
However, since this has been a pseudo-random problem, this may just be
a coincidence. On the other hand, the clock offsets between Windows and
non-Windows PCs is getting bigger and bigger all the time. At some point
there will be numerical problems when trying to invert the matrix when
load_xdf.m does the clock synchronization. I can report that in these
cases, I have not seen missing clock offset measurements in the xdf
footer. If memory serves, this was the case when we had the UDP packet
bug a couple of years ago.
…On 16/10/2020 12:30, Chadwick Boulay wrote:
What's the longest duration xdf file you've had with this problem? Can
you attach a problematic one that is at least 1 minute long to this issue?
And you're using the Matlab loader to import it?
If I recall correctly, the clock offsets are retrieved via UDP whereas
the data come via TCP. If you have any reason to think that your
network configuration might be dropping a large number of UDP packets
then this could be the source of your problem.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#44 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AA3SD3T7CCDONLFAPFABCD3SK6O25ANCNFSM4SSV5W4A>.
|
@garygan89, can you please confirm what Chad asked and also give details of your setup? Specifically, I am interested in what OS you are using on the outlet side and which OS you are using to host LabRecorder. I can also say that my recent bouts with this issue have only occurred when using liblsl (or is it just LSL now?) >=14. When I have a chance I will downgrade to 13 and see if the problem persists. |
It seems like the problem seems to be unrelated to the file size / duration, since I could have the error in 5MB or 20MB file. I will upload the problemetic file when I'm at the lab later.
Yes I loaded using
I ran both LSL consumer and producer (SendDataC) in a closed loop system, in particular the Freescale IMX8 SOM (ARM64/aarch64 architecture) that we mount in our custom board, with no IP assign to eth0 because our custom board does not have a real ethernet port. I suspect it was the missing IP at first, but it happened to my reference board with eth0 IP assigned. |
Yes I just followed up on that with Chad. I'm running it on a Freescale IMX8 SOM on our custom board, OS is Debian 10 (bullseye). Both consumer (LabRecorderCLI) v1.13.1, libLSL v1.14 and producer (SendDataC from the liblsl example) are running on the same host so that we could form a closed loop system. I didn't assign IP to the eth0 interface. This issue seems to start happening on liblsl v1.13 as reported, but I'm not sure whether it somehow creep to v1.14. |
I've never tried that kind of network setup. I'm happy you're using LSL and we'll try to fix this problem as best we can, but if you're running everything in a closed system on a custom platform, why not use shared memory? Shared memory will definitely have lower latency and be more efficient than LSL. LSL wins on flexibility, network synchronization across computers, and compatibility with many devices, but it sounds like you aren't using any of those features. Maybe you plan to? As you're debugging this, please use https://github.com/xdf-modules/xdf-Matlab instead of the loader that comes with EEGLAB. Ultimately they should be the same thing, but if we provide a fix then it'll appear in xdf-Matlab before EEGLAB. Also note in the load_xdf function there are many command line options like I hope to get Matlab again in a couple weeks. Until then I'll use pyxdf. Please attach the file when you can so I can try loading it in pyxdf to see if it loads and if it doesn't where the error is coming from, then maybe work backwards to find the source. |
Thanks Chad. The primary motivation to use LSL in our closed loop setting is really the how LSL is able to synchronize multiple stream (we have EEG and visual stimuli presentation and marker all running in the same board). And I reckon using LSL is the fastest way for me to pipe them together. Here are the list of XDF I uploaded to MF. http://www.mediafire.com/folder/osblwmlc4u9at/LSL_XDF The one that gave the error is in the "Problematic folder". The consumer is the SendC code from liblsl examples. The I will further try to load them using xdf-Matlab after the weekend and see if that improves. |
I believe this problem results from a numerical issue. When calculating the mapping from outlet to LabRecorder inlet, load_xdf.m must perform a Cholesky decomposition of the matrix that is a combination of timestamps on the outlet PC. What appears to be happening is that when the timestamps are very, very high---which they are when timestamps are the number of seconds since January 1, 1970---the combination For example, when examining
And here
Note the negative sign in the first value. In both cases the first eigenvalue should be 0, or very close to it on the positive side, but due to precision, it sometimes ends up on the negative and this (I am guessing) is what stops The workaround seems to be to increase the WinsorThreshold value. I confess that I have never fully understood how this works or how this parameter truly has a Winsorizing effect on the ADMM algorithm, but when I set it to 1 (as I mentioned above, the default is .0001), the matrix A is smaller by a factor of 10e4 and the numerical problem disappears:
by calling load_xdf.m with this option ( Again, I am not sure how this affects the precision of the clock offset mapping, but this will allow you to load these problematic sets. I am also unsure what to do to fix this. If we are at the point where the time since the Epoch is so great that this is going to happen, then this whole mechanism needs to be fixed. After all, this workaround will stop working in about 100 million seconds ;-). I am also unsure where to re-open this issue. Is it a problem with xdf-Matlab or liblsl? It is definitely not, however, a problem with LabRecorder, and that is a good thing. |
Also, 100 million seconds is only 3 years, so the clock is literally ticking! |
Thanks for the detailed investigation @dmedine ! I must admin I have little knowledge about Cholesky decomposition, but it is certainly good that these problematic XDF are still loadable with no data lose. The loss of clock precision offset might not be as important in my case since everything is streamed and timestamped from a same closed loop host (hope this is correct statement for a same host recording and streaming). Probably more investigation need to be done to see its effect on synchronizing multiple data stream. As mentioned it does sound like this is more of an issue inherent to I will try your method and see if I can salvage all of those problematic ones. |
I have been trying to do some more experiments with Raspberry Pi, and I believe this problem is very terrible. Currently, I am unable to synchronize streams between Windows and Raspbian when recording on Windows. The winsor threshold trick is distorting the signal beyond recognition. I am not sure how to confirm my original hypothesis that this is a numerical issue, but I will try some XDF surgery and see what I can figure out. In the meantime, I would say that you should proceed with extreme caution. Sorry. |
The fix has been merged in both xdf-Matlab and pyxdf. |
Environment Info
Tried both LabRecorderCLI tagged v1.13.1 and the latest from [master] branch, but still the same problem.
Issue
When using LSL v1.14 with LabRecorder (master branch from https://github.com/labstreaminglayer/App-LabRecorder), the following error occurred in 1 out of 5 XDF recordings (or none if I'm lucky). It is really hard to reproduce, and Google search points me to #15, which mentioned a clock offset bug introduced in liblsl1.13. Is it still unsolved and somehow creep to 1.14?
I'm running the consumer using the sample code, SendDataC. I also make sure to press 'Enter' to correctly closed the file, as follows:
Not sure if this issue is related to liblsl or LabRecorderCLI.
The text was updated successfully, but these errors were encountered: