Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple fixes and adaption for VS test explorer #255

Merged
merged 4 commits into from
May 21, 2024

Conversation

MushMal
Copy link
Contributor

@MushMal MushMal commented May 2, 2024

Issue #, if available: None

What was changed?

  • Cmake files to adapt test explorer, fixes to sources, fixes to tests, fixing warnings.
    Multiple updates include:
  • Adding support for Visual Studio Test Explorer via Google Test Adapter so the tests can be run/debugged in VS
  • Fixing multiple warnings - some have major impact on Windows platform/compiler which doesn't automatically zero structs
  • Fixing the flaky and explicitly broken unit tests

Why was it changed?

  • More Windows platform adoption and fixing broken code and tests

How was it changed?

  • Added some fixes in the unit test to work on Windows.

What testing was done for the changes?

  • Clean unit test run locally. Ensure CI passes.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

* Adding support for Visual Studio Test Explorer via Google Test Adapter so the tests can be run/debugged in VS
* Fixing multiple warnings - some have major impact on Windows platform/compiler which doesn't automatically zero structs
* Fixing the flaky and explicitly broken unit tests
@MushMal
Copy link
Contributor Author

MushMal commented May 2, 2024

I see the PR Description Check failed but I am at a loss as to what exactly needs to be provided there and why this was flagged as "short". Help?

@codecov-commenter
Copy link

codecov-commenter commented May 2, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 80.35%. Comparing base (e96735d) to head (97fb863).

❗ Current head 97fb863 differs from pull request most recent head b69d1b9. Consider uploading reports for the commit b69d1b9 to get more accurate results

Additional details and impacted files
@@             Coverage Diff             @@
##           develop     #255      +/-   ##
===========================================
- Coverage    80.36%   80.35%   -0.01%     
===========================================
  Files           53       53              
  Lines        10826    10823       -3     
===========================================
- Hits          8700     8697       -3     
  Misses        2126     2126              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

@disa6302 disa6302 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, the changes look good to me. Were there any specific unit tests from the list of exempted tests in the CI that were failing too?

tst/utils/ThreadsafeBlockingQueue.cpp Show resolved Hide resolved
Copy link
Contributor

@hassanctech hassanctech left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

I looked into the TSAN that's failing it's happening across two different test runs.

The testThreadRoutine thread from the test ThreadCreateAndCancel is still running when the test for ThreadCreateAndReleaseSimpleCheckWithStack is kicked off. Due to the nature of the test itself we're not joining the thread. Since pthread_cancel returns right away we may avoid this by adding a sleep at the end of the test. Alternatively we could join the thread at the end of the function and make sure retval has PTHREAD_CANCELED .

@hassanctech
Copy link
Contributor

I see the PR Description Check failed but I am at a loss as to what exactly needs to be provided there and why this was flagged as "short". Help?

I fixed it :). The issue was you were missing some italics :), unfortunately the template does an exact match of the text including the format -- easiest way to currently satisfy it is copy/paste the template from a previously passing PR and then fill in the details.

@MushMal
Copy link
Contributor Author

MushMal commented May 7, 2024

Nice!

I looked into the TSAN that's failing it's happening across two different test runs.

The testThreadRoutine thread from the test ThreadCreateAndCancel is still running when the test for ThreadCreateAndReleaseSimpleCheckWithStack is kicked off. Due to the nature of the test itself we're not joining the thread. Since pthread_cancel returns right away we may avoid this by adding a sleep at the end of the test. Alternatively we could join the thread at the end of the function and make sure retval has PTHREAD_CANCELED .

I just looked at the test and there are few modifications I would suggest. First, if we are running 500 threads, having a mutex just to up the value might not be the best solution. An atomic counter would be the best way. But, that's not the main issue.

If I remember it correctly, GTest would create a new process to run each test so two separate tests would not conflict with each other even if they have threads with the same name. Mutex, however is another story. On Windows, mutexes are kernel objects and they will clash. That said, the process from the first test would terminate first before running the second test and it will close handle all kernel objects. This is again on the presumption that GTest is running tests sequentially. I am not sure how the test runs are configured in the jobs.

The biggest issue with the test is that it's very platform and time sensitive. TSAN will inevitably break or flag this test. I would recommend a separate PR to just fix this (and if there are other tests like this) test. The fix would be to just block indefinitely in the routine and perhaps move to use atomics. The main test body would then spin wait until the test count is reached (or use a semaphore) and would cancel the threads.

@MushMal
Copy link
Contributor Author

MushMal commented May 7, 2024

Sorry, I missed the point about the TSAN failing. It's hard to reproduce this locally but let me try to modify these tests. Will provide another commit

@MushMal
Copy link
Contributor Author

MushMal commented May 10, 2024

Hello, is there anything else that needs to be done for this PR? I am not sure how to restart the checks on this - I was under the assumption that it will be kicked off automagically with a new commit.

@MushMal
Copy link
Contributor Author

MushMal commented May 10, 2024

@hassanctech, @disa6302 can you please look into this?

@disa6302
Copy link
Contributor

I started the run. Github has an approve and run workflow for forked PRs, that is why the CI did not start on its own.

@MushMal
Copy link
Contributor Author

MushMal commented May 13, 2024

Looking at the TSAN failure (relevant part is attached). The TSAN output makes no sense to me. The times array is declared on LN 115

struct sleep_times st[TEST_THREAD_COUNT] = {0};

and is a local stack variable. This can not and should not interfere with the previous test run in LN 80. I must be missing something or not understanding how the tests are executed..???

2024-05-10T18:59:41.5607407Z ================== 2024-05-10T18:59:41.5607863Z WARNING: ThreadSanitizer: data race (pid=3692) 2024-05-10T18:59:41.5608534Z Write of size 8 at 0x7ffc92aa8260 by main thread: 2024-05-10T18:59:41.5609501Z #0 memset (kvspic_test+0xdd31d) (BuildId: 50a8c112fff50e2b536ef37ad34f116c7f4b341b) 2024-05-10T18:59:41.5629370Z #1 ThreadFunctionalityTest_ThreadCreateAndReleaseSimpleCheckWithStack_Test::TestBody() /home/runner/work/amazon-kinesis-video-streams-pic/amazon-kinesis-video-streams-pic/tst/utils/Thread.cpp:115:24 (kvspic_test+0x32ca20) (BuildId: 50a8c112fff50e2b536ef37ad34f116c7f4b341b) 2024-05-10T18:59:41.5636229Z #2 void testing::internal::HandleSehExceptionsInMethodIfSupported(testing::Test*, void (testing::Test::*)(), char const*) (kvspic_test+0x44442a) (BuildId: 50a8c112fff50e2b536ef37ad34f116c7f4b341b) 2024-05-10T18:59:41.5638410Z #3 (libc.so.6+0x29d8f) (BuildId: 962015aa9d133c6cbcfb31ec300596d7f44d3348) 2024-05-10T18:59:41.5639112Z 2024-05-10T18:59:41.5639433Z Previous read of size 8 at 0x7ffc92aa8260 by thread T2752: 2024-05-10T18:59:41.5641789Z #0 testThreadRoutine(void*) /home/runner/work/amazon-kinesis-video-streams-pic/amazon-kinesis-video-streams-pic/tst/utils/Thread.cpp:24:28 (kvspic_test+0x32b926) (BuildId: 50a8c112fff50e2b536ef37ad34f116c7f4b341b) 2024-05-10T18:59:41.5643540Z 2024-05-10T18:59:41.5643723Z Location is stack of main thread. 2024-05-10T18:59:41.5644087Z 2024-05-10T18:59:41.5644489Z Location is global '??' at 0x7ffc92a8b000 ([stack]+0x1d260) 2024-05-10T18:59:41.5644998Z 2024-05-10T18:59:41.5645301Z Thread T2752 (tid=6474, running) created by main thread at: 2024-05-10T18:59:41.5646393Z #0 pthread_create (kvspic_test+0xd33ad) (BuildId: 50a8c112fff50e2b536ef37ad34f116c7f4b341b) 2024-05-10T18:59:41.5649074Z #1 defaultCreateThreadWithParams /home/runner/work/amazon-kinesis-video-streams-pic/amazon-kinesis-video-streams-pic/src/utils/src/Thread.c:186:14 (kvspic_test+0x3e7571) (BuildId: 50a8c112fff50e2b536ef37ad34f116c7f4b341b) 2024-05-10T18:59:41.5652264Z #2 defaultCreateThread /home/runner/work/amazon-kinesis-video-streams-pic/amazon-kinesis-video-streams-pic/src/utils/src/Thread.c:225:5 (kvspic_test+0x3e7571) 2024-05-10T18:59:41.5655919Z #3 ThreadFunctionalityTest_ThreadCreateAndCancel_Test::TestBody() /home/runner/work/amazon-kinesis-video-streams-pic/amazon-kinesis-video-streams-pic/tst/utils/Thread.cpp:80:9 (kvspic_test+0x32c1be) (BuildId: 50a8c112fff50e2b536ef37ad34f116c7f4b341b) 2024-05-10T18:59:41.5659424Z #4 void testing::internal::HandleSehExceptionsInMethodIfSupported(testing::Test*, void (testing::Test::*)(), char const*) (kvspic_test+0x44442a) (BuildId: 50a8c112fff50e2b536ef37ad34f116c7f4b341b) 2024-05-10T18:59:41.5661501Z #5 (libc.so.6+0x29d8f) (BuildId: 962015aa9d133c6cbcfb31ec300596d7f44d3348) 2024-05-10T18:59:41.5662200Z 2024-05-10T18:59:41.5663967Z SUMMARY: ThreadSanitizer: data race (/home/runner/work/amazon-kinesis-video-streams-pic/amazon-kinesis-video-streams-pic/build/tst/kvspic_test+0xdd31d) (BuildId: 50a8c112fff50e2b536ef37ad34f116c7f4b341b) in memset 2024-05-10T18:59:41.5665788Z ==================

@MushMal
Copy link
Contributor Author

MushMal commented May 17, 2024

Can I request a new run with the latest?

@MushMal
Copy link
Contributor Author

MushMal commented May 17, 2024

The run is failing for some odd reason. The format validator is failing on the following file src/mkvgen/include/com/amazonaws/kinesis/video/mkvgen/Include.h which is not part of the changes.

The mac gcc is still running and I don't see the logs to understand what's wrong. It has been running for 40 mins and I think something is wrong with the run itself.

disa6302
disa6302 previously approved these changes May 20, 2024
Copy link
Contributor

@disa6302 disa6302 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Mushegh! Changes look good to me. Will wait for @hassanctech to take a look too.

@hassanctech
Copy link
Contributor

Apologies for the delay here, looks great @MushMal!

Since we're making fixes for Windows I see this in the MSVC CI run logs:

[  0%] Building C object CMakeFiles/kvspic.dir/src/client/src/AckParser.c.obj
cl : Command line warning D9002 : ignoring unknown option '-fPIC'
AckParser.c
[  1%] Building C object CMakeFiles/kvspic.dir/src/client/src/AuthIntegration.c.obj
cl : Command line warning D9002 : ignoring unknown option '-fPIC'
AuthIntegration.c
[  1%] Building C object CMakeFiles/kvspic.dir/src/client/src/Callbacks.c.obj
cl : Command line warning D9002 : ignoring unknown option '-fPIC'
Callbacks.c
[  2%] Building C object CMakeFiles/kvspic.dir/src/client/src/Client.c.obj
cl : Command line warning D9002 : ignoring unknown option '-fPIC'
Client.c
D:\a\amazon-kinesis-video-streams-pic\amazon-kinesis-video-streams-pic\src\client\src\Client.c(115): warning C4102: 'CleanUp': unreferenced label
[  2%] Building C object CMakeFiles/kvspic.dir/src/client/src/ClientEvent.c.obj
cl : Command line warning D9002 : ignoring unknown option '-fPIC'

Seems like Windows doesn't like -fPIC? Anything we can do to get rid of that for Windows, like maybe don't add -fPIC for Windows? I'll approve this as-is since this issue doesn't appear to be newly introduced.

Copy link
Contributor

@hassanctech hassanctech left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

@MushMal
Copy link
Contributor Author

MushMal commented May 21, 2024

Apologies for the delay here, looks great @MushMal!

Since we're making fixes for Windows I see this in the MSVC CI run logs:

[  0%] Building C object CMakeFiles/kvspic.dir/src/client/src/AckParser.c.obj
cl : Command line warning D9002 : ignoring unknown option '-fPIC'
AckParser.c
[  1%] Building C object CMakeFiles/kvspic.dir/src/client/src/AuthIntegration.c.obj
cl : Command line warning D9002 : ignoring unknown option '-fPIC'
AuthIntegration.c
[  1%] Building C object CMakeFiles/kvspic.dir/src/client/src/Callbacks.c.obj
cl : Command line warning D9002 : ignoring unknown option '-fPIC'
Callbacks.c
[  2%] Building C object CMakeFiles/kvspic.dir/src/client/src/Client.c.obj
cl : Command line warning D9002 : ignoring unknown option '-fPIC'
Client.c
D:\a\amazon-kinesis-video-streams-pic\amazon-kinesis-video-streams-pic\src\client\src\Client.c(115): warning C4102: 'CleanUp': unreferenced label
[  2%] Building C object CMakeFiles/kvspic.dir/src/client/src/ClientEvent.c.obj
cl : Command line warning D9002 : ignoring unknown option '-fPIC'

Seems like Windows doesn't like -fPIC? Anything we can do to get rid of that for Windows, like maybe don't add -fPIC for Windows? I'll approve this as-is since this issue doesn't appear to be newly introduced.

There are couple of other warnings I looked at but didn't know what to do with. As these are not actual hinderances to Windows and specially VS development, I would rather prefer getting the fixes in first and perhaps iterate on those later. I plan to enhance the code in the near future so might look into those too.

@disa6302 disa6302 merged commit acc14a2 into awslabs:develop May 21, 2024
14 checks passed
@MushMal MushMal deleted the develop branch May 21, 2024 17:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants