Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Differences in runtime between build and test sparcv9 boxes #2729

Closed
Haroon-Khel opened this issue Sep 6, 2022 · 10 comments
Closed

Differences in runtime between build and test sparcv9 boxes #2729

Haroon-Khel opened this issue Sep 6, 2022 · 10 comments
Assignees
Labels

Comments

@Haroon-Khel
Copy link
Contributor

The test job https://ci.adoptopenjdk.net/job/Test_openjdk8_hs_extended.openjdk_sparcv9_solaris_testList_1/ will run for 9h+ (and get aborted) on build-siteox-solaris10u11-sparcv9-1 and 4h on test-siteox-solaris10u11-sparcv9-1

@Haroon-Khel
Copy link
Contributor Author

bash-3.2# psrinfo -pv
The physical processor has 4 virtual processors (0-3)
  UltraSPARC-T2+ (chipid 0, clock 1415 MHz)

Both machines have 4 virtual processors

@zdtsw
Copy link
Contributor

zdtsw commented Sep 7, 2022

so the test case "hotspot_jre_0_FAILED" failed everytime when run on build-siteox-solaris10u11-sparcv9-1

2022-09-03T15:34:54.606Z] TEST RESULT: Error. Program `/export/home/jenkins/workspace/Test_openjdk8_hs_extended.openjdk_sparcv9_solaris_testList_1/openjdkbinary/j2sdk-image/bin/java' timed out (timeout set to 2400000ms, elapsed time including timeout handling was 2401769ms).
[2022-09-03T15:34:54.606Z] --------------------------------------------------
[2022-09-03T16:24:42.542Z] Test results: passed: 711; error: 1

and looks like most of the tests run on this machine takes longer time than on the other one

@sxa
Copy link
Member

sxa commented Sep 8, 2022

FYI as discussed @steelhead31

@sxa
Copy link
Member

sxa commented Sep 8, 2022

@Haroon-Khel Can you summarise the data in here - I believe you've been running comparable openjdk runs on the TC machine too now as a third machine/data point for that comparison.

@Haroon-Khel
Copy link
Contributor Author

Annoyingly the extended job that I ran on our TC sparc solaris machine did not run smoothly. Every test failed with an error complaining about not being able to find the test file. I'll kick it off again.

@Haroon-Khel
Copy link
Contributor Author

So far the differences in runtime have been from the https://ci.adoptopenjdk.net/job/Test_openjdk8_hs_extended.openjdk_sparcv9_solaris_testList_1/ job. I'm interested to see if the runtimes differ if the extended openjdk job is run in a grinder instead

https://ci.adoptopenjdk.net/view/Test_grinder/job/Grinder/5682/console build-siteox-solaris10u11-sparcv9-1 https://ci.adoptopenjdk.net/view/Test_grinder/job/Grinder/5683/console test-siteox-solaris10u11-sparcv9-1

@Haroon-Khel
Copy link
Contributor Author

May have found the difference

Build machine

bash-3.2# /usr/sbin/psrinfo -v
Status of virtual processor 0 as of: 09/11/2022 23:42:23
  on-line since 05/13/2022 08:09:57.
  The sparcv9 processor operates at 1415 MHz,
        and has a sparcv9 floating point processor.
Status of virtual processor 1 as of: 09/11/2022 23:42:23
  on-line since 05/13/2022 08:09:59.
  The sparcv9 processor operates at 1415 MHz,
        and has a sparcv9 floating point processor.
Status of virtual processor 2 as of: 09/11/2022 23:42:23
  on-line since 05/13/2022 08:09:59.
  The sparcv9 processor operates at 1415 MHz,
        and has a sparcv9 floating point processor.
Status of virtual processor 3 as of: 09/11/2022 23:42:23
  on-line since 05/13/2022 08:09:59.
  The sparcv9 processor operates at 1415 MHz,
        and has a sparcv9 floating point processor.

Test machine

bash-3.2# /usr/sbin/psrinfo -v
Status of virtual processor 0 as of: 09/12/2022 09:02:26
  on-line since 01/09/2022 21:46:53.
  The sparcv9 processor operates at 3600 MHz,
        and has a sparcv9 floating point processor.
Status of virtual processor 1 as of: 09/12/2022 09:02:26
  on-line since 01/09/2022 21:47:38.
  The sparcv9 processor operates at 3600 MHz,
        and has a sparcv9 floating point processor.
Status of virtual processor 2 as of: 09/12/2022 09:02:26
  on-line since 01/09/2022 21:47:38.
  The sparcv9 processor operates at 3600 MHz,
        and has a sparcv9 floating point processor.
Status of virtual processor 3 as of: 09/12/2022 09:02:26
  on-line since 01/09/2022 21:47:38.
  The sparcv9 processor operates at 3600 MHz,
        and has a sparcv9 floating point processor.

The test machine has more than double the clock speed for each of its cpus compared to the build machine, which makes sense since the runtime on the build machine is double that of the test machine. @sxa Can we look at getting build-siteox-solaris10u11-sparcv9-1 upgraded?

@sxa
Copy link
Member

sxa commented Nov 22, 2024

Testing with a sanity.openjdk run at:

Note: This has required temporarily adjusting the job restrictions parameters configuration to be able to do the comparison and re-add ci.role.test to the machine.

Also note: I'm fairly sure these issues got resolved at some point in the last couple of years, but not sure if we've explicitly tested, so I'm not overly concerned by this.

@sxa sxa self-assigned this Nov 22, 2024
@sxa sxa moved this from Todo to In Progress in 2024 4Q Adoptium Plan Nov 22, 2024
@sxa
Copy link
Member

sxa commented Nov 22, 2024

Results of the above (Spoiler: They are sufficiently close that so I'm going to close this)

14:20:24 jdk_math_0 ...         14:22:41 jdk_math_0 ...
14:25:54 jdk_math_1 ...         14:28:13 jdk_math_1 ...
14:25:54 jdk_math_2 ...         14:28:13 jdk_math_2 ...
14:25:54 jdk_security1_0 ...    14:28:13 jdk_security1_0 ...
14:41:34 jdk_security1_1 ...    14:43:08 jdk_security1_1 ...
14:41:34 jdk_security1_2 ...    14:43:08 jdk_security1_2 ...
14:41:34 jdk_security2_0 ...    14:43:08 jdk_security2_0 ...
14:49:49 jdk_security2_1 ...    14:51:13 jdk_security2_1 ...
14:49:49 jdk_security2_2 ...    14:51:13 jdk_security2_2 ...
14:49:49 jdk_security4_0 ...    14:51:13 jdk_security4_0 ...
15:17:29 jdk_security4_1 ...    15:18:20 jdk_security4_1 ...
15:17:29 jdk_security4_2 ...    15:18:20 jdk_security4_2 ...
15:17:29 jdk_util_0 ...         15:18:20 jdk_util_0 ...
15:50:27 jdk_util_1 ...         15:50:36 jdk_util_1 ...
15:50:27 jdk_util_2 ...         15:50:36 jdk_util_2 ...
15:50:27 jdk_jdi_jdk8_0 ...     15:50:36 jdk_jdi_jdk8_0 ...
16:02:25 jdk_jdi_jdk8_1 ...     16:02:38 jdk_jdi_jdk8_1 ...
16:02:25 jdk_jdi_jdk8_2 ...     16:02:38 jdk_jdi_jdk8_2 ...
16:02:25 langtools_all_0 ...    16:02:38 langtools_all_0 ...

Results collated with curl https://ci.adoptium.net/view/Test_openjdk/job/Test_openjdk8_hs_sanity.openjdk_sparcv9_solaris/430/consoleText | grep Running.test | sed -e 's/^.2024-11-..T//g' -e s'/....Z. Running test//g'

@sxa sxa closed this as completed Nov 22, 2024
@github-project-automation github-project-automation bot moved this from In Progress to Done in 2024 4Q Adoptium Plan Nov 22, 2024
@sxa
Copy link
Member

sxa commented Nov 22, 2024

Also kicked off extended runs:

I've also restored the settings on the build machine so it can no longer run non-build jobs again.

Both took almost the same time, so there is no issue here any more

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Done
Development

No branches or pull requests

3 participants