Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feat](lock)add deadlock detection tool and monitored lock implementations #39015

Merged
merged 7 commits into from
Aug 8, 2024

Conversation

CalvinKirs
Copy link
Member

@CalvinKirs CalvinKirs commented Aug 7, 2024

Proposed changes

Description:

This issue proposes the addition of new features to the project, including a deadlock detection tool and monitored lock implementations. These features will help in identifying and debugging potential deadlocks and monitoring lock usage.
Features:

AbstractMonitoredLock:

A monitored version of Lock that tracks and logs lock acquisition and release times.
Functionality:
Overrides lock(), unlock(), tryLock(), and tryLock(long timeout, TimeUnit unit) methods.
Logs information about lock acquisition time, release time, and any failure to acquire the lock within the specified timeout.

eg
2024-08-07 12:02:59  [ Thread-2:2006 ] - [ WARN ]  Thread ID: 12, Thread Name: Thread-2 - Lock held for 1912 ms, exceeding hold timeout of 1000 ms 
Thread stack trace:
	at java.lang.Thread.getStackTrace(Thread.java:1564)
	at org.example.lock.AbstractMonitoredLock.afterUnlock(AbstractMonitoredLock.java:49)
	at org.example.lock.MonitoredReentrantLock.unlock(MonitoredReentrantLock.java:32)
	at org.example.ExampleService.timeout(ExampleService.java:17)
	at org.example.Main.lambda$test2$1(Main.java:39)
	at java.lang.Thread.run(Thread.java:750)

DeadlockCheckerTool:

Uses ScheduledExecutorService for periodic deadlock checks.
Logs deadlock information including thread names, states, lock info, and stack traces.

ThreadMXBean accesses thread information in the local JVM, which is already in memory, so accessing it is less expensive than fetching data from external resources such as disk or network.
Thread state cache: The JVM typically maintains a cache of thread states, reducing the need for real-time calculations or additional data processing.

eg
Thread Name: Thread-0
Thread State: WAITING
Lock Name: java.util.concurrent.locks.ReentrantLock$NonfairSync@1d653213
Lock Owner Name: Thread-1
Lock Owner Id: 12
Waited Time: -1
Blocked Time: -1
Lock Info: java.util.concurrent.locks.ReentrantLock$NonfairSync@1d653213
Blocked by: java.util.concurrent.locks.ReentrantLock$NonfairSync@1d653213
Stack Trace: 
	at sun.misc.Unsafe.park(Native Method)
	at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
	at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
	at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
	at org.example.lock.MonitoredReentrantLock.lock(MonitoredReentrantLock.java:22)
	at org.example.Main.lambda$testDeadLock$3(Main.java:79)
	at org.example.Main$$Lambda$1/1221555852.run(Unknown Source)
	at java.lang.Thread.run(Thread.java:750)


2024-08-07 14:11:28  [ pool-1-thread-1:2001 ] - [ WARN ]  Deadlocks detected:
Thread Name: Thread-1
Thread State: WAITING
Lock Name: java.util.concurrent.locks.ReentrantLock$NonfairSync@13a2dfcf
Lock Owner Name: Thread-0
Lock Owner Id: 11
Waited Time: -1
Blocked Time: -1
Lock Info: java.util.concurrent.locks.ReentrantLock$NonfairSync@13a2dfcf
Blocked by: java.util.concurrent.locks.ReentrantLock$NonfairSync@13a2dfcf
Stack Trace: 
	at sun.misc.Unsafe.park(Native Method)
	at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
	at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
	at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
	at org.example.lock.MonitoredReentrantLock.lock(MonitoredReentrantLock.java:22)
	at org.example.Main.lambda$testDeadLock$4(Main.java:93)
	at org.example.Main$$Lambda$2/1556956098.run(Unknown Source)
	at java.lang.Thread.run(Thread.java:750)


benchmark
    @Warmup(iterations = 5, time = 1, timeUnit = TimeUnit.SECONDS)
    @Measurement(iterations = 2, time = 2, timeUnit = TimeUnit.SECONDS)
    @Threads(1)

Benchmark                                                          Mode  Cnt       Score   Error   Units
LockBenchmark.testMonitoredLock                                   thrpt    2   15889.407          ops/ms
LockBenchmark.testMonitoredLock:·gc.alloc.rate                    thrpt    2     678.061          MB/sec
LockBenchmark.testMonitoredLock:·gc.alloc.rate.norm               thrpt    2      56.000            B/op
LockBenchmark.testMonitoredLock:·gc.churn.PS_Eden_Space           thrpt    2     668.249          MB/sec
LockBenchmark.testMonitoredLock:·gc.churn.PS_Eden_Space.norm      thrpt    2      55.080            B/op
LockBenchmark.testMonitoredLock:·gc.churn.PS_Survivor_Space       thrpt    2       0.075          MB/sec
LockBenchmark.testMonitoredLock:·gc.churn.PS_Survivor_Space.norm  thrpt    2       0.006            B/op
LockBenchmark.testMonitoredLock:·gc.count                         thrpt    2      20.000          counts
LockBenchmark.testMonitoredLock:·gc.time                          thrpt    2       6.000              ms
LockBenchmark.testNativeLock                                      thrpt    2  103130.635          ops/ms
LockBenchmark.testNativeLock:·gc.alloc.rate                       thrpt    2      ≈ 10⁻⁴          MB/sec
LockBenchmark.testNativeLock:·gc.alloc.rate.norm                  thrpt    2      ≈ 10⁻⁶            B/op
LockBenchmark.testNativeLock:·gc.count                            thrpt    2         ≈ 0          counts

    @Warmup(iterations = 5, time = 1, timeUnit = TimeUnit.SECONDS)
    @Measurement(iterations = 2, time = 2, timeUnit = TimeUnit.SECONDS)
    @Threads(100)

Benchmark                                                          Mode  Cnt       Score   Error   Units
LockBenchmark.testMonitoredLock                                   thrpt    2   10994.606          ops/ms
LockBenchmark.testMonitoredLock:·gc.alloc.rate                    thrpt    2     488.508          MB/sec
LockBenchmark.testMonitoredLock:·gc.alloc.rate.norm               thrpt    2      56.002            B/op
LockBenchmark.testMonitoredLock:·gc.churn.PS_Eden_Space           thrpt    2     481.390          MB/sec
LockBenchmark.testMonitoredLock:·gc.churn.PS_Eden_Space.norm      thrpt    2      55.163            B/op
LockBenchmark.testMonitoredLock:·gc.churn.PS_Survivor_Space       thrpt    2       0.020          MB/sec
LockBenchmark.testMonitoredLock:·gc.churn.PS_Survivor_Space.norm  thrpt    2       0.002            B/op
LockBenchmark.testMonitoredLock:·gc.count                         thrpt    2      18.000          counts
LockBenchmark.testMonitoredLock:·gc.time                          thrpt    2       9.000              ms
LockBenchmark.testNativeLock                                      thrpt    2  558652.036          ops/ms
LockBenchmark.testNativeLock:·gc.alloc.rate                       thrpt    2       0.016          MB/sec
LockBenchmark.testNativeLock:·gc.alloc.rate.norm                  thrpt    2      ≈ 10⁻⁴            B/op
LockBenchmark.testNativeLock:·gc.count                            thrpt    2         ≈ 0          counts

@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR

Since 2024-03-18, the Document has been moved to doris-website.
See Doris Document.

@CalvinKirs
Copy link
Member Author

run buildall

@CalvinKirs CalvinKirs marked this pull request as ready for review August 8, 2024 03:27
@CalvinKirs
Copy link
Member Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 39822 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 9a7d28c84bd08fe223c2875606ef04ee21b964b0, data reload: false

------ Round 1 ----------------------------------
q1	18506	6643	4324	4324
q2	2027	172	171	171
q3	10567	1174	1053	1053
q4	10233	781	743	743
q5	7526	2595	2530	2530
q6	227	138	137	137
q7	978	601	595	595
q8	9224	1929	1931	1929
q9	8578	6536	6580	6536
q10	7073	2253	2241	2241
q11	458	245	250	245
q12	392	228	219	219
q13	18705	3010	3000	3000
q14	293	241	233	233
q15	525	474	494	474
q16	532	384	390	384
q17	991	773	721	721
q18	8167	7623	7566	7566
q19	5003	1120	1058	1058
q20	650	337	332	332
q21	5459	4400	4322	4322
q22	1086	1043	1009	1009
Total cold run time: 117200 ms
Total hot run time: 39822 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4489	4286	4249	4249
q2	391	275	268	268
q3	2892	2623	2684	2623
q4	1918	1597	1640	1597
q5	5287	5224	5248	5224
q6	222	134	134	134
q7	2079	1657	1658	1657
q8	3164	3326	3333	3326
q9	8360	8408	8415	8408
q10	3381	3180	3143	3143
q11	606	489	481	481
q12	776	575	614	575
q13	16525	3012	3004	3004
q14	300	272	282	272
q15	525	476	471	471
q16	477	408	410	408
q17	1801	1494	1449	1449
q18	7688	7554	7429	7429
q19	1662	1603	1628	1603
q20	1996	1770	1798	1770
q21	5294	4940	4978	4940
q22	1105	1021	1012	1012
Total cold run time: 70938 ms
Total hot run time: 54043 ms

@CalvinKirs
Copy link
Member Author

run performance

@doris-robot
Copy link

TPC-H: Total hot run time: 39892 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 9a7d28c84bd08fe223c2875606ef04ee21b964b0, data reload: false

------ Round 1 ----------------------------------
q1	17613	4506	4316	4316
q2	2034	171	176	171
q3	10510	1196	1112	1112
q4	10141	822	692	692
q5	7504	2549	2522	2522
q6	228	140	138	138
q7	979	594	589	589
q8	9226	1919	1934	1919
q9	8725	6627	6597	6597
q10	7084	2229	2227	2227
q11	455	242	246	242
q12	467	225	218	218
q13	17757	2961	3004	2961
q14	273	250	236	236
q15	525	469	485	469
q16	505	395	393	393
q17	981	674	722	674
q18	8204	7557	7485	7485
q19	1359	1023	1129	1023
q20	666	326	335	326
q21	5379	4623	4582	4582
q22	1139	1000	1001	1000
Total cold run time: 111754 ms
Total hot run time: 39892 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4363	4249	4265	4249
q2	392	274	259	259
q3	2845	2584	2665	2584
q4	1913	1651	1663	1651
q5	5220	5207	5232	5207
q6	217	133	131	131
q7	2094	1656	1667	1656
q8	3163	3353	3292	3292
q9	8344	8369	8366	8366
q10	3405	3104	3176	3104
q11	573	498	497	497
q12	786	615	637	615
q13	16324	2952	2963	2952
q14	310	279	272	272
q15	516	480	479	479
q16	478	434	410	410
q17	1794	1493	1470	1470
q18	7594	7548	7477	7477
q19	1654	1457	1587	1457
q20	1981	1796	1816	1796
q21	5268	5084	5024	5024
q22	1096	1023	1032	1023
Total cold run time: 70330 ms
Total hot run time: 53971 ms


@ConfField(description = {"表示最大锁持有时间,超过该时间会打印告警日志,单位秒",
"Maximum lock hold time; logs a warning if exceeded"})
public static long max_lock_hold_threshold_seconds = 30;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
public static long max_lock_hold_threshold_seconds = 30;
public static long max_lock_hold_threshold_seconds = 10;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And make it mutable

@CalvinKirs CalvinKirs requested a review from dataroaring as a code owner August 8, 2024 06:28
Copy link
Contributor

@morningman morningman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

github-actions bot commented Aug 8, 2024

PR approved by at least one committer and no changes requested.

@github-actions github-actions bot added approved Indicates a PR has been approved by one committer. reviewed labels Aug 8, 2024
Copy link
Contributor

github-actions bot commented Aug 8, 2024

PR approved by anyone and no changes requested.

@CalvinKirs
Copy link
Member Author

run buildall

CalvinKirs added a commit to CalvinKirs/incubator-doris that referenced this pull request Aug 8, 2024
…tions apache#39015 (#22)

## Proposed changes

### Description:

This issue proposes the addition of new features to the project, including a deadlock detection tool and monitored lock implementations. These features will help in identifying and debugging potential deadlocks and monitoring lock usage.
Features:


#### AbstractMonitoredLock:

A monitored version of Lock that tracks and logs lock acquisition and release times.
Functionality:
Overrides lock(), unlock(), tryLock(), and tryLock(long timeout, TimeUnit unit) methods.
Logs information about lock acquisition time, release time, and any failure to acquire the lock within the specified timeout.
##### eg
```log
2024-08-07 12:02:59  [ Thread-2:2006 ] - [ WARN ]  Thread ID: 12, Thread Name: Thread-2 - Lock held for 1912 ms, exceeding hold timeout of 1000 ms 
Thread stack trace:
	at java.lang.Thread.getStackTrace(Thread.java:1564)
	at org.example.lock.AbstractMonitoredLock.afterUnlock(AbstractMonitoredLock.java:49)
	at org.example.lock.MonitoredReentrantLock.unlock(MonitoredReentrantLock.java:32)
	at org.example.ExampleService.timeout(ExampleService.java:17)
	at org.example.Main.lambda$test2$1(Main.java:39)
	at java.lang.Thread.run(Thread.java:750)
```












#### DeadlockCheckerTool:

Uses ScheduledExecutorService for periodic deadlock checks.
Logs deadlock information including thread names, states, lock info, and stack traces.

**ThreadMXBean accesses thread information in the local JVM, which is already in memory, so accessing it is less expensive than fetching data from external resources such as disk or network.
Thread state cache: The JVM typically maintains a cache of thread states, reducing the need for real-time calculations or additional data processing.**
##### eg
```log
Thread Name: Thread-0
Thread State: WAITING
Lock Name: java.util.concurrent.locks.ReentrantLock$NonfairSync@1d653213
Lock Owner Name: Thread-1
Lock Owner Id: 12
Waited Time: -1
Blocked Time: -1
Lock Info: java.util.concurrent.locks.ReentrantLock$NonfairSync@1d653213
Blocked by: java.util.concurrent.locks.ReentrantLock$NonfairSync@1d653213
Stack Trace: 
	at sun.misc.Unsafe.park(Native Method)
	at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
	at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
	at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
	at org.example.lock.MonitoredReentrantLock.lock(MonitoredReentrantLock.java:22)
	at org.example.Main.lambda$testDeadLock$3(Main.java:79)
	at org.example.Main$$Lambda$1/1221555852.run(Unknown Source)
	at java.lang.Thread.run(Thread.java:750)


2024-08-07 14:11:28  [ pool-1-thread-1:2001 ] - [ WARN ]  Deadlocks detected:
Thread Name: Thread-1
Thread State: WAITING
Lock Name: java.util.concurrent.locks.ReentrantLock$NonfairSync@13a2dfcf
Lock Owner Name: Thread-0
Lock Owner Id: 11
Waited Time: -1
Blocked Time: -1
Lock Info: java.util.concurrent.locks.ReentrantLock$NonfairSync@13a2dfcf
Blocked by: java.util.concurrent.locks.ReentrantLock$NonfairSync@13a2dfcf
Stack Trace: 
	at sun.misc.Unsafe.park(Native Method)
	at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
	at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
	at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
	at org.example.lock.MonitoredReentrantLock.lock(MonitoredReentrantLock.java:22)
	at org.example.Main.lambda$testDeadLock$4(Main.java:93)
	at org.example.Main$$Lambda$2/1556956098.run(Unknown Source)
	at java.lang.Thread.run(Thread.java:750)


```
##### benchmark
```
    @WarmUp(iterations = 5, time = 1, timeUnit = TimeUnit.SECONDS)
    @measurement(iterations = 2, time = 2, timeUnit = TimeUnit.SECONDS)
    @threads(1)

Benchmark                                                          Mode  Cnt       Score   Error   Units
LockBenchmark.testMonitoredLock                                   thrpt    2   15889.407          ops/ms
LockBenchmark.testMonitoredLock:·gc.alloc.rate                    thrpt    2     678.061          MB/sec
LockBenchmark.testMonitoredLock:·gc.alloc.rate.norm               thrpt    2      56.000            B/op
LockBenchmark.testMonitoredLock:·gc.churn.PS_Eden_Space           thrpt    2     668.249          MB/sec
LockBenchmark.testMonitoredLock:·gc.churn.PS_Eden_Space.norm      thrpt    2      55.080            B/op
LockBenchmark.testMonitoredLock:·gc.churn.PS_Survivor_Space       thrpt    2       0.075          MB/sec
LockBenchmark.testMonitoredLock:·gc.churn.PS_Survivor_Space.norm  thrpt    2       0.006            B/op
LockBenchmark.testMonitoredLock:·gc.count                         thrpt    2      20.000          counts
LockBenchmark.testMonitoredLock:·gc.time                          thrpt    2       6.000              ms
LockBenchmark.testNativeLock                                      thrpt    2  103130.635          ops/ms
LockBenchmark.testNativeLock:·gc.alloc.rate                       thrpt    2      ≈ 10⁻⁴          MB/sec
LockBenchmark.testNativeLock:·gc.alloc.rate.norm                  thrpt    2      ≈ 10⁻⁶            B/op
LockBenchmark.testNativeLock:·gc.count                            thrpt    2         ≈ 0          counts

    @WarmUp(iterations = 5, time = 1, timeUnit = TimeUnit.SECONDS)
    @measurement(iterations = 2, time = 2, timeUnit = TimeUnit.SECONDS)
    @threads(100)

Benchmark                                                          Mode  Cnt       Score   Error   Units
LockBenchmark.testMonitoredLock                                   thrpt    2   10994.606          ops/ms
LockBenchmark.testMonitoredLock:·gc.alloc.rate                    thrpt    2     488.508          MB/sec
LockBenchmark.testMonitoredLock:·gc.alloc.rate.norm               thrpt    2      56.002            B/op
LockBenchmark.testMonitoredLock:·gc.churn.PS_Eden_Space           thrpt    2     481.390          MB/sec
LockBenchmark.testMonitoredLock:·gc.churn.PS_Eden_Space.norm      thrpt    2      55.163            B/op
LockBenchmark.testMonitoredLock:·gc.churn.PS_Survivor_Space       thrpt    2       0.020          MB/sec
LockBenchmark.testMonitoredLock:·gc.churn.PS_Survivor_Space.norm  thrpt    2       0.002            B/op
LockBenchmark.testMonitoredLock:·gc.count                         thrpt    2      18.000          counts
LockBenchmark.testMonitoredLock:·gc.time                          thrpt    2       9.000              ms
LockBenchmark.testNativeLock                                      thrpt    2  558652.036          ops/ms
LockBenchmark.testNativeLock:·gc.alloc.rate                       thrpt    2       0.016          MB/sec
LockBenchmark.testNativeLock:·gc.alloc.rate.norm                  thrpt    2      ≈ 10⁻⁴            B/op
LockBenchmark.testNativeLock:·gc.count                            thrpt    2         ≈ 0          counts
```
@doris-robot
Copy link

TPC-H: Total hot run time: 39002 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 6e905dd9f7c68a59e375b294257d7749b39e09ff, data reload: false

------ Round 1 ----------------------------------
q1	17614	4321	4264	4264
q2	2024	179	170	170
q3	10505	1136	1030	1030
q4	10139	766	807	766
q5	7493	2530	2503	2503
q6	220	138	136	136
q7	974	588	612	588
q8	9222	1904	1878	1878
q9	8685	6541	6509	6509
q10	7048	2201	2230	2201
q11	466	237	248	237
q12	461	218	217	217
q13	18160	2978	2968	2968
q14	270	243	238	238
q15	512	478	502	478
q16	498	395	389	389
q17	972	671	717	671
q18	7986	7428	7288	7288
q19	6601	1026	986	986
q20	701	330	325	325
q21	5208	4180	4723	4180
q22	1109	998	980	980
Total cold run time: 116868 ms
Total hot run time: 39002 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4457	4241	4314	4241
q2	373	266	272	266
q3	2872	2613	2784	2613
q4	1965	1808	1704	1704
q5	5709	5523	5487	5487
q6	227	130	131	130
q7	2111	1747	1759	1747
q8	3292	3461	3421	3421
q9	8836	8762	8850	8762
q10	3488	3345	3201	3201
q11	594	484	497	484
q12	816	637	647	637
q13	17261	3099	3192	3099
q14	322	284	295	284
q15	545	485	515	485
q16	519	440	424	424
q17	1830	1524	1525	1524
q18	8000	7917	7865	7865
q19	1721	1661	1738	1661
q20	2184	1919	1874	1874
q21	5637	5334	5259	5259
q22	1095	1030	1055	1030
Total cold run time: 73854 ms
Total hot run time: 56198 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 203402 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 6e905dd9f7c68a59e375b294257d7749b39e09ff, data reload: false

query1	942	432	390	390
query2	6447	2002	1952	1952
query3	6654	205	213	205
query4	30208	23302	23117	23117
query5	3707	486	470	470
query6	286	190	196	190
query7	4584	311	299	299
query8	258	222	196	196
query9	8483	2372	2350	2350
query10	908	881	881	881
query11	17600	15036	14974	14974
query12	131	99	93	93
query13	1632	382	375	375
query14	10517	7653	7909	7653
query15	390	296	341	296
query16	7609	461	489	461
query17	1720	571	541	541
query18	1908	385	403	385
query19	260	224	199	199
query20	120	110	120	110
query21	214	107	111	107
query22	4612	4483	4192	4192
query23	34580	33699	33990	33699
query24	10764	2931	2906	2906
query25	570	414	367	367
query26	704	156	149	149
query27	2179	283	283	283
query28	5512	2014	2000	2000
query29	727	413	422	413
query30	253	155	155	155
query31	981	721	739	721
query32	101	53	52	52
query33	623	278	271	271
query34	860	462	465	462
query35	950	853	779	779
query36	1082	941	910	910
query37	138	83	76	76
query38	4337	4150	4175	4150
query39	1439	1372	1382	1372
query40	207	112	115	112
query41	47	45	43	43
query42	110	97	95	95
query43	520	463	479	463
query44	1061	741	758	741
query45	393	390	372	372
query46	1095	766	786	766
query47	1834	1822	1766	1766
query48	381	299	300	299
query49	834	426	421	421
query50	805	408	404	404
query51	6706	6570	6727	6570
query52	110	87	93	87
query53	246	180	176	176
query54	881	451	463	451
query55	80	76	77	76
query56	298	254	255	254
query57	1127	1055	1056	1055
query58	237	240	246	240
query59	3030	2845	2809	2809
query60	289	256	260	256
query61	102	98	97	97
query62	786	647	652	647
query63	209	183	178	178
query64	9290	2444	1946	1946
query65	3196	3156	3188	3156
query66	733	375	335	335
query67	15249	15036	14745	14745
query68	4494	539	546	539
query69	428	399	435	399
query70	1150	1125	1079	1079
query71	389	273	263	263
query72	17359	16658	16545	16545
query73	755	328	328	328
query74	9162	8778	8785	8778
query75	3348	2663	2691	2663
query76	2206	1069	1007	1007
query77	457	310	307	307
query78	10369	9050	9652	9050
query79	2453	519	521	519
query80	2354	492	478	478
query81	597	230	230	230
query82	749	138	133	133
query83	300	147	158	147
query84	269	82	81	81
query85	1608	340	359	340
query86	473	303	296	296
query87	4808	4552	4632	4552
query88	4102	2484	2496	2484
query89	387	279	283	279
query90	1777	190	188	188
query91	150	117	124	117
query92	59	47	50	47
query93	1739	530	527	527
query94	838	288	306	288
query95	357	255	256	255
query96	601	272	283	272
query97	3217	3018	3062	3018
query98	227	199	197	197
query99	1645	1256	1235	1235
Total cold run time: 303673 ms
Total hot run time: 203402 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 30.83 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 6e905dd9f7c68a59e375b294257d7749b39e09ff, data reload: false

query1	0.04	0.04	0.04
query2	0.08	0.04	0.04
query3	0.22	0.05	0.05
query4	1.66	0.10	0.08
query5	0.51	0.47	0.50
query6	1.13	0.72	0.72
query7	0.02	0.02	0.01
query8	0.05	0.04	0.04
query9	0.55	0.48	0.51
query10	0.54	0.54	0.52
query11	0.16	0.11	0.11
query12	0.16	0.12	0.12
query13	0.59	0.60	0.58
query14	0.76	0.80	0.77
query15	0.90	0.82	0.82
query16	0.35	0.35	0.38
query17	0.97	1.02	0.98
query18	0.22	0.21	0.22
query19	1.76	1.73	1.74
query20	0.01	0.01	0.02
query21	15.39	0.76	0.66
query22	4.49	6.92	2.09
query23	18.36	1.40	1.27
query24	2.10	0.22	0.22
query25	0.16	0.08	0.09
query26	0.30	0.20	0.21
query27	0.45	0.22	0.22
query28	13.28	1.01	1.00
query29	12.61	3.36	3.30
query30	0.25	0.06	0.07
query31	2.86	0.38	0.38
query32	3.29	0.47	0.47
query33	2.89	2.98	2.87
query34	17.14	4.38	4.35
query35	4.43	4.48	4.45
query36	0.64	0.47	0.47
query37	0.19	0.16	0.16
query38	0.16	0.15	0.15
query39	0.05	0.04	0.04
query40	0.15	0.13	0.12
query41	0.10	0.05	0.06
query42	0.06	0.05	0.05
query43	0.04	0.04	0.04
Total cold run time: 110.07 s
Total hot run time: 30.83 s

CalvinKirs added a commit to CalvinKirs/incubator-doris that referenced this pull request Aug 8, 2024
…tions apache#39015 (#22)

## Proposed changes

### Description:

This issue proposes the addition of new features to the project, including a deadlock detection tool and monitored lock implementations. These features will help in identifying and debugging potential deadlocks and monitoring lock usage.
Features:

#### AbstractMonitoredLock:

A monitored version of Lock that tracks and logs lock acquisition and release times.
Functionality:
Overrides lock(), unlock(), tryLock(), and tryLock(long timeout, TimeUnit unit) methods.
Logs information about lock acquisition time, release time, and any failure to acquire the lock within the specified timeout.
##### eg
```log
2024-08-07 12:02:59  [ Thread-2:2006 ] - [ WARN ]  Thread ID: 12, Thread Name: Thread-2 - Lock held for 1912 ms, exceeding hold timeout of 1000 ms
Thread stack trace:
	at java.lang.Thread.getStackTrace(Thread.java:1564)
	at org.example.lock.AbstractMonitoredLock.afterUnlock(AbstractMonitoredLock.java:49)
	at org.example.lock.MonitoredReentrantLock.unlock(MonitoredReentrantLock.java:32)
	at org.example.ExampleService.timeout(ExampleService.java:17)
	at org.example.Main.lambda$test2$1(Main.java:39)
	at java.lang.Thread.run(Thread.java:750)
```

#### DeadlockCheckerTool:

Uses ScheduledExecutorService for periodic deadlock checks.
Logs deadlock information including thread names, states, lock info, and stack traces.

**ThreadMXBean accesses thread information in the local JVM, which is already in memory, so accessing it is less expensive than fetching data from external resources such as disk or network.
Thread state cache: The JVM typically maintains a cache of thread states, reducing the need for real-time calculations or additional data processing.**
##### eg
```log
Thread Name: Thread-0
Thread State: WAITING
Lock Name: java.util.concurrent.locks.ReentrantLock$NonfairSync@1d653213
Lock Owner Name: Thread-1
Lock Owner Id: 12
Waited Time: -1
Blocked Time: -1
Lock Info: java.util.concurrent.locks.ReentrantLock$NonfairSync@1d653213
Blocked by: java.util.concurrent.locks.ReentrantLock$NonfairSync@1d653213
Stack Trace:
	at sun.misc.Unsafe.park(Native Method)
	at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
	at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
	at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
	at org.example.lock.MonitoredReentrantLock.lock(MonitoredReentrantLock.java:22)
	at org.example.Main.lambda$testDeadLock$3(Main.java:79)
	at org.example.Main$$Lambda$1/1221555852.run(Unknown Source)
	at java.lang.Thread.run(Thread.java:750)

2024-08-07 14:11:28  [ pool-1-thread-1:2001 ] - [ WARN ]  Deadlocks detected:
Thread Name: Thread-1
Thread State: WAITING
Lock Name: java.util.concurrent.locks.ReentrantLock$NonfairSync@13a2dfcf
Lock Owner Name: Thread-0
Lock Owner Id: 11
Waited Time: -1
Blocked Time: -1
Lock Info: java.util.concurrent.locks.ReentrantLock$NonfairSync@13a2dfcf
Blocked by: java.util.concurrent.locks.ReentrantLock$NonfairSync@13a2dfcf
Stack Trace:
	at sun.misc.Unsafe.park(Native Method)
	at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
	at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
	at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
	at org.example.lock.MonitoredReentrantLock.lock(MonitoredReentrantLock.java:22)
	at org.example.Main.lambda$testDeadLock$4(Main.java:93)
	at org.example.Main$$Lambda$2/1556956098.run(Unknown Source)
	at java.lang.Thread.run(Thread.java:750)

```
##### benchmark
```
    @WarmUp(iterations = 5, time = 1, timeUnit = TimeUnit.SECONDS)
    @measurement(iterations = 2, time = 2, timeUnit = TimeUnit.SECONDS)
    @threads(1)

Benchmark                                                          Mode  Cnt       Score   Error   Units
LockBenchmark.testMonitoredLock                                   thrpt    2   15889.407          ops/ms
LockBenchmark.testMonitoredLock:·gc.alloc.rate                    thrpt    2     678.061          MB/sec
LockBenchmark.testMonitoredLock:·gc.alloc.rate.norm               thrpt    2      56.000            B/op
LockBenchmark.testMonitoredLock:·gc.churn.PS_Eden_Space           thrpt    2     668.249          MB/sec
LockBenchmark.testMonitoredLock:·gc.churn.PS_Eden_Space.norm      thrpt    2      55.080            B/op
LockBenchmark.testMonitoredLock:·gc.churn.PS_Survivor_Space       thrpt    2       0.075          MB/sec
LockBenchmark.testMonitoredLock:·gc.churn.PS_Survivor_Space.norm  thrpt    2       0.006            B/op
LockBenchmark.testMonitoredLock:·gc.count                         thrpt    2      20.000          counts
LockBenchmark.testMonitoredLock:·gc.time                          thrpt    2       6.000              ms
LockBenchmark.testNativeLock                                      thrpt    2  103130.635          ops/ms
LockBenchmark.testNativeLock:·gc.alloc.rate                       thrpt    2      ≈ 10⁻⁴          MB/sec
LockBenchmark.testNativeLock:·gc.alloc.rate.norm                  thrpt    2      ≈ 10⁻⁶            B/op
LockBenchmark.testNativeLock:·gc.count                            thrpt    2         ≈ 0          counts

    @WarmUp(iterations = 5, time = 1, timeUnit = TimeUnit.SECONDS)
    @measurement(iterations = 2, time = 2, timeUnit = TimeUnit.SECONDS)
    @threads(100)

Benchmark                                                          Mode  Cnt       Score   Error   Units
LockBenchmark.testMonitoredLock                                   thrpt    2   10994.606          ops/ms
LockBenchmark.testMonitoredLock:·gc.alloc.rate                    thrpt    2     488.508          MB/sec
LockBenchmark.testMonitoredLock:·gc.alloc.rate.norm               thrpt    2      56.002            B/op
LockBenchmark.testMonitoredLock:·gc.churn.PS_Eden_Space           thrpt    2     481.390          MB/sec
LockBenchmark.testMonitoredLock:·gc.churn.PS_Eden_Space.norm      thrpt    2      55.163            B/op
LockBenchmark.testMonitoredLock:·gc.churn.PS_Survivor_Space       thrpt    2       0.020          MB/sec
LockBenchmark.testMonitoredLock:·gc.churn.PS_Survivor_Space.norm  thrpt    2       0.002            B/op
LockBenchmark.testMonitoredLock:·gc.count                         thrpt    2      18.000          counts
LockBenchmark.testMonitoredLock:·gc.time                          thrpt    2       9.000              ms
LockBenchmark.testNativeLock                                      thrpt    2  558652.036          ops/ms
LockBenchmark.testNativeLock:·gc.alloc.rate                       thrpt    2       0.016          MB/sec
LockBenchmark.testNativeLock:·gc.alloc.rate.norm                  thrpt    2      ≈ 10⁻⁴            B/op
LockBenchmark.testNativeLock:·gc.count                            thrpt    2         ≈ 0          counts
```
wyxxxcat pushed a commit to wyxxxcat/doris that referenced this pull request Aug 14, 2024
…tions (apache#39015)

## Proposed changes

### Description:

This issue proposes the addition of new features to the project,
including a deadlock detection tool and monitored lock implementations.
These features will help in identifying and debugging potential
deadlocks and monitoring lock usage.
Features:


#### AbstractMonitoredLock:

A monitored version of Lock that tracks and logs lock acquisition and
release times.
Functionality:
Overrides lock(), unlock(), tryLock(), and tryLock(long timeout,
TimeUnit unit) methods.
Logs information about lock acquisition time, release time, and any
failure to acquire the lock within the specified timeout.
##### eg
```log
2024-08-07 12:02:59  [ Thread-2:2006 ] - [ WARN ]  Thread ID: 12, Thread Name: Thread-2 - Lock held for 1912 ms, exceeding hold timeout of 1000 ms 
Thread stack trace:
	at java.lang.Thread.getStackTrace(Thread.java:1564)
	at org.example.lock.AbstractMonitoredLock.afterUnlock(AbstractMonitoredLock.java:49)
	at org.example.lock.MonitoredReentrantLock.unlock(MonitoredReentrantLock.java:32)
	at org.example.ExampleService.timeout(ExampleService.java:17)
	at org.example.Main.lambda$test2$1(Main.java:39)
	at java.lang.Thread.run(Thread.java:750)
```












#### DeadlockCheckerTool:

Uses ScheduledExecutorService for periodic deadlock checks.
Logs deadlock information including thread names, states, lock info, and
stack traces.

**ThreadMXBean accesses thread information in the local JVM, which is
already in memory, so accessing it is less expensive than fetching data
from external resources such as disk or network.
Thread state cache: The JVM typically maintains a cache of thread
states, reducing the need for real-time calculations or additional data
processing.**
##### eg
```log
Thread Name: Thread-0
Thread State: WAITING
Lock Name: java.util.concurrent.locks.ReentrantLock$NonfairSync@1d653213
Lock Owner Name: Thread-1
Lock Owner Id: 12
Waited Time: -1
Blocked Time: -1
Lock Info: java.util.concurrent.locks.ReentrantLock$NonfairSync@1d653213
Blocked by: java.util.concurrent.locks.ReentrantLock$NonfairSync@1d653213
Stack Trace: 
	at sun.misc.Unsafe.park(Native Method)
	at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
	at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
	at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
	at org.example.lock.MonitoredReentrantLock.lock(MonitoredReentrantLock.java:22)
	at org.example.Main.lambda$testDeadLock$3(Main.java:79)
	at org.example.Main$$Lambda$1/1221555852.run(Unknown Source)
	at java.lang.Thread.run(Thread.java:750)


2024-08-07 14:11:28  [ pool-1-thread-1:2001 ] - [ WARN ]  Deadlocks detected:
Thread Name: Thread-1
Thread State: WAITING
Lock Name: java.util.concurrent.locks.ReentrantLock$NonfairSync@13a2dfcf
Lock Owner Name: Thread-0
Lock Owner Id: 11
Waited Time: -1
Blocked Time: -1
Lock Info: java.util.concurrent.locks.ReentrantLock$NonfairSync@13a2dfcf
Blocked by: java.util.concurrent.locks.ReentrantLock$NonfairSync@13a2dfcf
Stack Trace: 
	at sun.misc.Unsafe.park(Native Method)
	at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
	at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
	at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
	at org.example.lock.MonitoredReentrantLock.lock(MonitoredReentrantLock.java:22)
	at org.example.Main.lambda$testDeadLock$4(Main.java:93)
	at org.example.Main$$Lambda$2/1556956098.run(Unknown Source)
	at java.lang.Thread.run(Thread.java:750)


```
##### benchmark
```
    @WarmUp(iterations = 5, time = 1, timeUnit = TimeUnit.SECONDS)
    @measurement(iterations = 2, time = 2, timeUnit = TimeUnit.SECONDS)
    @threads(1)

Benchmark                                                          Mode  Cnt       Score   Error   Units
LockBenchmark.testMonitoredLock                                   thrpt    2   15889.407          ops/ms
LockBenchmark.testMonitoredLock:·gc.alloc.rate                    thrpt    2     678.061          MB/sec
LockBenchmark.testMonitoredLock:·gc.alloc.rate.norm               thrpt    2      56.000            B/op
LockBenchmark.testMonitoredLock:·gc.churn.PS_Eden_Space           thrpt    2     668.249          MB/sec
LockBenchmark.testMonitoredLock:·gc.churn.PS_Eden_Space.norm      thrpt    2      55.080            B/op
LockBenchmark.testMonitoredLock:·gc.churn.PS_Survivor_Space       thrpt    2       0.075          MB/sec
LockBenchmark.testMonitoredLock:·gc.churn.PS_Survivor_Space.norm  thrpt    2       0.006            B/op
LockBenchmark.testMonitoredLock:·gc.count                         thrpt    2      20.000          counts
LockBenchmark.testMonitoredLock:·gc.time                          thrpt    2       6.000              ms
LockBenchmark.testNativeLock                                      thrpt    2  103130.635          ops/ms
LockBenchmark.testNativeLock:·gc.alloc.rate                       thrpt    2      ≈ 10⁻⁴          MB/sec
LockBenchmark.testNativeLock:·gc.alloc.rate.norm                  thrpt    2      ≈ 10⁻⁶            B/op
LockBenchmark.testNativeLock:·gc.count                            thrpt    2         ≈ 0          counts

    @WarmUp(iterations = 5, time = 1, timeUnit = TimeUnit.SECONDS)
    @measurement(iterations = 2, time = 2, timeUnit = TimeUnit.SECONDS)
    @threads(100)

Benchmark                                                          Mode  Cnt       Score   Error   Units
LockBenchmark.testMonitoredLock                                   thrpt    2   10994.606          ops/ms
LockBenchmark.testMonitoredLock:·gc.alloc.rate                    thrpt    2     488.508          MB/sec
LockBenchmark.testMonitoredLock:·gc.alloc.rate.norm               thrpt    2      56.002            B/op
LockBenchmark.testMonitoredLock:·gc.churn.PS_Eden_Space           thrpt    2     481.390          MB/sec
LockBenchmark.testMonitoredLock:·gc.churn.PS_Eden_Space.norm      thrpt    2      55.163            B/op
LockBenchmark.testMonitoredLock:·gc.churn.PS_Survivor_Space       thrpt    2       0.020          MB/sec
LockBenchmark.testMonitoredLock:·gc.churn.PS_Survivor_Space.norm  thrpt    2       0.002            B/op
LockBenchmark.testMonitoredLock:·gc.count                         thrpt    2      18.000          counts
LockBenchmark.testMonitoredLock:·gc.time                          thrpt    2       9.000              ms
LockBenchmark.testNativeLock                                      thrpt    2  558652.036          ops/ms
LockBenchmark.testNativeLock:·gc.alloc.rate                       thrpt    2       0.016          MB/sec
LockBenchmark.testNativeLock:·gc.alloc.rate.norm                  thrpt    2      ≈ 10⁻⁴            B/op
LockBenchmark.testNativeLock:·gc.count                            thrpt    2         ≈ 0          counts
```
morningman pushed a commit to morningman/doris that referenced this pull request Aug 18, 2024
…tions apache#39015  (apache#39099)

## Proposed changes
apache#39015
### Description:

This issue proposes the addition of new features to the project,
including a deadlock detection tool and monitored lock implementations.
These features will help in identifying and debugging potential
deadlocks and monitoring lock usage. Features:


#### AbstractMonitoredLock:

A monitored version of Lock that tracks and logs lock acquisition and
release times. Functionality:
Overrides lock(), unlock(), tryLock(), and tryLock(long timeout,
TimeUnit unit) methods. Logs information about lock acquisition time,
release time, and any failure to acquire the lock within the specified
timeout. ##### eg
```log
2024-08-07 12:02:59  [ Thread-2:2006 ] - [ WARN ]  Thread ID: 12, Thread Name: Thread-2 - Lock held for 1912 ms, exceeding hold timeout of 1000 ms 
Thread stack trace:
	at java.lang.Thread.getStackTrace(Thread.java:1564)
	at org.example.lock.AbstractMonitoredLock.afterUnlock(AbstractMonitoredLock.java:49)
	at org.example.lock.MonitoredReentrantLock.unlock(MonitoredReentrantLock.java:32)
	at org.example.ExampleService.timeout(ExampleService.java:17)
	at org.example.Main.lambda$test2$1(Main.java:39)
	at java.lang.Thread.run(Thread.java:750)
```












#### DeadlockCheckerTool:

Uses ScheduledExecutorService for periodic deadlock checks. Logs
deadlock information including thread names, states, lock info, and
stack traces.

**ThreadMXBean accesses thread information in the local JVM, which is
already in memory, so accessing it is less expensive than fetching data
from external resources such as disk or network. Thread state cache: The
JVM typically maintains a cache of thread states, reducing the need for
real-time calculations or additional data processing.** ##### eg
```log
Thread Name: Thread-0
Thread State: WAITING
Lock Name: java.util.concurrent.locks.ReentrantLock$NonfairSync@1d653213
Lock Owner Name: Thread-1
Lock Owner Id: 12
Waited Time: -1
Blocked Time: -1
Lock Info: java.util.concurrent.locks.ReentrantLock$NonfairSync@1d653213
Blocked by: java.util.concurrent.locks.ReentrantLock$NonfairSync@1d653213
Stack Trace: 
	at sun.misc.Unsafe.park(Native Method)
	at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
	at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
	at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
	at org.example.lock.MonitoredReentrantLock.lock(MonitoredReentrantLock.java:22)
	at org.example.Main.lambda$testDeadLock$3(Main.java:79)
	at org.example.Main$$Lambda$1/1221555852.run(Unknown Source)
	at java.lang.Thread.run(Thread.java:750)


2024-08-07 14:11:28  [ pool-1-thread-1:2001 ] - [ WARN ]  Deadlocks detected:
Thread Name: Thread-1
Thread State: WAITING
Lock Name: java.util.concurrent.locks.ReentrantLock$NonfairSync@13a2dfcf
Lock Owner Name: Thread-0
Lock Owner Id: 11
Waited Time: -1
Blocked Time: -1
Lock Info: java.util.concurrent.locks.ReentrantLock$NonfairSync@13a2dfcf
Blocked by: java.util.concurrent.locks.ReentrantLock$NonfairSync@13a2dfcf
Stack Trace: 
	at sun.misc.Unsafe.park(Native Method)
	at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
	at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
	at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
	at org.example.lock.MonitoredReentrantLock.lock(MonitoredReentrantLock.java:22)
	at org.example.Main.lambda$testDeadLock$4(Main.java:93)
	at org.example.Main$$Lambda$2/1556956098.run(Unknown Source)
	at java.lang.Thread.run(Thread.java:750)


```
##### benchmark
```
    @WarmUp(iterations = 5, time = 1, timeUnit = TimeUnit.SECONDS)
    @measurement(iterations = 2, time = 2, timeUnit = TimeUnit.SECONDS)
    @threads(1)

Benchmark                                                          Mode  Cnt       Score   Error   Units
LockBenchmark.testMonitoredLock                                   thrpt    2   15889.407          ops/ms
LockBenchmark.testMonitoredLock:·gc.alloc.rate                    thrpt    2     678.061          MB/sec
LockBenchmark.testMonitoredLock:·gc.alloc.rate.norm               thrpt    2      56.000            B/op
LockBenchmark.testMonitoredLock:·gc.churn.PS_Eden_Space           thrpt    2     668.249          MB/sec
LockBenchmark.testMonitoredLock:·gc.churn.PS_Eden_Space.norm      thrpt    2      55.080            B/op
LockBenchmark.testMonitoredLock:·gc.churn.PS_Survivor_Space       thrpt    2       0.075          MB/sec
LockBenchmark.testMonitoredLock:·gc.churn.PS_Survivor_Space.norm  thrpt    2       0.006            B/op
LockBenchmark.testMonitoredLock:·gc.count                         thrpt    2      20.000          counts
LockBenchmark.testMonitoredLock:·gc.time                          thrpt    2       6.000              ms
LockBenchmark.testNativeLock                                      thrpt    2  103130.635          ops/ms
LockBenchmark.testNativeLock:·gc.alloc.rate                       thrpt    2      ≈ 10⁻⁴          MB/sec
LockBenchmark.testNativeLock:·gc.alloc.rate.norm                  thrpt    2      ≈ 10⁻⁶            B/op
LockBenchmark.testNativeLock:·gc.count                            thrpt    2         ≈ 0          counts

    @WarmUp(iterations = 5, time = 1, timeUnit = TimeUnit.SECONDS)
    @measurement(iterations = 2, time = 2, timeUnit = TimeUnit.SECONDS)
    @threads(100)

Benchmark                                                          Mode  Cnt       Score   Error   Units
LockBenchmark.testMonitoredLock                                   thrpt    2   10994.606          ops/ms
LockBenchmark.testMonitoredLock:·gc.alloc.rate                    thrpt    2     488.508          MB/sec
LockBenchmark.testMonitoredLock:·gc.alloc.rate.norm               thrpt    2      56.002            B/op
LockBenchmark.testMonitoredLock:·gc.churn.PS_Eden_Space           thrpt    2     481.390          MB/sec
LockBenchmark.testMonitoredLock:·gc.churn.PS_Eden_Space.norm      thrpt    2      55.163            B/op
LockBenchmark.testMonitoredLock:·gc.churn.PS_Survivor_Space       thrpt    2       0.020          MB/sec
LockBenchmark.testMonitoredLock:·gc.churn.PS_Survivor_Space.norm  thrpt    2       0.002            B/op
LockBenchmark.testMonitoredLock:·gc.count                         thrpt    2      18.000          counts
LockBenchmark.testMonitoredLock:·gc.time                          thrpt    2       9.000              ms
LockBenchmark.testNativeLock                                      thrpt    2  558652.036          ops/ms
LockBenchmark.testNativeLock:·gc.alloc.rate                       thrpt    2       0.016          MB/sec
LockBenchmark.testNativeLock:·gc.alloc.rate.norm                  thrpt    2      ≈ 10⁻⁴            B/op
LockBenchmark.testNativeLock:·gc.count                            thrpt    2         ≈ 0          counts
```

## Proposed changes

Issue Number: close #xxx

<!--Describe your changes.-->
CalvinKirs added a commit to CalvinKirs/incubator-doris that referenced this pull request Sep 13, 2024
…tions (apache#39015)

## Proposed changes

### Description:

This issue proposes the addition of new features to the project,
including a deadlock detection tool and monitored lock implementations.
These features will help in identifying and debugging potential
deadlocks and monitoring lock usage.
Features:

#### AbstractMonitoredLock:

A monitored version of Lock that tracks and logs lock acquisition and
release times.
Functionality:
Overrides lock(), unlock(), tryLock(), and tryLock(long timeout,
TimeUnit unit) methods.
Logs information about lock acquisition time, release time, and any
failure to acquire the lock within the specified timeout.
##### eg
```log
2024-08-07 12:02:59  [ Thread-2:2006 ] - [ WARN ]  Thread ID: 12, Thread Name: Thread-2 - Lock held for 1912 ms, exceeding hold timeout of 1000 ms
Thread stack trace:
    at java.lang.Thread.getStackTrace(Thread.java:1564)
    at org.example.lock.AbstractMonitoredLock.afterUnlock(AbstractMonitoredLock.java:49)
    at org.example.lock.MonitoredReentrantLock.unlock(MonitoredReentrantLock.java:32)
    at org.example.ExampleService.timeout(ExampleService.java:17)
    at org.example.Main.lambda$test2$1(Main.java:39)
    at java.lang.Thread.run(Thread.java:750)
```

#### DeadlockCheckerTool:

Uses ScheduledExecutorService for periodic deadlock checks.
Logs deadlock information including thread names, states, lock info, and
stack traces.

**ThreadMXBean accesses thread information in the local JVM, which is
already in memory, so accessing it is less expensive than fetching data
from external resources such as disk or network.
Thread state cache: The JVM typically maintains a cache of thread
states, reducing the need for real-time calculations or additional data
processing.**
##### eg
```log
Thread Name: Thread-0
Thread State: WAITING
Lock Name: java.util.concurrent.locks.ReentrantLock$NonfairSync@1d653213
Lock Owner Name: Thread-1
Lock Owner Id: 12
Waited Time: -1
Blocked Time: -1
Lock Info: java.util.concurrent.locks.ReentrantLock$NonfairSync@1d653213
Blocked by: java.util.concurrent.locks.ReentrantLock$NonfairSync@1d653213
Stack Trace:
    at sun.misc.Unsafe.park(Native Method)
    at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
    at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
    at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
    at org.example.lock.MonitoredReentrantLock.lock(MonitoredReentrantLock.java:22)
    at org.example.Main.lambda$testDeadLock$3(Main.java:79)
    at org.example.Main$$Lambda$1/1221555852.run(Unknown Source)
    at java.lang.Thread.run(Thread.java:750)

2024-08-07 14:11:28  [ pool-1-thread-1:2001 ] - [ WARN ]  Deadlocks detected:
Thread Name: Thread-1
Thread State: WAITING
Lock Name: java.util.concurrent.locks.ReentrantLock$NonfairSync@13a2dfcf
Lock Owner Name: Thread-0
Lock Owner Id: 11
Waited Time: -1
Blocked Time: -1
Lock Info: java.util.concurrent.locks.ReentrantLock$NonfairSync@13a2dfcf
Blocked by: java.util.concurrent.locks.ReentrantLock$NonfairSync@13a2dfcf
Stack Trace:
    at sun.misc.Unsafe.park(Native Method)
    at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
    at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
    at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
    at org.example.lock.MonitoredReentrantLock.lock(MonitoredReentrantLock.java:22)
    at org.example.Main.lambda$testDeadLock$4(Main.java:93)
    at org.example.Main$$Lambda$2/1556956098.run(Unknown Source)
    at java.lang.Thread.run(Thread.java:750)

```
##### benchmark
```
    @WarmUp(iterations = 5, time = 1, timeUnit = TimeUnit.SECONDS)
    @measurement(iterations = 2, time = 2, timeUnit = TimeUnit.SECONDS)
    @threads(1)

Benchmark                                                          Mode  Cnt       Score   Error   Units
LockBenchmark.testMonitoredLock                                   thrpt    2   15889.407          ops/ms
LockBenchmark.testMonitoredLock:·gc.alloc.rate                    thrpt    2     678.061          MB/sec
LockBenchmark.testMonitoredLock:·gc.alloc.rate.norm               thrpt    2      56.000            B/op
LockBenchmark.testMonitoredLock:·gc.churn.PS_Eden_Space           thrpt    2     668.249          MB/sec
LockBenchmark.testMonitoredLock:·gc.churn.PS_Eden_Space.norm      thrpt    2      55.080            B/op
LockBenchmark.testMonitoredLock:·gc.churn.PS_Survivor_Space       thrpt    2       0.075          MB/sec
LockBenchmark.testMonitoredLock:·gc.churn.PS_Survivor_Space.norm  thrpt    2       0.006            B/op
LockBenchmark.testMonitoredLock:·gc.count                         thrpt    2      20.000          counts
LockBenchmark.testMonitoredLock:·gc.time                          thrpt    2       6.000              ms
LockBenchmark.testNativeLock                                      thrpt    2  103130.635          ops/ms
LockBenchmark.testNativeLock:·gc.alloc.rate                       thrpt    2      ≈ 10⁻⁴          MB/sec
LockBenchmark.testNativeLock:·gc.alloc.rate.norm                  thrpt    2      ≈ 10⁻⁶            B/op
LockBenchmark.testNativeLock:·gc.count                            thrpt    2         ≈ 0          counts

    @WarmUp(iterations = 5, time = 1, timeUnit = TimeUnit.SECONDS)
    @measurement(iterations = 2, time = 2, timeUnit = TimeUnit.SECONDS)
    @threads(100)

Benchmark                                                          Mode  Cnt       Score   Error   Units
LockBenchmark.testMonitoredLock                                   thrpt    2   10994.606          ops/ms
LockBenchmark.testMonitoredLock:·gc.alloc.rate                    thrpt    2     488.508          MB/sec
LockBenchmark.testMonitoredLock:·gc.alloc.rate.norm               thrpt    2      56.002            B/op
LockBenchmark.testMonitoredLock:·gc.churn.PS_Eden_Space           thrpt    2     481.390          MB/sec
LockBenchmark.testMonitoredLock:·gc.churn.PS_Eden_Space.norm      thrpt    2      55.163            B/op
LockBenchmark.testMonitoredLock:·gc.churn.PS_Survivor_Space       thrpt    2       0.020          MB/sec
LockBenchmark.testMonitoredLock:·gc.churn.PS_Survivor_Space.norm  thrpt    2       0.002            B/op
LockBenchmark.testMonitoredLock:·gc.count                         thrpt    2      18.000          counts
LockBenchmark.testMonitoredLock:·gc.time                          thrpt    2       9.000              ms
LockBenchmark.testNativeLock                                      thrpt    2  558652.036          ops/ms
LockBenchmark.testNativeLock:·gc.alloc.rate                       thrpt    2       0.016          MB/sec
LockBenchmark.testNativeLock:·gc.alloc.rate.norm                  thrpt    2      ≈ 10⁻⁴            B/op
LockBenchmark.testNativeLock:·gc.count                            thrpt    2         ≈ 0          counts
```
CalvinKirs added a commit to CalvinKirs/incubator-doris that referenced this pull request Sep 13, 2024
…k implementations (apache#39015)

## Proposed changes

### Description:

This issue proposes the addition of new features to the project,
including a deadlock detection tool and monitored lock implementations.
These features will help in identifying and debugging potential
deadlocks and monitoring lock usage.
Features:

#### AbstractMonitoredLock:

A monitored version of Lock that tracks and logs lock acquisition and
release times.
Functionality:
Overrides lock(), unlock(), tryLock(), and tryLock(long timeout,
TimeUnit unit) methods.
Logs information about lock acquisition time, release time, and any
failure to acquire the lock within the specified timeout.
##### eg
```log
2024-08-07 12:02:59  [ Thread-2:2006 ] - [ WARN ]  Thread ID: 12, Thread Name: Thread-2 - Lock held for 1912 ms, exceeding hold timeout of 1000 ms
Thread stack trace:
	at java.lang.Thread.getStackTrace(Thread.java:1564)
	at org.example.lock.AbstractMonitoredLock.afterUnlock(AbstractMonitoredLock.java:49)
	at org.example.lock.MonitoredReentrantLock.unlock(MonitoredReentrantLock.java:32)
	at org.example.ExampleService.timeout(ExampleService.java:17)
	at org.example.Main.lambda$test2$1(Main.java:39)
	at java.lang.Thread.run(Thread.java:750)
```

#### DeadlockCheckerTool:

Uses ScheduledExecutorService for periodic deadlock checks.
Logs deadlock information including thread names, states, lock info, and
stack traces.

**ThreadMXBean accesses thread information in the local JVM, which is
already in memory, so accessing it is less expensive than fetching data
from external resources such as disk or network.
Thread state cache: The JVM typically maintains a cache of thread
states, reducing the need for real-time calculations or additional data
processing.**
##### eg
```log
Thread Name: Thread-0
Thread State: WAITING
Lock Name: java.util.concurrent.locks.ReentrantLock$NonfairSync@1d653213
Lock Owner Name: Thread-1
Lock Owner Id: 12
Waited Time: -1
Blocked Time: -1
Lock Info: java.util.concurrent.locks.ReentrantLock$NonfairSync@1d653213
Blocked by: java.util.concurrent.locks.ReentrantLock$NonfairSync@1d653213
Stack Trace:
	at sun.misc.Unsafe.park(Native Method)
	at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
	at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
	at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
	at org.example.lock.MonitoredReentrantLock.lock(MonitoredReentrantLock.java:22)
	at org.example.Main.lambda$testDeadLock$3(Main.java:79)
	at org.example.Main$$Lambda$1/1221555852.run(Unknown Source)
	at java.lang.Thread.run(Thread.java:750)

2024-08-07 14:11:28  [ pool-1-thread-1:2001 ] - [ WARN ]  Deadlocks detected:
Thread Name: Thread-1
Thread State: WAITING
Lock Name: java.util.concurrent.locks.ReentrantLock$NonfairSync@13a2dfcf
Lock Owner Name: Thread-0
Lock Owner Id: 11
Waited Time: -1
Blocked Time: -1
Lock Info: java.util.concurrent.locks.ReentrantLock$NonfairSync@13a2dfcf
Blocked by: java.util.concurrent.locks.ReentrantLock$NonfairSync@13a2dfcf
Stack Trace:
	at sun.misc.Unsafe.park(Native Method)
	at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
	at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
	at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
	at org.example.lock.MonitoredReentrantLock.lock(MonitoredReentrantLock.java:22)
	at org.example.Main.lambda$testDeadLock$4(Main.java:93)
	at org.example.Main$$Lambda$2/1556956098.run(Unknown Source)
	at java.lang.Thread.run(Thread.java:750)

```
##### benchmark
```
    @WarmUp(iterations = 5, time = 1, timeUnit = TimeUnit.SECONDS)
    @measurement(iterations = 2, time = 2, timeUnit = TimeUnit.SECONDS)
    @threads(1)

Benchmark                                                          Mode  Cnt       Score   Error   Units
LockBenchmark.testMonitoredLock                                   thrpt    2   15889.407          ops/ms
LockBenchmark.testMonitoredLock:·gc.alloc.rate                    thrpt    2     678.061          MB/sec
LockBenchmark.testMonitoredLock:·gc.alloc.rate.norm               thrpt    2      56.000            B/op
LockBenchmark.testMonitoredLock:·gc.churn.PS_Eden_Space           thrpt    2     668.249          MB/sec
LockBenchmark.testMonitoredLock:·gc.churn.PS_Eden_Space.norm      thrpt    2      55.080            B/op
LockBenchmark.testMonitoredLock:·gc.churn.PS_Survivor_Space       thrpt    2       0.075          MB/sec
LockBenchmark.testMonitoredLock:·gc.churn.PS_Survivor_Space.norm  thrpt    2       0.006            B/op
LockBenchmark.testMonitoredLock:·gc.count                         thrpt    2      20.000          counts
LockBenchmark.testMonitoredLock:·gc.time                          thrpt    2       6.000              ms
LockBenchmark.testNativeLock                                      thrpt    2  103130.635          ops/ms
LockBenchmark.testNativeLock:·gc.alloc.rate                       thrpt    2      ≈ 10⁻⁴          MB/sec
LockBenchmark.testNativeLock:·gc.alloc.rate.norm                  thrpt    2      ≈ 10⁻⁶            B/op
LockBenchmark.testNativeLock:·gc.count                            thrpt    2         ≈ 0          counts

    @WarmUp(iterations = 5, time = 1, timeUnit = TimeUnit.SECONDS)
    @measurement(iterations = 2, time = 2, timeUnit = TimeUnit.SECONDS)
    @threads(100)

Benchmark                                                          Mode  Cnt       Score   Error   Units
LockBenchmark.testMonitoredLock                                   thrpt    2   10994.606          ops/ms
LockBenchmark.testMonitoredLock:·gc.alloc.rate                    thrpt    2     488.508          MB/sec
LockBenchmark.testMonitoredLock:·gc.alloc.rate.norm               thrpt    2      56.002            B/op
LockBenchmark.testMonitoredLock:·gc.churn.PS_Eden_Space           thrpt    2     481.390          MB/sec
LockBenchmark.testMonitoredLock:·gc.churn.PS_Eden_Space.norm      thrpt    2      55.163            B/op
LockBenchmark.testMonitoredLock:·gc.churn.PS_Survivor_Space       thrpt    2       0.020          MB/sec
LockBenchmark.testMonitoredLock:·gc.churn.PS_Survivor_Space.norm  thrpt    2       0.002            B/op
LockBenchmark.testMonitoredLock:·gc.count                         thrpt    2      18.000          counts
LockBenchmark.testMonitoredLock:·gc.time                          thrpt    2       9.000              ms
LockBenchmark.testNativeLock                                      thrpt    2  558652.036          ops/ms
LockBenchmark.testNativeLock:·gc.alloc.rate                       thrpt    2       0.016          MB/sec
LockBenchmark.testNativeLock:·gc.alloc.rate.norm                  thrpt    2      ≈ 10⁻⁴            B/op
LockBenchmark.testNativeLock:·gc.count                            thrpt    2         ≈ 0          counts
```

(cherry picked from commit f0ca491)
liutang123 pushed a commit to liutang123/doris that referenced this pull request Sep 14, 2024
…tions apache#39015  (apache#39099)

This issue proposes the addition of new features to the project,
including a deadlock detection tool and monitored lock implementations.
These features will help in identifying and debugging potential
deadlocks and monitoring lock usage. Features:

A monitored version of Lock that tracks and logs lock acquisition and
release times. Functionality:
Overrides lock(), unlock(), tryLock(), and tryLock(long timeout,
TimeUnit unit) methods. Logs information about lock acquisition time,
release time, and any failure to acquire the lock within the specified
timeout. ##### eg
```log
2024-08-07 12:02:59  [ Thread-2:2006 ] - [ WARN ]  Thread ID: 12, Thread Name: Thread-2 - Lock held for 1912 ms, exceeding hold timeout of 1000 ms
Thread stack trace:
	at java.lang.Thread.getStackTrace(Thread.java:1564)
	at org.example.lock.AbstractMonitoredLock.afterUnlock(AbstractMonitoredLock.java:49)
	at org.example.lock.MonitoredReentrantLock.unlock(MonitoredReentrantLock.java:32)
	at org.example.ExampleService.timeout(ExampleService.java:17)
	at org.example.Main.lambda$test2$1(Main.java:39)
	at java.lang.Thread.run(Thread.java:750)
```

Uses ScheduledExecutorService for periodic deadlock checks. Logs
deadlock information including thread names, states, lock info, and
stack traces.

**ThreadMXBean accesses thread information in the local JVM, which is
already in memory, so accessing it is less expensive than fetching data
from external resources such as disk or network. Thread state cache: The
JVM typically maintains a cache of thread states, reducing the need for
real-time calculations or additional data processing.** ##### eg
```log
Thread Name: Thread-0
Thread State: WAITING
Lock Name: java.util.concurrent.locks.ReentrantLock$NonfairSync@1d653213
Lock Owner Name: Thread-1
Lock Owner Id: 12
Waited Time: -1
Blocked Time: -1
Lock Info: java.util.concurrent.locks.ReentrantLock$NonfairSync@1d653213
Blocked by: java.util.concurrent.locks.ReentrantLock$NonfairSync@1d653213
Stack Trace:
	at sun.misc.Unsafe.park(Native Method)
	at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
	at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
	at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
	at org.example.lock.MonitoredReentrantLock.lock(MonitoredReentrantLock.java:22)
	at org.example.Main.lambda$testDeadLock$3(Main.java:79)
	at org.example.Main$$Lambda$1/1221555852.run(Unknown Source)
	at java.lang.Thread.run(Thread.java:750)

2024-08-07 14:11:28  [ pool-1-thread-1:2001 ] - [ WARN ]  Deadlocks detected:
Thread Name: Thread-1
Thread State: WAITING
Lock Name: java.util.concurrent.locks.ReentrantLock$NonfairSync@13a2dfcf
Lock Owner Name: Thread-0
Lock Owner Id: 11
Waited Time: -1
Blocked Time: -1
Lock Info: java.util.concurrent.locks.ReentrantLock$NonfairSync@13a2dfcf
Blocked by: java.util.concurrent.locks.ReentrantLock$NonfairSync@13a2dfcf
Stack Trace:
	at sun.misc.Unsafe.park(Native Method)
	at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
	at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
	at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
	at org.example.lock.MonitoredReentrantLock.lock(MonitoredReentrantLock.java:22)
	at org.example.Main.lambda$testDeadLock$4(Main.java:93)
	at org.example.Main$$Lambda$2/1556956098.run(Unknown Source)
	at java.lang.Thread.run(Thread.java:750)

```
```
    @WarmUp(iterations = 5, time = 1, timeUnit = TimeUnit.SECONDS)
    @measurement(iterations = 2, time = 2, timeUnit = TimeUnit.SECONDS)
    @threads(1)

Benchmark                                                          Mode  Cnt       Score   Error   Units
LockBenchmark.testMonitoredLock                                   thrpt    2   15889.407          ops/ms
LockBenchmark.testMonitoredLock:·gc.alloc.rate                    thrpt    2     678.061          MB/sec
LockBenchmark.testMonitoredLock:·gc.alloc.rate.norm               thrpt    2      56.000            B/op
LockBenchmark.testMonitoredLock:·gc.churn.PS_Eden_Space           thrpt    2     668.249          MB/sec
LockBenchmark.testMonitoredLock:·gc.churn.PS_Eden_Space.norm      thrpt    2      55.080            B/op
LockBenchmark.testMonitoredLock:·gc.churn.PS_Survivor_Space       thrpt    2       0.075          MB/sec
LockBenchmark.testMonitoredLock:·gc.churn.PS_Survivor_Space.norm  thrpt    2       0.006            B/op
LockBenchmark.testMonitoredLock:·gc.count                         thrpt    2      20.000          counts
LockBenchmark.testMonitoredLock:·gc.time                          thrpt    2       6.000              ms
LockBenchmark.testNativeLock                                      thrpt    2  103130.635          ops/ms
LockBenchmark.testNativeLock:·gc.alloc.rate                       thrpt    2      ≈ 10⁻⁴          MB/sec
LockBenchmark.testNativeLock:·gc.alloc.rate.norm                  thrpt    2      ≈ 10⁻⁶            B/op
LockBenchmark.testNativeLock:·gc.count                            thrpt    2         ≈ 0          counts

    @WarmUp(iterations = 5, time = 1, timeUnit = TimeUnit.SECONDS)
    @measurement(iterations = 2, time = 2, timeUnit = TimeUnit.SECONDS)
    @threads(100)

Benchmark                                                          Mode  Cnt       Score   Error   Units
LockBenchmark.testMonitoredLock                                   thrpt    2   10994.606          ops/ms
LockBenchmark.testMonitoredLock:·gc.alloc.rate                    thrpt    2     488.508          MB/sec
LockBenchmark.testMonitoredLock:·gc.alloc.rate.norm               thrpt    2      56.002            B/op
LockBenchmark.testMonitoredLock:·gc.churn.PS_Eden_Space           thrpt    2     481.390          MB/sec
LockBenchmark.testMonitoredLock:·gc.churn.PS_Eden_Space.norm      thrpt    2      55.163            B/op
LockBenchmark.testMonitoredLock:·gc.churn.PS_Survivor_Space       thrpt    2       0.020          MB/sec
LockBenchmark.testMonitoredLock:·gc.churn.PS_Survivor_Space.norm  thrpt    2       0.002            B/op
LockBenchmark.testMonitoredLock:·gc.count                         thrpt    2      18.000          counts
LockBenchmark.testMonitoredLock:·gc.time                          thrpt    2       9.000              ms
LockBenchmark.testNativeLock                                      thrpt    2  558652.036          ops/ms
LockBenchmark.testNativeLock:·gc.alloc.rate                       thrpt    2       0.016          MB/sec
LockBenchmark.testNativeLock:·gc.alloc.rate.norm                  thrpt    2      ≈ 10⁻⁴            B/op
LockBenchmark.testNativeLock:·gc.count                            thrpt    2         ≈ 0          counts
```

Issue Number: close #xxx

<!--Describe your changes.-->
liutang123 pushed a commit to liutang123/doris that referenced this pull request Sep 14, 2024
…tions apache#39015  (apache#39099)

This issue proposes the addition of new features to the project,
including a deadlock detection tool and monitored lock implementations.
These features will help in identifying and debugging potential
deadlocks and monitoring lock usage. Features:

A monitored version of Lock that tracks and logs lock acquisition and
release times. Functionality:
Overrides lock(), unlock(), tryLock(), and tryLock(long timeout,
TimeUnit unit) methods. Logs information about lock acquisition time,
release time, and any failure to acquire the lock within the specified
timeout. ##### eg
```log
2024-08-07 12:02:59  [ Thread-2:2006 ] - [ WARN ]  Thread ID: 12, Thread Name: Thread-2 - Lock held for 1912 ms, exceeding hold timeout of 1000 ms
Thread stack trace:
	at java.lang.Thread.getStackTrace(Thread.java:1564)
	at org.example.lock.AbstractMonitoredLock.afterUnlock(AbstractMonitoredLock.java:49)
	at org.example.lock.MonitoredReentrantLock.unlock(MonitoredReentrantLock.java:32)
	at org.example.ExampleService.timeout(ExampleService.java:17)
	at org.example.Main.lambda$test2$1(Main.java:39)
	at java.lang.Thread.run(Thread.java:750)
```

Uses ScheduledExecutorService for periodic deadlock checks. Logs
deadlock information including thread names, states, lock info, and
stack traces.

**ThreadMXBean accesses thread information in the local JVM, which is
already in memory, so accessing it is less expensive than fetching data
from external resources such as disk or network. Thread state cache: The
JVM typically maintains a cache of thread states, reducing the need for
real-time calculations or additional data processing.** ##### eg
```log
Thread Name: Thread-0
Thread State: WAITING
Lock Name: java.util.concurrent.locks.ReentrantLock$NonfairSync@1d653213
Lock Owner Name: Thread-1
Lock Owner Id: 12
Waited Time: -1
Blocked Time: -1
Lock Info: java.util.concurrent.locks.ReentrantLock$NonfairSync@1d653213
Blocked by: java.util.concurrent.locks.ReentrantLock$NonfairSync@1d653213
Stack Trace:
	at sun.misc.Unsafe.park(Native Method)
	at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
	at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
	at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
	at org.example.lock.MonitoredReentrantLock.lock(MonitoredReentrantLock.java:22)
	at org.example.Main.lambda$testDeadLock$3(Main.java:79)
	at org.example.Main$$Lambda$1/1221555852.run(Unknown Source)
	at java.lang.Thread.run(Thread.java:750)

2024-08-07 14:11:28  [ pool-1-thread-1:2001 ] - [ WARN ]  Deadlocks detected:
Thread Name: Thread-1
Thread State: WAITING
Lock Name: java.util.concurrent.locks.ReentrantLock$NonfairSync@13a2dfcf
Lock Owner Name: Thread-0
Lock Owner Id: 11
Waited Time: -1
Blocked Time: -1
Lock Info: java.util.concurrent.locks.ReentrantLock$NonfairSync@13a2dfcf
Blocked by: java.util.concurrent.locks.ReentrantLock$NonfairSync@13a2dfcf
Stack Trace:
	at sun.misc.Unsafe.park(Native Method)
	at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
	at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
	at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
	at org.example.lock.MonitoredReentrantLock.lock(MonitoredReentrantLock.java:22)
	at org.example.Main.lambda$testDeadLock$4(Main.java:93)
	at org.example.Main$$Lambda$2/1556956098.run(Unknown Source)
	at java.lang.Thread.run(Thread.java:750)

```
```
    @WarmUp(iterations = 5, time = 1, timeUnit = TimeUnit.SECONDS)
    @measurement(iterations = 2, time = 2, timeUnit = TimeUnit.SECONDS)
    @threads(1)

Benchmark                                                          Mode  Cnt       Score   Error   Units
LockBenchmark.testMonitoredLock                                   thrpt    2   15889.407          ops/ms
LockBenchmark.testMonitoredLock:·gc.alloc.rate                    thrpt    2     678.061          MB/sec
LockBenchmark.testMonitoredLock:·gc.alloc.rate.norm               thrpt    2      56.000            B/op
LockBenchmark.testMonitoredLock:·gc.churn.PS_Eden_Space           thrpt    2     668.249          MB/sec
LockBenchmark.testMonitoredLock:·gc.churn.PS_Eden_Space.norm      thrpt    2      55.080            B/op
LockBenchmark.testMonitoredLock:·gc.churn.PS_Survivor_Space       thrpt    2       0.075          MB/sec
LockBenchmark.testMonitoredLock:·gc.churn.PS_Survivor_Space.norm  thrpt    2       0.006            B/op
LockBenchmark.testMonitoredLock:·gc.count                         thrpt    2      20.000          counts
LockBenchmark.testMonitoredLock:·gc.time                          thrpt    2       6.000              ms
LockBenchmark.testNativeLock                                      thrpt    2  103130.635          ops/ms
LockBenchmark.testNativeLock:·gc.alloc.rate                       thrpt    2      ≈ 10⁻⁴          MB/sec
LockBenchmark.testNativeLock:·gc.alloc.rate.norm                  thrpt    2      ≈ 10⁻⁶            B/op
LockBenchmark.testNativeLock:·gc.count                            thrpt    2         ≈ 0          counts

    @WarmUp(iterations = 5, time = 1, timeUnit = TimeUnit.SECONDS)
    @measurement(iterations = 2, time = 2, timeUnit = TimeUnit.SECONDS)
    @threads(100)

Benchmark                                                          Mode  Cnt       Score   Error   Units
LockBenchmark.testMonitoredLock                                   thrpt    2   10994.606          ops/ms
LockBenchmark.testMonitoredLock:·gc.alloc.rate                    thrpt    2     488.508          MB/sec
LockBenchmark.testMonitoredLock:·gc.alloc.rate.norm               thrpt    2      56.002            B/op
LockBenchmark.testMonitoredLock:·gc.churn.PS_Eden_Space           thrpt    2     481.390          MB/sec
LockBenchmark.testMonitoredLock:·gc.churn.PS_Eden_Space.norm      thrpt    2      55.163            B/op
LockBenchmark.testMonitoredLock:·gc.churn.PS_Survivor_Space       thrpt    2       0.020          MB/sec
LockBenchmark.testMonitoredLock:·gc.churn.PS_Survivor_Space.norm  thrpt    2       0.002            B/op
LockBenchmark.testMonitoredLock:·gc.count                         thrpt    2      18.000          counts
LockBenchmark.testMonitoredLock:·gc.time                          thrpt    2       9.000              ms
LockBenchmark.testNativeLock                                      thrpt    2  558652.036          ops/ms
LockBenchmark.testNativeLock:·gc.alloc.rate                       thrpt    2       0.016          MB/sec
LockBenchmark.testNativeLock:·gc.alloc.rate.norm                  thrpt    2      ≈ 10⁻⁴            B/op
LockBenchmark.testNativeLock:·gc.count                            thrpt    2         ≈ 0          counts
```

Issue Number: close #xxx

<!--Describe your changes.-->
liutang123 pushed a commit to liutang123/doris that referenced this pull request Sep 14, 2024
…tions apache#39015  (apache#39099)

This issue proposes the addition of new features to the project,
including a deadlock detection tool and monitored lock implementations.
These features will help in identifying and debugging potential
deadlocks and monitoring lock usage. Features:

A monitored version of Lock that tracks and logs lock acquisition and
release times. Functionality:
Overrides lock(), unlock(), tryLock(), and tryLock(long timeout,
TimeUnit unit) methods. Logs information about lock acquisition time,
release time, and any failure to acquire the lock within the specified
timeout. ##### eg
```log
2024-08-07 12:02:59  [ Thread-2:2006 ] - [ WARN ]  Thread ID: 12, Thread Name: Thread-2 - Lock held for 1912 ms, exceeding hold timeout of 1000 ms
Thread stack trace:
	at java.lang.Thread.getStackTrace(Thread.java:1564)
	at org.example.lock.AbstractMonitoredLock.afterUnlock(AbstractMonitoredLock.java:49)
	at org.example.lock.MonitoredReentrantLock.unlock(MonitoredReentrantLock.java:32)
	at org.example.ExampleService.timeout(ExampleService.java:17)
	at org.example.Main.lambda$test2$1(Main.java:39)
	at java.lang.Thread.run(Thread.java:750)
```

Uses ScheduledExecutorService for periodic deadlock checks. Logs
deadlock information including thread names, states, lock info, and
stack traces.

**ThreadMXBean accesses thread information in the local JVM, which is
already in memory, so accessing it is less expensive than fetching data
from external resources such as disk or network. Thread state cache: The
JVM typically maintains a cache of thread states, reducing the need for
real-time calculations or additional data processing.** ##### eg
```log
Thread Name: Thread-0
Thread State: WAITING
Lock Name: java.util.concurrent.locks.ReentrantLock$NonfairSync@1d653213
Lock Owner Name: Thread-1
Lock Owner Id: 12
Waited Time: -1
Blocked Time: -1
Lock Info: java.util.concurrent.locks.ReentrantLock$NonfairSync@1d653213
Blocked by: java.util.concurrent.locks.ReentrantLock$NonfairSync@1d653213
Stack Trace:
	at sun.misc.Unsafe.park(Native Method)
	at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
	at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
	at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
	at org.example.lock.MonitoredReentrantLock.lock(MonitoredReentrantLock.java:22)
	at org.example.Main.lambda$testDeadLock$3(Main.java:79)
	at org.example.Main$$Lambda$1/1221555852.run(Unknown Source)
	at java.lang.Thread.run(Thread.java:750)

2024-08-07 14:11:28  [ pool-1-thread-1:2001 ] - [ WARN ]  Deadlocks detected:
Thread Name: Thread-1
Thread State: WAITING
Lock Name: java.util.concurrent.locks.ReentrantLock$NonfairSync@13a2dfcf
Lock Owner Name: Thread-0
Lock Owner Id: 11
Waited Time: -1
Blocked Time: -1
Lock Info: java.util.concurrent.locks.ReentrantLock$NonfairSync@13a2dfcf
Blocked by: java.util.concurrent.locks.ReentrantLock$NonfairSync@13a2dfcf
Stack Trace:
	at sun.misc.Unsafe.park(Native Method)
	at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
	at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
	at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
	at org.example.lock.MonitoredReentrantLock.lock(MonitoredReentrantLock.java:22)
	at org.example.Main.lambda$testDeadLock$4(Main.java:93)
	at org.example.Main$$Lambda$2/1556956098.run(Unknown Source)
	at java.lang.Thread.run(Thread.java:750)

```
```
    @WarmUp(iterations = 5, time = 1, timeUnit = TimeUnit.SECONDS)
    @measurement(iterations = 2, time = 2, timeUnit = TimeUnit.SECONDS)
    @threads(1)

Benchmark                                                          Mode  Cnt       Score   Error   Units
LockBenchmark.testMonitoredLock                                   thrpt    2   15889.407          ops/ms
LockBenchmark.testMonitoredLock:·gc.alloc.rate                    thrpt    2     678.061          MB/sec
LockBenchmark.testMonitoredLock:·gc.alloc.rate.norm               thrpt    2      56.000            B/op
LockBenchmark.testMonitoredLock:·gc.churn.PS_Eden_Space           thrpt    2     668.249          MB/sec
LockBenchmark.testMonitoredLock:·gc.churn.PS_Eden_Space.norm      thrpt    2      55.080            B/op
LockBenchmark.testMonitoredLock:·gc.churn.PS_Survivor_Space       thrpt    2       0.075          MB/sec
LockBenchmark.testMonitoredLock:·gc.churn.PS_Survivor_Space.norm  thrpt    2       0.006            B/op
LockBenchmark.testMonitoredLock:·gc.count                         thrpt    2      20.000          counts
LockBenchmark.testMonitoredLock:·gc.time                          thrpt    2       6.000              ms
LockBenchmark.testNativeLock                                      thrpt    2  103130.635          ops/ms
LockBenchmark.testNativeLock:·gc.alloc.rate                       thrpt    2      ≈ 10⁻⁴          MB/sec
LockBenchmark.testNativeLock:·gc.alloc.rate.norm                  thrpt    2      ≈ 10⁻⁶            B/op
LockBenchmark.testNativeLock:·gc.count                            thrpt    2         ≈ 0          counts

    @WarmUp(iterations = 5, time = 1, timeUnit = TimeUnit.SECONDS)
    @measurement(iterations = 2, time = 2, timeUnit = TimeUnit.SECONDS)
    @threads(100)

Benchmark                                                          Mode  Cnt       Score   Error   Units
LockBenchmark.testMonitoredLock                                   thrpt    2   10994.606          ops/ms
LockBenchmark.testMonitoredLock:·gc.alloc.rate                    thrpt    2     488.508          MB/sec
LockBenchmark.testMonitoredLock:·gc.alloc.rate.norm               thrpt    2      56.002            B/op
LockBenchmark.testMonitoredLock:·gc.churn.PS_Eden_Space           thrpt    2     481.390          MB/sec
LockBenchmark.testMonitoredLock:·gc.churn.PS_Eden_Space.norm      thrpt    2      55.163            B/op
LockBenchmark.testMonitoredLock:·gc.churn.PS_Survivor_Space       thrpt    2       0.020          MB/sec
LockBenchmark.testMonitoredLock:·gc.churn.PS_Survivor_Space.norm  thrpt    2       0.002            B/op
LockBenchmark.testMonitoredLock:·gc.count                         thrpt    2      18.000          counts
LockBenchmark.testMonitoredLock:·gc.time                          thrpt    2       9.000              ms
LockBenchmark.testNativeLock                                      thrpt    2  558652.036          ops/ms
LockBenchmark.testNativeLock:·gc.alloc.rate                       thrpt    2       0.016          MB/sec
LockBenchmark.testNativeLock:·gc.alloc.rate.norm                  thrpt    2      ≈ 10⁻⁴            B/op
LockBenchmark.testNativeLock:·gc.count                            thrpt    2         ≈ 0          counts
```

Issue Number: close #xxx

<!--Describe your changes.-->
liutang123 pushed a commit to liutang123/doris that referenced this pull request Sep 14, 2024
…tions apache#39015  (apache#39099)

This issue proposes the addition of new features to the project,
including a deadlock detection tool and monitored lock implementations.
These features will help in identifying and debugging potential
deadlocks and monitoring lock usage. Features:

A monitored version of Lock that tracks and logs lock acquisition and
release times. Functionality:
Overrides lock(), unlock(), tryLock(), and tryLock(long timeout,
TimeUnit unit) methods. Logs information about lock acquisition time,
release time, and any failure to acquire the lock within the specified
timeout. ##### eg
```log
2024-08-07 12:02:59  [ Thread-2:2006 ] - [ WARN ]  Thread ID: 12, Thread Name: Thread-2 - Lock held for 1912 ms, exceeding hold timeout of 1000 ms
Thread stack trace:
	at java.lang.Thread.getStackTrace(Thread.java:1564)
	at org.example.lock.AbstractMonitoredLock.afterUnlock(AbstractMonitoredLock.java:49)
	at org.example.lock.MonitoredReentrantLock.unlock(MonitoredReentrantLock.java:32)
	at org.example.ExampleService.timeout(ExampleService.java:17)
	at org.example.Main.lambda$test2$1(Main.java:39)
	at java.lang.Thread.run(Thread.java:750)
```

Uses ScheduledExecutorService for periodic deadlock checks. Logs
deadlock information including thread names, states, lock info, and
stack traces.

**ThreadMXBean accesses thread information in the local JVM, which is
already in memory, so accessing it is less expensive than fetching data
from external resources such as disk or network. Thread state cache: The
JVM typically maintains a cache of thread states, reducing the need for
real-time calculations or additional data processing.** ##### eg
```log
Thread Name: Thread-0
Thread State: WAITING
Lock Name: java.util.concurrent.locks.ReentrantLock$NonfairSync@1d653213
Lock Owner Name: Thread-1
Lock Owner Id: 12
Waited Time: -1
Blocked Time: -1
Lock Info: java.util.concurrent.locks.ReentrantLock$NonfairSync@1d653213
Blocked by: java.util.concurrent.locks.ReentrantLock$NonfairSync@1d653213
Stack Trace:
	at sun.misc.Unsafe.park(Native Method)
	at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
	at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
	at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
	at org.example.lock.MonitoredReentrantLock.lock(MonitoredReentrantLock.java:22)
	at org.example.Main.lambda$testDeadLock$3(Main.java:79)
	at org.example.Main$$Lambda$1/1221555852.run(Unknown Source)
	at java.lang.Thread.run(Thread.java:750)

2024-08-07 14:11:28  [ pool-1-thread-1:2001 ] - [ WARN ]  Deadlocks detected:
Thread Name: Thread-1
Thread State: WAITING
Lock Name: java.util.concurrent.locks.ReentrantLock$NonfairSync@13a2dfcf
Lock Owner Name: Thread-0
Lock Owner Id: 11
Waited Time: -1
Blocked Time: -1
Lock Info: java.util.concurrent.locks.ReentrantLock$NonfairSync@13a2dfcf
Blocked by: java.util.concurrent.locks.ReentrantLock$NonfairSync@13a2dfcf
Stack Trace:
	at sun.misc.Unsafe.park(Native Method)
	at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
	at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
	at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
	at org.example.lock.MonitoredReentrantLock.lock(MonitoredReentrantLock.java:22)
	at org.example.Main.lambda$testDeadLock$4(Main.java:93)
	at org.example.Main$$Lambda$2/1556956098.run(Unknown Source)
	at java.lang.Thread.run(Thread.java:750)

```
```
    @WarmUp(iterations = 5, time = 1, timeUnit = TimeUnit.SECONDS)
    @measurement(iterations = 2, time = 2, timeUnit = TimeUnit.SECONDS)
    @threads(1)

Benchmark                                                          Mode  Cnt       Score   Error   Units
LockBenchmark.testMonitoredLock                                   thrpt    2   15889.407          ops/ms
LockBenchmark.testMonitoredLock:·gc.alloc.rate                    thrpt    2     678.061          MB/sec
LockBenchmark.testMonitoredLock:·gc.alloc.rate.norm               thrpt    2      56.000            B/op
LockBenchmark.testMonitoredLock:·gc.churn.PS_Eden_Space           thrpt    2     668.249          MB/sec
LockBenchmark.testMonitoredLock:·gc.churn.PS_Eden_Space.norm      thrpt    2      55.080            B/op
LockBenchmark.testMonitoredLock:·gc.churn.PS_Survivor_Space       thrpt    2       0.075          MB/sec
LockBenchmark.testMonitoredLock:·gc.churn.PS_Survivor_Space.norm  thrpt    2       0.006            B/op
LockBenchmark.testMonitoredLock:·gc.count                         thrpt    2      20.000          counts
LockBenchmark.testMonitoredLock:·gc.time                          thrpt    2       6.000              ms
LockBenchmark.testNativeLock                                      thrpt    2  103130.635          ops/ms
LockBenchmark.testNativeLock:·gc.alloc.rate                       thrpt    2      ≈ 10⁻⁴          MB/sec
LockBenchmark.testNativeLock:·gc.alloc.rate.norm                  thrpt    2      ≈ 10⁻⁶            B/op
LockBenchmark.testNativeLock:·gc.count                            thrpt    2         ≈ 0          counts

    @WarmUp(iterations = 5, time = 1, timeUnit = TimeUnit.SECONDS)
    @measurement(iterations = 2, time = 2, timeUnit = TimeUnit.SECONDS)
    @threads(100)

Benchmark                                                          Mode  Cnt       Score   Error   Units
LockBenchmark.testMonitoredLock                                   thrpt    2   10994.606          ops/ms
LockBenchmark.testMonitoredLock:·gc.alloc.rate                    thrpt    2     488.508          MB/sec
LockBenchmark.testMonitoredLock:·gc.alloc.rate.norm               thrpt    2      56.002            B/op
LockBenchmark.testMonitoredLock:·gc.churn.PS_Eden_Space           thrpt    2     481.390          MB/sec
LockBenchmark.testMonitoredLock:·gc.churn.PS_Eden_Space.norm      thrpt    2      55.163            B/op
LockBenchmark.testMonitoredLock:·gc.churn.PS_Survivor_Space       thrpt    2       0.020          MB/sec
LockBenchmark.testMonitoredLock:·gc.churn.PS_Survivor_Space.norm  thrpt    2       0.002            B/op
LockBenchmark.testMonitoredLock:·gc.count                         thrpt    2      18.000          counts
LockBenchmark.testMonitoredLock:·gc.time                          thrpt    2       9.000              ms
LockBenchmark.testNativeLock                                      thrpt    2  558652.036          ops/ms
LockBenchmark.testNativeLock:·gc.alloc.rate                       thrpt    2       0.016          MB/sec
LockBenchmark.testNativeLock:·gc.alloc.rate.norm                  thrpt    2      ≈ 10⁻⁴            B/op
LockBenchmark.testNativeLock:·gc.count                            thrpt    2         ≈ 0          counts
```

Issue Number: close #xxx

<!--Describe your changes.-->
liutang123 pushed a commit to liutang123/doris that referenced this pull request Sep 19, 2024
…tions apache#39015  (apache#39099)

This issue proposes the addition of new features to the project,
including a deadlock detection tool and monitored lock implementations.
These features will help in identifying and debugging potential
deadlocks and monitoring lock usage. Features:

A monitored version of Lock that tracks and logs lock acquisition and
release times. Functionality:
Overrides lock(), unlock(), tryLock(), and tryLock(long timeout,
TimeUnit unit) methods. Logs information about lock acquisition time,
release time, and any failure to acquire the lock within the specified
timeout. ##### eg
```log
2024-08-07 12:02:59  [ Thread-2:2006 ] - [ WARN ]  Thread ID: 12, Thread Name: Thread-2 - Lock held for 1912 ms, exceeding hold timeout of 1000 ms
Thread stack trace:
	at java.lang.Thread.getStackTrace(Thread.java:1564)
	at org.example.lock.AbstractMonitoredLock.afterUnlock(AbstractMonitoredLock.java:49)
	at org.example.lock.MonitoredReentrantLock.unlock(MonitoredReentrantLock.java:32)
	at org.example.ExampleService.timeout(ExampleService.java:17)
	at org.example.Main.lambda$test2$1(Main.java:39)
	at java.lang.Thread.run(Thread.java:750)
```

Uses ScheduledExecutorService for periodic deadlock checks. Logs
deadlock information including thread names, states, lock info, and
stack traces.

**ThreadMXBean accesses thread information in the local JVM, which is
already in memory, so accessing it is less expensive than fetching data
from external resources such as disk or network. Thread state cache: The
JVM typically maintains a cache of thread states, reducing the need for
real-time calculations or additional data processing.** ##### eg
```log
Thread Name: Thread-0
Thread State: WAITING
Lock Name: java.util.concurrent.locks.ReentrantLock$NonfairSync@1d653213
Lock Owner Name: Thread-1
Lock Owner Id: 12
Waited Time: -1
Blocked Time: -1
Lock Info: java.util.concurrent.locks.ReentrantLock$NonfairSync@1d653213
Blocked by: java.util.concurrent.locks.ReentrantLock$NonfairSync@1d653213
Stack Trace:
	at sun.misc.Unsafe.park(Native Method)
	at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
	at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
	at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
	at org.example.lock.MonitoredReentrantLock.lock(MonitoredReentrantLock.java:22)
	at org.example.Main.lambda$testDeadLock$3(Main.java:79)
	at org.example.Main$$Lambda$1/1221555852.run(Unknown Source)
	at java.lang.Thread.run(Thread.java:750)

2024-08-07 14:11:28  [ pool-1-thread-1:2001 ] - [ WARN ]  Deadlocks detected:
Thread Name: Thread-1
Thread State: WAITING
Lock Name: java.util.concurrent.locks.ReentrantLock$NonfairSync@13a2dfcf
Lock Owner Name: Thread-0
Lock Owner Id: 11
Waited Time: -1
Blocked Time: -1
Lock Info: java.util.concurrent.locks.ReentrantLock$NonfairSync@13a2dfcf
Blocked by: java.util.concurrent.locks.ReentrantLock$NonfairSync@13a2dfcf
Stack Trace:
	at sun.misc.Unsafe.park(Native Method)
	at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
	at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
	at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
	at org.example.lock.MonitoredReentrantLock.lock(MonitoredReentrantLock.java:22)
	at org.example.Main.lambda$testDeadLock$4(Main.java:93)
	at org.example.Main$$Lambda$2/1556956098.run(Unknown Source)
	at java.lang.Thread.run(Thread.java:750)

```
```
    @WarmUp(iterations = 5, time = 1, timeUnit = TimeUnit.SECONDS)
    @measurement(iterations = 2, time = 2, timeUnit = TimeUnit.SECONDS)
    @threads(1)

Benchmark                                                          Mode  Cnt       Score   Error   Units
LockBenchmark.testMonitoredLock                                   thrpt    2   15889.407          ops/ms
LockBenchmark.testMonitoredLock:·gc.alloc.rate                    thrpt    2     678.061          MB/sec
LockBenchmark.testMonitoredLock:·gc.alloc.rate.norm               thrpt    2      56.000            B/op
LockBenchmark.testMonitoredLock:·gc.churn.PS_Eden_Space           thrpt    2     668.249          MB/sec
LockBenchmark.testMonitoredLock:·gc.churn.PS_Eden_Space.norm      thrpt    2      55.080            B/op
LockBenchmark.testMonitoredLock:·gc.churn.PS_Survivor_Space       thrpt    2       0.075          MB/sec
LockBenchmark.testMonitoredLock:·gc.churn.PS_Survivor_Space.norm  thrpt    2       0.006            B/op
LockBenchmark.testMonitoredLock:·gc.count                         thrpt    2      20.000          counts
LockBenchmark.testMonitoredLock:·gc.time                          thrpt    2       6.000              ms
LockBenchmark.testNativeLock                                      thrpt    2  103130.635          ops/ms
LockBenchmark.testNativeLock:·gc.alloc.rate                       thrpt    2      ≈ 10⁻⁴          MB/sec
LockBenchmark.testNativeLock:·gc.alloc.rate.norm                  thrpt    2      ≈ 10⁻⁶            B/op
LockBenchmark.testNativeLock:·gc.count                            thrpt    2         ≈ 0          counts

    @WarmUp(iterations = 5, time = 1, timeUnit = TimeUnit.SECONDS)
    @measurement(iterations = 2, time = 2, timeUnit = TimeUnit.SECONDS)
    @threads(100)

Benchmark                                                          Mode  Cnt       Score   Error   Units
LockBenchmark.testMonitoredLock                                   thrpt    2   10994.606          ops/ms
LockBenchmark.testMonitoredLock:·gc.alloc.rate                    thrpt    2     488.508          MB/sec
LockBenchmark.testMonitoredLock:·gc.alloc.rate.norm               thrpt    2      56.002            B/op
LockBenchmark.testMonitoredLock:·gc.churn.PS_Eden_Space           thrpt    2     481.390          MB/sec
LockBenchmark.testMonitoredLock:·gc.churn.PS_Eden_Space.norm      thrpt    2      55.163            B/op
LockBenchmark.testMonitoredLock:·gc.churn.PS_Survivor_Space       thrpt    2       0.020          MB/sec
LockBenchmark.testMonitoredLock:·gc.churn.PS_Survivor_Space.norm  thrpt    2       0.002            B/op
LockBenchmark.testMonitoredLock:·gc.count                         thrpt    2      18.000          counts
LockBenchmark.testMonitoredLock:·gc.time                          thrpt    2       9.000              ms
LockBenchmark.testNativeLock                                      thrpt    2  558652.036          ops/ms
LockBenchmark.testNativeLock:·gc.alloc.rate                       thrpt    2       0.016          MB/sec
LockBenchmark.testNativeLock:·gc.alloc.rate.norm                  thrpt    2      ≈ 10⁻⁴            B/op
LockBenchmark.testNativeLock:·gc.count                            thrpt    2         ≈ 0          counts
```

Issue Number: close #xxx

<!--Describe your changes.-->
liutang123 pushed a commit to liutang123/doris that referenced this pull request Sep 20, 2024
liutang123 pushed a commit to liutang123/doris that referenced this pull request Sep 20, 2024
lide-reed pushed a commit that referenced this pull request Sep 20, 2024
…ock (#40746)

Use the fe config like #39015, print table lock usage duration.
In 2.0.6, I found `replayer` thread is blocked for a long time.
```
2024-09-12 11:00:58,797 INFO (replayer|89) [DatabaseTransactionMgr.replayUpsertTransactionState():1977] replay a visible transaction TransactionState. transaction id: 471728922, label: flink_connector_20240912_103150_269f8d635ffb455bb4043eecd1a8406b, db id: 475180, table id list: 14473765, callback id: -1, coordinator: BE: 10.217.0.100, transaction status: VISIBLE, error replicas num: 0, replica ids: , prepare time: 1726108310505, commit time: 1726108310579, finish time: 1726108310604, reason: 
2024-09-12 11:00:58,798 WARN (replayer|89) [Env.replayJournal():2575] entityCost:940256 loadJournalCost:940256 logId:1392353661 replayedJournalId:1392353661 code:100 size:562
```
And, there are many table lock fail log:
```
2024-09-12 10:56:53,167 WARN (mysql-nio-pool-32036|412611) [Table.tryReadLock():155] Failed to try table atdpersonpaycode's read lock. timeout 1 MINUTES. Current owner: null
```
But, I don't kown what thread hold the table lock.
So, add some logs to print lock holding info.
CalvinKirs added a commit to CalvinKirs/incubator-doris that referenced this pull request Sep 23, 2024
…branch-3.0-apache#39015

# Conflicts:
#	fe/fe-common/src/main/java/org/apache/doris/common/Config.java
CalvinKirs added a commit to CalvinKirs/incubator-doris that referenced this pull request Oct 11, 2024
morningman pushed a commit that referenced this pull request Oct 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by one committer. dev/2.1.6-merged dev/3.0.3-merged reviewed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants