Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HLS: Fix on_hls and hls_dispose critical zone issue. v5.0.174 v6.0.69 #3781

Merged
merged 5 commits into from
Aug 28, 2023

Conversation

winlinvip
Copy link
Member

@winlinvip winlinvip commented Aug 23, 2023

on_hls and hls_dispose are two coroutines, with potential race conditions. That is, during on_hls, if the API Server being accessed is slower, it will switch to the hls_dispose coroutine to start cleaning up. However, when the API Server is processing the slice, a situation may occur where the slice does not exist, resulting in the following log:

[2023-08-22 12:03:20.309][WARN][40][x5l48q7b][11] ignore task failed code=4005(HttpStatus)(Invalid HTTP status code) : callback on_hls http://localhost:2024/terraform/v1/hooks/srs/hls : http: post http://localhost:2024/terraform/v1/hooks/srs/hls with {"server_id":"vid-5d7dxn8","service_id":"cu153o7g","action":"on_hls","client_id":"x5l48q7b","ip":"172.17.0.1","vhost":"__defaultVhost__","app":"live","tcUrl":"srt://172.17.0.2/live","stream":"stream-44572-2739617660809856576","param":"secret=1ed8e0ffbc53439c8fc8da30ab8c19f0","duration":4.57,"cwd":"/usr/local/srs-stack/platform","file":"./objs/nginx/html/live/stream-44572-2739617660809856576-1.ts","url":"live/stream-44572-2739617660809856576-1.ts","m3u8":"./objs/nginx/html/live/stream-44572-2739617660809856576.m3u8","m3u8_url":"live/stream-44572-2739617660809856576.m3u8","seq_no":1,"stream_url":"/live/stream-44572-2739617660809856576","stream_id":"vid-0n9zoz3"}, status=500, res=invalid ts file ./objs/nginx/html/live/stream-44572-2739617660809856576-1.ts: stat ./objs/nginx/html/live/stream-44572-2739617660809856576-1.ts: no such file or directory
thread [40][x5l48q7b]: call() [./src/app/srs_app_hls.cpp:122][errno=11]
thread [40][x5l48q7b]: on_hls() [./src/app/srs_app_http_hooks.cpp:401][errno=11]
thread [40][x5l48q7b]: do_post() [./src/app/srs_app_http_hooks.cpp:638][errno=11]


[error] 2023/08/22 12:03:20.076984 [52][1001] Serve /terraform/v1/hooks/srs/hls failed, err is stat ./objs/nginx/html/live/stream-44572-2739617660809856576-1.ts: no such file or directory
invalid ts file ./objs/nginx/html/live/stream-44572-2739617660809856576-1.ts
main.handleOnHls.func1.1
	/g/platform/srs-hooks.go:684
main.handleOnHls.func1
	/g/platform/srs-hooks.go:720
net/http.HandlerFunc.ServeHTTP
	/usr/local/go/src/net/http/server.go:2084
net/http.(*ServeMux).ServeHTTP
	/usr/local/go/src/net/http/server.go:2462
net/http.serverHandler.ServeHTTP
	/usr/local/go/src/net/http/server.go:2916
net/http.(*conn).serve
	/usr/local/go/src/net/http/server.go:1966
runtime.goexit
	/usr/local/go/src/runtime/asm_amd64.s:1571

Similarly, when stopping the stream, on_hls will also be called to handle the last slice. If the API Server is slower at this time, it will enter hls_dispose and call unpublish repeatedly. Since the previous unpublish is still blocked in on_hls, the following interference log will appear:

[2023-08-22 12:03:18.748][INFO][40][6498088c] hls cycle to dispose hls /live/stream-44572-2739617660809856576, timeout=10000000ms
[2023-08-22 12:03:18.752][WARN][40][6498088c][115] flush audio ignored, for segment is not open.
[2023-08-22 12:03:18.752][WARN][40][6498088c][115] ignore the segment close, for segment is not open.

Although this log will not cause problems, it can interfere with judgment.

The solution is to add an 'unpublishing' status. If it is in the 'unpublishing' status, then do not clean up the slices.


TRANS_BY_GPT4


Co-authored-by: Haibo Chen [email protected]

@winlinvip winlinvip linked an issue Aug 23, 2023 that may be closed by this pull request
@winlinvip winlinvip added this to the 5.0 milestone Aug 23, 2023
@winlinvip winlinvip changed the title HLS: Fix on_hls and reload critical zone issue. HLS: Fix on_hls and hls_dispose critical zone issue. Aug 23, 2023
@winlinvip winlinvip changed the title HLS: Fix on_hls and hls_dispose critical zone issue. HLS: Fix on_hls and hls_dispose critical zone issue. v5.0.174 v6.0.69 Aug 28, 2023
@winlinvip winlinvip added the RefinedByAI Refined by AI/GPT. label Aug 28, 2023
@winlinvip winlinvip merged commit 5147e1c into ossrs:develop Aug 28, 2023
17 checks passed
winlinvip added a commit that referenced this pull request Aug 28, 2023
…#3781)

on_hls and hls_dispose are two coroutines, with potential race
conditions. That is, during on_hls, if the API Server being accessed is
slower, it will switch to the hls_dispose coroutine to start cleaning
up. However, when the API Server is processing the slice, a situation
may occur where the slice does not exist, resulting in the following
log:

```
[2023-08-22 12:03:20.309][WARN][40][x5l48q7b][11] ignore task failed code=4005(HttpStatus)(Invalid HTTP status code) : callback on_hls http://localhost:2024/terraform/v1/hooks/srs/hls : http: post http://localhost:2024/terraform/v1/hooks/srs/hls with {"server_id":"vid-5d7dxn8","service_id":"cu153o7g","action":"on_hls","client_id":"x5l48q7b","ip":"172.17.0.1","vhost":"__defaultVhost__","app":"live","tcUrl":"srt://172.17.0.2/live","stream":"stream-44572-2739617660809856576","param":"secret=1ed8e0ffbc53439c8fc8da30ab8c19f0","duration":4.57,"cwd":"/usr/local/srs-stack/platform","file":"./objs/nginx/html/live/stream-44572-2739617660809856576-1.ts","url":"live/stream-44572-2739617660809856576-1.ts","m3u8":"./objs/nginx/html/live/stream-44572-2739617660809856576.m3u8","m3u8_url":"live/stream-44572-2739617660809856576.m3u8","seq_no":1,"stream_url":"/live/stream-44572-2739617660809856576","stream_id":"vid-0n9zoz3"}, status=500, res=invalid ts file ./objs/nginx/html/live/stream-44572-2739617660809856576-1.ts: stat ./objs/nginx/html/live/stream-44572-2739617660809856576-1.ts: no such file or directory
thread [40][x5l48q7b]: call() [./src/app/srs_app_hls.cpp:122][errno=11]
thread [40][x5l48q7b]: on_hls() [./src/app/srs_app_http_hooks.cpp:401][errno=11]
thread [40][x5l48q7b]: do_post() [./src/app/srs_app_http_hooks.cpp:638][errno=11]

[error] 2023/08/22 12:03:20.076984 [52][1001] Serve /terraform/v1/hooks/srs/hls failed, err is stat ./objs/nginx/html/live/stream-44572-2739617660809856576-1.ts: no such file or directory
invalid ts file ./objs/nginx/html/live/stream-44572-2739617660809856576-1.ts
main.handleOnHls.func1.1
	/g/platform/srs-hooks.go:684
main.handleOnHls.func1
	/g/platform/srs-hooks.go:720
net/http.HandlerFunc.ServeHTTP
	/usr/local/go/src/net/http/server.go:2084
net/http.(*ServeMux).ServeHTTP
	/usr/local/go/src/net/http/server.go:2462
net/http.serverHandler.ServeHTTP
	/usr/local/go/src/net/http/server.go:2916
net/http.(*conn).serve
	/usr/local/go/src/net/http/server.go:1966
runtime.goexit
	/usr/local/go/src/runtime/asm_amd64.s:1571
```

Similarly, when stopping the stream, on_hls will also be called to
handle the last slice. If the API Server is slower at this time, it will
enter hls_dispose and call unpublish repeatedly. Since the previous
unpublish is still blocked in on_hls, the following interference log
will appear:

```
[2023-08-22 12:03:18.748][INFO][40][6498088c] hls cycle to dispose hls /live/stream-44572-2739617660809856576, timeout=10000000ms
[2023-08-22 12:03:18.752][WARN][40][6498088c][115] flush audio ignored, for segment is not open.
[2023-08-22 12:03:18.752][WARN][40][6498088c][115] ignore the segment close, for segment is not open.
```

Although this log will not cause problems, it can interfere with
judgment.

The solution is to add an 'unpublishing' status. If it is in the
'unpublishing' status, then do not clean up the slices.

---------

Co-authored-by: Haibo Chen <[email protected]>
winlinvip added a commit that referenced this pull request Aug 28, 2023
on_hls and hls_dispose are two coroutines, with potential race
conditions. That is, during on_hls, if the API Server being accessed is
slower, it will switch to the hls_dispose coroutine to start cleaning
up. However, when the API Server is processing the slice, a situation
may occur where the slice does not exist, resulting in the following
log:

```
[2023-08-22 12:03:20.309][WARN][40][x5l48q7b][11] ignore task failed code=4005(HttpStatus)(Invalid HTTP status code) : callback on_hls http://localhost:2024/terraform/v1/hooks/srs/hls : http: post http://localhost:2024/terraform/v1/hooks/srs/hls with {"server_id":"vid-5d7dxn8","service_id":"cu153o7g","action":"on_hls","client_id":"x5l48q7b","ip":"172.17.0.1","vhost":"__defaultVhost__","app":"live","tcUrl":"srt://172.17.0.2/live","stream":"stream-44572-2739617660809856576","param":"secret=1ed8e0ffbc53439c8fc8da30ab8c19f0","duration":4.57,"cwd":"/usr/local/srs-stack/platform","file":"./objs/nginx/html/live/stream-44572-2739617660809856576-1.ts","url":"live/stream-44572-2739617660809856576-1.ts","m3u8":"./objs/nginx/html/live/stream-44572-2739617660809856576.m3u8","m3u8_url":"live/stream-44572-2739617660809856576.m3u8","seq_no":1,"stream_url":"/live/stream-44572-2739617660809856576","stream_id":"vid-0n9zoz3"}, status=500, res=invalid ts file ./objs/nginx/html/live/stream-44572-2739617660809856576-1.ts: stat ./objs/nginx/html/live/stream-44572-2739617660809856576-1.ts: no such file or directory
thread [40][x5l48q7b]: call() [./src/app/srs_app_hls.cpp:122][errno=11]
thread [40][x5l48q7b]: on_hls() [./src/app/srs_app_http_hooks.cpp:401][errno=11]
thread [40][x5l48q7b]: do_post() [./src/app/srs_app_http_hooks.cpp:638][errno=11]

[error] 2023/08/22 12:03:20.076984 [52][1001] Serve /terraform/v1/hooks/srs/hls failed, err is stat ./objs/nginx/html/live/stream-44572-2739617660809856576-1.ts: no such file or directory
invalid ts file ./objs/nginx/html/live/stream-44572-2739617660809856576-1.ts
main.handleOnHls.func1.1
	/g/platform/srs-hooks.go:684
main.handleOnHls.func1
	/g/platform/srs-hooks.go:720
net/http.HandlerFunc.ServeHTTP
	/usr/local/go/src/net/http/server.go:2084
net/http.(*ServeMux).ServeHTTP
	/usr/local/go/src/net/http/server.go:2462
net/http.serverHandler.ServeHTTP
	/usr/local/go/src/net/http/server.go:2916
net/http.(*conn).serve
	/usr/local/go/src/net/http/server.go:1966
runtime.goexit
	/usr/local/go/src/runtime/asm_amd64.s:1571
```

Similarly, when stopping the stream, on_hls will also be called to
handle the last slice. If the API Server is slower at this time, it will
enter hls_dispose and call unpublish repeatedly. Since the previous
unpublish is still blocked in on_hls, the following interference log
will appear:

```
[2023-08-22 12:03:18.748][INFO][40][6498088c] hls cycle to dispose hls /live/stream-44572-2739617660809856576, timeout=10000000ms
[2023-08-22 12:03:18.752][WARN][40][6498088c][115] flush audio ignored, for segment is not open.
[2023-08-22 12:03:18.752][WARN][40][6498088c][115] ignore the segment close, for segment is not open.
```

Although this log will not cause problems, it can interfere with
judgment.

The solution is to add an 'unpublishing' status. If it is in the
'unpublishing' status, then do not clean up the slices.

---------

Co-authored-by: Haibo Chen <[email protected]>
duiniuluantanqin added a commit to duiniuluantanqin/srs that referenced this pull request Sep 14, 2023
…3781)

on_hls and hls_dispose are two coroutines, with potential race
conditions. That is, during on_hls, if the API Server being accessed is
slower, it will switch to the hls_dispose coroutine to start cleaning
up. However, when the API Server is processing the slice, a situation
may occur where the slice does not exist, resulting in the following
log:

```
[2023-08-22 12:03:20.309][WARN][40][x5l48q7b][11] ignore task failed code=4005(HttpStatus)(Invalid HTTP status code) : callback on_hls http://localhost:2024/terraform/v1/hooks/srs/hls : http: post http://localhost:2024/terraform/v1/hooks/srs/hls with {"server_id":"vid-5d7dxn8","service_id":"cu153o7g","action":"on_hls","client_id":"x5l48q7b","ip":"172.17.0.1","vhost":"__defaultVhost__","app":"live","tcUrl":"srt://172.17.0.2/live","stream":"stream-44572-2739617660809856576","param":"secret=1ed8e0ffbc53439c8fc8da30ab8c19f0","duration":4.57,"cwd":"/usr/local/srs-stack/platform","file":"./objs/nginx/html/live/stream-44572-2739617660809856576-1.ts","url":"live/stream-44572-2739617660809856576-1.ts","m3u8":"./objs/nginx/html/live/stream-44572-2739617660809856576.m3u8","m3u8_url":"live/stream-44572-2739617660809856576.m3u8","seq_no":1,"stream_url":"/live/stream-44572-2739617660809856576","stream_id":"vid-0n9zoz3"}, status=500, res=invalid ts file ./objs/nginx/html/live/stream-44572-2739617660809856576-1.ts: stat ./objs/nginx/html/live/stream-44572-2739617660809856576-1.ts: no such file or directory
thread [40][x5l48q7b]: call() [./src/app/srs_app_hls.cpp:122][errno=11]
thread [40][x5l48q7b]: on_hls() [./src/app/srs_app_http_hooks.cpp:401][errno=11]
thread [40][x5l48q7b]: do_post() [./src/app/srs_app_http_hooks.cpp:638][errno=11]

[error] 2023/08/22 12:03:20.076984 [52][1001] Serve /terraform/v1/hooks/srs/hls failed, err is stat ./objs/nginx/html/live/stream-44572-2739617660809856576-1.ts: no such file or directory
invalid ts file ./objs/nginx/html/live/stream-44572-2739617660809856576-1.ts
main.handleOnHls.func1.1
	/g/platform/srs-hooks.go:684
main.handleOnHls.func1
	/g/platform/srs-hooks.go:720
net/http.HandlerFunc.ServeHTTP
	/usr/local/go/src/net/http/server.go:2084
net/http.(*ServeMux).ServeHTTP
	/usr/local/go/src/net/http/server.go:2462
net/http.serverHandler.ServeHTTP
	/usr/local/go/src/net/http/server.go:2916
net/http.(*conn).serve
	/usr/local/go/src/net/http/server.go:1966
runtime.goexit
	/usr/local/go/src/runtime/asm_amd64.s:1571
```

Similarly, when stopping the stream, on_hls will also be called to
handle the last slice. If the API Server is slower at this time, it will
enter hls_dispose and call unpublish repeatedly. Since the previous
unpublish is still blocked in on_hls, the following interference log
will appear:

```
[2023-08-22 12:03:18.748][INFO][40][6498088c] hls cycle to dispose hls /live/stream-44572-2739617660809856576, timeout=10000000ms
[2023-08-22 12:03:18.752][WARN][40][6498088c][115] flush audio ignored, for segment is not open.
[2023-08-22 12:03:18.752][WARN][40][6498088c][115] ignore the segment close, for segment is not open.
```

Although this log will not cause problems, it can interfere with
judgment.

The solution is to add an 'unpublishing' status. If it is in the
'unpublishing' status, then do not clean up the slices.

---------

Co-authored-by: Haibo Chen <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
RefinedByAI Refined by AI/GPT. TransByAI Translated by AI/GPT.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

SRT: Callback on_hls failed when disposing HLS.
2 participants