-
Notifications
You must be signed in to change notification settings - Fork 215
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
could not reduce file size #1605
Comments
I do not see any failure logs in the shared log file. Neither could I find any reference to ""/rman-backup/step/0/2025-01-03_1146/STEP_148867_1_c33ei92k_20250106.incr0" file in the logs. Are you sure correct logs are shared. As per logs the last mount was just trying to create blobfuse2.log file on the container, which makes me curious about the log file location. Are you trying to save blobfuse logs in the mounted container itself ? |
@vibhansa-msft , @ashruti-msft : Attached the fresh logs now.
|
Hi @sandip094 your logs end at timestamp 11:29:25 and after a break logs indicate blobfuse was mounted again. Can you explain what might be the reason for this? Since the error occurred at 11:29:34 and there are no logs around that time I cannot debug further. |
Hello @ashruti-msft , As soon as the error occurs /rman-backup gets disconnected and had to remout manually which i did.
|
Hi @sandip094 after analyzing the logs I could not find any indication of an issue from Blobfuse during the transfer. It looks like there is a connection abort due to which the backup fails. Could you please confirm if there are no network issues or any other reason causing this error? |
@ashruti-msft : I can confirm that there has been no network issue. However, what’s puzzling is why the /rman-backup mount gets disconnected every time the error occurs. Is there a way to ensure that this mount remains connected and doesn’t get disconnected? |
Do you mean to say in case of error the mount is not longer valid ? |
@vibhansa-msft : Yes thats right. Attached the dump logs as requested `/usr/local/go/src/bufio/bufio.go:148 +0x53 fp=0xc000c67d80 sp=0xc000c67d60 pc=0x8d59f3 goroutine 224 gp=0xc0005df6c0 m=nil [IO wait, 1 minutes]: goroutine 3166 gp=0xc00080e000 m=nil [IO wait, 1 minutes]: goroutine 219 gp=0xc000235c00 m=nil [select, 1 minutes]: goroutine 3167 gp=0xc00080e700 m=nil [select, 1 minutes]: goroutine 3170 gp=0xc00080ee00 m=nil [IO wait, 1 minutes]: goroutine 3191 gp=0xc000892700 m=nil [IO wait, 1 minutes]: goroutine 3116 gp=0xc0007a7340 m=nil [select, 1 minutes]: goroutine 225 gp=0xc0007a76c0 m=nil [select, 1 minutes]: goroutine 3218 gp=0xc000838a80 m=nil [select]: goroutine 341 gp=0xc000978a80 m=44 mp=0xc0006df808 [syscall, locked to thread]: goroutine 342 gp=0xc000978fc0 m=45 mp=0xc00060f808 [syscall, locked to thread]: goroutine 343 gp=0xc0006ba380 m=46 mp=0xc000103108 [syscall, locked to thread]: goroutine 362 gp=0xc000979500 m=49 mp=0xc00075c708 [syscall, locked to thread]: goroutine 507 gp=0xc0006bae00 m=50 mp=0xc00075ce08 [syscall, locked to thread]: goroutine 508 gp=0xc00074e8c0 m=51 mp=0xc0008a4008 [syscall, locked to thread]: goroutine 3154 gp=0xc00074fc00 m=nil [select]: goroutine 3159 gp=0xc0006bb6c0 m=nil [IO wait, 1 minutes]: goroutine 3171 gp=0xc0006bba40 m=nil [select, 1 minutes]: goroutine 3219 gp=0xc0006bbdc0 m=nil [IO wait, 1 minutes]: goroutine 3133 gp=0xc000568380 m=nil [select, 1 minutes]: goroutine 3132 gp=0xc0008921c0 m=nil [IO wait, 1 minutes]: goroutine 3220 gp=0xc0001736c0 m=nil [select, 1 minutes]: goroutine 3192 gp=0xc0008ec540 m=nil [select, 1 minutes]: goroutine 3196 gp=0xc0008ec8c0 m=nil [select, 1 minutes]: goroutine 3216 gp=0xc0008ecfc0 m=nil [select, 1 minutes]: goroutine 3173 gp=0xc0008ed340 m=nil [select, 1 minutes]: |
Hey Sandip, thanks for sharing the screen logs. Unfortunately the logs are not complete and top part of the logs is rolled over. We need to get the console logs right from beginning as that will show the most important details of this crash. Either you can redirect these to a file or have a buffer which can help you scroll up and get the entire logs. |
Hello @vibhansa-msft , Addiitonal info: Refer ./setup/baseConfig.yaml for full set of config parameters#allow-other: false logging:
libfuse: file_cache: azstorage: |
From the crash log it appears you are running out of memory. Very first line in the output says When blobfuse is running try to observe the free memory on your system. From your config it does not appear blobfuse shall be consuming too much of memory but for some reason system is running out of memory causing blobfuse to crash. This may also happen if system is already low on memory and then your backup application and blobfuse combined put system on pressure. You need to monitor the memory usage of blobfuse and your backup application when your workflow is running. |
Hello @vibhansa-msft , |
For that you need to monitor the memory usage as your RMAN process is running. If blobfuse is found to be using more, maybe you can migrate to block-cache based config where memory is reallocated and blobfuse tries to run within that scope. If RMAN is using more, maybe you can try to reduce the parallel file operations or if there are any config to control the memory usage. |
@vibhansa-msft , I work with Sandeep. I notice this warning: config: (just added max-size-mb again as we thought it made a difference, but not the case)
Mount point disappears ... |
This file open flags is a warning and its safe to ignore that. When you say "mount disappears" what do you mean? |
Hello @vibhansa-msft , The issue occurs only with the latest version (2.4.0). After downgrading to version 2.3.2, the backup runs without any problems. With the latest version, the system is running out of memory, causing Blobfuse to crash and the mount to disappear. Logs generated using the foreground option have already been shared. |
Interesting that it used to work fine on 2.3.2. Post block-cache config also you observe the same ? I see that console log was shared earlier and it was crashing saying "out of memory" which means there is some sort of memory leak. In one of our other tool we observed the same some time back and was root-caused to a Go lang upgrade. I assume we are hitting the same here. Is it possible for you to build the code from source and validate this part ? I can help you on how to build the binary and all. |
Hello @vibhansa-msft , Regards |
You can git clone this repo and then checkout "blobfuse/2.4.1" branch where we have pushed this change set. To build the code just execute "./build.sh" on workspace root and it will create blobfuse binary for you. You just need to replace your existing binary with this new one and you are good to go. Just in case you do not have golang installed on your system, on the workspace root you can execute "./go_installer.sh ../" and that shall do the trick. Once installed you can start with the build mentioned in above text. |
Did it help ? |
Hello @vibhansa-msft , |
Great, then we will close this item here. If you hit the issue again you can reopen this or create a new one and we can investigate from there. |
Which version of blobfuse was used?
blobfuse2 version: 2.4.0
Which OS distribution and version are you using?
CentOS Linux release 7.9.2009 (Core)
What was the issue encountered?
Getting the below error while running the RMAN backup.
released channel: C1 RMAN-00571: =========================================================== RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS =============== RMAN-00571: =========================================================== RMAN-03002: failure of backup plus archivelog command at 01/06/2025 11:53:13 ORA-19510: failed to set size of 1758986 blocks for file "/rman-backup/step/0/2025-01-03_1146/STEP_148867_1_c33ei92k_20250106.incr0" (block size=8192) ORA-27059: could not reduce file size Linux-x86_64 Error: 103: Software caused connection abort Additional information: 2
Configuration file is as below -/etc/blobfuse/blobfuseconfig.yaml
`# Refer ./setup/baseConfig.yaml for full set of config parameters
#allow-other: false
logging:
type: base
level: log_info
max-file-size-mb: 32
file-count: 10
track-time: true
max-concurrency: 8
components:
libfuse:
default-permission: 0644
attribute-expiration-sec: 120
entry-expiration-sec: 120
negative-entry-expiration-sec: 240
ignore-open-flags: true
file_cache:
path: /mnt/blobfusetmp
timeout-sec: 20
max-size-mb: 75776
allow-non-empty-temp: true
cleanup-on-start: true
azstorage:
type: block
account-name: x
account-key: x
mode: key
container: x
blobfuse2.log
`
So far tried these options:
The text was updated successfully, but these errors were encountered: