Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

blobfuse2 command keeps running at 100% CPU #1196

Closed
ggrrll opened this issue Jul 10, 2023 · 20 comments · Fixed by #1205
Closed

blobfuse2 command keeps running at 100% CPU #1196

ggrrll opened this issue Jul 10, 2023 · 20 comments · Fixed by #1205

Comments

@ggrrll
Copy link

ggrrll commented Jul 10, 2023

for v. and settings see #1195

(they might be the same issue -- not sure)

@vibhansa-msft
Copy link
Member

Can you enable log_debug and share the log file. That shall give us some indication on what is going on with the blobfuse instance.

@vibhansa-msft vibhansa-msft self-assigned this Jul 10, 2023
@vibhansa-msft vibhansa-msft added this to the V2-2.0.5 milestone Jul 10, 2023
@ggrrll
Copy link
Author

ggrrll commented Jul 10, 2023

isn't already enabled?
#1195 (comment)

(/var/log/syslog has no reference to blobfuse and there is no $HOME/.blobfuse2/blobfuse2.log )

@gilzimmerman
Copy link

gilzimmerman commented Jul 11, 2023

Also seeing this with rsync of 1 GB file to blobfuse2 mount.

@mariobarrenechea
Copy link

mariobarrenechea commented Jul 13, 2023

blobfuse2_logs.zip
Hi Vikas: any idea what could be causing 100% CPU usage on blobfuse2 version 2.0.4 mount? No additional activity. Only mount

Running on Ubuntu 20.04.6 LTS, Debian and Centos supported environments with same results.

@vibhansa-msft
Copy link
Member

Without the debug logs I can not comment on what is going on with the system. Kindly check your /var/log/syslog if blobfuse logs are being punted there. If you are not able to figure out the log location you can configure blobfuse to dump its log in a given file for this you need to set logging config in your yaml file like:
logging:
type: base
file-path: /home/userABCD/blobfuse2.log
level: log_debug

@ggrrll
Copy link
Author

ggrrll commented Jul 13, 2023

here are my logs

mind that, by default the logs are not in /var/log/syslog but in /var/log/blobfuse2.log
(maybe, updated the docs ;) )

I also have

logging:
  type: syslog
  level: log_debug

blobfuse2.log

@tom-seechange
Copy link

tom-seechange commented Jul 13, 2023

Hi, I have this issue as well.

It appears to be related to this change:

#1189

It's now sat spinning in func (lcm *lifecycleMgr) watchInputs()

(gdb) bt
#0 runtime.(*timeHistogram).record (h=, duration=) at /usr/local/go/src/runtime/histogram.go:135
#1 0x000000000043f187 in runtime.casgstatus (gp=0xc00039b1e0, oldval=3, newval=2) at /usr/local/go/src/runtime/proc.go:1088
#2 0x000000000046b55b in runtime.exitsyscall () at /usr/local/go/src/runtime/proc.go:3951
#3 0x00000000004c823c in syscall.Syscall (trap=0, a1=0, a2=4096, a3=0, r1=, r2=, err=) at /usr/local/go/src/syscall/syscall_linux.go:83
#4 0x00000000004c5e05 in syscall.read (fd=, p=..., n=, err=...) at /usr/local/go/src/syscall/zsyscall_linux_amd64.go:711
#5 0x00000000004e3b0e in syscall.Read (fd=0, n=, err=..., p=...) at /usr/local/go/src/syscall/syscall_unix.go:178
#6 internal/poll.ignoringEINTRIO (fd=0, fn=, p=...) at /usr/local/go/src/internal/poll/fd_unix.go:794
#7 internal/poll.(*FD).Read (fd=0xc000078060, p=..., ~r0=, ~r0=, ~r1=..., ~r1=...) at /usr/local/go/src/internal/poll/fd_unix.go:163
#8 0x00000000004ef8be in os.(*File).read (f=0xc000014018, b=..., n=, err=...) at /usr/local/go/src/os/file_posix.go:31
#9 os.(*File).Read (f=0xc000014018, b=..., n=, err=...) at /usr/local/go/src/os/file.go:118
#10 0x000000000058c5bf in bufio.(*Reader).fill (b=0xc000065f60) at /usr/local/go/src/bufio/bufio.go:106
#11 0x000000000058d1af in bufio.(*Reader).ReadSlice (b=0x167fde0 <runtime[sched]+320>, delim=10 '\n', line=..., err=...) at /usr/local/go/src/bufio/bufio.go:372
#12 0x000000000058d5f4 in bufio.(*Reader).collectFragments (b=0xc000065f60, delim=10 '\n', fullBuffers=..., finalFragment=..., totalLen=, err=...) at /usr/local/go/src/bufio/bufio.go:447
#13 0x000000000058d9cb in bufio.(*Reader).ReadString (b=0x167fde0 <runtime[sched]+320>, delim=0 '\000', ~r0=..., ~r1=...) at /usr/local/go/src/bufio/bufio.go:494
#14 0x0000000000bf9405 in github.com/Azure/azure-storage-azcopy/v10/common.(*lifecycleMgr).watchInputs (lcm=0xc0000fec60) at /home/cloudtest/go/pkg/mod/github.com/!azure/azure-storage-azcopy/[email protected]/common/lifecyleMgr.go:125
#15 0x0000000000bf9206 in github.com/Azure/azure-storage-azcopy/v10/common.glob..func2.3 () at /home/cloudtest/go/pkg/mod/github.com/!azure/azure-storage-azcopy/[email protected]/common/lifecyleMgr.go:43
#16 0x00000000004708a1 in runtime.goexit () at /usr/local/go/src/runtime/asm_amd64.s:1598
#17 0x0000000000000000 in ?? ()

and with a breakpoint on ReadString

gdb) b bufio.(*Reader).ReadString
Breakpoint 1 at 0x58d9a0: file /usr/local/go/src/bufio/bufio.go, line 493.
(gdb) continue
Continuing.

Breakpoint 1, bufio.(*Reader).ReadString (b=0xc000065f60, delim=10 '\n', ~r0=..., ~r1=...) at /usr/local/go/src/bufio/bufio.go:493
493 /usr/local/go/src/bufio/bufio.go: No such file or directory.
(gdb) bt
#0 bufio.(*Reader).ReadString (b=0xc000065f60, delim=10 '\n', ~r0=..., ~r1=...) at /usr/local/go/src/bufio/bufio.go:493
#1 0x0000000000bf9405 in github.com/Azure/azure-storage-azcopy/v10/common.(*lifecycleMgr).watchInputs (lcm=0xc00010cc60) at /home/cloudtest/go/pkg/mod/github.com/!azure/azure-storage-azcopy/[email protected]/common/lifecyleMgr.go:125
#2 0x0000000000bf9206 in github.com/Azure/azure-storage-azcopy/v10/common.glob..func2.3 () at /home/cloudtest/go/pkg/mod/github.com/!azure/azure-storage-azcopy/[email protected]/common/lifecyleMgr.go:43
#3 0x00000000004708a1 in runtime.goexit () at /usr/local/go/src/runtime/asm_amd64.s:1598
#4 0x0000000000000000 in ?? ()
(gdb) continue
Continuing.

Breakpoint 1, bufio.(*Reader).ReadString (b=0xc000065f60, delim=10 '\n', ~r0=..., ~r1=...) at /usr/local/go/src/bufio/bufio.go:493
493 in /usr/local/go/src/bufio/bufio.go
(gdb) bt
#0 bufio.(*Reader).ReadString (b=0xc000065f60, delim=10 '\n', ~r0=..., ~r1=...) at /usr/local/go/src/bufio/bufio.go:493
#1 0x0000000000bf9405 in github.com/Azure/azure-storage-azcopy/v10/common.(*lifecycleMgr).watchInputs (lcm=0xc00010cc60) at /home/cloudtest/go/pkg/mod/github.com/!azure/azure-storage-azcopy/[email protected]/common/lifecyleMgr.go:125
#2 0x0000000000bf9206 in github.com/Azure/azure-storage-azcopy/v10/common.glob..func2.3 () at /home/cloudtest/go/pkg/mod/github.com/!azure/azure-storage-azcopy/[email protected]/common/lifecyleMgr.go:43
#3 0x00000000004708a1 in runtime.goexit () at /usr/local/go/src/runtime/asm_amd64.s:1598
#4 0x0000000000000000 in ?? ()
(gdb) continue
Continuing.

Breakpoint 1, bufio.(*Reader).ReadString (b=0xc000065f60, delim=10 '\n', ~r0=..., ~r1=...) at /usr/local/go/src/bufio/bufio.go:493
493 in /usr/local/go/src/bufio/bufio.go
(gdb) bt
#0 bufio.(*Reader).ReadString (b=0xc000065f60, delim=10 '\n', ~r0=..., ~r1=...) at /usr/local/go/src/bufio/bufio.go:493
#1 0x0000000000bf9405 in github.com/Azure/azure-storage-azcopy/v10/common.(*lifecycleMgr).watchInputs (lcm=0xc00010cc60) at /home/cloudtest/go/pkg/mod/github.com/!azure/azure-storage-azcopy/[email protected]/common/lifecyleMgr.go:125
#2 0x0000000000bf9206 in github.com/Azure/azure-storage-azcopy/v10/common.glob..func2.3 () at /home/cloudtest/go/pkg/mod/github.com/!azure/azure-storage-azcopy/[email protected]/common/lifecyleMgr.go:43
#3 0x00000000004708a1 in runtime.goexit () at /usr/local/go/src/runtime/asm_amd64.s:1598
#4 0x0000000000000000 in ?? ()
(gdb) continue

My guess is that ReadString is returning err != nil and therefore it's running the "continue", which in turn makes it go back into ReadString. This wasn't a problem before because allowWatchInput was false which had a 500 microsecond sleep.

Downgrading back to 2.0.3 clears it for now.

There is nothing useful in the debug logs.

@tom-seechange
Copy link

I should also mention that I'm running in systemd which might be causing quirks with os.stdin vs running in a shell.

Thanks,

Tom

@vibhansa-msft
Copy link
Member

Can you share your config file as well. As per logs you have both "stream" and "file-cache" component plugged in. These two components are mutually exclusive and shall not be plugged in together.

@vibhansa-msft
Copy link
Member

Jul 13 08:31:32 ubuntudev blobfuse2[73124]: LOG_TRACE [stream.go (100)]: Stream::Configure : stream
Jul 13 08:31:32 ubuntudev blobfuse2[73124]: LOG_INFO [read.go (55)]: ReadCache::Configure : Streamonly set to true
Jul 13 08:31:32 ubuntudev blobfuse2[73124]: LOG_INFO [stream.go (132)]: Stream::Configure : Buffer size 0, Block size 0, Handle limit 0
Jul 13 08:31:32 ubuntudev blobfuse2[73124]: LOG_TRACE [file_cache.go (205)]: FileCache::Configure : file_cache

@tom-seechange
Copy link

Hi,

I do not have both file_cache and stream enabled, only stream and have this problem. Did you notice the stack traces I posted above?

Log and config attached.

blobfuse high cpu config.txt
blobfuse high cpu log

Thanks,

Tom

@vibhansa-msft vibhansa-msft linked a pull request Jul 17, 2023 that will close this issue
@vibhansa-msft
Copy link
Member

@tom-seechange thanks for pointing this out. This was observed in local tests as well and has been fixed in above mentioned PR. Either you can try out top of the line main branch or wait for our next release.

@ggrrll Can you also try top of the main branch and validate this fixes your issue as well.

@ggrrll
Copy link
Author

ggrrll commented Jul 17, 2023

so, after cloning I got this error, during building (go build)

build github.com/Azure/azure-storage-fuse/v2: cannot load embed: malformed module path "embed": missing dot in first path element

not sure if related to #1036 (comment)

@vibhansa-msft
Copy link
Member

Which version of go you are using, can you upgrade to go-1.20 and try it out.

@ggrrll
Copy link
Author

ggrrll commented Jul 24, 2023

on go1.20.6 , but still get

# github.com/Azure/azure-storage-fuse/v2/component/libfuse
In file included from component/libfuse/libfuse_handler.go:44:
./libfuse_wrapper.h:53:10: fatal error: fuse3/fuse.h: No such file or directory
   53 | #include <fuse3/fuse.h>
      |          ^~~~~~~~~~~~~~
compilation terminated.

@vibhansa-msft
Copy link
Member

This is some setup issue where its not able to find some header file that it needs.
Execute "sudo apt-get install fuse3 libfuse3-dev gcc -y" and then try to rebuild the code.

@ggrrll
Copy link
Author

ggrrll commented Jul 26, 2023

not solved:
now, after go build the is not even the blobfuse command ...

@vibhansa-msft
Copy link
Member

"./build.sh" you can run then you will see "blobfuse2" binary.

@ggrrll
Copy link
Author

ggrrll commented Jul 26, 2023

(ok thanks -- please update the readme )

@ggrrll
Copy link
Author

ggrrll commented Jul 26, 2023

I can confirm the issues is solved -- normal CPU load < 0.1 %
(building on latest commit in main)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants