Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bookkeeper operator upgrade from 0.1.1 to 0.1.3 fails #105

Closed
SrishT opened this issue Jan 27, 2021 · 0 comments · Fixed by #106
Closed

Bookkeeper operator upgrade from 0.1.1 to 0.1.3 fails #105

SrishT opened this issue Jan 27, 2021 · 0 comments · Fixed by #106
Assignees

Comments

@SrishT
Copy link
Contributor

SrishT commented Jan 27, 2021

Description

When we upgrade BK-operator 0.1.1 (with journalDirectories /bk/journal/j0,/bk/journal/j1,/bk/journal/j2,/bk/journal/j3 and ledgerDirectories /bk/ledgers/l0,/bk/ledgers/l1,/bk/ledgers/l2,/bk/ledgers/l3 specified in bookkeeper options) to BK-operator 0.1.3, the new operator instead of mounting the parent /bk/journal folder to the same old volume uses subpath to mount each of /bk/journal/j subfolders to a subpath on the volume.
The subpath option is not able to find the old existing j<n> subfolders on the volume, and creates new mount points on the volume that do not contain the old data.
As a result, the bookie pod is failing with the following errors

2021-01-22 20:22:07,145 - ERROR - [main:Bookie@467] - There are directories without a cookie, and this is neither a new environment, nor is storage expansion enabled. Empty directories are [/bk/journal/j0/current, /bk/journal/j1/current, /bk/journal/j2/current, /bk/journal/j3/current, /bk/ledgers/l0/current, /bk/ledgers/l1/current, /bk/ledgers/l2/current, /bk/ledgers/l3/current]
2021-01-22 20:22:07,145 - INFO  - [main:BookieNettyServer@424] - Shutting down BookieNettyServer
2021-01-22 20:22:07,154 - ERROR - [main:Main@228] - Failed to build bookie server
org.apache.bookkeeper.bookie.BookieException$InvalidCookieException: 
	at org.apache.bookkeeper.bookie.Bookie.checkEnvironmentWithStorageExpansion(Bookie.java:471)
	at org.apache.bookkeeper.bookie.Bookie.checkEnvironment(Bookie.java:253)
	at org.apache.bookkeeper.bookie.Bookie.<init>(Bookie.java:699)
	at org.apache.bookkeeper.proto.BookieServer.newBookie(BookieServer.java:140)
	at org.apache.bookkeeper.proto.BookieServer.<init>(BookieServer.java:108)
	at org.apache.bookkeeper.server.service.BookieService.<init>(BookieService.java:52)
	at org.apache.bookkeeper.server.Main.buildBookieServer(Main.java:313)
	at org.apache.bookkeeper.server.Main.doMain(Main.java:226)
	at org.apache.bookkeeper.server.Main.main(Main.java:208)

Importance

blocker

Location

pkg/controller/bookkeepercluster/bookie.go

Suggestions for an improvement

We need to teach the operator to be able to recognize the situation where on the legacy bookie pod we previously had the parent folder mounted to the volume without the use of subpath option and remount the volume to the new pod in the way that all the old data can be retrieved.

Looking at the mounts produced for new deployments:

    - mountPath: /bk/journal/j0
      name: journal
      subPath: journal0

So we might be able to recover the data if we used subpath: j0 instead of journal0.

One solution would be for bookkeeper-operator would take two parameters (ledgerSubPath and journalSubPath) on BK-cluster resource that would override the LedgerDiskName and JournalDiskName parameters used to produce the mount subpaths, based on the kind of deployment we are upgrading from.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant