Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update skyhook to ceph-nautilus #86

Open
wants to merge 10,000 commits into
base: skyhook-luminous
Choose a base branch
from

Conversation

aditigupta17
Copy link
Collaborator

No description provided.

yaarith and others added 30 commits March 25, 2020 10:26
Plus details about license agreement.

Fixes: https://tracker.ceph.com/issues/43648
Signed-off-by: Yaarit Hatuka <[email protected]>
(cherry picked from commit 66c41f3d0e516e217dc89ab20c53563ea10f97f7)
nautilus: doc/mgr/telemetry: added device channel details
After 5ace82e65c72847fb875fc01c419937a26a59d70 was merged, I found
three more instances of the code being patched.

The commit message of 5ace82e65c72847fb875fc01c419937a26a59d70 was/is:

"The default values are handled by mgr_module.py's _get_module_option();
the or here means that we break any non-true (0, false, none) value and
override it with the default."

Fixes: 5ace82e65c72847fb875fc01c419937a26a59d70
Fixes: https://tracker.ceph.com/issues/43746
Signed-off-by: Nathan Cutler <[email protected]>
(cherry picked from commit 30585f44de6147c824ed2df1a477ad17f7b51fff)

Conflicts:
	src/pybind/mgr/osd_support/module.py
- file does not exist in nautilus
Additionally, introduce `task status` field in manager report
messages to forward status of executing tasks in daemons (e.g.,
status of executing scrubs in ceph metadata servers).

`task status` makes its way upto service map which is then used
to display the relevant information in ceph status.

Signed-off-by: Venky Shankar <[email protected]>
(cherry picked from commit 5c25a018643b10aa78db8270cae1476f71d8f4f4)

 Conflicts:
	src/messages/MMgrReport.h
	src/mgr/DaemonServer.cc
	src/mgr/ServiceMap.h
Signed-off-by: Venky Shankar <[email protected]>
(cherry picked from commit 625dffe65c0f8001b3b6ca6d0b12732a1a103849)

 Conflicts:
	src/mds/MDSRank.cc
... also log new and completed scrubs.

Signed-off-by: Venky Shankar <[email protected]>
(cherry picked from commit 49520156414cb0667e8edd6ea4b31462e7cb7752)
Signed-off-by: Venky Shankar <[email protected]>
(cherry picked from commit 465a3adc6c31fd9b8359920ab47adf8e1f45d5f1)
Fixes: http://tracker.ceph.com/issues/36370
Signed-off-by: Venky Shankar <[email protected]>
(cherry picked from commit 05d17994a879995b56bda5d770a938d0aabaaed9)
* partially revert 5c25a018. which is not backward compatible.
* change `ServiceMap::get_daemon()` so it returns a
  `pair<Daemon*,bool>`.

git is an `optional<uint64_t>`, so we cannot dump it without checking.

Fixes: http://tracker.ceph.com/issues/41424
Signed-off-by: Kefu Chai <[email protected]>
(cherry picked from commit 3e65551d0ab4acafaf4012601987b3391b2411ed)
Fixes: http://tracker.ceph.com/issues/42169
Introduced-by: 5c25a01864
Signed-off-by: Venky Shankar <[email protected]>
(cherry picked from commit cda3eadbfbd3e3a8a057cc7ed042ba0a6d7fef11)
Introduced by 625dffe65c0f8001b3b6ca6d0b12732a1a103849, which added
periodic scrub stats reporting.

Fixes: https://tracker.ceph.com/issues/42494
Fixes: https://tracker.ceph.com/issues/41525
Signed-off-by: Sage Weil <[email protected]>
(cherry picked from commit a1a220d137f7fa128d08f96d1eff83865388e5fc)
This also cleans up the output to be more readable/useful in debug
output.

Signed-off-by: Patrick Donnelly <[email protected]>
(cherry picked from commit 1c56632e88a126d0eac75235d9a9716833cba6b7)

 Conflicts:
	PendingReleaseNotes

Fix minor conflict in PendingReleaseNotes.
This is a trivial refactor.

Signed-off-by: Patrick Donnelly <[email protected]>
(cherry picked from commit bafa3b731ff93a356fcf60ebbaf0e5f299182992)
Signed-off-by: Patrick Donnelly <[email protected]>
(cherry picked from commit 3355581a298963711ad058c6e47d21b7eab887a4)
Few things here:

- Make explicit the check for getting removed from the MDSMap. This was
  only done before by checking if MDS held a rank which does not check the
  case where a standby is removed from the FSMap.

- Use mds_info_t::dump to simplify various debug output.

- Add a few sanity asserts for invalid state transitions.

Signed-off-by: Patrick Donnelly <[email protected]>
(cherry picked from commit 26a08df2adc7495fc660113cd26c33d5debd3ee6)

 Conflicts:
	src/mds/MDSDaemon.h

Nautilus still uses "const_ref" rather than "cref_t<>" in master.
The Monitors send an empty MDSMap to an MDS it is removing. The MDS
can't diagnose the cause. Instead suggest looking at the cluster/monitor
logs.

Signed-off-by: Patrick Donnelly <[email protected]>
(cherry picked from commit c385178b0c04d09bc6774acf5bdb4537fa6803ad)
This commit undoes the service daemon registration for the MDS. It doesn't look
absolutely necessary and it causes the MDS to be listed twice in the `ceph
versions` output:

    $ ceph versions
        ...
        "mds": {
            "ceph version v15.0.0-6915-g0143b904676 (0143b9046763ea1801efa8358a0c033ec862cea9) octopus (dev)": 3
        },
        "mds": {
            "unknown": 3
        },
        "overall": {
            "ceph version v15.0.0-6915-g0143b904676 (0143b9046763ea1801efa8358a0c033ec862cea9) octopus (dev)": 10,
            "unknown": 3
        }

Fixing that requires looking for duplicates or ignoring MDSs in the
service daemons when the mon processes `ceph versions`. I have a feeling
that it wasn't actually designed to be used by the MDS this way however.
Additionally, the reason for "unknown" version is because the metadata
sent to the mgr does not include "ceph_version".

Signed-off-by: Patrick Donnelly <[email protected]>
(cherry picked from commit 046137f819aae72f1423e3feb213f0e46c97c9ce)
Note that we now sub to the mgrmap after init because the MgrClient
connection to the mgr is driven by receipt of the MgrMap.

This is important so that the MDS does not have metadata with the mgr
when the mons are ignoring the MDS otherwise due to CompatSet
incompatibilities.

Fixes: https://tracker.ceph.com/issues/41538
Fixes: https://tracker.ceph.com/issues/42635
Signed-off-by: Patrick Donnelly <[email protected]>
(cherry picked from commit e765f2d533440cfc4189f36fcaba24617a302e84)

 Conflicts:
	src/mds/MDSDaemon.cc

Nautilus uses Lock/Unlock rather than lock/unlock in master.
This would be widely required since ceph metadata server entries are
maintained in service map (DaemonServer::pending_service_map). Such
normal ceph services would need to filtered when processing the service
map to avoid extraneous entries getting processed.

Signed-off-by: Venky Shankar <[email protected]>
(cherry picked from commit 79503fc16749ed0cfe8a89ea3b3c8c792d6b8809)
This is done is couple of places in ceph manager -- when culling
entries from service map and the other when dumping serice status.

Signed-off-by: Venky Shankar <[email protected]>
(cherry picked from commit 20139d12423fc6154b57499c576d1e4bb3f1eade)

 Conflicts:
	src/mgr/DaemonServer.cc

Switch to non-initializer list in range-based for loop.
…tadata

Signed-off-by: Venky Shankar <[email protected]>
(cherry picked from commit dfc056028e44cdbda84d7118e29120ae70edb35a)
MDS does not register with manager, therefore `task status`
(scrub status updates) were not getting displayed in ceph
status output.

Signed-off-by: Venky Shankar <[email protected]>
(cherry picked from commit d6ad0a255a27c5e3da20c7ae03066a692a6d49bb)

 Conflicts:
	src/mgr/DaemonServer.cc
	src/mgr/DaemonServer.h

mgr/DaemonServer.{h,cc} deals with raw pointers while master uses ref_t<>
cast -- adjust to that. a minor conflict in the header and the metrics
templatization is not backported to nautilus. also, DaemonKey is a std::pair
in nautilus but a struct in master -- that requires a change in referencing
daemon type and name.
Fixes: http://tracker.ceph.com/issues/42835
Signed-off-by: Venky Shankar <[email protected]>
(cherry picked from commit fb4cd42cbcdaa4812838b846ae33ee26356ce733)
nautilus: mds: display scrub status in ceph status

Reviewed-by: Patrick Donnelly <[email protected]>
Reviewed-by: Greg Farnum <[email protected]>
Introduce a config option called 'mon_warn_on_pool_no_redundancy' that is
used to show a health warning if any pool in the ceph cluster is
configured with a size of 1. The user can mute/unmute the warning using
'ceph health mute/unmute POOL_NO_REDUNDANCY'.

Add standalone test to verify warning on setting pool size=1. Set the
associated warning to 'false' in ceph.conf.template under qa/tasks so
that existing tests do not break.

Fixes: https://tracker.ceph.com/issues/41666
Signed-off-by: Sridhar Seshasayee <[email protected]>
(cherry picked from commit 33c647e8114b37404d8d62a08c85664cea709118)

 Conflicts:
	PendingReleaseNotes
- Added release notes under 14.2.9
	qa/standalone/mon/health-mute.sh
- Deleted the script as 'health mute/unmute' cmd is unavailable in nautilus
	qa/tasks/ceph.conf.template
- Removed a flag not available in nautilus
	src/common/options.cc
- Removed a flag not available in nautilus
	src/osd/OSDMap.cc
nautilus: mgr/dashboard: Updated existing E2E tests to match new format 

Reviewed-by: Laura Paduano <[email protected]>
nautilus: mgr/telemetry: catch exception during requests.put

Reviewed-by: Dan Mick <[email protected]>
nautilus: mgr/dashboard: Pool read/write OPS shows too many decimal places

Reviewed-by: Stephan Müller <[email protected]>
Reviewed-by: Laura Paduano <[email protected]>
nautilus: mgr/dashboard: do not show RGW API keys if only read-only privileges

Reviewed-by: Lenz Grimmer <[email protected]>
Reviewed-by: Laura Paduano <[email protected]>
Reviewed-by: Ernesto Puerta <[email protected]>
Reviewed-by: Volker Theile <[email protected]>
nautilus: pybind/mgr/*: fix config_notify handling of default values

Reviewed-by: Sage Weil <[email protected]>
yuriw and others added 30 commits April 28, 2020 09:09
nautilus: rgw: reshard: skip stale bucket id entries from reshard queue

Reviewed-by: Casey Bodley <[email protected]>
Fixes: https://tracker.ceph.com/issues/44774
Signed-off-by: Yang Honggang <[email protected]>
(cherry picked from commit ade4d46981e660c8d57cec64180b5afa4561b945)
nautilus: qa/workunits/rbd: wait for nbd map to close after unmap

Reviewed-by: Jason Dillaman <[email protected]>
osd_pool_default_pg_autoscale_mode is the right parameter to
set placement-group autoscale mode.

Signed-off-by: Changcheng Liu <[email protected]>
(cherry picked from commit c0df98fc7e78bbd366333d810f78ddbeed0e6729)
nautilus: doc: fix parameter to set pg autoscale mode

Reviewed-by: Neha Ojha <[email protected]>
The pg count needs to be a power-of-two since
dff5697464edb9931d5dfa08cd4a30f85c1f237e.

Also, mon_pg_warn_min_per_osd is disabled by default now (or set to a
low value in vstart/testing) so there's no need to base the pg count on
this value.

Ideally someday we can remove this so that the default cluster value is
used but we need to keep this for deployments of older versions of Ceph.

Fixes: https://tracker.ceph.com/issues/42228
Signed-off-by: Patrick Donnelly <[email protected]>
(cherry picked from commit fc88e6c6c55402120a432ea47f05f321ba4c9bb1)

Conflicts:
	qa/tasks/cephfs/filesystem.py: this commit was orignally
backported by #34055, but it failed to cherry-pick all necessary
bits. in this change, the missing bit is picked up.
The other partss of the struct are initialized by their
ctors. Only for unint_64 there is no ctor.

Otherwise ceph-dencoder tests will fail in comparing
the exported output.
[~/master36] [email protected]> build/bin/ceph-dencoder type bluestore_bdev_label_t select_test 1 encode export /tmp/typ-yFISvjvgj
[~/master36] [email protected]> hexdump -C !$
hexdump -C /tmp/typ-yFISvjvgj
00000000  62 6c 75 65 73 74 6f 72  65 20 62 6c 6f 63 6b 20 |bluestore block |
00000010  64 65 76 69 63 65 0a 30  30 30 30 30 30 30 30 2d |device.00000000-|
00000020  30 30 30 30 2d 30 30 30  30 2d 30 30 30 30 2d 30 |0000-0000-0000-0|
00000030  30 30 30 30 30 30 30 30  30 30 30 0a 02 01 28 00 |00000000000...(.|
00000040  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00 |................|
00000050  00 00 70 74 03 04 00 00  00 00 00 00 00 00 00 00 |..pt............|
00000060  00 00 00 00 00 00 00 00  00 00 |..........|
0000006a

[~/master36] [email protected]> build/bin/ceph-dencoder type bluestore_bdev_label_t select_test 1 encode decode encode export /tmp/typ-MjWXdCpzJ
[~/master36] [email protected]> hexdump -C !$
hexdump -C /tmp/typ-MjWXdCpzJ
00000000  62 6c 75 65 73 74 6f 72  65 20 62 6c 6f 63 6b 20 |bluestore block |
00000010  64 65 76 69 63 65 0a 30  30 30 30 30 30 30 30 2d |device.00000000-|
00000020  30 30 30 30 2d 30 30 30  30 2d 30 30 30 30 2d 30 |0000-0000-0000-0|
00000030  30 30 30 30 30 30 30 30  30 30 30 0a 02 01 28 00 |00000000000...(.|
00000040  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00 |................|
00000050  00 00 73 64 00 00 00 00  00 00 00 00 00 00 00 00 |..sd............|
00000060  00 00 00 00 00 00 00 00  00 00 |..........|
0000006a

Signed-off-by: Willem Jan Withagen <[email protected]>
(cherry picked from commit d411ee26fb2f6cbe610f8bbc81b777cf28d839c2)
Signed-off-by: Ali Maredia <[email protected]>
(cherry picked from commit 90cce10fc1364df16ab12632b2dca403894cbe44)
mimic does not support auto split/merge, but we do test mimic-x on
nautilus, which ends up with failures like:

ceback (most recent call last):
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_py2/teuthology/contextutil.py", line 34, in nested
    yield vars
  File "/home/teuthworker/src/git.ceph.com_ceph_nautilus/qa/tasks/ceph.py", line 1928, in task
    ctx.managers[config['cluster']].stop_pg_num_changes()
  File "/home/teuthworker/src/git.ceph.com_ceph_nautilus/qa/tasks/ceph_manager.py", line 1806, in stop_pg_num_changes
    if pool['pg_num'] != pool['pg_num_target']:
KeyError: 'pg_num_target'

so we need to skip this if 'pg_num_target' is not in pg_pool_t::dump().

this change is not cherry-picked from master, as we don't test
mimic-x on master.

Signed-off-by: Kefu Chai <[email protected]>
this change is not cherry-picked from master, as master is using el8
already

Signed-off-by: Kefu Chai <[email protected]>
nautilus: qa/tasks/ceph_manager: do not cancel pending pg num changes on mimic

Reviewed-by: Brad Hubbard <[email protected]>
Reviewed-by: Neha Ojha <[email protected]>
Reviewed-by: Yuri Weinstein <[email protected]>
nautilus: mon/FSCommands: Fix 'add_data_pool' command and 'fs new' command

Reviewed-by: Nathan Cutler <[email protected]>
…utilus

nautilus: os/bluestore: open DB in read-only when expanding DB/WAL

Reviewed-by: Adam Kupczyk <[email protected]>
nautilus: mon: Get session_map_lock before remove_session

Reviewed-by: xie xingguo <[email protected]>
nautilus: os/bluestore: Don't pollute old journal when add new device

Reviewed-by: Igor Fedotov <[email protected]>
nautilus: bluestore/bdev: initialize size when creating object

Reviewed-by: Igor Fedotov <[email protected]>
nautilus: qa/distros: point {centos,rhel}_latest.yaml to 7.8

Reviewed-by: Yuri Weinstein <[email protected]>
Reviewed-by: David Galloway <[email protected]>
nautilus: rgw: increase log level for same or older period pull msg

Reviewed-by: Casey Bodley <[email protected]>
disable the TOO_FEW_PGS warning, as
1ac34a5ea3d1aca299b02e574b295dd4bf6167f4 is not backported to mimic, we
will have TOO_FEW_PGS warnings when a healthy cluster is expected when
upgrading from mimic.

this change disables this warning by setting "mon_pg_warn_min_per_osd" to
"0".

this change is not cherry-picked from master. as
1ac34a5ea3d1aca299b02e574b295dd4bf6167f4 is already included by master,
and we don't perform upgrade from mimic on master branch.

Signed-off-by: Kefu Chai <[email protected]>
nautilus: mgr: force purge normal ceph entities from service map

Reviewed-by: Ramana Raja <[email protected]>
nautilus: qa: use small default pg count for CephFS pools

Reviewed-by: Ramana Raja <[email protected]>
nautilus: ceph-fuse: link to libfuse3 and pass "-o big_writes" to libfuse if libfuse < 3.0.0

Reviewed-by: Zheng Yan <[email protected]>
Reviewed-by: Xiubo Li <[email protected]>
Reviewed-by: Ramana Raja <[email protected]>
…pg-per-osd-warning

nautilus: qa/suites/upgrade: disable min pg per osd warning

Reviewed-by: Yuri Weinstein <[email protected]>
nautilus: rgw/notifications: backporting features and bug fix

Reviewed-by: Casey Bodley <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.