Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow larger SPA names in stats #6481

Merged
merged 1 commit into from
Aug 11, 2017
Merged

Allow larger SPA names in stats #6481

merged 1 commit into from
Aug 11, 2017

Conversation

gaurkuma
Copy link
Contributor

@gaurkuma gaurkuma commented Aug 8, 2017

Signed-off-by: gaurkuma [email protected]

Description

The pool name can be 256 chars long. Today, in /proc/spl/kstat/zfs/ the name is limited to < 32 characters. This change is to allow bigger pool names.

Motivation and Context

It solves two main problems:

  1. In a system, with several pools each having a large pool name, there is a possibility of having the first 31 characters of "zfs/<pool name>" same, in which case, we will loose stats for all the pools except one.
  2. Same is applicable for pool/pools running on different VMs in a cluster, and upon failover, when the pool is moved from one VM to another, the name clash due to same initial characters can lead to losing the stats.

Pool name can be constructed in different ways. It can be any random name or a combination of some strings appended to some UUID etc. For e.g. we have the following pools in one of our nodes:

VM:~$ sudo zpool list
NAME SIZE ALLOC FREE EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT
zpool-ABCD-WT-e1c0d782-ca7f-4916-bf68-8fc0650a1172-023c8d8c-baa0-4cbb-b26a-bcc6eb502ccf 39.8T 98.5G 39.7T - 0% 0% 162.82x ONLINE -
zpool-ABCD-WT-e1c0d782-ca7f-4916-bf68-8fc0650a1172-4986f9c6-5603-4fac-aae7-f0e0319af2a9 39.8T 94.3G 39.7T - 0% 0% 162.06x ONLINE -
zpool-ABCD-WT-e1c0d782-ca7f-4916-bf68-8fc0650a1172-85eef403-6d5b-455f-87a6-22aafc396a39 39.8T 96.7G 39.7T - 0% 0% 162.49x ONLINE -
zpool-ABCD-WT-e1c0d782-ca7f-4916-bf68-8fc0650a1172-fb64fbb7-d4fb-4087-8278-c92d5a41f2d6 39.8T 98.3G 39.7T - 0% 0% 161.72x ONLINE -
zpool-ABCD-WT-e1c0d782-ca7f-4916-bf68-8fc0650a1172-fd66d34a-43c1-44de-8970-5e659863b235 39.8T 174G 39.6T - 0% 0% 236.73x ONLINE -

In the above example, the first few characters are same across all the pools.

SPL Changes: openzfs/spl#641

How Has This Been Tested?

As in the above multi-pool scenario, we need to have stats for each of the pool under /proc/spl/kstats/zfs/

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation (a change to man pages or other documentation)

Checklist:

  • My code follows the ZFS on Linux code style requirements.
  • I have updated the documentation accordingly.
  • I have read the CONTRIBUTING document.
  • I have added tests to cover my changes.
  • All new and existing tests passed.
  • All commit messages are properly formatted and contain Signed-off-by.
  • Change has been approved by a ZFS on Linux member.

@behlendorf
Copy link
Contributor

@gaurkuma if you add Requires-spl: refs/pull/641/head to the commit message in this PR the buildbot will test this change with the matching SPL PR. That should resolve the build failures.

As for the patch itself it looks like we might be able to simply increase KSTAT_STRLEN to 255. The original 31 character limit is something we inherited from Illumos but it looks like the proc limit is 255 is recent kernels. Is this something you tried and ruled out? Let's also get @loli10K's thoughts on this.

@behlendorf behlendorf requested a review from loli10K August 9, 2017 17:57
@gaurkuma
Copy link
Contributor Author

gaurkuma commented Aug 9, 2017

@behlendorf Increasing KSTAT_STRLEN to 255 was considered as it was very straightforward..The only reason we went with the proposed change was to make sure we do not allocate too much on the kernel stack.

@behlendorf
Copy link
Contributor

Good point, that is definitely a concern. In this case I suspect it wouldn't be an issue since it's only placed on the stack in a few places and those are initialization call paths which should be relatively shallow.

@gaurkuma
Copy link
Contributor Author

gaurkuma commented Aug 9, 2017

@behlendorf You are right we may not have enough things on the stack by the time we initialize these stats. We just wanted to be extra careful and future-proof ourselves.

@@ -59,7 +59,8 @@ typedef int kid_t; /* unique kstat id */
* kcid = ioctl(kd, KSTAT_IOC_WRITE, kstat_t *);
*/

#define KSTAT_STRLEN 31 /* 30 chars + NULL; must be 16 * n - 1 */
#define KSTAT_STRLEN 31 /* 30 chars + NULL; must be 16 * n-1 */
#define KSTAT_MOD_STRLEN 255 /* Support large pool name */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: s/large/long/?

Also in the commit message, s/larger/longer/?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@loli10K Thanks i will update this.

(void) snprintf(name, KSTAT_STRLEN, "zfs/%s", spa_name(spa));

ksp = kstat_create(name, 0, "reads", "misc",
ksp = kstat_create("", 0, "reads", "misc",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kstat_create() accepts the module name as the first parameter, why are we passing an empty string here just to set it manually later via snprintf()? Is this to minimize stack space usage?

What happens if kstat_create() does some kind of input validation on the first parameter and we are now bypassing it? Even if it's not doing it now this may change in the future.

Copy link
Contributor Author

@gaurkuma gaurkuma Aug 10, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@loli10K the reason for this change is to minimize the stack usage. In kstat create we already have checks for null pointer etc and passing a empty string is just a place holder for the actual name to be assigned later. We can perhaps pass the string (31 chars) that we were passing before which means we keep the original code but reassign the ks_module to a longer name just before calling kstat_install (as per the new code). I passed empty string because didnt want to do snprintf twice.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From kstat_create(9F):

kstat_create() is used in conjunction with kstat_install(9F) to allocate and initialize a kstat(9S) structure.

kstat_create() allocates and performs necessary system initialization of a kstat(9S) structure. kstat_create() allocates memory for the entire kstat (header plus data), initializes all header fields, initializes the data section to all zeroes, assigns a unique kstat ID (KID), and puts the kstat onto the system's kstat chain. The returned kstat is marked invalid because the provider (caller) has not yet had a chance to initialize the data section.

After a successful call to kstat_create() the driver must perform any necessary initialization of the data section (such as setting the name fields in a kstat of type KSTAT_TYPE_NAMED). Virtual kstats must have the ks_data field set at this time. The provider may also set the ks_update, ks_private, and ks_lock fields if necessary.

Once the kstat is completely initialized, kstat_install(9F) is used to make the kstat accessible to the outside world.

I don't think kstat_t.ks_module is part of the "data section", so touching it outside of kstat_create() is breaking the API: my preference is leaning towards compatibility and correctness rather than stack optimization.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are only a half dozen or so places where a KSTAT_STRLEN sized buffer is put on the stack. In all those cases it's used as part of a snprintf() to print a custom name. These few instances could be easily switched to use kmem_asprintf() which would further reduce our stack usage and allow us to make the minimal change of increasing the size of KSTAT_STRLEN.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@loli10K @behlendorf..Thanks for the feedback. I agree with the points laid out here.

kmem_asprintf() is definitely is better choice for dynamically allocating the string. Will update the patch soon.

Requires-spl: refs/pull/641/head

Signed-off-by: gaurkuma <[email protected]>
Copy link
Contributor

@behlendorf behlendorf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Copy link
Contributor

@loli10K loli10K left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just tested with 2.6.32-696.6.3.el6.x86_64 and i've had no issues with long SPA names and procfs.

@behlendorf behlendorf merged commit 761b8ec into openzfs:master Aug 11, 2017
@gaurkuma gaurkuma deleted the stats branch August 11, 2017 16:19
FransUrbo pushed a commit to FransUrbo/zfs that referenced this pull request Apr 28, 2019
The pool name can be 256 chars long. Today, in /proc/spl/kstat/zfs/
the name is limited to < 32 characters. This change is to allows
bigger pool names.

Reviewed-by: Giuseppe Di Natale <[email protected]>
Reviewed-by: loli10K <[email protected]>
Reviewed-by: George Melikov <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: gaurkuma <[email protected]>
Closes openzfs#6481
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants