Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Default ashift for Amazon EC2 NVMe devices #7676

Merged
merged 1 commit into from
Jul 6, 2018
Merged

Default ashift for Amazon EC2 NVMe devices #7676

merged 1 commit into from
Jul 6, 2018

Conversation

tnn
Copy link
Contributor

@tnn tnn commented Jul 3, 2018

Description

Add a default 4 KiB ashift for Amazon EC2 NVMe devices on instances with NVMe ephemeral devices, such as the types c5d, f1, i3 and m5d.

Motivation and Context

As per the official documentation [1] a 4096 byte blocksize should be
used to match the underlying hardware.

The string was identified via:

$ sudo sginfo -M /dev/nvme0n1
INQUIRY response (cmd: 0x12)
----------------------------
Device Type                        0
Vendor:                    NVMe
Product:                   Amazon EC2 NVMe
Revision level:

$ lsblk -io KNAME,TYPE,SIZE,MODEL
KNAME   TYPE    SIZE MODEL
nvme0n1 disk  442.4G Amazon EC2 NVMe Instance Storage

[1] https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/storage-optimized-instances.html
Retrived 2018-07-03

How Has This Been Tested?

Compiled and verified via zpool create ...:

zpool create dozer /dev/nvme0n1 && zdb -C dozer | grep ashift
                ashift: 12

Instance: Amazon EC2 i3.xlarge
OS: Ubuntu 16.04
Kernel: 4.4.0-1048-aws

See below comments for full test plan executed.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation (a change to man pages or other documentation)

Checklist:

  • My code follows the ZFS on Linux code style requirements.
  • I have updated the documentation accordingly.
  • I have read the contributing document.
  • [] I have added tests to cover my changes.
  • All new and existing tests passed.
  • All commit messages are properly formatted and contain Signed-off-by.
  • Change has been approved by a ZFS on Linux member.

@behlendorf
Copy link
Contributor

@tnn thanks for the PR. Yes, this is the right way to go about it. Could you verify that the physical block size is being reported at 512, and we in fact do need to add an override. And thanks for the link to the official documentation.

cat /sys/block/nvme0n1/queue/physical_block_size

@tnn
Copy link
Contributor Author

tnn commented Jul 3, 2018

Thanks! I'll report back tomorrow with the before/after data (plus the other instance types).
So far it looks like it is required:

$ sudo blockdev --getss --getpbsz /dev/nvme0n1
512
512
$ cat /sys/block/nvme0n1/queue/physical_block_size
512

@richardelling
Copy link
Contributor

NB, some NVMe SSDs can be formatted for 512, 520, 528, 4096, 4104 or 4224 byte blocks. The method to determine what the drive is formatted to is to inquire using nvme-cli or (hopefully) it is properly reported by physical_block_size in sysfs.

@tnn
Copy link
Contributor Author

tnn commented Jul 3, 2018

Thanks for the pointer, @richardelling. nvme-cli appears to report the same story:

$ sudo nvme list
Node             SN                   Model                                    Namespace Usage                      Format           FW Rev  
---------------- -------------------- ---------------------------------------- --------- -------------------------- ---------------- --------
/dev/nvme0n1     AWS3433290E944XXXXXX Amazon EC2 NVMe Instance Storage         1         475.00  GB / 475.00  GB    512   B +  0 B   0

@tnn
Copy link
Contributor Author

tnn commented Jul 4, 2018

I could not get master to compile (missing ./configure / autogen.sh), but I chery-picked my commit on top of 0.7.x and did the following tests on a AWS EC2 i3.large instance:

$ curl http://169.254.169.254/latest/meta-data/instance-type
i3.large

$ uname -a
Linux secret-name-i-054359d2538a5baa8 4.4.0-1062-aws #71-Ubuntu SMP Fri Jun 15 10:07:39 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

$ dpkg -l | grep zfs
ii  libzfs2linux                     0.6.5.6-0ubuntu18                          amd64        Native OpenZFS filesystem library for Linux
ii  zfs-dkms                         0.6.5.6-0ubuntu18                          amd64        Native OpenZFS filesystem kernel modules for Linux
ii  zfs-doc                          0.6.5.6-0ubuntu18                          all          Native OpenZFS filesystem documentation and examples.
ii  zfs-initramfs                    0.6.5.6-0ubuntu18                          all          Native OpenZFS root filesystem capabilities for Linux
ii  zfs-zed                          0.6.5.6-0ubuntu18                          amd64        OpenZFS Event Daemon (zed)
ii  zfsutils-linux                   0.6.5.6-0ubuntu18                          amd64        Native OpenZFS management utilities for Linux

$ sudo zpool create dozer /dev/nvme0n1 && sudo zdb -C dozer | grep ashift
                ashift: 9

$ sudo sginfo -M /dev/nvme0n1
INQUIRY response (cmd: 0x12)
----------------------------
Device Type                        0
Vendor:                    NVMe
Product:                   Amazon EC2 NVMe
Revision level:

$ lsblk -io KNAME,TYPE,SIZE,MODEL
KNAME   TYPE    SIZE MODEL
nvme0n1 disk  442.4G Amazon EC2 NVMe Instance Storage

$ cat /sys/block/nvme0n1/queue/physical_block_size
512

## Get latest spl-0.7-release and zfs-0.7-release branches, cherry-pick my commit
## Verify change is apply for cmd/zpool/zpool_vdev.c
## Build the source and package following https://github.com/zfsonlinux/zfs/wiki/Custom-Packages#debian-and-ubuntu


# Remove the default zfs packages, since the package name differ from 0.6.x to 0.7.x
$ sudo apt-get remove libzfs2linux zfs-doc zfs-initramfs zfs-zed zfsutils-linux

$ sudo dpkg -i spl/spl-dkms_0.7.9-1_amd64.deb \
   spl/spl_0.7.9-1_amd64.deb \
   zfs/libnvpair1_0.7.9-1_amd64.deb \
   zfs/libuutil1_0.7.9-1_amd64.deb \
   zfs/libzfs2_0.7.9-1_amd64.deb \
   zfs/libzpool2_0.7.9-1_amd64.deb \
   zfs/zfs-dkms_0.7.9-1_amd64.deb \
   zfs/zfs-initramfs_0.7.9-1_amd64.deb \
   zfs/zfs_0.7.9-1_amd64.deb
[...]

$ sudo dkms status |grep -E 'zfs|spl'
spl, 0.7.9, 4.4.0-1062-aws, x86_64: installed
zfs, 0.7.9, 4.4.0-1062-aws, x86_64: installed


$ sudo zpool create dozer /dev/nvme0n1 && sudo zdb -C dozer | grep ashift
                ashift: 12

Add a default 4 KiB ashift for Amazon EC2 NVMe devices on instances with
NVMe ephemeral devices, such as the types c5d, f1, i3 and m5d.
As per the official documentation [1] a 4096 byte blocksize should be
used to match the underlying hardware.

The string was identified via:

$ sudo sginfo -M /dev/nvme0n1
INQUIRY response (cmd: 0x12)
----------------------------
Device Type                        0
Vendor:                    NVMe
Product:                   Amazon EC2 NVMe
Revision level:

$ lsblk -io KNAME,TYPE,SIZE,MODEL
KNAME   TYPE    SIZE MODEL
nvme0n1 disk  442.4G Amazon EC2 NVMe Instance Storage

[1] https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/
    storage-optimized-instances.html
    Retrived 2018-07-03

Signed-off-by: Troels Nørgaard <[email protected]>
@richardelling
Copy link
Contributor

I see no proof that the NVMe SSDs are 4k or 8k. I only see proof the SSDs are 512.

@behlendorf
Copy link
Contributor

Since the tools really can't be trusted in this environment, let's follow the guidance from the documentation.

https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/storage-optimized-instances.html#ssd-io-performance

"This decrease in performance is even larger if the write operations are not in multiples
of 4,096 bytes or not aligned to a 4,096-byte boundary. If you write a smaller amount of
bytes or bytes that are not aligned, the SSD controller must read the surrounding data
and store the result in a new location. This pattern results in significantly increased
write amplification, increased latency, and dramatically reduced I/O performance."

@tnn
Copy link
Contributor Author

tnn commented Jul 6, 2018

Matt Wilson from AWS (engineer on the Nitro devices) has been kind enough to confirm, that 4 KiB block alignment should be beneficial.

@behlendorf behlendorf merged commit 94370f5 into openzfs:master Jul 6, 2018
@tnn tnn deleted the ashift-aws-nvme branch July 7, 2018 06:44
@tnn

This comment has been minimized.

@MyPod-zz

This comment has been minimized.

@tnn

This comment has been minimized.

tonyhutter pushed a commit to tonyhutter/zfs that referenced this pull request Aug 15, 2018
Add a default 4 KiB ashift for Amazon EC2 NVMe devices on instances with
NVMe ephemeral devices, such as the types c5d, f1, i3 and m5d.
As per the official documentation [1] a 4096 byte blocksize should be
used to match the underlying hardware.

The string was identified via:

$ sudo sginfo -M /dev/nvme0n1
INQUIRY response (cmd: 0x12)
----------------------------
Device Type                        0
Vendor:                    NVMe
Product:                   Amazon EC2 NVMe
Revision level:

$ lsblk -io KNAME,TYPE,SIZE,MODEL
KNAME   TYPE    SIZE MODEL
nvme0n1 disk  442.4G Amazon EC2 NVMe Instance Storage

[1] https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/
    storage-optimized-instances.html
    Retrived 2018-07-03

Reviewed-by: George Melikov <[email protected]>
Reviewed-by: Giuseppe Di Natale <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Troels Nørgaard <[email protected]>
Closes openzfs#7676
tonyhutter pushed a commit to tonyhutter/zfs that referenced this pull request Aug 23, 2018
Add a default 4 KiB ashift for Amazon EC2 NVMe devices on instances with
NVMe ephemeral devices, such as the types c5d, f1, i3 and m5d.
As per the official documentation [1] a 4096 byte blocksize should be
used to match the underlying hardware.

The string was identified via:

$ sudo sginfo -M /dev/nvme0n1
INQUIRY response (cmd: 0x12)
----------------------------
Device Type                        0
Vendor:                    NVMe
Product:                   Amazon EC2 NVMe
Revision level:

$ lsblk -io KNAME,TYPE,SIZE,MODEL
KNAME   TYPE    SIZE MODEL
nvme0n1 disk  442.4G Amazon EC2 NVMe Instance Storage

[1] https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/
    storage-optimized-instances.html
    Retrived 2018-07-03

Reviewed-by: George Melikov <[email protected]>
Reviewed-by: Giuseppe Di Natale <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Troels Nørgaard <[email protected]>
Closes openzfs#7676
tonyhutter pushed a commit to tonyhutter/zfs that referenced this pull request Aug 27, 2018
Add a default 4 KiB ashift for Amazon EC2 NVMe devices on instances with
NVMe ephemeral devices, such as the types c5d, f1, i3 and m5d.
As per the official documentation [1] a 4096 byte blocksize should be
used to match the underlying hardware.

The string was identified via:

$ sudo sginfo -M /dev/nvme0n1
INQUIRY response (cmd: 0x12)
----------------------------
Device Type                        0
Vendor:                    NVMe
Product:                   Amazon EC2 NVMe
Revision level:

$ lsblk -io KNAME,TYPE,SIZE,MODEL
KNAME   TYPE    SIZE MODEL
nvme0n1 disk  442.4G Amazon EC2 NVMe Instance Storage

[1] https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/
    storage-optimized-instances.html
    Retrived 2018-07-03

Reviewed-by: George Melikov <[email protected]>
Reviewed-by: Giuseppe Di Natale <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Troels Nørgaard <[email protected]>
Closes openzfs#7676
tonyhutter pushed a commit to tonyhutter/zfs that referenced this pull request Sep 5, 2018
Add a default 4 KiB ashift for Amazon EC2 NVMe devices on instances with
NVMe ephemeral devices, such as the types c5d, f1, i3 and m5d.
As per the official documentation [1] a 4096 byte blocksize should be
used to match the underlying hardware.

The string was identified via:

$ sudo sginfo -M /dev/nvme0n1
INQUIRY response (cmd: 0x12)
----------------------------
Device Type                        0
Vendor:                    NVMe
Product:                   Amazon EC2 NVMe
Revision level:

$ lsblk -io KNAME,TYPE,SIZE,MODEL
KNAME   TYPE    SIZE MODEL
nvme0n1 disk  442.4G Amazon EC2 NVMe Instance Storage

[1] https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/
    storage-optimized-instances.html
    Retrived 2018-07-03

Reviewed-by: George Melikov <[email protected]>
Reviewed-by: Giuseppe Di Natale <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Troels Nørgaard <[email protected]>
Closes openzfs#7676
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants