diff --git a/doc/io.rst b/doc/io.rst
index 192890e112a..b6652ab5ef1 100644
--- a/doc/io.rst
+++ b/doc/io.rst
@@ -266,6 +266,39 @@ converting ``NaN`` to ``-9999``, we would use
 ``encoding={'foo': {'dtype': 'int16', 'scale_factor': 0.1, '_FillValue': -9999}}``.
 Compression and decompression with such discretization is extremely fast.
 
+.. _io.string-encoding:
+
+String encoding
+...............
+
+xarray can write unicode strings to netCDF files in two ways:
+
+- As variable lengths strings. This is only supported on netCDF4 (HDF5) files.
+- By encoding strings into bytes, and writing encoded bytes as a character
+  array. The default encoding is UTF-8.
+
+By default, we use variable length strings for compatible files and fall-back
+to using encoded character arrays. Character arrays can be selected even for
+netCDF4 files by setting the ``dtype`` field in ``encoding`` to ``S1``
+(corresponding to NumPy's single-character bytes dtype).
+
+If character arrays are used, the string encoding that was used is stored on
+disk in the ``_Encoding`` attribute, which matches an ad-hoc convention
+`adopted by the netCDF4-Python library <https://github.com/Unidata/netcdf4-python/pull/665>`_.
+At the time of this writing (October 2017), a standard convention for indicating
+string encoding for character arrays in netCDF files was
+`still under discussion <https://github.com/Unidata/netcdf-c/issues/402>`_.
+Technically, you can use
+`any string encoding recognized by Python <https://docs.python.org/3/library/codecs.html#standard-encodings>`_ if you feel the need to deviate from UTF-8,
+by setting the ``_Encoding`` field in ``encoding``. But
+`we don't recommend it<http://utf8everywhere.org/>`_.
+
+.. warning::
+
+  By default, missing values in bytes or unicode string arrays (represented by
+  ``NaN`` in xarray) are currently written to disk as empty strings ``''``. Thus
+  missing values will not be restored when data is loaded from disk.
+  This behavior is likely to change in the future (:issue:`1647`).
 
 Chunk based compression
 .......................
@@ -390,7 +423,7 @@ over the network until we look at particular values:
 
 Some servers require authentication before we can access the data. For this
 purpose we can explicitly create a :py:class:`~xarray.backends.PydapDataStore`
-and pass in a `Requests`__ session object. For example for 
+and pass in a `Requests`__ session object. For example for
 HTTP Basic authentication::
 
     import xarray as xr
@@ -403,7 +436,7 @@ HTTP Basic authentication::
                                             session=session)
     ds = xr.open_dataset(store)
 
-`Pydap's cas module`__ has functions that generate custom sessions for 
+`Pydap's cas module`__ has functions that generate custom sessions for
 servers that use CAS single sign-on. For example, to connect to servers
 that require NASA's URS authentication::
 
diff --git a/doc/whats-new.rst b/doc/whats-new.rst
index e3d63e7f525..03b50b90616 100644
--- a/doc/whats-new.rst
+++ b/doc/whats-new.rst
@@ -73,6 +73,13 @@ Breaking changes
   produce a warning encouraging users to adopt the new syntax.
   By `Daniel Rothenberg <https://github.com/darothen>`_.
 
+- Unicode strings (``str`` on Python 3) are now round-tripped successfully even
+  when written as character arrays (e.g., as netCDF3 files or when using
+  ``engine='scipy'``) (:issue:`1638`). This is controlled by the ``_Encoding``
+  attribute convention, which is also understood directly by the netCDF4-Python
+  interface. See :ref:`io.string-encoding` for full details.
+  By `Stephan Hoyer <https://github.com/shoyer>`_.
+
 - ``repr`` and the Jupyter Notebook won't automatically compute dask variables.
   Datasets loaded with ``open_dataset`` won't automatically read coords from
   disk when calling ``repr`` (:issue:`1522`).