apache · RussellSpitzer · Nov 17, 2021 · Nov 15, 2021 · Nov 15, 2021 · Nov 15, 2021
diff --git a/site/docs/spec.md b/site/docs/spec.md
@@ -212,6 +212,24 @@ Columns in Iceberg data files are selected by field id. The table schema's colum
 
 For example, a file may be written with schema `1: a int, 2: b string, 3: c double` and read using projection schema `3: measurement, 2: name, 4: a`. This must select file columns `c` (renamed to `measurement`), `b` (now called `name`), and a column of `null` values called `a`; in that order.
 
+Tables may also define a property `schema.name-mapping.default` with a JSON name mapping containing a list of field mapping objects. These mappings provide fallback field ids to be used when a data file does not contain field id information. Each object should contain
+
+* `names`: A required list of 0 or more names for a field. 
+* `field-id`: An optional Iceberg field ID used when a field's name is present in `names`
+* `fields`: An optional list of field mappings for child field of structs, maps, and lists.
+
+Field mapping fields are constrained by the following rules:
+
+* A name may contain `.` but this refers to a literal name, not a nested field. For example, `a.b` refers to a field named `a.b`, not child field `b` of field `a`. 
+* Each child field should be defined with their own field mapping under `fields`. 
+* Multiple values for `names` may be mapped to a single field ID to support cases where a field may have different names in different data files. For example, all Avro field aliases should be listed in `names`.
+* Fields which exist only in the Iceberg schema and not in imported data files may use an empty `names` list.
+* Fields that exist in imported files but not in the Iceberg schema may omit `field-id`.
+* List types should contain a mapping in `fields` for `element`. 
+* Map types should contain mappings in `fields` for `key` and `value`. 
+* Struct types should contain mappings in `fields` for their child fields.
+
+For details on serialization, see [Appendix C](#name-mapping-serialization).
 
 #### Identifier Field IDs
 
@@ -990,6 +1008,27 @@ Table metadata is serialized as a JSON object according to the following table.
 |**`default-sort-order-id`**|`JSON int`|`0`|
 
 
+### Name Mapping Serialization
+
+Name mapping is serialized as a list of field mapping JSON Objects which are serialized as follows
+
+|Field mapping field|JSON representation|Example|
+|--- |--- |--- |
+|**`names`**|`JSON list of strings`|`["latitude", "lat"]`|
+|**`field_id`**|`JSON int`|`1`|
+|**`fields`**|`JSON field mappings (list of objects)`|`[{ `<br />&nbsp;&nbsp;`"field-id": 4,`<br />&nbsp;&nbsp;`"names": ["latitude", "lat"]`<br />`}, {`<br />&nbsp;&nbsp;`"field-id": 5,`<br />&nbsp;&nbsp;`"names": ["longitude", "long"]`<br />`}]`|
+
+Example
+```json
+[ { "field-id": 1, "names": ["id", "record_id"] },
+   { "field-id": 2, "names": ["data"] },
+   { "field-id": 3, "names": ["location"], "fields": [
+       { "field-id": 4, "names": ["latitude", "lat"] },
+       { "field-id": 5, "names": ["longitude", "long"] }
+     ] } ]
+```
+
+
 ## Appendix D: Single-value serialization
 
 This serialization scheme is for storing single values as individual binary values in the lower and upper bounds maps of manifest files.