-
Notifications
You must be signed in to change notification settings - Fork 23
Data normalization
Source plugins are expected to add data to two specific data buckets.
Products like content management systems typically have the concept of models or content types. Source plugins are expected to add information about these models to the models
data bucket, as objects with the following properties:
Property | Type | Description | Example |
---|---|---|---|
id |
String | A unique ID that identifies the object | 123456789 |
source |
String | The name of the source plugin as used in its package.json
|
sourcebit-source-contentful |
modelName |
String | The ID or machine-friendly name of the model | blog |
modelLabel |
String | The human-friendly name of the model | Blog Posts |
projectId |
String | The ID of the project within the source platform | Contentful space ID |
projectEnvironment |
String | The environment within the source platform | Contentful space environment |
💡 For data sources that don't have the concept of a project ID or environment, these values can be set to an empty string.
The objects
data bucket contains all entries coming from the various data sources. Source plugins must normalize all entries before adding them to the data bucket. This normalization consists of adding a property called __metadata
, containing an object with the following properties:
Property | Type | Description | Example |
---|---|---|---|
source |
String | The name of the source plugin as used in its package.json
|
sourcebit-source-contentful |
modelName |
String | The ID or machine-friendly name of the model | blog |
modelLabel |
String | The human-friendly name of the model | Blog Posts |
projectId |
String | The ID of the project within the source platform | Contentful space ID |
projectEnvironment |
String | The environment within the source platform | Contentful space environment |
createdAt |
String | The ISO 8601 representation of the entry's creation date | 2011-10-05T14:48:00.000Z |
updatedAt |
String | The ISO 8601 representation of the entry's last update date | 2011-10-05T15:30:00.000Z |
Additionally, all content fields should be placed at the root level of the entry object with the ID field named id
.
-
🚫
{ "type": "blog", "meta": { "_id": "123456789", "created_at": "2011-10-05T14:48:00.000Z", "updated_at": "2011-10-05T15:30:00.000Z" }, "fields": { "title": "Normalizing entries", "subtitle": "Because normal is good" } }
-
✅
{ "title": "Normalizing entries", "subtitle": "Because normal is good", "__metadata": { "id": "123456789", "source": "source-source-contentful", "modelName": "blog", "modelLabel": "Blog posts", "projectId": "1q2w3e4r", "projectEnvironment": "master", "createdAt": "2011-10-05T14:48:00.000Z", "updatedAt": "2011-10-05T15:30:00.000Z" } }
💡 For data sources that don't have the concept of a unique ID for each entry, you're advised to auto-generate one using a package like https://www.npmjs.com/package/uuid.
Asset objects are subject to an additional normalization routine. The structure of the objects will be changed so that they contain only the following properties:
-
contentType
(String): The MIME type describing the asset type -
fileName
(String): The name of the original file -
url
(String): The full asset URL
A __metadata
block will still be added, but the value of the modelName
property will be set to __asset
.