dplace-data/datasets at master · D-PLACE/dplace-data

Name	Name	Last commit message	Last commit date
parent directory ..
Binford	Binford
EA	EA
GMTED2010	GMTED2010
GSHHS	GSHHS
Jenkins	Jenkins
Kreft	Kreft
MODIS	MODIS
SCCS	SCCS
TEOW	TEOW
WNAI	WNAI
_template	_template
ecoClimate	ecoClimate
README.md	README.md
index.csv	index.csv

Datasets

Variables and associated values coded for D-PLACE societies are organized into datasets, according to their source.

Each dataset is identified by a short textual ID, e.g. "EA" for data originating from the Ethnographic Atlas or "Binford" for the data from the Hunter- and Gatherer Database. The data files for a dataset are kept in a subdirectory of datasets named with the dataset ID and must consist of the following files:

variables.csv: The list of variables, coded in a dataset; must contain columns
- id: D-PLACE-wide unique identifier for the variable
- title
- definition
- type: one of Ordinal, Continuous, Categorical
- category: comma-separated list of categories a variable belongs to.
- units
- source
- changes
- notes
data.csv: The coded values; must contain columns
- soc_id: Reference to a D-PLACE society ID.
- var_id: Reference to a D-PLACE variable ID.
- code: Reference to a categorical value described in codes.csv or a literal value.
- sub_case
- year
- comment
- references: Semicolon-separated list of reference keys.
- source_coded_data
- admin_comment

and may optionally also provide files:

codes.csv: A list of category descriptions for categorical variables:
- var_id: Reference to a D-PLACE variable ID.
- code
- description
- name
references.csv: A list of references:
- key: The key used to refer to this source in the data
- citation: The full citation.
societies.csv: A list of additional societies coded in the dataset with columns:
- soc_id: D-PLACE-wide unique identifier for the society
- xd_id
- pref_name_for_society
- ORIG_name_and_ID_in_this_dataset
- alt_names_by_society
- main_focal_year
- HRAF_name_ID
- HRAF_link
- origLat
- origLong
- Lat
- Long
- Comment on location
- glottocode: Code for the most specific Glottolog languoid which can be assigned to this society.
- glottocode_comment: Comment on the assignment of a glottocode to this society.
societies_mapping.csv: A CSV file mapping society IDs to similar societies in other datasets.

If a dataset provides societies (possibly exclusively), it is considered a "society set" as well (or exclusively). While the D-PLACE web interface distinguishes these two ways of contributing to D-PLACE, the data model does not - because this property can be computed.

For a dataset to be considered for import into D-PLACE it must be registered, i.e. listed in the file index.csv, which also provides additional metadata for the dataset. index.csv has the following columns:

id: The dataset ID, i.e. the name of the subdirectory of datasets the data is kept in.
type: one of environmental, cultural.
name
description
year
author
reference: Full citation of the source

Explicit registration may be somewhat redundant in keeping the dataset ID in two places - the registry and the directory name - but allows for better control over what is considered ready for import, thus makes it possible to work on datasets in their "final place" until they are finished.

Relations

Each dataset may contribute its own set of societies. Relations among the societies from different datasets are stored in a CSV mapping file societies_mapping.csv in the form

id,related
<soc-id>,<qualified-soc-id>[;<qualified-soc-id>]*

where <qualified-soc-id> is a string composed as <dataset-id>: <original name> [<soc-id>].

Currently the only type of relation specified in the data is "equivalence", but this may be a misnomer, since this implies that the sets of equivalent societies form a partition of the set of all societies, which is not the case.

Note that changing the xd_id of a society requires re-computing the D-PLACE internal society relations.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Files

datasets

datasets

README.md

Datasets

Relations

Files

datasets

Directory actions

More options

Directory actions

More options

Latest commit

History

datasets

Folders and files

parent directory

README.md

Datasets

Relations