Libraries, archives, museums, and other holders of cultural heritage artifacts are releasing metadata (e.g. artist or author information, medium, physical dimensions) about those artifacts in the open for public reuse and analysis. For the "Cultural Heritage Data" theme at Data Tables, we want to explore these repositories of "cultural source code"!
The collection data of
the Carnegie Museum of Art in Pittsburgh, Pennsylvania provided as CSV
and JSON files uploaded to GitHub. The datasets are described using a
datapackage.json
file: #22
https://github.com/cmoa/collection
The collection data of The Metropolitan Museum of Art in New York: #23
https://github.com/metmuseum/openaccess
The collection data of Tate, a family of four galleries holding British art: #24
https://github.com/tategallery/collection
The collection data of the Cooper Hewitt, Smithsonian Design Museum in New York: #27
https://github.com/cooperhewitt/collection
The collection data of the Museum of Modern Art (MoMA) in New York: #28
https://github.com/MuseumofModernArt/collection
- Inspect the data and describe the schema if it doesn't yet exist
- You can do this with a
datapackage.json
(http://specs.frictionlessdata.io/data-package/) or CSV on the Web metadata file (http://w3c.github.io/csvw/)
- You can do this with a
- Clean the data
- As an example, Srini Kadamati explored the particular issues relevant to collection data and how they can be cleaned with Python.
- Create an basic analysis
- As an example, FiveThirtyEight's Oliver Roeder produced articles analyzing
New York’s Metropolitan Museum of Art and the MoMa.
and yielded some interesting insights including:
- Top 10 countries of origin for items in the collection
- Top 10 categories of items in the collection by type and year of origin
- Map year acquired to year painted as a measure of the rate of modernization
- A focus on vases by year and civilization of origin
- As an example, FiveThirtyEight's Oliver Roeder produced articles analyzing
New York’s Metropolitan Museum of Art and the MoMa.
and yielded some interesting insights including:
- Create a bot
- As an example, [@NYPLEmoji]((https://twitter.com/nyplemoji) is a Twitter bot that receives emoji replies and responds with an image related to that emoji.
- How do these collection datasets relate to each other?
- How is the same artist, location, medium, etc. represented across datasets?
Decentralized data storage systems, like InterPlanetary File System (IPFS), are currently being explored as a way to store and share cultural heritage data.
Palladio is a tool for visualizing complex historical data.
TimelineJS is a tool for building interactive timelines.
If you are interested in "Collections as Data", there is a working group aimed at developing a strategic approach to sharing and describing such datasets.