Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Combine Image Metadata #14

Closed
mattmotoki opened this issue Oct 24, 2018 · 4 comments
Closed

Combine Image Metadata #14

mattmotoki opened this issue Oct 24, 2018 · 4 comments
Assignees

Comments

@mattmotoki
Copy link
Contributor

mattmotoki commented Oct 24, 2018

Create a single file with a row for each of the following plants:

Add a column for each of the following fields:

  • Hawaiian Name
  • Common Name
  • Genus
  • Species
  • Status (Endangered, Native, Endemic, Invasive)
  • Story
  • Uses

Upload all data to our shared data directory.

@mattmotoki mattmotoki added the in progress Work is in progress label Oct 24, 2018
@mattmotoki mattmotoki changed the title Obtain rare plant dataset Obtain native plant dataset Oct 24, 2018
@mattmotoki mattmotoki changed the title Obtain native plant dataset Collect images for plants on the KUPU/DLNR suggested list Oct 24, 2018
@mattmotoki mattmotoki changed the title Collect images for plants on the KUPU/DLNR suggested list Collect images on KUPU/DLNR list Oct 24, 2018
@mattmotoki mattmotoki changed the title Collect images on KUPU/DLNR list Combine Image Meta Data Oct 28, 2018
@mattmotoki
Copy link
Contributor Author

mattmotoki commented Oct 29, 2018

Combined the Native, Invasive, Common plants into a single file called plant_categories.csv
The file is in our shared data directory.

All fields except for the "plant_name" (the category that we are predicting) need to be filled in.

@mattmotoki mattmotoki changed the title Combine Image Meta Data Combine Image Metadata Oct 29, 2018
@xyl012 xyl012 closed this as completed Oct 29, 2018
@xyl012
Copy link

xyl012 commented Oct 29, 2018

Input plant_categories_v2 with most fields filled in.

@mattmotoki
Copy link
Contributor Author

mattmotoki commented Oct 29, 2018

Manually removed extra information; e.g., pineapple nutritional information. Cleaned up text see :

  • replace new lines with space
  • remove references [\d+]
  • remove extra whitespace

For example,

The ovaries develop into berries, which coalesce into a large, compact, multiple fruit. The fruit of a pineapple is arranged in two interlocking helices, eight in one direction, 13 in the other, each being a Fibonacci number.[12]

The pineapple carries out CAM photosynthesis,[13] fixing carbon dioxide at night and storing it as the acid malate, then releasing it during the day aiding photosynthesis.

In the wild, pineapples are pollinated primarily by hummingbirds.[2][14] Certain wild pineapples are foraged and pollinated at night by bats.[15]

gets converted to

The ovaries develop into berries, which coalesce into a large, compact, multiple fruit. The fruit of a pineapple is arranged in two interlocking helices, eight in one direction, 13 in the other, each being a Fibonacci number. The pineapple carries out CAM photosynthesis, fixing carbon dioxide at night and storing it as the acid malate, then releasing it during the day aiding photosynthesis. In the wild, pineapples are pollinated primarily by hummingbirds. Certain wild pineapples are foraged and pollinated at night by bats. 

New data in plant_categories_v3.csv

@mattmotoki mattmotoki removed the in progress Work is in progress label Oct 29, 2018
@mattmotoki mattmotoki reopened this Oct 30, 2018
@mattmotoki mattmotoki added the in progress Work is in progress label Oct 30, 2018
@mattmotoki
Copy link
Contributor Author

Added Sora's "canoe plants" to the list. These plants are pretty well-known in Hawaii and add about 500 images per plant to our dataset. The updated data is in local_meta_v4.csv.

I changed the name of the files because I am working with metadata from multiple sources. plant_categores and plant_meta was getting confusing. local will refer to the set of plants that will be predicted in our app.

Each plant in local_meta_v4.csv has a corresponding sub-folder in our shared images drive.

@mattmotoki mattmotoki removed the in progress Work is in progress label Nov 7, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants