Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prepare documentation for Worker.coop vocabs #8

Closed
ColmDC opened this issue Apr 7, 2023 · 17 comments
Closed

Prepare documentation for Worker.coop vocabs #8

ColmDC opened this issue Apr 7, 2023 · 17 comments
Assignees

Comments

@ColmDC
Copy link

ColmDC commented Apr 7, 2023

** Context: **
The Worker.coop project are at a stage where they can tweak their CRM so the data they are going to collect and then publish in their MykoMap will be well aligned with any international standards we want to support, to minimise the need to provide term mapping to international standards.

** Requirement: **
Prepare documentation for worker.coop team to understand the ICA vocabs we are using/recommending for describing co-operatives. This should include but not be limited to the autogenerated ESSGlobal documentation (which currently needs rebuilding and redeploying to the web).

@ColmDC ColmDC assigned ColmMassey, wu-lee and ColmDC and unassigned ColmMassey Apr 7, 2023
@ColmDC
Copy link
Author

ColmDC commented May 15, 2023

Co-ops UK have now provided an interim drop of their Open Data. Internal copy here...https://drive.google.com/drive/folders/1NyaImSOYCP8Zk9rkEduw7Ld_2UrazCZb?usp=share_link

@wu-lee
Copy link

wu-lee commented May 19, 2023

We've shared a document listing the ESSGLOBAL and other taxonomies in use in DCC currently, called TaxonomiesForCoops.xlsx.

Is this what you had in mind? Is there anything which we should add?

It was shared via an editable folder link here, so John and Graham can edit/upload if required.

https://nextcloud.digitalcommons.coop/s/mkso8BL43pxbARK

The folder's path is /DCC Personnel/SEA_Archive/Open Data and Maps/Delivery/CodeOperatives/Worker.Coop, and the internal link for it is:

https://nextcloud.digitalcommons.coop/f/3047

@ColmDC
Copy link
Author

ColmDC commented May 22, 2023

Is this what you had in mind? Is there anything which we should add?
No. It is this one.

@ColmDC
Copy link
Author

ColmDC commented May 23, 2023

@wu-lee Do you feel there is clarity now in that document to be able to produce the augmented version of Organisations_2023-05-15.csv ? If not, please specify what you think is still missing in comments in that document.

@wu-lee
Copy link

wu-lee commented May 23, 2023

Based on reading it through, it seems like it - but the proof of the pudding is in the making. Should I attempt it and then feed back?

@ColmDC
Copy link
Author

ColmDC commented May 23, 2023

I note that this ticket is still in NextUp, so best follow Kanban board priorities. Are you blocked on what is currently in In Progess?

@wu-lee
Copy link

wu-lee commented May 23, 2023

Not blocked in the sense of not being able to make progress, but it is the mm version upgrades and some related tweaks and build configs for hook-runner which are in progress, so maybe appearing blocked because they aren't being made visible

@ColmDC
Copy link
Author

ColmDC commented May 24, 2023

As we didn't get to review the Kanban yesterday, if you have time to look at what it takes to populate those new fields before the call today, we'd have more to talk about, but no worries if not.

@wu-lee
Copy link

wu-lee commented Jun 7, 2023

Having another look of this since it's Wednesday (we don't have a meeting scheduled but it seems an apt time to spend a few minutes on this).

Squinting at our code, it looks like a sensible way to proceed is to add a conversion script within the existing coopsuk/ folder of the open-data project. Much of the configs and schemas are the same. We're essentially reading the same input and outputting something a bit like our standard.csv with some extra fields passed through as-is.

So I'm wondering, how much do we care about the titles of the output fields? If we use the existing code, then the path of least resistance is to use the standard.csv names for fields in the output. Specifically, this sort of thing (although I've not been exhaustive yet):

Co-ops UK Identifier -> Identifier
Trading Name -> Name # (this is how the old scheme did it, as Organisation Name is a new field)
Registered Street -> Street Address
Registered City -> Locality
Registered State/Province -> Region
Registered Postcode -> Postcode
Registered Country -> Country ID
Website -> Website

Then those fields listed in your notes in that document above which not retained would be added in. There may be some fields which are included but not in your notes - I guess these are harmless to include, but could be removed.

It looks to me like Co-ops UK have altered their CSV format yet again, so I'll need to tackle that as part of DigitalCommons/coopsuk#19 first.

@ColmDC
Copy link
Author

ColmDC commented Jun 7, 2023

Sounds like a plan. Good to keep in mind that this is a one time activity, so manually changing titles, deleting columns etc are all fine.

@ColmDC
Copy link
Author

ColmDC commented Jun 7, 2023

It looks to me like Co-ops UK have altered their CSV format yet again, so I'll need to tackle that as part of DigitalCommons/coopsuk#19 first.

Is it just column titles, or something more? Also this is an interim drop, so probably didn't do many chancks to see if it was consistent with last drop. So if you can manually change the csv file before chanign code to cope with changed file.

@wu-lee
Copy link

wu-lee commented Jun 13, 2023

More than just column titles, as per the issue thread linked above.

But hopefully we're mostly insulated from that by using the standard csv file. The main PITA was the loss of the "SIC Section" field, which I had to infer from "SIC Code". I see a lot of organisations don't even have a SIC Section, so I wonder how good its coverage is.

@wu-lee
Copy link

wu-lee commented Jun 13, 2023

Also doing DigitalCommons/coopsuk#17 first.

@wu-lee
Copy link

wu-lee commented Jun 14, 2023

I've made a best effort of mapping the data to what is described in the notes.

The way I did it was to adapt the mapping to out standard CSV fields for CUK, which adds our taxonomies, and include some extra fields as-is from CUK data, and a new synthesised field WC Legal Form.

  • WC Legal Form is generated from the directions in the notes as best as I can interpret - see note there.
  • CUK's Simplified Sector field is absent in the new dump from them. I've added a best effort inference, by using the mappings from the old data and adding new ones where necessary

Both these fields will need reviewing, the latter not least because it's blank in a lot of cases (as given by CUK).

The mapping from Legal Form and Organisation Type to Organisational Structure and Base Membership may need reviewing as per the instructions, which say that Portland Works and Brighton Jazz need to be reclassified. I've not done those manually for now, as I've run out of time.

There's a CSV file in here, called wc_standard.csv:

https://nextcloud.digitalcommons.coop/s/mkso8BL43pxbARK

The heading mappings used are in headings.csv (see cols G:N if you open in a spreadsheet, the others are mapping the old CUK data fields to their latest and different set)

Correction: Its "SIC Section" which is missing in the new data, not "Sector - Simplified" (which has become "Co-ops UK Industry Sector").

@wu-lee
Copy link

wu-lee commented Jun 14, 2023

I've also uploaded:

  • legal-ownership-to-org-membership.csv - maps Legal Form and Ownership Type fields to Org Structure and Base Membership vocabs
  • sic-division-to-ica-activity.csv - maps SIC Division to ICA Activity
  • sic-code-to-sic-section.csv - maps SIC Code to SIC Section

@ColmDC
Copy link
Author

ColmDC commented Jun 28, 2023

Colm to do a final review and pass over to John & Graham

@ColmDC
Copy link
Author

ColmDC commented Jul 20, 2023

They are now uploading to the Civi, so can close.

@ColmDC ColmDC closed this as completed Jul 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants