full FUA(s) for full algo run #38

jGaboardi · 2024-10-11T17:18:36Z

Which (if any) FUAs should we test against in our CI and where should we store the data?

1133 – Aleppo, Syria, Middle East / Asia
869 – Auckland, New Zealand, Oceania / Asia
809 – Douala, Cameroon, Africa
1656 – Liège, Belgium, Europe
4617 – Bucaramanga, Colombia, S. America
4881 – Salt Lake City, Utah, USA, N. America
8989 – Wuhan, China, Far East / Asia

The text was updated successfully, but these errors were encountered:

martinfleis · 2024-10-11T17:36:56Z

If we want full coverage, we may even need more of those from simplification repo. Plus potentially those that just errored recently on Central Europe.

Everything apart from Wuhan takes minutes. But every single one of these triggered some fixes of the algo.

jGaboardi · 2024-10-11T18:43:39Z

So maybe let's just include them all and "suffer" from longer testing runs? It beats spending hours debugging.

The question remains as to where to store the data? We are probably looking at ≈20mb storage for the "original" and "known" results for the 6 FUA (excluding Wuhan). Not the end of the world if we have copies that live here in sgeop, but do we have a better idea to source the data directly from simplification?

martinfleis · 2024-10-11T19:50:02Z

I think that this repo should be self-sufficient so we should store our test data here. But certainly not package them.

jGaboardi · 2024-10-11T19:55:34Z

I think that this repo should be self-sufficient so we should store our test data here. But certainly not package them.

Agreed

jGaboardi · 2024-10-11T20:32:16Z

But let's agree on where to put the data before I start with getting a PR in for it. For the current little testing data we just merged in we have that in sgeop/sgeop/tests/data/*. Should these full FUA results perhaps live in a top-level directory simply named data/, like over in simplification/data/*? Though perhaps a more friendly storage and naming schema than we have over there.

anastassiavybornova · 2024-10-13T13:27:06Z

Should these full FUA results perhaps live in a top-level directory simply named data/, like over in simplification/data/*? Though perhaps a more friendly storage and naming schema than we have over there.

it sounds good to me with sgeop/data/*/file

(where * is currently the FUA code, and if we want a more friendly naming schema - use city name instead?)

jGaboardi · 2024-10-13T16:19:56Z

@anastassiavybornova How about like the following (which does not include everything in the repo for brevity):

--- sgeop
      |--- ...
      |      |--- ...
      |
      |--- ci
      |      |--- ...
      |
      |--- data
      |      |--- README.md (for stuff in data dir)
      |      |--- <creation_script>.py
      |      |--- <fua_name>_<fua_number>
      |      |      |--- original.parquet
      |      |      |--- simplified.parquet
      |      |--- ...
      |
      |--- sgeop
      |      |--- ...
      |
      |--- README.md (top-dir)
      |--- ...

Here we'd have in the ./data/ directory:

a README.md + some script for (re)creating the data to assist with potential testing schema #7
1 directory for each FUA's data with an input (original.parquet) and known result (simplified.parquet)

anastassiavybornova · 2024-10-14T15:05:16Z

looks good to me! @jGaboardi

jGaboardi added the tests/CI label Oct 11, 2024

jGaboardi self-assigned this Oct 11, 2024

jGaboardi mentioned this issue Oct 11, 2024

small integrated testing data - apalachicola, fl #29

Merged

jGaboardi mentioned this issue Oct 13, 2024

testing, refactor, & docstrings #20

Closed

5 tasks

This was referenced Oct 14, 2024

Liège fails sgeop.simplify_network() – ValueError: No threshold for artifact detection found. #40

Closed

full fua testing for simplify_network() #41

Merged

jGaboardi closed this as completed in #41 Oct 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

full FUA(s) for full algo run #38

full FUA(s) for full algo run #38

jGaboardi commented Oct 11, 2024 •

edited

Loading

martinfleis commented Oct 11, 2024

jGaboardi commented Oct 11, 2024

martinfleis commented Oct 11, 2024

jGaboardi commented Oct 11, 2024

jGaboardi commented Oct 11, 2024

anastassiavybornova commented Oct 13, 2024

jGaboardi commented Oct 13, 2024

anastassiavybornova commented Oct 14, 2024

full FUA(s) for full algo run #38

full FUA(s) for full algo run #38

Comments

jGaboardi commented Oct 11, 2024 • edited Loading

martinfleis commented Oct 11, 2024

jGaboardi commented Oct 11, 2024

martinfleis commented Oct 11, 2024

jGaboardi commented Oct 11, 2024

jGaboardi commented Oct 11, 2024

anastassiavybornova commented Oct 13, 2024

jGaboardi commented Oct 13, 2024

anastassiavybornova commented Oct 14, 2024

jGaboardi commented Oct 11, 2024 •

edited

Loading