Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add merge analyser for French Enedis power distribution poles #2435

Merged
merged 4 commits into from
Feb 3, 2025

Conversation

flacombe
Copy link
Contributor

@flacombe flacombe commented Jan 26, 2025

It is proposed to add a new analyser to merge Enedis open data regarding distribution power poles in France.

As original dataset contains more than 1.5M of suitable features, a manual preprocess is done to restrict to first 300k rows for now. Thus it will generate less than 300k warnings.
Dataset will be regularly updated with new features as the amount of osmose warnings get lower (would be automated with #2035)

Covered area:
image

This analyser is planed to be updated later with poles material once available in Enedis dataset at the end of the year.

@frodrigo
Copy link
Member

frodrigo commented Jan 26, 2025 via email

@flacombe
Copy link
Contributor Author

flacombe commented Jan 26, 2025

I'm am publishing this file out of the original one, with a select (...) limit 300000;
There is no departement selection.

The sample won't change until I publish a new extract.

@Famlam
Copy link
Collaborator

Famlam commented Jan 26, 2025

Wouldn't this lead to the user thinking this is a never-ending challenge? "Finally done" "oh wait, there's a new 300000 coming up after another Osmose run". Just my thought, but for me this would probably feel identical to a challenge that has all 1.5M at once, where you can actually see the number go down with time.

Are those 300000 randomly distributed over the full area or could it be that they're all within the same geological area? E.g. some villages have all, some have none? Or per power line (such that they're not missing half the poles within one line and the other half the next time)?

@flacombe
Copy link
Contributor Author

flacombe commented Jan 26, 2025

"Everyone" is fine with approx 12 millions poles to be found in France. Even utility operator only know 5 millions of them yet.
We are helping him to find the remainder part.

So it's sure for now the counter won't go down.
All the point of #2035 is to prevent too many warnings raising at once in Osmose, for technical and performance reasons first.

I'm not able to make a clever filter, per power line or per geographical area, so at least natural sort make points raise from south to north.

Finally the analyzer will cover the full Metropolitan France area, once most of the poles will have been found.

@flacombe
Copy link
Contributor Author

Proposed analyzer has been modified to use the main Enedis dataset (1.3 Go).
It has been enabled on following areas:

  • Alpes-Maritimes
  • Bouches du Rhône
  • Var
  • Vaucluse
  • Gard
  • Hérault
  • Aude
  • Pyrénnées orientales
  • Haute-Garonne

@frodrigo
Copy link
Member

I was thinking about this CVS loader parameter fields to limits the number of columns loaded into the database. https://github.com/osm-fr/osmose-backend/blob/dev/analysers/Analyser_Merge.py#L592

Also filter the data on Code Département using dep_code = config.options.get('dep_code') or config.options.get('country').split('-')[1]

@flacombe
Copy link
Contributor Author

flacombe commented Jan 30, 2025

Good points indeed. I hope they will remove all the administrative data in further releases

@frodrigo
Copy link
Member

frodrigo commented Feb 3, 2025

Ok. Thank you. Merged. let see if something will explode once deployed.

@frodrigo frodrigo merged commit a2b46f2 into osm-fr:dev Feb 3, 2025
3 checks passed
@flacombe
Copy link
Contributor Author

flacombe commented Feb 3, 2025

Thank you!
You could save some space with #2438

@flacombe flacombe deleted the merge/enedis branch February 3, 2025 11:06
@frodrigo
Copy link
Member

frodrigo commented Feb 7, 2025

@flacombe
Copy link
Contributor Author

flacombe commented Feb 7, 2025

I see no changes yet, are you?

Current warnings comes from existing merge analysers, not from Enedis

@frodrigo
Copy link
Member

frodrigo commented Feb 7, 2025

Ho. You reuse an existing class id (1001...). You should use a new one e.g. 1011...

@flacombe
Copy link
Contributor Author

flacombe commented Feb 7, 2025

It's done on purpose, all Analyser_Merge_power_pole_FR_xxxx analysers gives Power pole not integrated (1001) warnings.
Enedis analyser gives the same warnings as others.

@frodrigo
Copy link
Member

frodrigo commented Feb 7, 2025 via email

@flacombe
Copy link
Contributor Author

flacombe commented Feb 7, 2025

@frodrigo
Copy link
Member

frodrigo commented Feb 7, 2025 via email

@flacombe
Copy link
Contributor Author

flacombe commented Feb 7, 2025

Do you mean on the same area?

Because sources:

  • Merge_power_pole_FR_spec_sde18- france_centre_cher
  • Merge_power_pole_FR_gracethd3_jura- france_franche_comte_jura
    Currently both produce 8290/1001 warnings on two different areas and don't clear each other

@frodrigo
Copy link
Member

frodrigo commented Feb 8, 2025

Do you mean on the same area?

Yes class has to be uniq per area. Nevertheless, that a bad idea to reuse class id.

Isn't it fine? https://osmose.openstreetmap.fr/en/issues/open?item=8290

In fact the issue is that there is no merge_power_pole_FR_spec_enedis in the list

It fails with

2025-02-08 07:31:30 france_midi_pyrenees_haute_garonne : merge_power_pole_FR_spec_enedis
2025-02-08 07:31:30   error: Fails to get status from frontend: 404
2025-02-08 07:31:30   error: No remote timestamp to resume from, start a full run
2025-02-08 07:31:30   run osmosis all analyser Analyser_Merge_power_pole_FR_spec_enedis
2025-02-08 07:31:30   Analyser_Merge.py:1286 sql
2025-02-08 07:31:31   Analyser_Merge.py:871 sql
2025-02-08 07:31:31   Analyser_Merge.py:883 sql
2025-02-08 07:31:31   Load raw data into database
2025-02-08 07:31:51   Analyser_Merge.py:901 sql
2025-02-08 07:31:51   Analyser_Merge.py:906 sql
2025-02-08 07:31:55   error: error on analyse merge_power_pole_FR_spec_enedis...
2025-02-08 07:31:55     Traceback (most recent call last):
2025-02-08 07:31:55       File "/data/project/osmose/backend/./osmose_run.py", line 275, in execc
2025-02-08 07:31:55         analyser_obj.analyser()
2025-02-08 07:31:55       File "/data/project/osmose/backend/analysers/Analyser_Osmosis.py", line 321, in analyser
2025-02-08 07:31:55         self.analyser_osmosis_common()
2025-02-08 07:31:55       File "/data/project/osmose/backend/analysers/Analyser_Merge.py", line 1444, in analyser_osmosis_common
2025-02-08 07:31:55         table = super().analyser_osmosis_common()
2025-02-08 07:31:55                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2025-02-08 07:31:55       File "/data/project/osmose/backend/analysers/Analyser_Merge.py", line 1288, in analyser_osmosis_common
2025-02-08 07:31:55         table = self.load.run(self, self.conflate, self.config.db_user, self.__class__.__name__.lower()[15:], self.analyser_version())
2025-02-08 07:31:55                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2025-02-08 07:31:55       File "/data/project/osmose/backend/analysers/Analyser_Merge.py", line 1046, in run
2025-02-08 07:31:55         return super().run(osmosis, conflate, db_schema, default_table_base_name, version)
2025-02-08 07:31:55                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2025-02-08 07:31:55       File "/data/project/osmose/backend/analysers/Analyser_Merge.py", line 907, in run
2025-02-08 07:31:55         self.parser.import_(table, osmosis)
2025-02-08 07:31:55       File "/data/project/osmose/backend/analysers/Analyser_Merge.py", line 623, in import_
2025-02-08 07:31:55         osmosis.giscurs.copy_expert(copy, self.f)
2025-02-08 07:31:55     psycopg2.errors.BadCopyFileFormat: extra data after last expected column
2025-02-08 07:31:55     CONTEXT:  COPY power_pole_fr_spec_enedis, line 2: "11206;Limoux;200071926;CC du Limouxin;11;Aude;76;Occitanie;43.04921412635185, 2.2299138532345695;"{"..."
2025-02-08 07:31:55     

@flacombe
Copy link
Contributor Author

flacombe commented Feb 9, 2025

Yes class has to be uniq per area. Nevertheless, that a bad idea to reuse class id.

As this Enedis analyser is supposed to be nation wide, I've changed classes for every analyser of item 8290 in #2444

It fails with

I don't know how CSV source actually works but isn't it weird that it load all columns from the csv file instead of ones set in fields=['Code Département', 'Geo Point', 'PREC']?

Or is it an issue with ; separator while it expects , ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants