-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Questions about parsl workflow: making 3D tiles #2
Comments
deduplication of files seems to be occurring twice in the workflow - once when we stage the tiles initially, and again in the There is also a comment further down the workflow about this: # Deduplicate & make leaf 3D tiles all staged tiles (only highest
# z-level).
# TODO: COMBINE WITH STEP 2, so we only read in and deduplicate
# each staged file once. Update: After reviewing the README for the The centroids of the polygons are assigned to only 1 tile (the centroids are never assigned to multiple tiles, because even if they fall on a tile boundary they are assigned to the SE tile), and this centroid tile assignment may differ from the polygon tile assignment if the polygon falls within 2+ tiles (these tile assignments are 2 separate properties assigned when we execute the staging step). When we execute |
I'm confused about a few differences between objects in the section where we define a bunch of variables: workflow_config = '/home/jcohen/sample_data/ingmar-config__updated.json'
logging_config = '/home/jcohen/sample_data/logging.json'
batch_size_staging = 1
batch_size_rasterization = 30
batch_size_3dtiles = 20 # leaf tiles? higher resolution, more zoomed in, which is why we process fewer of them in a batch relative to the parent tiles
batch_size_parent_3dtiles = 500
batch_size_geotiffs = 200
batch_size_web_tiles = 200
|
|
Short answer: Because we create 3D tiles from vector data (e.g. shapesfiles or geopackage files). 3D tiles are just another format of vector data. Long answer:
I tried correcting the first diagram here (please just ignore the part that says "Text" 🙃) But your second diagram is correct! 🎉 To answer questions from the second diagram:
Here's a little bit more about how the lower z-level GeoTiffs are created during the rasterization step:
This depends on what is set for the
Yes, this is all correct, but it's important to note that this is a different case of deduplication (confusing, I know). In this case, the data is duplicated because our workflow duplicated it during staging. We duplicate it to make sure that the rasterization step has access to polygons that overlap a tile even just a little bit, otherwise there will be weird edge effects in the resulting PNGs. However, we do not want identical polygons in the resulting 3D tiles, so we ALWAYS remove them at this step. The OTHER deduplication which is configurable in the workflow is related to input files overlapping the same area. For the lakes dataset, where files overlap, the same lakes are detected twice, once in each file. Because the images that lakes are detected from differ a little, the same lakes won't give the exact same polygons, so the deduplication strategy is a little more complex. This part of the pdgstaging docs gives a detailed overview of these deduplication strategies.
3D tiles are the Cesium 3D Tiles (b3dm & json), web tile are the image tiles we create just for showing in Cesium (PNG)
Yes, the way we are making our 3D Tile tree is such that ONLY the leaf tiles have B3DM content. So we only show the Cesium 3D tiles when a user is very zoomed into the map. Parent tiles, in our case, are all JSON that references their child JSON or B3DM content.
Leaf tiles are children tiles. But we can have child tiles that are not leaf tiles. Page 2 of the Cesium 3D Tiles Reference card is a good reference here.
The .B3DM tiles are created by the |
@robyngit thanks for all that detailed feedback, your drawings and differentiation between the different types of deduplication and child vs leaf tiles are very helpful! Yes, now I do see the .B3DM tiles that were created by the However, I am still working through producing the Since I finished the rasterization step, I tried jumping straight to Each time I try to execute |
Hopefully is helpful in learning the different parts of the script, but any time you are stuck, I'm happy to jump on a zoom call and help you debug! :) |
I was able to resolve that error regarding import viz_3dtiles
from viz_3dtiles import TreeGenerator, BoundingVolumeRegion Just importing Thank you for offering to zoom to debug. I might take you up on that offer later today. |
Update 10/24Creating parent geotiffs for all z-levelsSince I am not working with batches, and am instead processing all files in one batch: for z in parent_zs:
# Determine which tiles we need to make for the next z-level based on the
# path names of the files just created
child_paths = tile_manager.get_filenames_from_dir('geotiff', z=z + 1)
parent_tiles = set()
for child_path in child_paths:
parent_tile = tile_manager.get_parent_tile(child_path)
parent_tiles.add(parent_tile)
parent_tiles = list(parent_tiles)
# Robyn explained here that I do not run the following function in a loop, that's just if we were iterating over many batches
create_composite_geotiffs(tiles = parent_tiles, config = workflow_config, logging_dict = logging_dict) I am not sure if that actually created any new files. It seems that the Create web tiles from geotiffsrasterizer.update_ranges() # not sure what this does
# create a file path for every .tiff within the `geotiff` dir, resulting in 7768 paths
geotiff_paths = tile_manager.get_filenames_from_dir('geotiff')
# create function for creating web tiles
def create_web_tiles(geotiff_paths, config, logging_dict=None):
"""
Create a batch of webtiles from geotiffs (step 4)
"""
import pdgraster
if logging_dict:
import logging.config
logging.config.dictConfig(logging_dict)
rasterizer = pdgraster.RasterTiler(config)
return rasterizer.webtiles_from_geotiffs(
geotiff_paths, update_ranges=False)
create_web_tiles(geotiff_paths, workflow_config, logging_dict) This code ran fine. But similarly to the last step, it does not seem that any new files were created by this code. But it matches Robyn's It also produced a concerning output (even though it did not error). This output was repeated many times, specifically 373 considering the length of the
This might be the source of the reason I am not able to create a Deduplicate and make leaf 3D tiles all staged tiles (only highest z-level)staged_paths = stager.tiles.get_filenames_from_dir('staged')
# define the function
def create_leaf_3dtiles(staged_paths, config, logging_dict=None):
"""
Create a batch of leaf 3d tiles from staged vector tiles
"""
#from pdg_workflow import StagedTo3DConverter
if logging_dict:
import logging.config
logging.config.dictConfig(logging_dict)
converter3d = StagedTo3DConverter(config)
tilesets = []
for path in staged_paths:
ces_tile, ces_tileset = converter3d.staged_to_3dtile(path) # tiles3dmaker = converter3d if converter3d = StagedTo3DConverter(workflow_config)
tilesets.append(ces_tileset)
return tilesets
# apply the function
create_leaf_3dtiles(staged_paths = staged_paths, config = workflow_config, logging_dict = logging_dict)
While that function ran fine, it did not create a dir called Moving on without this Create parent cesium 3d tilesets for all z-levels (except highest):max_z_tiles = [tile_manager.tile_from_path(path) for path in staged_paths]
# get the total bounds for all the tiles
max_z_bounds = [tile_manager.get_bounding_box(tile) for tile in max_z_tiles]
# get the total bounds for all the tiles
polygons = [box(bounds['left'],
bounds['bottom'],
bounds['right'],
bounds['top']) for bounds in max_z_bounds]
max_z_bounds = gpd.GeoSeries(polygons, crs=tile_manager.tms.crs)
bound_volume_limit = max_z_bounds.total_bounds
# loop that reads from the 3dtiles folder that should have been created in the previous step:
for z in parent_zs:
# Determine which tiles we need to make for the next z-level based on the
# path names of the files just created
all_child_paths = tiles3dmaker.tiles.get_filenames_from_dir('3dtiles', z=z + 1)
parent_tiles = set()
for child_path in all_child_paths:
parent_tile = tile_manager.get_parent_tile(child_path)
parent_tiles.add(parent_tile)
parent_tiles = list(parent_tiles)
# define function
def create_parent_3dtiles(tiles, config, limit_bv_to=None, logging_dict=None):
"""
Create a batch of cesium 3d tileset parent files that point to child
tilesets
"""
#from pdg_workflow import StagedTo3DConverter
if logging_dict:
import logging.config
logging.config.dictConfig(logging_dict)
converter3d = StagedTo3DConverter(config)
return converter3d.parent_3dtiles_from_children(tiles, limit_bv_to)
# apply function
create_parent_3dtiles(parent_tiles, workflow_config, bound_volume_limit, logging_dict) Output because there was no 3dstaging dir to read from: |
10/25 troubleshooting approach: use updated config file!I was using an older version of the config file that was linked in the issue I'm following, but now I am trying the workflow with the updated config file found in Robyn's workflow. This was one of those realizations that can only happen after stepping away from the code for a night and returning in the morning. |
This is a really important step that does the following:
We use the min and max pixel value for each z-level to map the entire range of values to the color palette when creating the PNG web tiles. If we were only to use the min and max within a tile, then the colors would not be mapped evenly across the layer. This could have something to do with the error you're seeing in web tile generation: If you ever want to know what one of the methods does, search for the method in the repo, or run:
Creating the 3D tiles is independent from creating the GeoTIFFs & webtiles. You could, for example, run the staging step and then run the 3D tiles steps, and skip rasterization all together. So the issues with 3D tiles is unrelated to the raster code.
This sounds like
Could you share the config you're using? |
Great, thanks Robyn! That makes sense that we want the min and max pixel values for each z-level to map the range of values with a color palette when creating the PNG web tiles. I agree that subtracting those values should not yield 0. I'm currently running the rasterization step again (after re-staging the files too). Hopefully re-generating the Ah yes, good point that the I did check the object Here is the updated config I'm using ( {
"version": null,
"dir_geotiff": "/home/jcohen/lake_change_sample/geotiff",
"dir_web_tiles": "/home/jcohen/lake_change_sample/web_tiles",
"dir_3dtiles": "/home/jcohen/lake_change_sample/3dtiles",
"dir_staged": "/home/jcohen/lake_change_sample/staged",
"dir_input": "/home/jcohen/lake_change_sample/input",
"dir_footprints": "/home/jcohen/lake_change_sample/footprints",
"filename_staging_summary": "/home/jcohen/lake_change_sample/staging_summary.csv",
"filename_rasterization_events": "/home/jcohen/lake_change_sample/rasterization_events.csv",
"filename_rasters_summary": "/home/jcohen/lake_change_sample/rasters_summary.csv",
"filename_config": "/home/jcohen/lake_change_sample/config.json",
"ext_web_tiles": ".png",
"ext_input": ".shp",
"ext_staged": ".gpkg",
"ext_footprints": ".gpkg",
"prop_centroid_x": "staging_centroid_x",
"prop_centroid_y": "staging_centroid_y",
"prop_area": "staging_area",
"prop_tile": "staging_tile",
"prop_centroid_tile": "staging_centroid_tile",
"prop_filename": "staging_filename",
"prop_identifier": "staging_identifier",
"prop_centroid_within_tile": "staging_centroid_within_tile",
"input_crs": null,
"simplify_tolerance": 0.0001,
"tms_id": "WorldCRS84Quad",
"tile_path_structure": [
"style",
"tms",
"z",
"x",
"y"
],
"z_range": [
0,
11
],
"tile_size": [
256,
256
],
"statistics": [
{
"name": "polygon_count",
"weight_by": "count",
"property": "centroids_per_pixel",
"aggregation_method": "sum",
"resampling_method": "sum",
"val_range": [
0,
null
],
"nodata_val": 0,
"nodata_color": "#ffffff00",
"palette": "#d93fce",
"z_config": {
"0": {
"val_range": [
null,
4533.000000001244
]
},
"1": {
"val_range": [
null,
1520.9999999982012
]
},
"2": {
"val_range": [
null,
533.0000000026628
]
},
"3": {
"val_range": [
null,
143.0000000016093
]
},
"4": {
"val_range": [
null,
45.99999999881213
]
},
"5": {
"val_range": [
null,
17.99999999962709
]
},
"6": {
"val_range": [
null,
6.999999999865963
]
},
"7": {
"val_range": [
null,
3.9999999996430233
]
},
"8": {
"val_range": [
null,
1.9999999998981368
]
},
"9": {
"val_range": [
null,
2.0
]
},
"10": {
"val_range": [
null,
2.0
]
},
"11": {
"val_range": [
null,
1.0
]
}
}
},
{
"name": "coverage",
"weight_by": "area",
"property": "area_per_pixel_area",
"aggregation_method": "sum",
"resampling_method": "average",
"val_range": [
0,
1
],
"nodata_val": 0,
"nodata_color": "#ffffff00",
"palette": "#d93fce"
}
],
"geometricError": null,
"z_coord": 0,
"deduplicate_at": [
"raster", "3dtiles"
],
"deduplicate_method": "neighbor",
"deduplicate_keep_rules": [
[
"staging_filename",
"larger"
]
],
"deduplicate_overlap_tolerance": 0.1,
"deduplicate_overlap_both": false,
"deduplicate_centroid_tolerance": null,
"deduplicate_distance_crs": "EPSG:3857",
"deduplicate_clip_to_footprint": false,
"deduplicate_clip_method": "within"
} The only changes I made to this were the palette in 2 places and file paths at the top. |
Thanks for sharing the config @julietcohen! This version, with the suffix In the The palette that you are configuring is also not valid, see the doc string in ConfigManager. Palette needs to be either the name of a color palette available in the Colormaps library or a list of color strings in any format accepted by the coloraide library. Maybe try using two colors for your palette, for example |
I did use that config file you linked, So I should use I did try a few different palettes with 2 colors before I settled on the single color in my config file because it did not error. My syntax must have been wrong, because using 2 values produced an error, but that certainly is correct according to the documentation you linked. I'll choose a valid palette and re-run the staging and rasterization steps with the original config file. Thanks for the help, as always! |
You shouldn't need to switch to the
It's interesting you didn't get an error! I would expect that you'd at least have to put your color in a list, but I haven't tested this 🤷🏻 |
Ohh got it. I see that by running |
Perhaps it would be helpful to add in an error message if the user inputs only 1 color for the palette in the config like I had. I'll note that in the issue I made for the palette in case we want to implement it. |
I ended yesterday trying to produce the 3d tiles dir by running the staging, rasterization, and 3d tiles steps from start to finish using This morning I noticed that the config file
I am wondering if the |
The differences that you are seeing between the config object and the config json file are just differences in the syntax for python ( The following are how you translate between python and JSON:
If you read the JSON file into python, it will be parsed as a dict object, and all of these translations will be done for you. import json
with open('path_to_file/ingmar-config.json', 'r') as f:
config = json.load(f)
print(config) For the tile size that I specified in one and not the other: 256x256 is the default, so it makes no difference here. There was no reason to specify it in the You can see the default config options like this: from pdgstaging import ConfigManager
print(ConfigManager.defaults) |
Awesome, thanks for clarifying! |
No description provided.
The text was updated successfully, but these errors were encountered: