Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Give users the full process instead of just the graph #54

Closed
m-mohr opened this issue Nov 12, 2020 · 24 comments
Closed

Give users the full process instead of just the graph #54

m-mohr opened this issue Nov 12, 2020 · 24 comments
Assignees
Labels
enhancement New feature or request medium
Milestone

Comments

@m-mohr
Copy link
Member

m-mohr commented Nov 12, 2020

In general, but especially also in the QGIS plugin and the Web Editor, openEO processes in JSON need to be fully defined including their potential metadata (see example 1), i.e. it is not enough to just list the graph nodes (see example 2). Unfortunately, the R client and Python client export the graphs without their metadata. This leads reportedly to some confusion when users try to copy/paste the "incomplete" processes into the Web Editor, QGIS Plugin or the Hub for example, which then can't handle them properly. Therefore, I'd really like to see that the R and Python clients print out the full process (see example 1).

Example 1:

{
  "process_graph": {
    "1": {
      "process_id": "add",
      "arguments": {
        "x": 1,
        "y": 2
      },
      "result": true
    }
  }
}

Example 2:

{
  "1": {
    "process_id": "add",
    "arguments": {
      "x": 1,
      "y": 2
    },
    "result": true
  }
}
@flahn
Copy link
Member

flahn commented Nov 12, 2020

I think, I had implemented something like that. Just use create_user_process and set submit=FALSE. The resulting object is a ProcessInfo class which has the same representation as an openEO process.

@m-mohr
Copy link
Member Author

m-mohr commented Nov 12, 2020

Thank you, I'll look into it tomorrow, I just saw again that someone using the R client exported it "incorrectly" again. Wasn't there a graph or toJson function or something like that that may still export the old version?
Also, this is nothing I expect you to fix soon (I can't expect anything anyway ;-) ), but is more a reminder for later...

@m-mohr
Copy link
Member Author

m-mohr commented Nov 23, 2020

Okay, the issue is the following code used in many examples (e.g. GEE + EURAC):

...

final = p$save_result(data = apply_linear_transform,format = formats$output$PNG)
graph = as(final,"Graph")
graph

The graph outputted is just the "graph" (example 2), not a complete process (example 1).

Such behavior should either be changed or not be promoted in examples.

@flahn
Copy link
Member

flahn commented Nov 23, 2020

Yes, it is as I imagined, you wan't to have an openEO process (JSON) from this graph. But as the name difference suggests you have to create a process by adding metadata to the graph using create_user_process first. When you set the parameter submit=FALSE you won't do the HTTP call, but an object (a process) is returned instead. This object is list based, but the class will state that it is something process related (don't know the exact class name, right now). If you print this object, it will get a nice visualization like every other openEO process or user-defined process. To get it as JSON you would need to call jsonlite::toJSON and set force=TRUE to treat the object as a simple list which it is. Additional parameter like auto_unbox and pretty might be set as well.

@flahn
Copy link
Member

flahn commented Nov 23, 2020

@m-mohr Can you point me to the examples where it is promoted, that you can 1:1 copy the graph into the Webeditor? Then I can change this.

@m-mohr
Copy link
Member Author

m-mohr commented Nov 26, 2020

@m-mohr
Copy link
Member Author

m-mohr commented Dec 4, 2020

Actually, as(final,"Graph") was used again in the review meeting although it should be discouraged. I'd actually propose to completely remove as(final,"Graph") and replace it with something to just print the full process or let as(final,"Graph") print the full process. Just the process graph without metadata will not be useful in the openEO world and will confuse users. @MilutinMM

@MilutinMM
Copy link
Member

MilutinMM commented Dec 5, 2020

I see the point, and as(graph_test1,"Graph") is now removed from the script. Instead, I add the following code:

# create a bach job from the graph:
job_id = create_job(con = eurac, graph = graph_test1, title = "job_final_review_demo", description = "job_final_review_demo") # batch call, works on udf!
# get the JSON text of the process graph:
graph_info = create_user_process( graph_test1,id=job_id$id, submit = FALSE, con = eurac )
graph_json = jsonlite::toJSON(graph_info$process_graph)
print(graph_json)

and this is the result:

{"load_collection_ZHZZM4626F":{"process_id":["load_collection"],"arguments":{"id":["openEO_WUR_Usecase"],"spatial_extent":{"west":[-54.815],"south":[-3.515],"east":[-54.81],"north":[-3.51]},"temporal_extent":[["2017-01-01T00:00:00Z"],["2019-12-29T00:00:00Z"]],"bands":[["VH"]]}},"run_udf_ZIAXO9027Y":{"process_id":["run_udf"],"arguments":{"data":{"from_node":["load_collection_ZHZZM4626F"]},"udf":["# @require x:stars\nlibrary(bfast)\nlibrary(abind)\n# set_fast_options()\n# this works because band is a empty dimension\nx = adrop(x)\n\n# Define the pixel-wise function\nSpatialBFM = function(pixels)\n{\n  lsts = ts(pixels, c(2017, 1), frequency=30.666667)\n  bfastmonitor(lsts, 2019, formula=response~trend)$breakpoint\n}\n# either use adrop() to drop the band dimension... or include here to reduce.\n#StarsResult = st_apply(x, c(\"x\", \"y\", \"band\"), SpatialBFM, PROGRESS=TRUE)\nStarsResult = st_apply(x, c(\"x\", \"y\"), SpatialBFM, PROGRESS=TRUE)\n# deal with NA-a:\nStarsResult[is.na(StarsResult)] = -9999\n#\nStarsResult"],"runtime":["R"],"context":{}}},"save_result_VWHJC2286A":{"process_id":["save_result"],"arguments":{"data":{"from_node":["run_udf_ZIAXO9027Y"]},"format":["GTiff"],"options":{}},"result":[true]}}

then I used additional step to visualize the json in a web editor to get the following:

{
  "load_collection_ZHZZM4626F": {
    "process_id": [
      "load_collection"
    ],
    "arguments": {
      "id": [
        "openEO_WUR_Usecase"
      ],
      "spatial_extent": {
        "west": [
          -54.815
        ],
        "south": [
          -3.515
        ],
        "east": [
          -54.81
        ],
        "north": [
          -3.51
        ]
      },
      "temporal_extent": [
        [
          "2017-01-01T00:00:00Z"
        ],
        [
          "2019-12-29T00:00:00Z"
        ]
      ],
      "bands": [
        [
          "VH"
        ]
      ]
    }
  },
  "run_udf_ZIAXO9027Y": {
    "process_id": [
      "run_udf"
    ],
    "arguments": {
      "data": {
        "from_node": [
          "load_collection_ZHZZM4626F"
        ]
      },
      "udf": [
        "# @require x:stars\nlibrary(bfast)\nlibrary(abind)\n# set_fast_options()\n# this works because band is a empty dimension\nx = adrop(x)\n\n# Define the pixel-wise function\nSpatialBFM = function(pixels)\n{\n  lsts = ts(pixels, c(2017, 1), frequency=30.666667)\n  bfastmonitor(lsts, 2019, formula=response~trend)$breakpoint\n}\n# either use adrop() to drop the band dimension... or include here to reduce.\n#StarsResult = st_apply(x, c(\"x\", \"y\", \"band\"), SpatialBFM, PROGRESS=TRUE)\nStarsResult = st_apply(x, c(\"x\", \"y\"), SpatialBFM, PROGRESS=TRUE)\n# deal with NA-a:\nStarsResult[is.na(StarsResult)] = -9999\n#\nStarsResult"
      ],
      "runtime": [
        "R"
      ],
      "context": {
        
      }
    }
  },
  "save_result_VWHJC2286A": {
    "process_id": [
      "save_result"
    ],
    "arguments": {
      "data": {
        "from_node": [
          "run_udf_ZIAXO9027Y"
        ]
      },
      "format": [
        "GTiff"
      ],
      "options": {
        
      }
    },
    "result": [
      true
    ]
  }
}

@MilutinMM
Copy link
Member

MilutinMM commented Dec 5, 2020

As my main purpose was to present how a graph could look like, I find print(job_id) even better and with more relevant detail. The question is if the graph in the print(job_id) result below is ready to be copy-pasted to the web editor and QGIS:

Job ID:		949b1db4-5fad-4e8b-8ba1-aecb7b716bd2
Title:		job_final_review_demo
Description:	job_final_review_demo
Status:		created
Created:	2020-12-04T13:27:38.161Z
Updated:	2020-12-04T13:27:38.161Z
Progress:	0
Plan:		free
Costs:		
Budget:		---
User defined process:
Process:	
Summary:	
Description:	
Returns:	

Stored process graph:
{
  "run_udf_ZIAXO9027Y": {
    "process_id": "run_udf",
    "arguments": {
      "data": {
        "from_node": "load_collection_ZHZZM4626F"
      },
      "udf": "# @require x:stars\nlibrary(bfast)\nlibrary(abind)\n# set_fast_options()\n# this works because band is a empty dimension\nx = adrop(x)\n\n# Define the pixel-wise function\nSpatialBFM = function(pixels)\n{\n  lsts = ts(pixels, c(2017, 1), frequency=30.666667)\n  bfastmonitor(lsts, 2019, formula=response~trend)$breakpoint\n}\n# either use adrop() to drop the band dimension... or include here to reduce.\n#StarsResult = st_apply(x, c(\"x\", \"y\", \"band\"), SpatialBFM, PROGRESS=TRUE)\nStarsResult = st_apply(x, c(\"x\", \"y\"), SpatialBFM, PROGRESS=TRUE)\n# deal with NA-a:\nStarsResult[is.na(StarsResult)] = -9999\n#\nStarsResult",
      "context": {},
      "runtime": "R"
    }
  },
  "save_result_VWHJC2286A": {
    "result": true,
    "process_id": "save_result",
    "arguments": {
      "data": {
        "from_node": "run_udf_ZIAXO9027Y"
      },
      "format": "GTiff",
      "options": {}
    }
  },
  "load_collection_ZHZZM4626F": {
    "process_id": "load_collection",
    "arguments": {
      "temporal_extent": [
        "2017-01-01T00:00:00Z",
        "2019-12-29T00:00:00Z"
      ],
      "spatial_extent": {
        "east": -54.81,
        "south": -3.515,
        "north": -3.51,
        "west": -54.815
      },
      "id": "openEO_WUR_Usecase",
      "bands": [
        "VH"
      ]
    }
  }
} 

@m-mohr
Copy link
Member Author

m-mohr commented Dec 6, 2020

No, print() is also giving the imcomplete process (can't be visualized in the Web Editor for example), so yet another place that should be fixed.

@przell
Copy link
Member

przell commented Dec 17, 2020

@m-mohr, @flahn:
I was just talking to MeterGroup and @tim-salabim. They ran into the same problem.
They used as(xxx, "Graph") and tried to visualize in the webeditor.

  • We should keep that in mind for the R4openEO Project and put it into our requirements to fix this unambiguously
  • Metergroup also asked for process graph visualization directly in R.
    I think it would be a cool feature! To not leave the workflow and have the possibility to present the process visually to clients, your team, etc.

@m-mohr
Copy link
Member Author

m-mohr commented Dec 17, 2020

I'm working towards having the process visualizations standalone as a component that it can be integrated like the Collections component for example, but it's not an easy task as it has quite a lot of dependencies so it will take a bit more time...

@flahn
Copy link
Member

flahn commented May 13, 2021

I'm coming a bit late to the discussion, sorry. "Graph" and "Process" (user defined or predefined) are different concepts. The graph as such is just the part of the process that tells the back-end what it has to do. Process is more, it contains also meta data about the graph like name, description, etc.
To add the meta data to the graph you have to create a process, hence create_user_process. Usually that process is not really of interest for the user, because it should go straight to the back-end. But it seems that there is a bigger use case for that and I see what I can do. The best option would be indeed the webeditor components embedded in the RStudio viewer.

@przell
Copy link
Member

przell commented May 14, 2021

How is the python client handling this at the moment? @m-mohr @clausmichele

@clausmichele
Copy link
Member

I've opened an issue a couple of days ago exactly about this for the python client Open-EO/openeo-python-client#209

@m-mohr
Copy link
Member Author

m-mohr commented May 14, 2021

Both R client and Python client still work a bit like in API v0.4 and it seems both never fully adopted the new process schema internally. In 0.4 there was only process graphs without metadata, but a process graph without any metadata and without the wrapper is pretty much useless in API 1.0, so the clients should adopt that structure.

@przell
Copy link
Member

przell commented Jul 6, 2021

Just an update on this topic to pick up when we continue. On the create_user_process version there are square brackets added where there shouldn't be any.

Process graph definition:

data = p$load_collection(id = collection, 
                         spatial_extent = bbox,
                         temporal_extent = time_range, 
                         bands = bands)
result = p$save_result(data = data, format = "NetCDF")

as() ouptut:

process_graph = as(result, "Graph")
{
  "load_collection_GXORC7003Z": {
    "process_id": "load_collection",
    "arguments": {
      "id": "SENTINEL2_L1C_SENTINELHUB",
      "spatial_extent": {
        "west": 13.0974,
        "east": 13.1003,
        "south": 47.8242,
        "north": 47.8257
      },
      "temporal_extent": [
        "2018-01-01T00:00:00.000Z",
        "2018-12-31T00:00:00.000Z"
      ],
      "bands": [
        "B04"
      ]
    }
  },
  "save_result_SPVBN1400V": {
    "process_id": "save_result",
    "arguments": {
      "data": {
        "from_node": "load_collection_GXORC7003Z"
      },
      "format": "NetCDF",
      "options": {}
    },
    "result": true
  }
} 

create_user_process() output:

graph_info = create_user_process(result,
                                 id = result$id,
                                 submit = FALSE)

print(jsonlite::toJSON(graph_info, pretty = TRUE))
{
  "id": {},
  "process_graph": {
    "load_collection_GXORC7003Z": {
      "process_id": ["load_collection"],
      "arguments": {
        "id": ["SENTINEL2_L1C_SENTINELHUB"],
        "spatial_extent": {
          "west": [13.0974],
          "east": [13.1003],
          "south": [47.8242],
          "north": [47.8257]
        },
        "temporal_extent": [
          ["2018-01-01T00:00:00.000Z"],
          ["2018-12-31T00:00:00.000Z"]
        ],
        "bands": [
          ["B04"]
        ]
      }
    },
    "save_result_SPVBN1400V": {
      "process_id": ["save_result"],
      "arguments": {
        "data": {
          "from_node": ["load_collection_GXORC7003Z"]
        },
        "format": ["NetCDF"],
        "options": {}
      },
      "result": [true]
    }
  },
  "parameters": [],
  "returns": {
    "schema": {
      "type": ["boolean"],
      "subtype": [],
      "pattern": [],
      "parameters": [],
      "items": {
        "type": {}
      },
      "minItems": [],
      "maxItems": []
    }
  }
} 

@flahn
Copy link
Member

flahn commented Jul 6, 2021

@przell, you can use the parameter auto_unbox=TRUE to get rid of the brackets, when using toJSON

@m-mohr m-mohr added enhancement New feature or request high medium and removed high labels Sep 30, 2021
@m-mohr m-mohr added this to the v1.1.0 / CRAN milestone Oct 7, 2021
@flahn
Copy link
Member

flahn commented Oct 8, 2021

As for now, I have implemented the as(x,"Process") function to create a simple user defined process representation. The returned object is of type "Process", but the underlying structure of predefined and user defined process are the same. You can print that object now to print the JSON object.

As a deprecation note: now that we can obtained a simple UDP on the client by coercion, we don't necessarily need the parameter submit in create_user_process any longer. It will still work, but I have added a note, when using the function to obtain a Process object.

@m-mohr
Copy link
Member Author

m-mohr commented Oct 8, 2021

Great! Could we also deprecate (and print a warning?) for as(x,"Graph")?

@flahn
Copy link
Member

flahn commented Oct 14, 2021

I advise against it, because i use this internally to coerce into the process graph part of a UDP. I cannot differentiate if the user called the function explicitly or if it was called from an internal function, which would be annoying to read. However, I removed examples in the code documentation which used as(x,"Graph") and pointed out that this might be revoked from the function export of the package (087c9f3).

@m-mohr
Copy link
Member Author

m-mohr commented Oct 14, 2021

Okay, then a warning is not a good idea, but you could still deprecate it in the docs, right?

@flahn
Copy link
Member

flahn commented Oct 14, 2021

Kind of. I haven't found any option to denote deprecation in the documentation. That is, what I did

\code{\link{as.Process}}. This function might be removed from the package function export in the future.

@flahn
Copy link
Member

flahn commented Oct 22, 2021

I removed printing the process graph in UDPs and services. Instead I advertise / encourage the user to coerce it into a Process. Printing the process will then of course show a valid process graph. I hope that I have eliminated now all potential misuses. If not feel free to open this issue again or create a new one ;)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request medium
Projects
None yet
Development

No branches or pull requests

5 participants