Dev/gfql endpoint #615

lmeyerov · 2024-11-28T23:10:47Z

Some very exciting things happening here -- maybe time to call pygraphistry 1.x.x ??

Remote dataset binding

Bind to remote (lazy):

g1 = graphistry.bind(dataset_id="abc123")

Decouple upload from plot(), and bind the upload:

g_uploaded = g_unuploaded.upload()
print(g_uploaded._dataset_id)
print(g_uploaded._nodes_file_id)
print(g_uploaded._edges_file_id)
print(g_uploaded._url)

Remote GFQL

# + auto-uploads if not already
g2 = g1.chain_remote([....], engine='gpu')

# no need to download results if you just care whether matched vs not
df_meta = g1.chain_remote_shape([...])

Remote Python too!

json_obj = g1.python_remote("""
from graphistry import Plottable

def task(g: Plottable):
  return {'status': True}
""")

Notes

Changes

Now tracks dataset_id, nodes_file_id, edges_file_id, and url, including clearing them out whenever the nodes/edges gets re-bound with new data
plot(render= expanded from bool to Union[bool, Literal["auto", "g", "ipython", "databricks", "browser"]] to make more predictable

…mports

graphistry/compute/chain_remote.py

mj3cheun · 2024-11-29T06:12:07Z

graphistry/compute/python_remote.py

+    :param api_token: Optional JWT token. If not provided, refreshes JWT and uses that.
+    :type api_token: Optional[str]
+
+    :param dataset_id: Optional dataset_id. If not provided, will uplaod current data, store that dataset_id, and run GFQL against that.


comment references gfql here

also could clarify that if a dataset_id exists on the plottable it will be reused, as opposed to reuploading every time python_remote is called with it omitted

so this is a bit problematic in that this would trigger re-upload, and not sure if that's 'right':

g2 = g1.... # auto-upload 1 res1 = g2.python_remote(script1) # auto-upload 2 res2 = g2.python_remote(script2)

A few ideas:

inplace update of g2 , breaking purity of v3 = g2.xyz()

Some sort of global memoizable trick, so even if g2._dataset_id is not mutated on first upload, we can detect g2 did have a recent upload? Ex: some sort of global weak reference lookup table of last n g objects?

Don't worry about it, and encourage users to do an explicit g2 = g1.upload()...

didnt consider this... this cant happen in python_remote because that never modifies the plottable, but chain_remote definitely does.

my opinion is that perhaps we should avoid self mutation of plottable and instead return a "clone" (with identical metadata etc) of the original plottable with the updated node and edge lists? this should also better mirror the behaviour of non-remote chain anyway?

i dont have strong opinions on autouploading the clone

Does the python endpoint return a Plottable, JSON, or any? How would I know ahead of time, or how would the python client sniff?

for now the python endpoint returns either a string or a json. which one it returns is completely dependent on the return type of the function defined (by the user) in execute. the python endpoint code will detect the type of the returned value and use the appropriate response format.

you can read the MIME type of the response to get which

hmm, it might not be so bad to extend the python endpoint to reuse the gfql endpoint's machinery here, i'll take a look

(mimetype is cool, that helps!)

I agree on avoiding mutation on the base obj, maybe we do some variants like:

# generic helper _ : Any = g1.remote_python("...", output_mode=...) # explicit mypy-friendly g2 = g1.remote_python_g("...") g2, g1_bound = g1.remote_python_g2("...") o = g1.remote_python_json("...") o, g1_bound = g1.remote_python_json2("...")

I'm tempted to not do the <xyz>2 variants above, and if someone wants to reuse uploads, they have to be explicit:

g1 = g0.upload(...) # or g1 = graphistry.bind(dataset_id="...") g2 = g1.remote_python_g(...)

graphistry/compute/python_remote.py

lmeyerov added 9 commits November 28, 2024 14:54

feat(plot): plot param method union types new rendermode

177441f

feat(dataset_id): expose and track

2919242

fix(plot): missing refactor

0206088

feat(upload)

39c15b5

feat(remote)

4e2f20c

docs(remote)

28c144e

fix(lint)

904fa93

fix(types)

2096127

refactor(types): move to dedicated models folder to avoid recursive i…

fbb82fb

…mports

mj3cheun reviewed Nov 29, 2024

View reviewed changes

graphistry/compute/chain_remote.py Show resolved Hide resolved

mj3cheun reviewed Nov 29, 2024

View reviewed changes

lmeyerov added 2 commits November 28, 2024 22:12

fix(plotter tests): deps flow

863ec05

fix(ipython tests): plotter

3f051ae

mj3cheun reviewed Nov 29, 2024

View reviewed changes

graphistry/compute/python_remote.py Outdated Show resolved Hide resolved

lmeyerov added 5 commits November 28, 2024 22:32

fix(remote mode): docstr cascade on _dataset_id

0edbebe

docs(remote): formatting

45f4bfb

fix(rtd)

e58d1bd

docs(gfql remote): update

b5b84cd

docs(remote): links

efdc438

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dev/gfql endpoint #615

Dev/gfql endpoint #615

lmeyerov commented Nov 28, 2024 •

edited

Loading

mj3cheun Nov 29, 2024

mj3cheun Nov 29, 2024

lmeyerov Nov 29, 2024

mj3cheun Nov 29, 2024 •

edited

Loading

lmeyerov Nov 29, 2024

mj3cheun Nov 29, 2024 •

edited

Loading

lmeyerov Nov 29, 2024 •

edited

Loading

lmeyerov Nov 29, 2024

lmeyerov Nov 29, 2024 •

edited

Loading

Dev/gfql endpoint #615

Are you sure you want to change the base?

Dev/gfql endpoint #615

Conversation

lmeyerov commented Nov 28, 2024 • edited Loading

Remote dataset binding

Remote GFQL

Remote Python too!

Notes

Changes

mj3cheun Nov 29, 2024

Choose a reason for hiding this comment

mj3cheun Nov 29, 2024

Choose a reason for hiding this comment

lmeyerov Nov 29, 2024

Choose a reason for hiding this comment

mj3cheun Nov 29, 2024 • edited Loading

Choose a reason for hiding this comment

lmeyerov Nov 29, 2024

Choose a reason for hiding this comment

mj3cheun Nov 29, 2024 • edited Loading

Choose a reason for hiding this comment

lmeyerov Nov 29, 2024 • edited Loading

Choose a reason for hiding this comment

lmeyerov Nov 29, 2024

Choose a reason for hiding this comment

lmeyerov Nov 29, 2024 • edited Loading

Choose a reason for hiding this comment

lmeyerov commented Nov 28, 2024 •

edited

Loading

mj3cheun Nov 29, 2024 •

edited

Loading

mj3cheun Nov 29, 2024 •

edited

Loading

lmeyerov Nov 29, 2024 •

edited

Loading

lmeyerov Nov 29, 2024 •

edited

Loading