-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add compatibility to work with py2neo version 3.1.2 #6
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the contribution! I like the elegance of your code. I requested some changes, but overall this is a great enhancement to hetio.
I also added a progress bar to the neo4j export process, which allows me to watch how things are going from the command line. This option is off by default in order to ensure it does not break your previous work.
Cool that progress bar will be really helpful!
hetio/neo4j.py
Outdated
@@ -71,7 +101,13 @@ def append(self, x): | |||
def create(self): | |||
if not self: | |||
return | |||
self.db_graph.create(*self) | |||
|
|||
# http://stackoverflow.com/questions/37676731/typeerror-create-takes-2-positional-arguments-but-4-were-given |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indent
hetio/neo4j.py
Outdated
|
||
# http://stackoverflow.com/questions/37676731/typeerror-create-takes-2-positional-arguments-but-4-were-given | ||
if PY2NEO_VER == 3: | ||
self.db_graph.create(union(self)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's put
union = lambda v: reduce(or_, v)
under this if statement, since it's not really needed anywhere else
hetio/neo4j.py
Outdated
|
||
queue = graph.get_nodes() | ||
if show_progress: | ||
total_nodes = hetio.stats.get_metanode_df(graph)["nodes"].sum() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's try to remove the hetio.stats
import:
total_nodes = len(graph.node_dict)
hetio/neo4j.py
Outdated
|
||
queue = graph.get_edges(exclude_inverts=True) | ||
if show_progress: | ||
total_edges = hetio.stats.get_metaedge_df(graph)["edges"].sum() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could wrap graph.get_edges(exclude_inverts=True)
in list()
. Then you could do total_edges = len(queue)
. I don't think it's likely that this list will push us over a memory limit in any real setting.
On second thought, perhaps I should add an n_nodes
and n_edges
attribute to graphs, which get updated when nodes or edges are created. What do you prefer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In 903a4c8, I added n_nodes
and n_edges
as graph attributes. Therefore, you should be able to do total_edges = graph.n_edges
(and similar for nodes).
You would have to fetch the upstream changes and rebase your branch on them. Then force push. Do you know how to do that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I will figure it out.
setup.py
Outdated
with open('README.rst') as read_file: | ||
long_description = read_file.read() | ||
#with open('README.rst') as read_file: | ||
# long_description = read_file.read() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the reason for commenting this out?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah README.rst
wasn't part of the repo. Added it, if you rebase, you should be good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't worry about these commented lines. See #7. Will delete them after we merge this PR to avoid conflicts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah doing a pip install -e .
for the repo was giving me an error with README.rst
, so I just commented it out to get it to install correctly.
hetio/neo4j.py
Outdated
@@ -8,14 +8,28 @@ | |||
import py2neo.packages.httpstream | |||
import pandas | |||
|
|||
from tqdm import tqdm | |||
from functools import reduce |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The functools and operator imports should go in the first group of standard library imports. tqdm
import should go with second group (third party imports). See https://www.python.org/dev/peps/pep-0008/#imports.
hetio/neo4j.py
Outdated
self.db_graph.create(*self) | ||
|
||
# http://stackoverflow.com/questions/37676731/typeerror-create-takes-2-positional-arguments-but-4-were-given | ||
if PY2NEO_VER == 3: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps if PY2NEO_VER >= 3
makes a little more sense, assuming v4 would be more like 3 than 2
hetio/neo4j.py
Outdated
"""Export hetnet to neo4j""" | ||
db_graph = py2neo.Graph(uri) | ||
|
||
if PY2NEO_VER == 2: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we change the if statement to evaluate isinstance(uri, py2neo.Graph)
rather than to evaluate the version? This way for py2neo 2 you could pass a py2neo.Graph
. And I'm hopeful that https://github.com/technige/py2neo/issues/605 will eventually be fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, that's an easy fix I can make. I guess it might just confuse other people if they try to use the py2neo 3 URI resolver with bolt and have difficulties with getting it working.
@veleritas I'd love to get your contributions merged. Let me know if you want me to pick up where you've left off on this PR. But if you have the time, I think everything left to do should be pretty quick to implement. |
Yes, sorry, I've been busy, but will implement the changes you mentioned this week. |
Tqdm progress bars will be used to monitor Neo4j network import progress.
Add tqdm progress bars to the node and edge importing steps.
@veleritas this is available in v0.2.2, but I'd recommend v0.2.3 which fixes a PyPI formatting issue. Thanks for the contribution, you're now an author on the Zenodo https://zenodo.org/record/345182 |
Hi Daniel,
I've finished making my changes to the
hetio
package to make it compatible with py2neo version 3.1.2. I didn't actually have to end up changing too much in the package. The only real difference between py2neo versions is that when nodes and edges are added to neo4j, they are done as a single subgraph object, and not as multiple objects.I also added a progress bar to the neo4j export process, which allows me to watch how things are going from the command line. This option is off by default in order to ensure it does not break your previous work. In that regard I have also tested things with your original code, and things worked fine.
I have only modified the
neo4j.py
file, and found that no other changes inhetio
were needed to make the rest of the pipeline work. All other changes were done in the respectiveintegrate
andlearn
repositories.One thing you might notice is that the
uri
variable passed toexport_neo4j()
is actually expected to be apy2neo.Graph
object, but only if using py2neo version 3. In my testing it seemed that I could not get the py2neo URI resolver to work with the new Bolt protocol. Instead I had to open my own py2neo connection explicitly, and that seems to work. I believe the problem is that both the Bolt and HTTP ports have to be open, and on different ports, and the URI resolver might not support doing that.Let me know if you have any questions!
Toby