Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Spark loader Task not serializable #471

Merged
merged 14 commits into from
May 22, 2023
Original file line number Diff line number Diff line change
@@ -77,7 +77,7 @@ public class HugeGraphSparkLoader implements Serializable {
private final LoadOptions loadOptions;
private final Map<ElementBuilder, List<GraphElement>> builders;

private final ExecutorService executor;
private final transient ExecutorService executor;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it the root problem? And how to ensure it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, i think the root cause of the problem is that we later used the specific properties of hugegraphsparkloader when building the vertex and edge data, such as: loadoptions,builders; spark needs to transfer the entire hugegraphsparkloader object over the network, so it requires both the hugegraphsparkloader itself and its properties to implement a serialization interface. However, the ExecutorService type does not implement serialization, so it throws an error that the task cannot be serialized. Using transient allows us to avoid serializing the ExecutorService executor when serializing hugegraphsparkloader, because we will not need the ExecutorService executor during the vertexs/edges construction process after submitting the job. So I think we can solve this problem by adding transient.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make sense


public static void main(String[] args) {
HugeGraphSparkLoader loader;