Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG]: Trying to follow the "Getting Started" guide step by step #49

Closed
jalchr opened this issue Apr 25, 2019 · 26 comments
Closed

[BUG]: Trying to follow the "Getting Started" guide step by step #49

jalchr opened this issue Apr 25, 2019 · 26 comments
Labels
bug Something isn't working duplicate This issue or pull request already exists

Comments

@jalchr
Copy link

jalchr commented Apr 25, 2019

Describe the bug
I was following the Getting Started guide step by step. When I execute the following:

C:\Users\j.shaer\source\repos\HelloSpark\HelloSpark\bin\Debug\netcoreapp2.1>spark-submit 
`--class org.apache.spark.deploy.DotnetRunner ` --master local ` microsoft-spark-2.4.x-0.1.0.jar ` HelloSpark

I get this:

Exception in thread "main" org.apache.spark.SparkException: Cannot load main class from JAR file:/C:/Users/j.shaer/source/repos/HelloSpark/HelloSpark/bin/Debug/netcoreapp2.1/%60--class
        at org.apache.spark.deploy.SparkSubmitArguments.error(SparkSubmitArguments.scala:657)
        at org.apache.spark.deploy.SparkSubmitArguments.loadEnvironmentArguments(SparkSubmitArguments.scala:221)
        at org.apache.spark.deploy.SparkSubmitArguments.<init>(SparkSubmitArguments.scala:116)
        at org.apache.spark.deploy.SparkSubmit$$anon$2$$anon$3.<init>(SparkSubmit.scala:911)
        at org.apache.spark.deploy.SparkSubmit$$anon$2.parseArguments(SparkSubmit.scala:911)
        at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:81)
        at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:924)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:933)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
@jalchr jalchr added the bug Something isn't working label Apr 25, 2019
@jalchr
Copy link
Author

jalchr commented Apr 25, 2019

I ran the code again but without the backtick

C:\Users\j.shaer\source\repos\HelloSpark\HelloSpark\bin\Debug\netcoreapp2.1>
spark-submit --class org.apache.spark.deploy.DotnetRunner --master local microsoft-spark-2.4.x-0.1.0.jar HelloSpark

Now, I'm getting this:

19/04/25 12:35:16 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/spark/internal/Logging$class
        at org.apache.spark.deploy.DotnetRunner$.<init>(DotnetRunner.scala:34)
        at org.apache.spark.deploy.DotnetRunner$.<clinit>(DotnetRunner.scala)
        at org.apache.spark.deploy.DotnetRunner.main(DotnetRunner.scala)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
        at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:849)
        at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:167)
        at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:195)
        at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86)
        at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:924)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:933)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.ClassNotFoundException: org.apache.spark.internal.Logging$class
        at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
        ... 15 more
log4j:WARN No appenders could be found for logger (org.apache.spark.util.ShutdownHookManager).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.

@jalchr jalchr changed the title [BUG]: Cannot load main class from JAR [BUG]: Trying to follow the "Getting Started" guide step by step Apr 25, 2019
@GoEddie
Copy link
Contributor

GoEddie commented Apr 25, 2019

were you able to run spark-shell ?

@marcoparenzan
Copy link

marcoparenzan commented Apr 25, 2019

I have the same issue. And please note that the jar to be used is 0.1.0 not 1.0.0-alpha (microsoft-spark-2.4.x-0.1.0.jar)

@imback82
Copy link
Contributor

It appears that you are using Spark 2.4.2, which is not supported yet. Can you please try either 2.4.0, 2.4.1 or 2.3.*? More intfo #43.

@marcoparenzan
Copy link

@imback82 Yes it works! I have used 2.3.3. Thanks!

@marcoparenzan
Copy link

After the result (computed) I have this error.
2019-04-25 14:56:48 INFO DAGScheduler:54 - Job 23 finished: showString at NativeMethodAccessorImpl.java:0, took 1.502159 s
+----+-------+----+-------+
| age| name| age| name|
+----+-------+----+-------+
|null|Michael|null|Michael|
| 30| Andy| 30| Andy|
| 19| Justin| 19| Justin|
+----+-------+----+-------+

2019-04-25 14:56:48 INFO AbstractConnector:318 - Stopped Spark@38ef1355{HTTP/1.1,[http/1.1]}{0.0.0.0:4040}
2019-04-25 14:56:48 INFO SparkUI:54 - Stopped Spark web UI at http://mpdatascience.nr4hyhiis0zenpze54untwy3dd.bx.internal.cloudapp.net:4040
2019-04-25 14:56:48 INFO MapOutputTrackerMasterEndpoint:54 - MapOutputTrackerMasterEndpoint stopped!
2019-04-25 14:56:48 INFO MemoryStore:54 - MemoryStore cleared
2019-04-25 14:56:48 INFO BlockManager:54 - BlockManager stopped
2019-04-25 14:56:48 INFO BlockManagerMaster:54 - BlockManagerMaster stopped
2019-04-25 14:56:48 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint:54 - OutputCommitCoordinator stopped!
2019-04-25 14:56:48 WARN SparkEnv:87 - Exception while deleting Spark temp dir: C:\Users\trainer\AppData\Local\Temp\2\spark-13915bb9-62e2-4e4e-a933-6a63400131f5\userFiles-598c330c-29da-4123-8859-e44f9fe0e936
java.io.IOException: Failed to delete: C:\Users\trainer\AppData\Local\Temp\2\spark-13915bb9-62e2-4e4e-a933-6a63400131f5\userFiles-598c330c-29da-4123-8859-e44f9fe0e936
at org.apache.spark.util.Utils$.deleteRecursively(Utils.scala:1074)
at org.apache.spark.SparkEnv.stop(SparkEnv.scala:103)
at org.apache.spark.SparkContext$$anonfun$stop$11.apply$mcV$sp(SparkContext.scala:1947)
at org.apache.spark.util.Utils$.tryLogNonFatalError(Utils.scala:1361)
at org.apache.spark.SparkContext.stop(SparkContext.scala:1946)
at org.apache.spark.sql.SparkSession.stop(SparkSession.scala:712)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.api.dotnet.DotnetBackendHandler.handleMethodCall(DotnetBackendHandler.scala:162)
at org.apache.spark.api.dotnet.DotnetBackendHandler.handleBackendRequest(DotnetBackendHandler.scala:102)
at org.apache.spark.api.dotnet.DotnetBackendHandler.channelRead0(DotnetBackendHandler.scala:29)
at org.apache.spark.api.dotnet.DotnetBackendHandler.channelRead0(DotnetBackendHandler.scala:24)
at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:102)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:310)
at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:284)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1359)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:935)
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:138)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:645)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:580)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:497)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:459)
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)
at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)
at java.lang.Thread.run(Thread.java:748)
2019-04-25 14:56:48 INFO SparkContext:54 - Successfully stopped SparkContext
2019-04-25 14:56:48 INFO DotnetRunner:54 - Closing DotnetBackend
2019-04-25 14:56:48 INFO DotnetBackend:54 - Requesting to close all call back sockets
2019-04-25 14:56:48 INFO DotnetRunner:54 - .NET application exited successfully
2019-04-25 14:56:51 INFO ShutdownHookManager:54 - Shutdown hook called
2019-04-25 14:56:51 INFO ShutdownHookManager:54 - Deleting directory C:\Users\trainer\AppData\Local\Temp\2\spark-02e6417e-88f2-4134-bee4-ccdfab21420b
2019-04-25 14:56:51 INFO ShutdownHookManager:54 - Deleting directory C:\Users\trainer\AppData\Local\Temp\2\spark-13915bb9-62e2-4e4e-a933-6a63400131f5
2019-04-25 14:56:51 ERROR ShutdownHookManager:91 - Exception while deleting Spark temp dir: C:\Users\trainer\AppData\Local\Temp\2\spark-13915bb9-62e2-4e4e-a933-6a63400131f5
java.io.IOException: Failed to delete: C:\Users\trainer\AppData\Local\Temp\2\spark-13915bb9-62e2-4e4e-a933-6a63400131f5
at org.apache.spark.util.Utils$.deleteRecursively(Utils.scala:1074)
at org.apache.spark.util.ShutdownHookManager$$anonfun$1$$anonfun$apply$mcV$sp$3.apply(ShutdownHookManager.scala:65)
at org.apache.spark.util.ShutdownHookManager$$anonfun$1$$anonfun$apply$mcV$sp$3.apply(ShutdownHookManager.scala:62)
at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186)
at org.apache.spark.util.ShutdownHookManager$$anonfun$1.apply$mcV$sp(ShutdownHookManager.scala:62)
at org.apache.spark.util.SparkShutdownHook.run(ShutdownHookManager.scala:216)
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(ShutdownHookManager.scala:188)
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply(ShutdownHookManager.scala:188)
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply(ShutdownHookManager.scala:188)
at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1992)
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply$mcV$sp(ShutdownHookManager.scala:188)
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply(ShutdownHookManager.scala:188)
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply(ShutdownHookManager.scala:188)
at scala.util.Try$.apply(Try.scala:192)
at org.apache.spark.util.SparkShutdownHookManager.runAll(ShutdownHookManager.scala:188)
at org.apache.spark.util.SparkShutdownHookManager$$anon$2.run(ShutdownHookManager.scala:178)
at org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54)
2019-04-25 14:56:51 INFO ShutdownHookManager:54 - Deleting directory C:\Users\trainer\AppData\Local\Temp\2\spark-13915bb9-62e2-4e4e-a933-6a63400131f5\userFiles-598c330c-29da-4123-8859-e44f9fe0e936
2019-04-25 14:56:51 ERROR ShutdownHookManager:91 - Exception while deleting Spark temp dir: C:\Users\trainer\AppData\Local\Temp\2\spark-13915bb9-62e2-4e4e-a933-6a63400131f5\userFiles-598c330c-29da-4123-8859-e44f9fe0e936
java.io.IOException: Failed to delete: C:\Users\trainer\AppData\Local\Temp\2\spark-13915bb9-62e2-4e4e-a933-6a63400131f5\userFiles-598c330c-29da-4123-8859-e44f9fe0e936
at org.apache.spark.util.Utils$.deleteRecursively(Utils.scala:1074)
at org.apache.spark.util.ShutdownHookManager$$anonfun$1$$anonfun$apply$mcV$sp$3.apply(ShutdownHookManager.scala:65)
at org.apache.spark.util.ShutdownHookManager$$anonfun$1$$anonfun$apply$mcV$sp$3.apply(ShutdownHookManager.scala:62)
at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186)
at org.apache.spark.util.ShutdownHookManager$$anonfun$1.apply$mcV$sp(ShutdownHookManager.scala:62)
at org.apache.spark.util.SparkShutdownHook.run(ShutdownHookManager.scala:216)
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(ShutdownHookManager.scala:188)
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply(ShutdownHookManager.scala:188)
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply(ShutdownHookManager.scala:188)
at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1992)
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply$mcV$sp(ShutdownHookManager.scala:188)
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply(ShutdownHookManager.scala:188)
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply(ShutdownHookManager.scala:188)
at scala.util.Try$.apply(Try.scala:192)
at org.apache.spark.util.SparkShutdownHookManager.runAll(ShutdownHookManager.scala:188)
at org.apache.spark.util.SparkShutdownHookManager$$anon$2.run(ShutdownHookManager.scala:178)
at org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54)

@imback82
Copy link
Contributor

This is a known in Spark: https://issues.apache.org/jira/browse/SPARK-12216

@marcoparenzan
Copy link

Great! Thanks!

@amgadmadkour
Copy link

amgadmadkour commented Apr 25, 2019

One temporary work-around to avoid seeing the spark temporary files error is to add the following two lines in the log4j.properties file of your spark installation:

log4j.logger.org.apache.spark.util.ShutdownHookManager=OFF
log4j.logger.org.apache.spark.SparkEnv=ERROR

Source: Stack Overflow

@GoEddie
Copy link
Contributor

GoEddie commented Apr 25, 2019

Just to help others - in your spark directory there is a conf directory - add those two lines to the "log4j.properties" file. If there is no "log4j.properties" there should be a "log4j.properties.template" - copy the .template and remove the ".template" then add those lines at the top and the error will be hidden

@imback82
Copy link
Contributor

@jalchr I am closing this as a duplicate of #48. Thanks for reporting!

@imback82 imback82 added the duplicate This issue or pull request already exists label Apr 26, 2019
@tang2087
Copy link

tang2087 commented Apr 26, 2019

I got the same issue and found out it is because I am using Spark 2.4.2.
After I change to Spark 2.4.1, it is working.
Please refer to my following post about the details with some examples, i.e. read/write parquet files; read from HDFS/Hive using C#.
.NET for Apache Spark Preview with Examples

@imback82
Copy link
Contributor

Wow, thanks for the nice blog @FahaoTang! (btw, you may want to fix the link in your comment.

@tang2087
Copy link

Wow, thanks for the nice blog @FahaoTang! (btw, you may want to fix the link in your comment.

Thanks for the reminder. I just fixed the link. :)

@jalchr
Copy link
Author

jalchr commented Apr 29, 2019

@imback82 I installed spark v2.4.1 ... yes it is running. But still getting this:
Cannot run program ".\HelloSpark": CreateProcess error=2, The system cannot find the file specified

Microsoft Windows [Version 10.0.17763.437]
(c) 2018 Microsoft Corporation. All rights reserved.

C:\Users\j.shaer\source\repos\HelloSpark\HelloSpark\bin\Debug\netcoreapp2.1>spark-submit --class org.apache.spark.deploy.DotnetRunner --master local microsoft-spark-2.4.x-0.1.0.jar HelloSpark
19/04/29 11:50:30 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
19/04/29 11:50:30 INFO DotnetRunner: Starting DotnetBackend with HelloSpark.
19/04/29 11:50:30 INFO DotnetRunner: Port number used by DotnetBackend is 49958
19/04/29 11:50:30 INFO DotnetRunner: Adding key=spark.jars and value=file:/C:/Users/j.shaer/source/repos/HelloSpark/HelloSpark/bin/Debug/netcoreapp2.1/microsoft-spark-2.4.x-0.1.0.jar to environment
19/04/29 11:50:30 INFO DotnetRunner: Adding key=spark.app.name and value=org.apache.spark.deploy.DotnetRunner to environment
19/04/29 11:50:30 INFO DotnetRunner: Adding key=spark.submit.deployMode and value=client to environment
19/04/29 11:50:30 INFO DotnetRunner: Adding key=spark.master and value=local to environment
19/04/29 11:50:30 ERROR DotnetRunner: Cannot run program ".\HelloSpark": CreateProcess error=2, The system cannot find the file specified
 [Ljava.lang.StackTraceElement;@4b213651
19/04/29 11:50:30 INFO ShutdownHookManager: Shutdown hook called
19/04/29 11:50:30 INFO ShutdownHookManager: Deleting directory C:\Users\j.shaer\AppData\Local\Temp\spark-6b79c40a-b20d-425c-8c8c-60e98b1f2f1d

C:\Users\j.shaer\source\repos\HelloSpark\HelloSpark\bin\Debug\netcoreapp2.1>

@jalchr
Copy link
Author

jalchr commented Apr 29, 2019

Interesting ... I added the extension to HelloSpark.dll and now getting this

C:\Users\j.shaer\source\repos\HelloSpark\HelloSpark\bin\Debug\netcoreapp2.1>spark-submit --class org.apache.spark.deploy.DotnetRunner --master local microsoft-spark-2.4.x-0.1.0.jar HelloSpark.dll
19/04/29 11:57:19 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
19/04/29 11:57:19 INFO DotnetRunner: Starting DotnetBackend with HelloSpark.dll.
19/04/29 11:57:19 INFO DotnetRunner: Port number used by DotnetBackend is 50019
19/04/29 11:57:19 INFO DotnetRunner: Adding key=spark.jars and value=file:/C:/Users/j.shaer/source/repos/HelloSpark/HelloSpark/bin/Debug/netcoreapp2.1/microsoft-spark-2.4.x-0.1.0.jar to environment
19/04/29 11:57:19 INFO DotnetRunner: Adding key=spark.app.name and value=org.apache.spark.deploy.DotnetRunner to environment
19/04/29 11:57:19 INFO DotnetRunner: Adding key=spark.submit.deployMode and value=client to environment
19/04/29 11:57:19 INFO DotnetRunner: Adding key=spark.master and value=local to environment

Unhandled Exception: System.IO.FileNotFoundException: Could not load file or assembly 'System.Runtime, Version=4.2.1.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a' or one of its dependencies. The system cannot find the file specified.
19/04/29 11:57:21 INFO DotnetRunner: Closing DotnetBackend
19/04/29 11:57:21 INFO DotnetBackend: Requesting to close all call back sockets
19/04/29 11:57:21 INFO ShutdownHookManager: Shutdown hook called
19/04/29 11:57:21 INFO ShutdownHookManager: Deleting directory C:\Users\j.shaer\AppData\Local\Temp\spark-3af1ceef-aa4a-4917-8cc9-34c66acb10b4

C:\Users\j.shaer\source\repos\HelloSpark\HelloSpark\bin\Debug\netcoreapp2.1>

@tang2087
Copy link

Interesting ... I added the extension to HelloSpark.dll and now getting this

C:\Users\j.shaer\source\repos\HelloSpark\HelloSpark\bin\Debug\netcoreapp2.1>spark-submit --class org.apache.spark.deploy.DotnetRunner --master local microsoft-spark-2.4.x-0.1.0.jar HelloSpark.dll
19/04/29 11:57:19 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
19/04/29 11:57:19 INFO DotnetRunner: Starting DotnetBackend with HelloSpark.dll.
19/04/29 11:57:19 INFO DotnetRunner: Port number used by DotnetBackend is 50019
19/04/29 11:57:19 INFO DotnetRunner: Adding key=spark.jars and value=file:/C:/Users/j.shaer/source/repos/HelloSpark/HelloSpark/bin/Debug/netcoreapp2.1/microsoft-spark-2.4.x-0.1.0.jar to environment
19/04/29 11:57:19 INFO DotnetRunner: Adding key=spark.app.name and value=org.apache.spark.deploy.DotnetRunner to environment
19/04/29 11:57:19 INFO DotnetRunner: Adding key=spark.submit.deployMode and value=client to environment
19/04/29 11:57:19 INFO DotnetRunner: Adding key=spark.master and value=local to environment

Unhandled Exception: System.IO.FileNotFoundException: Could not load file or assembly 'System.Runtime, Version=4.2.1.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a' or one of its dependencies. The system cannot find the file specified.
19/04/29 11:57:21 INFO DotnetRunner: Closing DotnetBackend
19/04/29 11:57:21 INFO DotnetBackend: Requesting to close all call back sockets
19/04/29 11:57:21 INFO ShutdownHookManager: Shutdown hook called
19/04/29 11:57:21 INFO ShutdownHookManager: Deleting directory C:\Users\j.shaer\AppData\Local\Temp\spark-3af1ceef-aa4a-4917-8cc9-34c66acb10b4

C:\Users\j.shaer\source\repos\HelloSpark\HelloSpark\bin\Debug\netcoreapp2.1>

Did you follow my post? it should work.

@imback82
Copy link
Contributor

@jalchr Do you have HelloSpark.exe in that directory? If not, you should do <dotnet-exe-fullpath> HelloSpark.dll instead,

@jalchr
Copy link
Author

jalchr commented Apr 30, 2019

I was running in debug mode, which produces .dll file.
I have to publish it :

dotnet publish -c Release -r win10-x64

Now, I'm getting an .exe file and works perfectly.

Thanks all

@unruledboy
Copy link

unruledboy commented May 1, 2019

I followed the steps, and managed to run it with this command:

C:\bin\spark-2.4.1-bin-hadoop2.7\bin\spark-submit --class org.apache.spark.deploy.DotnetRunner --master local microsoft-spark-2.4.x-0.1.0.jar HelloSpark.dll

But encountered this exception:

Missing Python executable 'python', defaulting to 'C:\bin\spark-2.4.1-bin-hadoop2.7\bin..' for SPARK_HOME environment variable. Please install Python or specify the correct Python executable in PYSPARK_DRIVER_PYTHON or PYSPARK_PYTHON environment variable to detect SPARK_HOME safely.
19/05/02 09:56:36 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
19/05/02 09:56:37 INFO DotnetRunner: Starting DotnetBackend with HelloSpark.dll.
19/05/02 09:56:38 INFO DotnetRunner: Port number used by DotnetBackend is 60764
19/05/02 09:56:38 INFO DotnetRunner: Adding key=spark.jars and value=file:/C:/Users/user/source/repos/HelloSpark/bin/Debug/netcoreapp2.2/publish/microsoft-spark-2.4.x-0.1.0.jar to environment
19/05/02 09:56:38 INFO DotnetRunner: Adding key=spark.app.name and value=org.apache.spark.deploy.DotnetRunner to environment
19/05/02 09:56:38 INFO DotnetRunner: Adding key=spark.submit.deployMode and value=client to environment
19/05/02 09:56:38 INFO DotnetRunner: Adding key=spark.master and value=local to environment

Unhandled Exception: System.IO.FileNotFoundException: Could not load file or assembly 'System.Runtime, Version=4.2.1.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a' or one of its dependencies. The system cannot find the file specified.
19/05/02 09:56:40 INFO DotnetRunner: Closing DotnetBackend
19/05/02 09:56:40 INFO DotnetBackend: Requesting to close all call back sockets
19/05/02 09:56:40 INFO ShutdownHookManager: Shutdown hook called
19/05/02 09:56:40 INFO ShutdownHookManager: Deleting directory C:\Users\user\AppData\Local\Temp\spark-9691705d-03f6-490a-a92c-900f32b167b0

Looks like it could not find the runtimes folder, but it is there

The files in the output folder:

  • HelloSpark.dll
  • HelloSpark.deps.json
  • microsoft-spark-2.3.x-0.1.0.jar
  • microsoft-spark-2.4.x-0.1.0.jar
  • people.json
  • HelloSpark.runtimeconfig.json
  • Runtimes folder

@imback82
Copy link
Contributor

imback82 commented May 1, 2019

Please try:

C:\bin\spark-2.4.1-bin-hadoop2.7\bin\spark-submit --class org.apache.spark.deploy.DotnetRunner --master local microsoft-spark-2.4.x-0.1.0.jar <full-path-to-dotnet.exe> HelloSpark.dll

as the README says here https://github.com/dotnet/spark#get-started

@unruledboy
Copy link

@imback82 sorry, I found that cause and moved on and now I have same problem as discussed here: #77

@unruledboy
Copy link

unruledboy commented May 2, 2019

Now I managed to run it, but has some exception after it finishes.

First, I would suggest to give a working sample of the command (IN-ONE-LINE), like:

C:\bin\spark-2.4.1-bin-hadoop2.7\bin\spark-submit --class org.apache.spark.deploy.DotnetRunner --master local microsoft-spark-2.4.x-0.1.0.jar "C:\Program Files\dotnet\dotnet.exe" HelloSpark.dll

Second, at the end of the program, it throws some Spark exceptions:

19/05/02 10:05:24 WARN SparkEnv: Exception while deleting Spark temp dir: C:\Users\user\AppData\Local\Temp\spark-e362e9c1-4189-4c58-9988-a415148242f5\userFiles-7b6adef9-4da3-49a8-bc4b-4ecdae22ad36
java.io.IOException: Failed to delete: C:\Users\user\AppData\Local\Temp\spark-e362e9c1-4189-4c58-9988-a415148242f5\userFiles-7b6adef9-4da3-49a8-bc4b-4ecdae22ad36\microsoft-spark-2.4.x-0.1.0.jar
at org.apache.spark.network.util.JavaUtils.deleteRecursivelyUsingJavaIO(JavaUtils.java:144)
at org.apache.spark.network.util.JavaUtils.deleteRecursively(JavaUtils.java:118)
at org.apache.spark.network.util.JavaUtils.deleteRecursivelyUsingJavaIO(JavaUtils.java:128)
at org.apache.spark.network.util.JavaUtils.deleteRecursively(JavaUtils.java:118)
at org.apache.spark.network.util.JavaUtils.deleteRecursively(JavaUtils.java:91)
at org.apache.spark.util.Utils$.deleteRecursively(Utils.scala:1062)
at org.apache.spark.SparkEnv.stop(SparkEnv.scala:103)
at org.apache.spark.SparkContext$$anonfun$stop$11.apply$mcV$sp(SparkContext.scala:1974)
at org.apache.spark.util.Utils$.tryLogNonFatalError(Utils.scala:1340)
at org.apache.spark.SparkContext.stop(SparkContext.scala:1973)
at org.apache.spark.SparkContext$$anonfun$2.apply$mcV$sp(SparkContext.scala:575)
at org.apache.spark.util.SparkShutdownHook.run(ShutdownHookManager.scala:216)
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(ShutdownHookManager.scala:188)
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply(ShutdownHookManager.scala:188)
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply(ShutdownHookManager.scala:188)
at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1945)
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply$mcV$sp(ShutdownHookManager.scala:188)
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply(ShutdownHookManager.scala:188)
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply(ShutdownHookManager.scala:188)
at scala.util.Try$.apply(Try.scala:192)
at org.apache.spark.util.SparkShutdownHookManager.runAll(ShutdownHookManager.scala:188)
at org.apache.spark.util.SparkShutdownHookManager$$anon$2.run(ShutdownHookManager.scala:178)
at org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54)
19/05/02 10:05:24 INFO SparkContext: Successfully stopped SparkContext
19/05/02 10:05:24 INFO ShutdownHookManager: Shutdown hook called
19/05/02 10:05:24 INFO ShutdownHookManager: Deleting directory C:\Users\user\AppData\Local\Temp\spark-9fabe1ce-5533-402d-abd2-a01dccc0bea4
19/05/02 10:05:24 INFO ShutdownHookManager: Deleting directory C:\Users\user\AppData\Local\Temp\spark-e362e9c1-4189-4c58-9988-a415148242f5\userFiles-7b6adef9-4da3-49a8-bc4b-4ecdae22ad36
19/05/02 10:05:24 ERROR ShutdownHookManager: Exception while deleting Spark temp dir: C:\Users\user\AppData\Local\Temp\spark-e362e9c1-4189-4c58-9988-a415148242f5\userFiles-7b6adef9-4da3-49a8-bc4b-4ecdae22ad36
java.io.IOException: Failed to delete: C:\Users\user\AppData\Local\Temp\spark-e362e9c1-4189-4c58-9988-a415148242f5\userFiles-7b6adef9-4da3-49a8-bc4b-4ecdae22ad36\microsoft-spark-2.4.x-0.1.0.jar
at org.apache.spark.network.util.JavaUtils.deleteRecursivelyUsingJavaIO(JavaUtils.java:144)
at org.apache.spark.network.util.JavaUtils.deleteRecursively(JavaUtils.java:118)
at org.apache.spark.network.util.JavaUtils.deleteRecursivelyUsingJavaIO(JavaUtils.java:128)
at org.apache.spark.network.util.JavaUtils.deleteRecursively(JavaUtils.java:118)
at org.apache.spark.network.util.JavaUtils.deleteRecursively(JavaUtils.java:91)
at org.apache.spark.util.Utils$.deleteRecursively(Utils.scala:1062)
at org.apache.spark.util.ShutdownHookManager$$anonfun$1$$anonfun$apply$mcV$sp$3.apply(ShutdownHookManager.scala:65)
at org.apache.spark.util.ShutdownHookManager$$anonfun$1$$anonfun$apply$mcV$sp$3.apply(ShutdownHookManager.scala:62)
at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186)
at org.apache.spark.util.ShutdownHookManager$$anonfun$1.apply$mcV$sp(ShutdownHookManager.scala:62)
at org.apache.spark.util.SparkShutdownHook.run(ShutdownHookManager.scala:216)
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(ShutdownHookManager.scala:188)
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply(ShutdownHookManager.scala:188)
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply(ShutdownHookManager.scala:188)
at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1945)
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply$mcV$sp(ShutdownHookManager.scala:188)
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply(ShutdownHookManager.scala:188)
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply(ShutdownHookManager.scala:188)
at scala.util.Try$.apply(Try.scala:192)
at org.apache.spark.util.SparkShutdownHookManager.runAll(ShutdownHookManager.scala:188)
at org.apache.spark.util.SparkShutdownHookManager$$anon$2.run(ShutdownHookManager.scala:178)
at org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54)
19/05/02 10:05:24 INFO ShutdownHookManager: Deleting directory C:\Users\user\AppData\Local\Temp\spark-e362e9c1-4189-4c58-9988-a415148242f5
19/05/02 10:05:24 ERROR ShutdownHookManager: Exception while deleting Spark temp dir: C:\Users\user\AppData\Local\Temp\spark-e362e9c1-4189-4c58-9988-a415148242f5
java.io.IOException: Failed to delete: C:\Users\user\AppData\Local\Temp\spark-e362e9c1-4189-4c58-9988-a415148242f5\userFiles-7b6adef9-4da3-49a8-bc4b-4ecdae22ad36\microsoft-spark-2.4.x-0.1.0.jar
at org.apache.spark.network.util.JavaUtils.deleteRecursivelyUsingJavaIO(JavaUtils.java:144)
at org.apache.spark.network.util.JavaUtils.deleteRecursively(JavaUtils.java:118)
at org.apache.spark.network.util.JavaUtils.deleteRecursivelyUsingJavaIO(JavaUtils.java:128)
at org.apache.spark.network.util.JavaUtils.deleteRecursively(JavaUtils.java:118)
at org.apache.spark.network.util.JavaUtils.deleteRecursivelyUsingJavaIO(JavaUtils.java:128)
at org.apache.spark.network.util.JavaUtils.deleteRecursively(JavaUtils.java:118)
at org.apache.spark.network.util.JavaUtils.deleteRecursively(JavaUtils.java:91)
at org.apache.spark.util.Utils$.deleteRecursively(Utils.scala:1062)
at org.apache.spark.util.ShutdownHookManager$$anonfun$1$$anonfun$apply$mcV$sp$3.apply(ShutdownHookManager.scala:65)
at org.apache.spark.util.ShutdownHookManager$$anonfun$1$$anonfun$apply$mcV$sp$3.apply(ShutdownHookManager.scala:62)
at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186)
at org.apache.spark.util.ShutdownHookManager$$anonfun$1.apply$mcV$sp(ShutdownHookManager.scala:62)
at org.apache.spark.util.SparkShutdownHook.run(ShutdownHookManager.scala:216)
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(ShutdownHookManager.scala:188)
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply(ShutdownHookManager.scala:188)
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply(ShutdownHookManager.scala:188)
at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1945)
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply$mcV$sp(ShutdownHookManager.scala:188)
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply(ShutdownHookManager.scala:188)
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply(ShutdownHookManager.scala:188)
at scala.util.Try$.apply(Try.scala:192)
at org.apache.spark.util.SparkShutdownHookManager.runAll(ShutdownHookManager.scala:188)
at org.apache.spark.util.SparkShutdownHookManager$$anon$2.run(ShutdownHookManager.scala:178)
at org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54)

@imback82
Copy link
Contributor

imback82 commented May 2, 2019

Please refer to #49 (comment)

@unruledboy
Copy link

unruledboy commented May 2, 2019

@imback82 my bad, I searched before I made the comment, I suggest the search function in GitHub backed by Elasticsearch has some keyword parsing issue.

Anyway, all good. Thanks a lot for the promptly reply. I came a long way for Spark,, more than 1 year ago I evaluated Mobius but found out it has issue and can't meet our requirement, so we end up using native Scala solution, which is not ideal for .NET developers.

Finally we have this .NET solution, let's hope and work together to make it beneficial to big data projects for .NET community.

@imback82
Copy link
Contributor

imback82 commented May 2, 2019

Finally we have this .NET solution, let's hope and work together to make it beneficial to big data projects for .NET community.

We look forward to working together, @unruledboy!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working duplicate This issue or pull request already exists
Projects
None yet
Development

No branches or pull requests

7 participants