-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
#251 Fix glob support and divibility check for large amount of files. #253
Conversation
-- could use toree magic AddDeps property "repository" to choose another one than maven central -- basic example below, without any gadgets, fails for a reason maybe related to missing dependencies
` //import org.apache.spark.sql.SparkSession //spark.udf.register("get_file_name", (path: String) => path.split("/").last) val cobolDataframe = spark
|
|
How do you build your Spark Application? If you use Maven, you can allow snapshot repositories by adding this profile to your pom.xml:
and using this dependency:
|
Hi @yruslan , @kriswijnants In short: I confirm the solution. @yruslan I thank you for the solution @kriswijnants please take note of this Detailed information: The repo-URL you provided in the maven configuration allowed me to add the correct dependency in the "Apache Toree - Scala" Jupyter kernel this using its magic I basically coded in a Jupyter notebook via a Jupyter docker container. This approach is a good good simulation of the actual deployment platform, a Databricks cluster. This approach allows me to analyse the data format, extraction method, library stability and issues without interfering with the infrastructure and data-platform setup by lads like @kriswijnants Regards, Bart Debersaques |
Great! This will be released next week. |
No description provided.