-
My company is just moving to Azure Databricks so am a bit fuzzy on how all this works. But what I have done already is setup a 3rd party ODBC driver on my machine and was able to do some simple queries against one of our DBs. Which leads me to Microsoft.Spark. What is it for? Can I use it as a native .NET driver to query the company database? If so, how would I setup the config with my cluster, token, etc to make simple queries? Thanks |
Beta Was this translation helpful? Give feedback.
Replies: 4 comments
-
.NET for Apache Spark allows you to write your Spark application in .NET (instead of having to learn Scala or Python) and can be used with the .NET Interactive notebooks to run queries in .NET against your Spark cluster (e.g., Azure Synapse). This is different than using a client-side ODBC connector to run some SparkSQL queries against a Spark cluster. So the question for you now is, if you primarily want to send SparkSQL queries over ODBC, or if you want to submit Spark applications as batch programs or use an interactive notebook with .NET. If it is the first one, you are already set. If it is the second one, please check out our documentation. If you want to use it with Notebooks, then you will need to request the capability to have Azure Databricks support .NET based notebooks from Databricks. |
Beta Was this translation helpful? Give feedback.
-
@sindizzy Thanks for your question! In addition to what @MikeRys has mentioned, you can also checkout more docs on the official documentation page. Please let us know if you have more questions. |
Beta Was this translation helpful? Give feedback.
-
I just want to query our company db and present the results to end users. I am doing this through a .NET program. So something like:
As mentioned, I can already do this via an ODBC driver but it has to be installed by every one of my end users. If Microsoft.Spark is a native .NET driver then that would work better since end users do not have to install anything. I also think the big confusion for me is what is meant by the term "spark application". |
Beta Was this translation helpful? Give feedback.
-
A Spark application is an application that uses the Spark libraries, gets compiled (if written in Scala, Java or .NET) and then gets submitted to a Spark cluster which will run the application in the Spark coordination (aka master) node and scale out the Spark expressions in the application into the worker nodes. This is a different paradigm than using a client application which then uses a client API such as ODBC or ADO.NET to submit a query from a client application to a server to execute.
It looks you are interested in the later. In this case, .NET for Apache Spark is not going to be of much use, since it is designed to build Spark applications as described above (similar to how you would write them in Scala, Java or pySpark).
If you want client applications written in .NET, you would have to check if there is either an ODBC library available in .NET or an ADO.NET way to connect to a Databricks service.
Best regards
Michael
From: Abel G. Perez <[email protected]>
Sent: Wednesday, October 21, 2020 8:18 AM
To: dotnet/spark <[email protected]>
Cc: Michael Rys <[email protected]>; Mention <[email protected]>
Subject: Re: [dotnet/spark] ODBC vs Microsoft.Spark (#749)
I just want to query our company db and present the results to end users. I am doing this through a .NET program. So something like:
1. connect to my company's Azure Databricks db
2. send query "select * from db.table where empid=0005 and trip_date = '2020-09-05'"
3. format the results and present to user
As mentioned, I can already do this via an ODBC driver but it has to be installed by every one of my end users. If Microsoft.Spark is a native .NET driver then that would work better since end users do not have to install anything.
I also think the big confusion for me is what is meant by the term "spark application".
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fdotnet%2Fspark%2Fissues%2F749%23issuecomment-713652488&data=04%7C01%7Cmrys%40microsoft.com%7C1c22a58a6c30446dfee308d875d47955%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637388902719946677%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=GFG1rimgcqRlS%2FD8yDS0r1Lxw5ZXaa7NJucs74W47Y8%3D&reserved=0>, or unsubscribe<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FACZXGJEY375MWMWJKVC6NGTSL33R5ANCNFSM4SYUSZTQ&data=04%7C01%7Cmrys%40microsoft.com%7C1c22a58a6c30446dfee308d875d47955%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637388902719951657%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=oJ3LKdQi%2FE4TpR%2BRgXJxDx6oYUJKD%2BhNNtwjRnembbI%3D&reserved=0>.
|
Beta Was this translation helpful? Give feedback.
.NET for Apache Spark allows you to write your Spark application in .NET (instead of having to learn Scala or Python) and can be used with the .NET Interactive notebooks to run queries in .NET against your Spark cluster (e.g., Azure Synapse).
This is different than using a client-side ODBC connector to run some SparkSQL queries against a Spark cluster.
So the question for you now is, if you primarily want to send SparkSQL queries over ODBC, or if you want to submit Spark applications as batch programs or use an interactive notebook with .NET. If it is the first one, you are already set. If it is the second one, please check out our documentation. If you want to use it with Notebooks, then…