ODBC vs Microsoft.Spark #749

sindizzy · 2020-10-20T17:18:11Z

sindizzy
Oct 20, 2020

My company is just moving to Azure Databricks so am a bit fuzzy on how all this works. But what I have done already is setup a 3rd party ODBC driver on my machine and was able to do some simple queries against one of our DBs. Which leads me to Microsoft.Spark. What is it for? Can I use it as a native .NET driver to query the company database? If so, how would I setup the config with my cluster, token, etc to make simple queries?

Thanks

Answered by MikeRys

Oct 21, 2020

.NET for Apache Spark allows you to write your Spark application in .NET (instead of having to learn Scala or Python) and can be used with the .NET Interactive notebooks to run queries in .NET against your Spark cluster (e.g., Azure Synapse).

This is different than using a client-side ODBC connector to run some SparkSQL queries against a Spark cluster.

So the question for you now is, if you primarily want to send SparkSQL queries over ODBC, or if you want to submit Spark applications as batch programs or use an interactive notebook with .NET. If it is the first one, you are already set. If it is the second one, please check out our documentation. If you want to use it with Notebooks, then…

View full answer

MikeRys · 2020-10-21T00:15:27Z

MikeRys
Oct 21, 2020

.NET for Apache Spark allows you to write your Spark application in .NET (instead of having to learn Scala or Python) and can be used with the .NET Interactive notebooks to run queries in .NET against your Spark cluster (e.g., Azure Synapse).

This is different than using a client-side ODBC connector to run some SparkSQL queries against a Spark cluster.

So the question for you now is, if you primarily want to send SparkSQL queries over ODBC, or if you want to submit Spark applications as batch programs or use an interactive notebook with .NET. If it is the first one, you are already set. If it is the second one, please check out our documentation. If you want to use it with Notebooks, then you will need to request the capability to have Azure Databricks support .NET based notebooks from Databricks.

0 replies

rapoth · 2020-10-21T06:09:15Z

rapoth
Oct 21, 2020

@sindizzy Thanks for your question! In addition to what @MikeRys has mentioned, you can also checkout more docs on the official documentation page.

Please let us know if you have more questions.

0 replies

sindizzy · 2020-10-21T15:17:33Z

sindizzy
Oct 21, 2020
Author

I just want to query our company db and present the results to end users. I am doing this through a .NET program. So something like:

connect to my company's Azure Databricks db
send query "select * from db.table where empid=0005 and trip_date = '2020-09-05'"
format the results and present to user

As mentioned, I can already do this via an ODBC driver but it has to be installed by every one of my end users. If Microsoft.Spark is a native .NET driver then that would work better since end users do not have to install anything.

I also think the big confusion for me is what is meant by the term "spark application".

0 replies

MikeRys · 2020-10-22T09:54:23Z

MikeRys
Oct 22, 2020

A Spark application is an application that uses the Spark libraries, gets compiled (if written in Scala, Java or .NET) and then gets submitted to a Spark cluster which will run the application in the Spark coordination (aka master) node and scale out the Spark expressions in the application into the worker nodes. This is a different paradigm than using a client application which then uses a client API such as ODBC or ADO.NET to submit a query from a client application to a server to execute. It looks you are interested in the later. In this case, .NET for Apache Spark is not going to be of much use, since it is designed to build Spark applications as described above (similar to how you would write them in Scala, Java or pySpark). If you want client applications written in .NET, you would have to check if there is either an ODBC library available in .NET or an ADO.NET way to connect to a Databricks service. Best regards Michael From: Abel G. Perez <[email protected]> Sent: Wednesday, October 21, 2020 8:18 AM To: dotnet/spark <[email protected]> Cc: Michael Rys <[email protected]>; Mention <[email protected]> Subject: Re: [dotnet/spark] ODBC vs Microsoft.Spark (#749) I just want to query our company db and present the results to end users. I am doing this through a .NET program. So something like: 1. connect to my company's Azure Databricks db 2. send query "select * from db.table where empid=0005 and trip_date = '2020-09-05'" 3. format the results and present to user As mentioned, I can already do this via an ODBC driver but it has to be installed by every one of my end users. If Microsoft.Spark is a native .NET driver then that would work better since end users do not have to install anything. I also think the big confusion for me is what is meant by the term "spark application". — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fdotnet%2Fspark%2Fissues%2F749%23issuecomment-713652488&data=04%7C01%7Cmrys%40microsoft.com%7C1c22a58a6c30446dfee308d875d47955%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637388902719946677%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=GFG1rimgcqRlS%2FD8yDS0r1Lxw5ZXaa7NJucs74W47Y8%3D&reserved=0>, or unsubscribe<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FACZXGJEY375MWMWJKVC6NGTSL33R5ANCNFSM4SYUSZTQ&data=04%7C01%7Cmrys%40microsoft.com%7C1c22a58a6c30446dfee308d875d47955%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637388902719951657%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=oJ3LKdQi%2FE4TpR%2BRgXJxDx6oYUJKD%2BhNNtwjRnembbI%3D&reserved=0>.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ODBC vs Microsoft.Spark #749

{{title}}

Replies: 4 comments

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

ODBC vs Microsoft.Spark #749

sindizzy Oct 20, 2020

Replies: 4 comments

MikeRys Oct 21, 2020

rapoth Oct 21, 2020

sindizzy Oct 21, 2020 Author

MikeRys Oct 22, 2020

sindizzy
Oct 20, 2020

MikeRys
Oct 21, 2020

rapoth
Oct 21, 2020

sindizzy
Oct 21, 2020
Author

MikeRys
Oct 22, 2020