Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds new blog post announcing opensearch hadoop #1650

Closed

Conversation

harshavamsi
Copy link
Contributor

Description

Adds a new blog post announcing the availability of the hadoop client

Issues Resolved

[List any issues this PR will resolve]

Check List

  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the BSD-3-Clause License.

Signed-off-by: Harsha Vamsi Kalluri <[email protected]>
@harshavamsi harshavamsi force-pushed the opensearch_hadoop_blog branch from 4a471a9 to b28eb91 Compare June 6, 2023 16:01
Copy link
Contributor

@vagimeli vagimeli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See editorial review comments and changes. Please reach out with any questions.

_posts/2023-06-05-opensearch-hadoop-launch.markdown Outdated Show resolved Hide resolved
_posts/2023-06-05-opensearch-hadoop-launch.markdown Outdated Show resolved Hide resolved
_posts/2023-06-05-opensearch-hadoop-launch.markdown Outdated Show resolved Hide resolved
_posts/2023-06-05-opensearch-hadoop-launch.markdown Outdated Show resolved Hide resolved
_posts/2023-06-05-opensearch-hadoop-launch.markdown Outdated Show resolved Hide resolved
_posts/2023-06-05-opensearch-hadoop-launch.markdown Outdated Show resolved Hide resolved
_posts/2023-06-05-opensearch-hadoop-launch.markdown Outdated Show resolved Hide resolved
harshavamsi and others added 8 commits June 6, 2023 10:55
Co-authored-by: Melissa Vagi <[email protected]>
Signed-off-by: Harsha Vamsi Kalluri <[email protected]>
Co-authored-by: Melissa Vagi <[email protected]>
Signed-off-by: Harsha Vamsi Kalluri <[email protected]>
Co-authored-by: Melissa Vagi <[email protected]>
Signed-off-by: Harsha Vamsi Kalluri <[email protected]>
Co-authored-by: Melissa Vagi <[email protected]>
Signed-off-by: Harsha Vamsi Kalluri <[email protected]>
Co-authored-by: Melissa Vagi <[email protected]>
Signed-off-by: Harsha Vamsi Kalluri <[email protected]>
Co-authored-by: Melissa Vagi <[email protected]>
Signed-off-by: Harsha Vamsi Kalluri <[email protected]>
Co-authored-by: Melissa Vagi <[email protected]>
Signed-off-by: Harsha Vamsi Kalluri <[email protected]>
Co-authored-by: Melissa Vagi <[email protected]>
Signed-off-by: Harsha Vamsi Kalluri <[email protected]>
Copy link
Collaborator

@nknize nknize left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good start. I like having the compatibility matrices. Might be good, though, to also add a simple "Getting Started Examples"?

Maybe an example on how to write to a dataframe in scala?

e.g.,

val spark = SparkSession.builder().master("local[*]")
    .config("opensearch.nodes", "127.0.0.1").config("opensearch.net.http.auth.user", "admin").config("opensearch.net.http.auth.pass", "admin").config("opensearch.net.ssl", "true")
    .config("opensearch.batch.size.bytes", "1kb").config("opensearch.net.ssl.cert.allow.self.signed", "true")
    .getOrCreate()

or how to use it with pyspark like I demonstrate in my comment on #153.

I'm happy to add if you'd like.


We are excited to announce the release of the new OpenSearch-Hadoop connector. This tool enables efficient interaction between your Hadoop-based Big Data operations and OpenSearch clusters, supporting all versions of OpenSearch.

## OpenSearch Hadoop connector features:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add an opening paragraph here? For example, "The OpenSearch-Hadoop connector includes the following features:" (and remove the colon from the heading)

@harshavamsi
Copy link
Contributor Author

This would be awesome to have!

harshavamsi and others added 2 commits June 6, 2023 13:17
Co-authored-by: Melissa Vagi <[email protected]>
Signed-off-by: Harsha Vamsi Kalluri <[email protected]>
Co-authored-by: Heather Halter <[email protected]>
Signed-off-by: Harsha Vamsi Kalluri <[email protected]>
harshavamsi and others added 4 commits June 6, 2023 15:30
Co-authored-by: Melissa Vagi <[email protected]>
Signed-off-by: Harsha Vamsi Kalluri <[email protected]>
Co-authored-by: Melissa Vagi <[email protected]>
Signed-off-by: Harsha Vamsi Kalluri <[email protected]>
Co-authored-by: Melissa Vagi <[email protected]>
Signed-off-by: Harsha Vamsi Kalluri <[email protected]>
Signed-off-by: Harsha Vamsi Kalluri <[email protected]>
@wbeckler
Copy link
Contributor

wbeckler commented Jun 7, 2023

This is a good start. I like having the compatibility matrices. Might be good, though, to also add a simple "Getting Started Examples"?

Maybe an example on how to write to a dataframe in scala?

e.g.,

val spark = SparkSession.builder().master("local[*]")
    .config("opensearch.nodes", "127.0.0.1").config("opensearch.net.http.auth.user", "admin").config("opensearch.net.http.auth.pass", "admin").config("opensearch.net.ssl", "true")
    .config("opensearch.batch.size.bytes", "1kb").config("opensearch.net.ssl.cert.allow.self.signed", "true")
    .getOrCreate()

or how to use it with pyspark like I demonstrate in my comment on #153.

I'm happy to add if you'd like.

Go for it!

@hdhalter
Copy link
Contributor

Is anyone making any updates to this (@nknize )? We are targeting next week to publish it. Thanks!

@nknize
Copy link
Collaborator

nknize commented Jun 15, 2023

Is anyone making any updates to this (@nknize )? We are targeting next week to publish it. Thanks!

Yes. I'll put the example in tomorrow.

@wbeckler
Copy link
Contributor

Is anyone making any updates to this (@nknize )? We are targeting next week to publish it. Thanks!

Yes. I'll put the example in tomorrow.

Hi Nick, this is still awaiting your input. Thank you!!

categories:
- releases
meta_keywords: opensearch hadoop, apache spark, apache hive, apache hadoop, openseearch, mapreduce, hdfs
meta_description: OpenSearch Hadoop is now generally available with support for multiple versions of OpenSearch to run on Spark and Hive.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please update the meta with the following:

Meta_keywords: OpenSearch Hadoop, Apache Hadoop, OpenSearch Hadoop client
Meta_description: The OpenSearch Hadoop connector is now generally available with support for multiple versions of OpenSearch running on Spark and Hive.

@pajuric
Copy link

pajuric commented Jun 29, 2023

@nknize @mnkugler @wbeckler - If you can make the final edits, update he blog date, and let @krisfreedain know when it's ready to go, we can get this posted to the blog tomorrow. Otherwise, we'll need to hold this until next Wednesday.

@nknize
Copy link
Collaborator

nknize commented Jun 29, 2023

Otherwise, we'll need to hold this until next Wednesday.

Let's hold to Wednesday. I was working up the example with the published artifacts and noticed they don't support Spark 3. We may want to republish the Spark 3 artifacts before publishing the blog.

@pajuric
Copy link

pajuric commented Jul 7, 2023

@mnkugler and @wbeckler - Are we good to publish this today?

@wbeckler
Copy link
Contributor

wbeckler commented Jul 7, 2023

Still waiting on @nknize's changes.

@nknize
Copy link
Collaborator

nknize commented Jul 7, 2023

@pajuric The blocker right now is that the released OpenSearched-Hadoop artifacts are not compatible with Spark 3. Thus the compatibility matrix in this blog post is not correct and the example code I'm providing will not work for the users / readers running Spark 3:

e.g.,

[error] Modules were resolved with conflicting cross-version suffixes in ProjectRef(uri("file:/...
[error]    org.apache.spark:spark-core _2.13, _2.11

From example build.sbt

ThisBuild / scalaVersion := "2.13.0"

lazy val root = (project in file("."))
  .settings(
    name := "opensearch-spark-example"
  )

libraryDependencies ++= Seq(
  "org.apache.spark" %% "spark-core" % "3.2.4" exclude("javax", "servlet") exclude("org.apache", "hadoop"),
  "org.opensearch.client" % "opensearch-hadoop" % "1.0.1",
  "org.antlr" % "antlr4-runtime" % "4.8",
  "org.codehaus.janino" % "commons-compiler" % "3.0.8",
  "org.codehaus.janino" % "janino" % "3.0.8"
)

We need to publish the Spark 3 compatible version which is built and packaged with the artifacts from the spark/sql-30 module

@nknize
Copy link
Collaborator

nknize commented Jul 7, 2023

I opened an issue to move this forward: opensearch-project/opensearch-hadoop#304

@pajuric
Copy link

pajuric commented Aug 21, 2023

@vagimeli @nknize - Just checking the status on this blog to see if there are any updates?

@vagimeli
Copy link
Contributor

@vagimeli @nknize - Just checking the status on this blog to see if there are any updates?

@pajuric I've not heard from the authors in a while. I'm adding them to this comment, as they need to provide the update.

@nknize @harshavamsi Please update on the status of this blog. Is the text final and ready for an editorial review?

@pajuric
Copy link

pajuric commented Nov 2, 2023

@wbeckler @Xtansia - Please provide an update on the blog, as I understand it has been transferred over to you both.

@pajuric
Copy link

pajuric commented Jul 25, 2024

@wbeckler - Are we OK to close this blog?

@pajuric pajuric closed this Dec 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants