Add spark job #140

pavolloffay · 2018-11-29T16:17:18Z

Resolves #44

Signed-off-by: Pavol Loffay [email protected]

jpkrohling · 2018-11-29T16:17:23Z

This change is

codecov · 2018-11-29T16:31:02Z

Codecov Report

Merging #140 into master will increase coverage by 0.11%.
The diff coverage is 97.89%.

@@            Coverage Diff             @@
##           master     #140      +/-   ##
==========================================
+ Coverage   96.21%   96.32%   +0.11%     
==========================================
  Files          28       29       +1     
  Lines        1267     1362      +95     
==========================================
+ Hits         1219     1312      +93     
- Misses         37       39       +2     
  Partials       11       11

Impacted Files	Coverage Δ
pkg/apis/io/v1alpha1/jaeger_types.go	`100% <ø> (ø)`	⬆️
pkg/strategy/production.go	`100% <100%> (ø)`	⬆️
pkg/strategy/all-in-one.go	`100% <100%> (ø)`	⬆️
pkg/cronjob/spark_dependencies.go	`97.64% <97.64%> (ø)`
pkg/strategy/controller.go	`100% <0%> (ø)`	⬆️
pkg/util/util.go	`100% <0%> (ø)`	⬆️
pkg/service/query.go	`100% <0%> (ø)`	⬆️
pkg/config/ui/ui.go
pkg/configmap/ui.go	`96.42% <0%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 610d415...3be05ea. Read the comment docs.

pavolloffay · 2018-11-29T16:31:05Z

At the moment this does not change any UI configuration. Baed on the job configuration (enabled) and storage operator could enable/disable dependencies tab in UI. I will add it after this PR is merged.

pavolloffay · 2018-11-29T16:43:06Z

Also the job is by default disabled - we can enable it by default for supported storages.

pavolloffay · 2018-11-30T16:57:53Z

deploy/examples/simplest.yaml

@@ -2,3 +2,13 @@ apiVersion: io.jaegertracing/v1alpha1
 kind: Jaeger
 metadata:
  name: simplest
+  strategy: allInOne


I will remove this

jpkrohling

Reviewed 6 of 10 files at r1, 1 of 1 files at r2, 9 of 9 files at r3.
Reviewable status: all files reviewed, 10 unresolved discussions (waiting on @pavolloffay)

a discussion (no related file):
Looks really nice. Just need to adjust a few imports. I had a couple of other minor comments as well, but LGTM otherwise.

deploy/examples/simple-prod.yaml, line 15 at r3 (raw file):

        username: elastic
        password: changeme
  sparkDependencies:

This is related to Elasticsearch, isn't it? Or is this job also available for Cassandra? If it's relevant only for ES, I'd place it under the StorageSpec. I would abstract the "how" from the "what", and name it dependenciesJob (or something like that).

pkg/cronjob/spark_dependencies.go, line 8 at r3 (raw file):

	"github.com/jaegertracing/jaeger-operator/pkg/storage"
	"github.com/spf13/viper"
	batchv1 "k8s.io/api/batch/v1"

Same comment as the other PR, regarding the imports: you are probably using a different tooling, but we are splitting the imports in three sections: stdlib, external, internal. I think we follow the same pattern in the Jaeger core repo.

pkg/cronjob/spark_dependencies.go, line 15 at r3 (raw file):

)

var supportedStorageTypes = map[string]bool{"elasticsearch": true, "cassandra": true}

This answers my previous question about the job being ES-specific.

pkg/cronjob/spark_dependencies_test.go, line 15 at r3 (raw file):

		expected  *v1alpha1.Jaeger
	}{
		{underTest: &v1alpha1.Jaeger{}, expected: &v1alpha1.Jaeger{Spec: v1alpha1.JaegerSpec{SparkDependencies: v1alpha1.JaegerSparkDependenciesSpec{Schedule: "55 23 * * *"}}}},

Could you split this into multiple lines, to make it easier to read?

pkg/cronjob/spark_dependencies_test.go, line 30 at r3 (raw file):

	}{
		{},
		{underTest: []v1.EnvVar{{Name: "foo", Value: "bar"}, {Name: "foo3"}, {Name: "foo2", ValueFrom: &v1.EnvVarSource{}}},

Same here, each pair on its line would make it more readable.

pkg/strategy/all-in-one.go, line 76 at r3 (raw file):

	if c.jaeger.Spec.SparkDependencies.Enabled && cronjob.SupportedStorage(c.jaeger.Spec.Storage.Type) {
		os = append(os, cronjob.Create(c.jaeger))

What's the default value for this Enabled? If it's false, then it means the user has explicitly set the flag to true. As such, the user deserves a log message, saying that the dependency job will be disabled because the storage isn't supported.

We similar log messages in a function called normalize(), as part of the controller. You could place it there.

pkg/strategy/all-in-one_test.go, line 167 at r3 (raw file):

		s := fce(test.jaeger)
		objs := s.Create()
		cronJobs := getTypesOf(objs, reflect.TypeOf(&batchv1beta1.CronJob{}))

Nice trick. We have a getDeployments func in some tests that could make use of this.

test/e2e/spark_dependencies_test.go, line 85 at r3 (raw file):

	}

	return e2eutil.WaitForDeployment(t, f.KubeClient, namespace, name, 1, retryInterval, timeout)

Do you want to wait for the SmokeTest to be merged? If so, you could add the smoke test here.

pavolloffay · 2018-12-03T14:58:41Z

Just need to adjust a few imports

I prefer to get merged #151 first.

pavolloffay · 2018-12-03T16:14:44Z

PR updated

pavolloffay · 2018-12-04T10:08:03Z

@jpkrohling could you please review this PR?

objectiser

Reviewable status: 4 of 15 files reviewed, 12 unresolved discussions (waiting on @jpkrohling and @pavolloffay)

deploy/examples/simple-prod.yaml, line 15 at r3 (raw file):

Previously, jpkrohling (Juraci Paixão Kröhling) wrote…

This is related to Elasticsearch, isn't it? Or is this job also available for Cassandra? If it's relevant only for ES, I'd place it under the StorageSpec. I would abstract the "how" from the "what", and name it dependenciesJob (or something like that).

Can this be renamed "dependencies", rather than "sparkDependencies", as in the future we may use other technology to derive the dependencies.

deploy/examples/simple-prod.yaml, line 7 at r4 (raw file):

  name: simple-prod
spec:
  strategy: allInOne

This example is supposed to relate to a simple production deployment using ES. So needs to be production.

pkg/apis/io/v1alpha1/jaeger_types.go, line 128 at r4 (raw file):

	Options               Options                         `json:"options"`
	CassandraCreateSchema JaegerCassandraCreateSchemaSpec `json:"cassandraCreateSchema"`
	SparkDependencies     JaegerSparkDependenciesSpec     `json:"sparkDependencies"`

As mentioned above - I think it would be better to keep Spark out of the type and field names - its just an implementation detail. For now, just changing the json node name would be fine if you didn't want to change all of the types.

pavolloffay · 2018-12-04T12:12:46Z

As mentioned above - I think it would be better to keep Spark out of the type and field names - its just an

I am not sure if we could use that unknown future project with this interface - as the interface is highly dependent on the properties exposed by spark-dependencies project. For this reason I would keep it as with spark prefix.

Signed-off-by: Pavol Loffay <[email protected]>

jpkrohling

Reviewed 2 of 12 files at r4, 11 of 11 files at r5.
Reviewable status: all files reviewed, 12 unresolved discussions (waiting on @pavolloffay and @objectiser)

pkg/apis/io/v1alpha1/jaeger_types.go, line 128 at r4 (raw file):

Previously, objectiser (Gary Brown) wrote…

As mentioned above - I think it would be better to keep Spark out of the type and field names - its just an implementation detail. For now, just changing the json node name would be fine if you didn't want to change all of the types.

+1, Spark shouldn't be there. Instead, we could have two levels:

dependencies:
  spark:
    option1: value1

pkg/cronjob/spark_dependencies_test.go, line 30 at r3 (raw file):

Previously, jpkrohling (Juraci Paixão Kröhling) wrote…

Same here, each pair on its line would make it more readable.

Looks like this was missing in the final version that got merged.

pkg/strategy/all-in-one.go, line 76 at r3 (raw file):

Previously, jpkrohling (Juraci Paixão Kröhling) wrote…

What's the default value for this Enabled? If it's false, then it means the user has explicitly set the flag to true. As such, the user deserves a log message, saying that the dependency job will be disabled because the storage isn't supported.

We similar log messages in a function called normalize(), as part of the controller. You could place it there.

I don't think the code that got merged makes sense regarding this...

The log.Info will always be displayed in the default case, ie, when the user has omitted the flag. We do not want this. We want info logging when the user wants something and we cannot deliver.
There's no logging when the user sets the enabled flag to true and uses a unsupported storage...

pkg/strategy/production.go, line 89 at r5 (raw file):

	if cronjob.SupportedStorage(c.jaeger.Spec.Storage.Type) {
		if c.jaeger.Spec.Storage.SparkDependencies.Enabled {

Same as the comment above: the logging should be about telling the user that the flag that was explicitly set is being ignored due to the usage of an unsupported storage.

pavolloffay · 2018-12-04T14:38:52Z

@jpkrohling could you please create a separate issue for any missing comments?

pavolloffay commented Nov 30, 2018

View reviewed changes

jpkrohling reviewed Dec 3, 2018

View reviewed changes

pavolloffay force-pushed the spark-job branch from 8c795fd to 15bc4ff Compare December 3, 2018 15:51

objectiser suggested changes Dec 4, 2018

View reviewed changes

pavolloffay added 9 commits December 4, 2018 13:17

Add spark job

6548a27

Signed-off-by: Pavol Loffay <[email protected]>

Add generate

ca1c98a

Signed-off-by: Pavol Loffay <[email protected]>

Increase coverage

44275bc

Signed-off-by: Pavol Loffay <[email protected]>

Fix some review comments

0021f23

Signed-off-by: Pavol Loffay <[email protected]>

Fix format

75e1c95

Signed-off-by: Pavol Loffay <[email protected]>

Fix review comments

0d6d2f0

Signed-off-by: Pavol Loffay <[email protected]>

Use production in prod example

e07a7ca

Signed-off-by: Pavol Loffay <[email protected]>

Rename sparkDependencies to dependencies

ce78f19

Signed-off-by: Pavol Loffay <[email protected]>

Format code

3be05ea

Signed-off-by: Pavol Loffay <[email protected]>

pavolloffay force-pushed the spark-job branch from c4a0c61 to 3be05ea Compare December 4, 2018 12:18

objectiser approved these changes Dec 4, 2018

View reviewed changes

pavolloffay merged commit 3aa97dd into jaegertracing:master Dec 4, 2018

jpkrohling reviewed Dec 4, 2018

View reviewed changes

This was referenced Dec 4, 2018

Spark dependencies as child of 'dependencies' #156

Closed

Log when dependency job is enabled with unsupported storage #157

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add spark job #140

Add spark job #140

pavolloffay commented Nov 29, 2018

jpkrohling commented Nov 29, 2018

codecov bot commented Nov 29, 2018 •

edited

Loading

pavolloffay commented Nov 29, 2018

pavolloffay commented Nov 29, 2018

pavolloffay Nov 30, 2018

jpkrohling left a comment

pavolloffay commented Dec 3, 2018

pavolloffay commented Dec 3, 2018

pavolloffay commented Dec 4, 2018

objectiser left a comment

pavolloffay commented Dec 4, 2018

jpkrohling left a comment

pavolloffay commented Dec 4, 2018

Add spark job #140

Add spark job #140

Conversation

pavolloffay commented Nov 29, 2018

jpkrohling commented Nov 29, 2018

codecov bot commented Nov 29, 2018 • edited Loading

Codecov Report

pavolloffay commented Nov 29, 2018

pavolloffay commented Nov 29, 2018

pavolloffay Nov 30, 2018

Choose a reason for hiding this comment

jpkrohling left a comment

Choose a reason for hiding this comment

pavolloffay commented Dec 3, 2018

pavolloffay commented Dec 3, 2018

pavolloffay commented Dec 4, 2018

objectiser left a comment

Choose a reason for hiding this comment

pavolloffay commented Dec 4, 2018

jpkrohling left a comment

Choose a reason for hiding this comment

pavolloffay commented Dec 4, 2018

codecov bot commented Nov 29, 2018 •

edited

Loading