Setup Storage interface for log persistence #186

Aayyush · 2022-02-09T19:38:10Z

This PR updates the multiplexer to check for the job in the storage backend before registering with partition registry. For persisting the job, the project output wrapper calls thecompleteJob()method which handles closing active ws connections, marking operation complete and clearing buffers if the job is successfully persisted.

Aayyush · 2022-02-09T19:40:40Z

server/controllers/websocket/writer.go

+	}
+
+	// Read from the s3 client and write to the websocket.
+	buf := make([]byte, 4)


Should we increase the buffer size?

Aayyush · 2022-02-09T19:43:51Z

server/jobs/project_command_output_handler.go

+	p.closeActiveChannels(jobID)
+
+	// Persist logs to storage backend
+	ok, err := p.storageBackend.Write(jobID, outputBuffer.Buffer)


Here, we persist the job synchronously. We could spin up new goroutines for this but I thought it would be good to see how long it actually takes and how much it impacts our response time(after we have our instrumented implementation) before we make premature optimizations!

I'd like to see more encapsulation here, instead of having ProjectCommandOutputHandler doing a million things. This seems like could have another struct that's a field here that abstracts usage of projectOutputBuffers in addition to the StorageBackend to make it almost seamless integration.

One idea for this is as follows:

type JobStatus int const ( Processing JobStatus = iota Complete ) type Job struct { Output []string Status JobStatus } type JobStore interface { // gets the job id from the in memory buffers if available, and if not // reaches to storage backend Get(jobID string) Job // Appends a given string to the job's output, this won't accept anymore output // if the status of the job is complete AppendOutput(jobID string, output string) // sets a job status and triggers any associated workflow, // e.g. if the status is complete, the job is flushed to the associated storage // backend. SetJobStatus(jobID string, status JobStatus) }

JobStore here basically abstracts dealing with all the in memory buffers and storage backend so that project command output handler doesn't have to know.

Yeah I like this approach better! I've made the changes accordingly.

Aayyush · 2022-02-09T19:47:15Z

server/jobs/storage_backend.go

+	Read(key string) io.ReadCloser
+
+	// Write logs to the storage backend
+	Write(key string, logs []string) (success bool, err error)


This returns success bool which indicates if the logs were successfully persisted. This would make the code path cleaner when log persistence is not configured. The NoopStorageBackend returns false, nil which means the logs were not persisted without any errors. So, we leave the output buffer as it is instead of clearing it up.

nishkrishnan

Took a first pass and have some comments.

server/controllers/websocket/writer.go

server/controllers/websocket/mux.go

nishkrishnan · 2022-02-10T00:42:28Z

server/jobs/storage_backend.go

+type StorageBackend interface {
+	// Checks the backend storage for the specified key
+	IsKeyExists(key string) bool
+
+	// Read logs from the storage backend. Must close the reader
+	Read(key string) io.ReadCloser
+
+	// Write logs to the storage backend
+	Write(key string, logs []string) (success bool, err error)
+}


If you have an interface on the read side you don't need this, that's just too much indirection.

server/controllers/websocket/mux.go

nishkrishnan · 2022-02-10T15:07:43Z

server/jobs/project_command_output_handler.go

+	p.closeActiveChannels(jobID)
+
+	// Persist logs to storage backend
+	ok, err := p.storageBackend.Write(jobID, outputBuffer.Buffer)


I'd like to see more encapsulation here, instead of having ProjectCommandOutputHandler doing a million things. This seems like could have another struct that's a field here that abstracts usage of projectOutputBuffers in addition to the StorageBackend to make it almost seamless integration.

One idea for this is as follows:

type JobStatus int const ( Processing JobStatus = iota Complete ) type Job struct { Output []string Status JobStatus } type JobStore interface { // gets the job id from the in memory buffers if available, and if not // reaches to storage backend Get(jobID string) Job // Appends a given string to the job's output, this won't accept anymore output // if the status of the job is complete AppendOutput(jobID string, output string) // sets a job status and triggers any associated workflow, // e.g. if the status is complete, the job is flushed to the associated storage // backend. SetJobStatus(jobID string, status JobStatus) }

JobStore here basically abstracts dealing with all the in memory buffers and storage backend so that project command output handler doesn't have to know.

server/controllers/websocket/mux.go

nishkrishnan

Looks SO much better! You're basically good to go, just have a couple things.

server/events/project_command_runner.go

server/jobs/job_receiver_registry.go

server/jobs/job_store.go

server/jobs/project_command_output_handler.go

Aayyush added 5 commits February 8, 2022 13:47

WIP

a265295

Adding interfaces

17ca8a9

refactoring tests

281d124

Pulling changes

730fbe8

Wiring up config changes

3598e94

Aayyush commented Feb 9, 2022

View reviewed changes

nishkrishnan reviewed Feb 10, 2022

View reviewed changes

Aayyush added 3 commits February 14, 2022 10:20

Addressing comments

dd2609c

Adding comments

f80b75c

Fixing tests

e549c0f

nishkrishnan reviewed Feb 15, 2022

View reviewed changes

Aayyush added 2 commits February 15, 2022 15:57

Addressing comments

8ae8dd2

Fixing tests

51ba204

nishkrishnan approved these changes Feb 16, 2022

View reviewed changes

Aayyush merged commit 59ae206 into release-v0.17.3-lyft.1 Feb 16, 2022

Aayyush deleted the aay/s3-log-interface branch February 16, 2022 19:57

Aayyush added a commit that referenced this pull request Mar 22, 2022

Setup Storage interface for log persistence (#186)

e28284f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Setup Storage interface for log persistence #186

Setup Storage interface for log persistence #186

Aayyush commented Feb 9, 2022

Aayyush Feb 9, 2022

Aayyush Feb 9, 2022

nishkrishnan Feb 10, 2022

Aayyush Feb 14, 2022

Aayyush Feb 9, 2022

nishkrishnan left a comment

nishkrishnan Feb 10, 2022

nishkrishnan Feb 10, 2022

nishkrishnan left a comment

Setup Storage interface for log persistence #186

Setup Storage interface for log persistence #186

Conversation

Aayyush commented Feb 9, 2022

Aayyush Feb 9, 2022

Choose a reason for hiding this comment

Aayyush Feb 9, 2022

Choose a reason for hiding this comment

nishkrishnan Feb 10, 2022

Choose a reason for hiding this comment

Aayyush Feb 14, 2022

Choose a reason for hiding this comment

Aayyush Feb 9, 2022

Choose a reason for hiding this comment

nishkrishnan left a comment

Choose a reason for hiding this comment

nishkrishnan Feb 10, 2022

Choose a reason for hiding this comment

nishkrishnan Feb 10, 2022

Choose a reason for hiding this comment

nishkrishnan left a comment

Choose a reason for hiding this comment