Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(plugin): new jobs plugin #726

Merged
merged 122 commits into from
Aug 12, 2021
Merged

feat(plugin): new jobs plugin #726

merged 122 commits into from
Aug 12, 2021

Conversation

rustatian
Copy link
Member

@rustatian rustatian commented Jun 15, 2021

Reason for This PR

closes: #403

Description of Changes

  • New SQS jobs based on the updated AWS SDK to v2 Link.
  • PQ is based on the binary heaps algorithm with max_queue_len option. In the future will be replaced with the: https://arxiv.org/pdf/1407.3377.pdf
  • Approx. JPS (jobs per second) with the echo worker and Batch endpoint (5950x, 64Gb RAM, nvme 980PRO, Linux (5.12.15), 64 pollers (configuration), and 10 workers):
    1. Ephemeral - 350k JPS
    2. RabbitMQ - Push: 45k JPS, Consume (directly from RabbitMQ): approx. 100k JPS (RabbitMQ in docker), Push (1000 concurrent connections) + Consume: 80k JPS. Push operation also included delayed jobs.
    3. SQS via the docker-ed ElasticMQ (SQS compatible, because SQS doesn't provide official Docker images): 1-2k JPS. Haven't tested on the real amazon SQS instance.
    4. Beanstalk - v0.12, docker. ~10k JPS.
  • Now it's possible to declare pipelines in the RUNTIME.
  • SQS, AMQP, Beanstalk drivers support redial with backoff.
  • Architecture diagrams are located in the https://github.com/spiral/roadrunner/blob/master/plugins/jobs/doc/jobs_arch.drawio.
  • Durability tests with near-to-real situations (connections drops, latency, etc) using Toxicproxy.
  • Replace third-party amqp091 implementation with the official amqp091 provided by the rabbitmq core team.
  • New protocol to control the execution from the worker side: https://github.com/spiral/roadrunner/blob/master/plugins/jobs/doc/response_protocol.md

License Acceptance

By submitting this pull request, I confirm that my contribution is made under
the terms of the MIT license.

PR Checklist

[Author TODO: Meet these criteria.]
[Reviewer TODO: Verify that these criteria are met. Request changes if not]

Implementation progress:

  • Move old sources to the RR2 and remove all unneeded files/folders.
  • Drivers' design
  • Root plugin design
  • Configuration section
  • Root plugin draft implementation
  • Update jobs event to be compatible with RR2
  • Algorithms for the pipelines (skip list, bst ???), wildcards, dispatcher.
  • RPC protobuf schema (v1beta)
  • Limit the number of concurrent Jobs per driver. Add option to the configuration: pq_prefetch (priority queue prefetch). So at every moment of time, each driver would have no more than pq_prefethc number of jobs in the priority queue.
  • Jobs pause, reset, resume.
  • Pluggable queues design and draft implementation (amqp,sqs, beanstalk, memory)
  • Toxicproxy for the durability tests.
  • amqp, beanstalk,ephemeral (memory),sqs in initial implementation. nast, nsq, etc - later
  • Protocol to send commands to the server. For example, to Nack failed job, or to store UUID of the job and notify the user when the job is completed.
  • Binary heaps PQ.
  • Update PQ to Strengthened Lazy Heaps (https://arxiv.org/pdf/1407.3377.pdf) [💡FEATURE REQUEST]: Update PQ algorithm #758
  • Adopt CFS algorithm (or other algo from this family) to the PQ. [💡FEATURE REQUEST]: Adopt CFS algorithm (or other algo from this family) to the PQ #759
  • Metrics. Custom exporter for the workers and general observers for the req/errors. [RR2, JOBS, METRICS] Expose jobs metrics #760
  • Stat() RPC method to provide internal jobs statistic. [RR2, JOBS, METRICS] Expose Stats RPC call #761

# AMQP jobs driver
#
# This option is required to use AMQP driver
amqp:
  # AMQP Uri to connect to the rabbitmq server https://www.rabbitmq.com/uri-spec.html
  #
  # This option is required for the production. Default: amqp://guest:[email protected]:5672
  addr: amqp://guest:[email protected]:5672/

# Beanstalk jobs driver
#
# This option is required to use Beanstalk driver
beanstalk:
  # Beanstalk address
  #
  # This option is required for the production. Default: tcp://127.0.0.1:11300
  addr: tcp://127.0.0.1:11300

  # Beanstalk connect timeout.
  #
  # Default: 30s
  timeout: 10s

# SQS jobs driver (https://docs.aws.amazon.com/general/latest/gr/aws-sec-cred-types.html)
#
# This option is required to use SQS driver
sqs:
  # AccessKey ID
  #
  # This option is required for the production. Default: empty
  key: api-key

  # Secret access key
  #
  # This option is required for the production. Default: empty
  secret: api-secret

  # AWS region
  #
  # This option is required for the production. Default: empty
  region: us-west-1

  # AWS session token
  #
  # This option is required for the production. Default: empty
  session_token: test

  # AWS SQS endpoint to connect
  #
  # This option is required for the production. Default: http://127.0.0.1:9324
  endpoint: http://127.0.0.1:9324

jobs:
  # Number of threads which will try to obtain the job from the priority queue
  #
  # Default: number of the logical CPU cores
  num_pollers: 32

  # Size of the internal priority queue
  #
  # Default: 1_000_000
  pipeline_size: 100000

  # worker pool configuration
  pool:
    num_workers: 10
    max_jobs: 0
    allocate_timeout: 60s
    destroy_timeout: 60s

  # List of broker pipelines associated with the drivers.
  #
  # This option is not required since you can declare pipelines in the runtime. Pipeline driver should exist.
  pipelines:
    # Pipeline name
    #
    # This option is required when defining pipelines via configuration.
    test-local:
      # Driver associated with the pipeline
      #
      # This option is required. Possible values: amqp, ephemeral, sqs, beanstalk
      driver: ephemeral
      # Pipeline priority
      #
      # If the job has priority set to 0, it will inherit the pipeline's priority. Default: 10.
      priority: 10
      # Number of job to prefetch from the driver.
      #
      # Default: 100_000.
      prefetch: 10000

    test-local-2:
      # Driver name
      #
      # This option is required.
      driver: amqp
      # QoS - prefetch.
      #
      # Default: 10
      prefetch: 10
      # Queue name
      #
      # Default: default
      queue: test-1-queue
      # Pipeline jobs priority, 1 - highest
      #
      # Default: 10
      priority: 1
      # Exchange name
      #
      # Default: amqp.default
      exchange: default
      # Exchange type
      #
      # Default: direct.
      exchange_type: direct
      # Routing key for the queue
      #
      # Default: empty.
      routing_key: test
      # Declare a queue exclusive at the exchange
      #
      # Default: false
      exclusive: false
      # When multiple is true, this delivery and all prior unacknowledged deliveries
      # on the same channel will be acknowledged.  This is useful for batch processing
      # of deliveries
      #
      # Default:false
      multiple_ack: false
      # When multiple is true, this delivery and all prior unacknowledged deliveries
      # on the same channel will be acknowledged.  This is useful for batch processing
      # of deliveries
      #
      # Default: false
      requeue_on_fail: false


    test-local-3:
      # Driver name
      #
      # This option is required.
      driver: beanstalk
      # Pipeline jobs priority, 1 - highest
      #
      # Default: 10
      priority: 11
      # Beanstalk internal tube priority
      #
      # Default: 1
      tube_priority: 1
      # Tube name
      #
      # Default: default
      tube: default-1
      # If no job is available before this timeout has passed, Reserve returns a ConnError recording ErrTimeout.
      #
      # Default: 5s
      reserve_timeout: 10s

    test-local-4:
      # Driver name
      #
      # This option is required.
      driver: sqs
      # Number of jobs to prefetch from the SQS. mazon SQS never returns more messages than this value
      # (however, fewer messages might be returned). Valid values: 1 to 10.
      #
      # Default: 10
      prefetch: 10
      # The duration (in seconds) that the received messages are hidden from subsequent
      #	retrieve requests after being retrieved by a ReceiveMessage request
      #
      # Default: 0
      visibility_timeout: 0
      # The duration (in seconds) for which the call waits for a message to arrive
      #	in the queue before returning. If a message is available, the call returns
      #	sooner than WaitTimeSeconds. If no messages are available and the wait time
      #	expires, the call returns successfully with an empty list of messages.
      #
      # Default: 0
      wait_time_seconds: 0
      # Queue name.
      #
      # Default: default
      queue: default
      # List of the AWS SQS attributes https://docs.aws.amazon.com/AWSSimpleQueueService/latest/APIReference/API_SetQueueAttributes.html.
      attributes:
        DelaySeconds: 0
        MaximumMessageSize: 262144
        MessageRetentionPeriod: 345600
        ReceiveMessageWaitTimeSeconds: 0
        VisibilityTimeout: 30
      # Tags don't have any semantic meaning. Amazon SQS interprets tags as character
      #	strings.
      tags:
        test: "tag"


  # list of pipelines to be consumed by the server, keep empty if you want to start consuming manually
  consume: [ "test-local", "test-local-2", "test-local-3", "test-local-4" ]

Proto

syntax = "proto3";

package jobs.v1beta;
option go_package = "./;jobsv1beta";

// single job request
message PushRequest {
    Job job = 1;
}

// batch jobs request
message PushBatchRequest {
    repeated Job jobs = 1;
}

// request to pause/resume/list/Destroy
message Pipelines {
    repeated string pipelines = 1;
}

// some endpoints receives nothing
// all endpoints returns nothing, except error
message Empty {}

message DeclareRequest {
    map<string, string> pipeline = 1;
}

message Job {
    string job = 1;
    string id = 2;
    string payload = 3;
    map<string, HeaderValue> headers = 5;
    Options options = 4;
}

message Options {
    int64 priority = 1;
    string pipeline = 2;
    int64 delay = 3;
}

message HeaderValue {
    repeated string value = 1;
}

Dynamic pipeline declaration request sample:

	pipe := &jobsv1beta.DeclareRequest{Pipeline: map[string]string{
		"driver":          "amqp",
		"name":            "test-3",
		"routing-key":     "test-3",
		"queue":           "default",
		"exchange-type":   "direct",
		"exchange":        "amqp.default",
		"prefetch":        "100",
		"priority":        "3",
		"exclusive":       "true",
		"multiple_ask":    "true",
		"requeue_on_fail": "true",
	}}

  • All commits in this PR are signed (git commit -s).
  • The reason for this PR is clearly provided (issue no. or explanation).
  • The description of changes is clear and encompassing.
  • Any required documentation changes (code and docs) are included in this PR.
  • Any user-facing changes are mentioned in CHANGELOG.md.
  • All added/changed functionality is tested.

Signed-off-by: Valery Piashchynski <[email protected]>
@rustatian rustatian added C-enhancement Category: enhancement. Meaning improvements of current module, transport, etc.. A-plugin Area: module labels Jun 15, 2021
@rustatian rustatian added this to the 2.4.0 milestone Jun 15, 2021
@rustatian rustatian requested a review from wolfy-j June 15, 2021 19:15
@rustatian rustatian self-assigned this Jun 15, 2021
@lgtm-com
Copy link

lgtm-com bot commented Jun 15, 2021

This pull request introduces 1 alert when merging d4c92e4 into 9dc98d4 - view on LGTM.com

new alerts:

  • 1 for Missing error check

@codecov
Copy link

codecov bot commented Jun 15, 2021

Codecov Report

Merging #726 (ecbfc5c) into master (cea3f6a) will decrease coverage by 1.37%.
The diff coverage is 60.12%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #726      +/-   ##
==========================================
- Coverage   67.64%   66.26%   -1.38%     
==========================================
  Files          94      126      +32     
  Lines        4846    10337    +5491     
==========================================
+ Hits         3278     6850    +3572     
- Misses       1154     2881    +1727     
- Partials      414      606     +192     
Impacted Files Coverage Δ
common/pubsub/psmessage.go 0.00% <ø> (ø)
pkg/events/general.go 76.47% <ø> (+6.47%) ⬆️
pkg/events/jobs_events.go 0.00% <0.00%> (ø)
pkg/pool/config.go 83.33% <ø> (+1.51%) ⬆️
plugins/broadcast/plugin.go 63.93% <ø> (+3.69%) ⬆️
plugins/broadcast/rpc.go 69.81% <ø> (+7.31%) ⬆️
plugins/http/attributes/attributes.go 69.44% <ø> (-0.13%) ⬇️
plugins/http/config/ssl.go 57.14% <ø> (ø)
plugins/informer/rpc.go 73.91% <0.00%> (-26.09%) ⬇️
plugins/kv/drivers/boltdb/plugin.go 75.86% <0.00%> (+3.63%) ⬆️
... and 190 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 67db4b5...ecbfc5c. Read the comment docs.

- Update Arch diagramm

Signed-off-by: Valery Piashchynski <[email protected]>
@lgtm-com
Copy link

lgtm-com bot commented Jun 16, 2021

This pull request introduces 1 alert when merging cee4bc4 into 9dc98d4 - view on LGTM.com

new alerts:

  • 1 for Missing error check

@lgtm-com
Copy link

lgtm-com bot commented Jun 21, 2021

This pull request introduces 1 alert when merging bdcfdd2 into 87d023d - view on LGTM.com

new alerts:

  • 1 for Missing error check

Create a config sample with RR2 support. Progress on root JOBS plugin.

Signed-off-by: Valery Piashchynski <[email protected]>
@lgtm-com
Copy link

lgtm-com bot commented Jun 21, 2021

This pull request introduces 1 alert when merging 41bb9fa into 87d023d - view on LGTM.com

new alerts:

  • 1 for Missing error check

- Remove old PHP tests

Signed-off-by: Valery Piashchynski <[email protected]>
@lgtm-com
Copy link

lgtm-com bot commented Jun 22, 2021

This pull request introduces 1 alert when merging 260d69c into 87d023d - view on LGTM.com

new alerts:

  • 1 for Missing error check

rustatian added 12 commits June 22, 2021 11:44
- Update tests

Signed-off-by: Valery Piashchynski <[email protected]>
- Initial ephemeral broker commit
- Initial RPC

Signed-off-by: Valery Piashchynski <[email protected]>
Signed-off-by: Valery Piashchynski <[email protected]>
- Add binary heap mock
- Connect first sub-plugin (ephemeral) with root jobs plugin

Signed-off-by: Valery Piashchynski <[email protected]>
Signed-off-by: Valery Piashchynski <[email protected]>
Signed-off-by: Valery Piashchynski <[email protected]>
Signed-off-by: Valery Piashchynski <[email protected]>
rustatian added 13 commits July 24, 2021 12:21
Signed-off-by: Valery Piashchynski <[email protected]>
Signed-off-by: Valery Piashchynski <[email protected]>
Signed-off-by: Valery Piashchynski <[email protected]>
branch to handle dead workers inside the channel.
Update docker-compose.yaml used for the tests. Update rabbitmq to 3.9.1.
Replace third-party amqp091 with the official implementation.

Signed-off-by: Valery Piashchynski <[email protected]>
spawned goroutine might stuck on the channel send operation and leak
memory.

Signed-off-by: Valery Piashchynski <[email protected]>
Signed-off-by: Valery Piashchynski <[email protected]>
Signed-off-by: Valery Piashchynski <[email protected]>
Signed-off-by: Valery Piashchynski <[email protected]>
logger. Fix bugs discovered during testing.

Signed-off-by: Valery Piashchynski <[email protected]>
Signed-off-by: Valery Piashchynski <[email protected]>
jobs_ok.php/jobs_err.php workers.

Signed-off-by: Valery Piashchynski <[email protected]>
@rustatian rustatian changed the title [WIP] feat(plugin): new jobs plugin feat(plugin): new jobs plugin Aug 12, 2021
@rustatian rustatian marked this pull request as ready for review August 12, 2021 11:20
@rustatian rustatian merged commit df27287 into master Aug 12, 2021
@bors bors bot deleted the feature/jobs_plugin branch August 12, 2021 12:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-plugin Area: module C-enhancement Category: enhancement. Meaning improvements of current module, transport, etc..
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[RR2] Update Jobs plugin
1 participant