Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add AWS SQS publisher and example job file. #431

Closed
wants to merge 10 commits into from

Conversation

Wonong
Copy link
Contributor

@Wonong Wonong commented Jan 17, 2021

Summary of Changes

  • AWS SQS publisher takes two folders of csv files(nodes and relations) and publish it to AWS SQS queue
  • Common expected usage is running job on client-side cluster(environment of DBMS) and push extracted metadata to message queue
  • Its message format is like below.
{
  "nodes": [
    {
      "LABEL": "...",
      "KEY": "...",
      "description_source": "...",
      "description": "..."
    }, ...
  ],
  "relations": [ 
    {
      "START_KEY": "...",
      "START_LABEL": "...",
      "END_KEY": "...",
      "END_LABEL": "...",
      "TYPE": "...",
      "REVERSE_TYPE": "..."
    }, ...
  ]
}

Tests

Test it using container executing sample job periodically using crontab and it works.
test repo: https://github.com/Wonong/ab-metadata-pusher

Documentation

https://github.com/Wonong/ab-metadata-pusher

CheckList

Make sure you have checked all steps below to ensure a timely review.

  • PR title addresses the issue accurately and concisely. Example: "Updates the version of Flask to v1.0.2"
  • PR includes a summary of changes.
  • PR adds unit tests, updates existing unit tests, OR documents why no test additions or modifications are needed.
  • In case of new functionality, my PR adds documentation that describes how to use it.
    • All the public functions and the classes in the PR contain docstrings that explain what it does
  • PR passes make test

Wonyeong Choi and others added 5 commits January 17, 2021 16:52
Signed-off-by: Wonyeong Choi <[email protected]>
Signed-off-by: Wonyeong Choi <[email protected]>
Signed-off-by: Wonyeong Choi <[email protected]>
Signed-off-by: Wonyeong Choi <[email protected]>
Signed-off-by: Wonyeong Choi <[email protected]>
Signed-off-by: Wonyeong Choi <[email protected]>
Signed-off-by: Wonyeong Choi <[email protected]>
@feng-tao
Copy link
Member

besides unit test, is the pr ready for review? exciting about supporting sqs publisher in databuilder etl pattern!

@Wonong
Copy link
Contributor Author

Wonong commented Jan 26, 2021

This pr is ready. but, I want you to review both this pr and the other pr which implement extractor which extracts metadata from sqs. I'm currently working on extractor PR.
So, I will reopen this PR when I finish extractor PR and also add unit tests of this PR as you said :)

@Wonong Wonong closed this Jan 26, 2021
@feng-tao
Copy link
Member

sg, thanks @Wonong

@feng-tao
Copy link
Member

@Wonong just to check, have you had the chance to continue the pr?

@feng-tao feng-tao reopened this Feb 26, 2021
@feng-tao feng-tao added the keep fresh Disables stalebot from closing an issue label Feb 26, 2021
@Wonong
Copy link
Contributor Author

Wonong commented Feb 26, 2021

@feng-tao sorry for no progress... I had been busy these days. but, I can continue the pr soon if you don't mind

@feng-tao
Copy link
Member

@Wonong that would be great!

Signed-off-by: Wonyeong Choi <[email protected]>
@Wonong
Copy link
Contributor Author

Wonong commented Mar 4, 2021

@feng-tao
I just wonder your opinion.
How about update job class to be able to create a non-task job(only publisher)?
I think it is easy to update. but original concept of job(described in readme) say publisher is only optional.
Below is simple implementation of none-task available job class. I think implementing a NoopTask like NoopPublisher works too.
https://github.com/Wonong/amundsendatabuilder/blob/non-task-job/databuilder/job/job.py#L64

@dorianj
Copy link
Contributor

dorianj commented Apr 20, 2021

Hey @Wonong, I'd suggest using a NoopPublisher would work fine here.

We're now in a code-freeze period as part of the monorepo transition. If you can get this landed in the next few days, I'll cleanly transfer it over to the new repo. If not, it'll need to be manually re-opened against the new repo.

@Wonong
Copy link
Contributor Author

Wonong commented Apr 23, 2021

Hey @Wonong, I'd suggest using a NoopPublisher would work fine here.

We're now in a code-freeze period as part of the monorepo transition. If you can get this landed in the next few days, I'll cleanly transfer it over to the new repo. If not, it'll need to be manually re-opened against the new repo.

sorry for late answer. I think i need more time. so, i will manually open it if needed. thank you for inform :)

@dorianj
Copy link
Contributor

dorianj commented Apr 25, 2021

OK, happy to review on the monorepo :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
keep fresh Disables stalebot from closing an issue
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants