Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature]Flink operator and help wanted #387

Open
jentle opened this issue Jan 24, 2021 · 0 comments
Open

[Feature]Flink operator and help wanted #387

jentle opened this issue Jan 24, 2021 · 0 comments
Labels
help wanted Extra attention is needed

Comments

@jentle
Copy link

jentle commented Jan 24, 2021

Is your feature request related to a problem? Please describe.
提交 flink 应用的operator,不过目前 workflow 还不支持 continuous, streaming应用只能当不会停止的oneshot来使用。
代码分支

Describe the solution you'd like
参考 README.md

Additional context
开发的时候,碰到了以下问题,还没解决

  1. workflow执行时,classpath冲突的问题
    workflow 执行operator时,主动添加了当前java classpath ,参考这里
command.add(buildClassPath());
command.add("com.miotech.kun.workflow.worker.local.OperatorLauncher");
      

private String buildClassPath() {
        String classPath = System.getProperty("java.class.path");
        checkState(StringUtils.isNotEmpty(classPath), "launcher jar should exist.");
        return classPath;
    }

但实际上,只需要OperatorLauncher 的依赖uber jar就可以了。 即使只添加了OperatorLauncher 的依赖后,依然和 flink operator 本身的依赖 冲突(guava-3.0 , hadoop-2.7.3需要) ,没有办法解决, 开发时用了 一些hardcode去编译uber jar。如果是用户自定义开发的operator,jar包冲突的可能性还是蛮大的。

  1. 资源配置文件

flink operator运行时依赖一些 hadoop 配置文件,目前只是通过 配置传入明文,但是比较丑陋。

name type note
hadoopConfYarn string hadoop yarn-site.xml 配置文件内容,xml格式
hadoopConfCore string hadoop core-site.xml 配置文件内容,xml格式
hadoopConfHdfs string hadoop hdfs-site.xml 配置文件内容,xml格式

**看了一下 spark **里也有类似的配置,目前是hard coded的,不是很安全 ,其实完全可以通过 core-site.xml来配置连接的文件系统信息。 workflow 之前我记得是有resource的概念的, 可以通过resource相关的接口获取配置文件,避免直接在代码中暴露。

Further Plan
目前flink operator还只是提交运行的 flink 应用, 没有跟workflow做整合。
可以考虑在 元数据的基础上,定义流表注册到flink 的metastore中, 然后直接使用 flink sql,产生的结果也是个流表,可以直接落入到终端的 es,db,graph,kafka等存储中。这个还是需要看workflow本身的定义和规划。

@jentle jentle added the help wanted Extra attention is needed label Jan 24, 2021
@jentle jentle changed the title Flink operator and help wanted [Feature]Flink operator and help wanted Jan 24, 2021
@JoshOY JoshOY pinned this issue Jan 28, 2021
@jentle jentle unpinned this issue Jan 28, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

1 participant