Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

exchange add udf #2780

Merged
merged 1 commit into from
May 19, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -102,6 +102,9 @@
|`tags.fields`|list\[string\]|-|是|属性对应的列的表头或列名。如果有表头或列名,请直接使用该名称。如果 CSV 文件没有表头,用`[_c0, _c1, _c2]`的形式表示第一列、第二列、第三列,以此类推。|
|`tags.nebula.fields`|list\[string\]|-|是|{{nebula.name}}中定义的属性名称,顺序必须和`tags.fields`一一对应。例如`[_c1, _c2]`对应`[name, age]`,表示第二列为属性 name 的值,第三列为属性 age 的值。|
|`tags.vertex.field`|string|-|是|点 ID 的列。例如 CSV 文件没有表头时,可以用`_c0`表示第一列的值作为点 ID。|
|`tags.vertex.udf.separator`|string|-|否|通过自定义规则合并多列,该参数指定连接符。|
|`tags.vertex.udf.oldColNames`|list|-|否|通过自定义规则合并多列,该参数指定待合并的列名。多个列用英文逗号(,)分隔。|
|`tags.vertex.udf.newColName`|string|-|否|通过自定义规则合并多列,该参数指定新列的列名。|
|`tags.batch`|int|`256`|是|单批次写入{{nebula.name}}的最大点数量。|
|`tags.partition`|int|`32`|是|Spark 分片数量。|

Expand Down
15 changes: 15 additions & 0 deletions docs-2.0/nebula-exchange/use-exchange/ex-ug-export-from-nebula.md
Original file line number Diff line number Diff line change
Expand Up @@ -229,6 +229,11 @@ CentOS 7.9.2009
nebula.fields: [target_nebula-field-0, target_nebula-field-1, target_nebula-field-2]
limit:10000
vertex: _vertexId # must be `_vertexId`
# udf:{
# separator:"_"
# oldColNames:[field-0,field-1,field-2]
# newColName:new-field
# }
batch: 2000
partition: 60
}
Expand All @@ -250,7 +255,17 @@ CentOS 7.9.2009
nebula.fields: [target_nebula-field-0, target_nebula-field-1, target_nebula-field-2]
limit:1000
source: _srcId # must be `_srcId`
# udf:{
# separator:"_"
# oldColNames:[field-0,field-1,field-2]
# newColName:new-field
# }
target: _dstId # must be `_dstId`
# udf:{
# separator:"_"
# oldColNames:[field-0,field-1,field-2]
# newColName:new-field
# }
ranking: source_nebula-field-2
batch: 2000
partition: 60
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -168,6 +168,11 @@
# 指定表中某一列数据为{{nebula.name}}中点 VID 的来源。
vertex: {
field:playerid
# udf:{
# separator:"_"
# oldColNames:[field-0,field-1,field-2]
# newColName:new-field
# }
# policy:hash
}

Expand Down Expand Up @@ -238,11 +243,21 @@
# 在 source 里,将 follow 表中某一列作为边的起始点数据源。
source: {
field:src_player
# udf:{
# separator:"_"
# oldColNames:[field-0,field-1,field-2]
# newColName:new-field
# }
}

# 在 target 里,将 follow 表中某一列作为边的目的点数据源。
target: {
field:dst_player
# udf:{
# separator:"_"
# oldColNames:[field-0,field-1,field-2]
# newColName:new-field
# }
}

# 指定一个列作为 rank 的源(可选)。
Expand Down
15 changes: 15 additions & 0 deletions docs-2.0/nebula-exchange/use-exchange/ex-ug-import-from-csv.md
Original file line number Diff line number Diff line change
Expand Up @@ -187,6 +187,11 @@
# 目前,{{nebula.name}} {{nebula.release}}仅支持字符串或整数类型的 VID。
vertex: {
field:_c0
# udf:{
# separator:"_"
# oldColNames:[field-0,field-1,field-2]
# newColName:new-field
# }
# policy:hash
}

Expand Down Expand Up @@ -285,9 +290,19 @@
# 目前,{{nebula.name}} {{nebula.release}}仅支持字符串或整数类型的 VID。
source: {
field: _c0
# udf:{
# separator:"_"
# oldColNames:[field-0,field-1,field-2]
# newColName:new-field
# }
}
target: {
field: _c1
# udf:{
# separator:"_"
# oldColNames:[field-0,field-1,field-2]
# newColName:new-field
# }
}

# 指定的分隔符。默认值为英文逗号(,)。
Expand Down
15 changes: 15 additions & 0 deletions docs-2.0/nebula-exchange/use-exchange/ex-ug-import-from-hbase.md
Original file line number Diff line number Diff line change
Expand Up @@ -199,6 +199,11 @@ ROW COLUMN+CELL
# 例如 rowkey 作为 VID 的来源,请填写“rowkey”。
vertex:{
field:rowkey
# udf:{
# separator:"_"
# oldColNames:[field-0,field-1,field-2]
# newColName:new-field
# }
}

# 单批次写入 {{nebula.name}} 的数据条数。
Expand Down Expand Up @@ -260,10 +265,20 @@ ROW COLUMN+CELL
# 在 target 里,将 follow 表中某一列作为边的目的点数据源。示例使用列 dst_player。
source:{
field:rowkey
# udf:{
# separator:"_"
# oldColNames:[field-0,field-1,field-2]
# newColName:new-field
# }
}

target:{
field:dst_player
# udf:{
# separator:"_"
# oldColNames:[field-0,field-1,field-2]
# newColName:new-field
# }
}

# 指定一个列作为 rank 的源(可选)。
Expand Down
15 changes: 15 additions & 0 deletions docs-2.0/nebula-exchange/use-exchange/ex-ug-import-from-hive.md
Original file line number Diff line number Diff line change
Expand Up @@ -236,6 +236,11 @@ scala> sql("select playerid, teamid, start_year, end_year from basketball.serve"
# 指定表中某一列数据为 {{nebula.name}} 中点 VID 的来源。
vertex:{
field:playerid
# udf:{
# separator:"_"
# oldColNames:[field-0,field-1,field-2]
# newColName:new-field
# }
}

# 单批次写入 {{nebula.name}} 的最大数据条数。
Expand Down Expand Up @@ -292,10 +297,20 @@ scala> sql("select playerid, teamid, start_year, end_year from basketball.serve"
# 在 target 里,将 follow 表中某一列作为边的目的点数据源。
source: {
field: src_player
# udf:{
# separator:"_"
# oldColNames:[field-0,field-1,field-2]
# newColName:new-field
# }
}

target: {
field: dst_player
# udf:{
# separator:"_"
# oldColNames:[field-0,field-1,field-2]
# newColName:new-field
# }
}

# 指定一个列作为 rank 的源(可选)。
Expand Down
15 changes: 15 additions & 0 deletions docs-2.0/nebula-exchange/use-exchange/ex-ug-import-from-jdbc.md
Original file line number Diff line number Diff line change
Expand Up @@ -226,6 +226,11 @@ mysql> desc serve;
# 指定表中某一列数据为 {{nebula.name}} 中点 VID 的来源。
vertex: {
field:playerid
# udf:{
# separator:"_"
# oldColNames:[field-0,field-1,field-2]
# newColName:new-field
# }
}

# 单批次写入 {{nebula.name}} 的数据条数。
Expand Down Expand Up @@ -304,10 +309,20 @@ mysql> desc serve;
# 在 target 里,将 follow 表中某一列作为边的目的点数据源。
source: {
field: src_player
# udf:{
# separator:"_"
# oldColNames:[field-0,field-1,field-2]
# newColName:new-field
# }
}

target: {
field: dst_player
# udf:{
# separator:"_"
# oldColNames:[field-0,field-1,field-2]
# newColName:new-field
# }
}

# 指定一个列作为 rank 的源(可选)。
Expand Down
15 changes: 15 additions & 0 deletions docs-2.0/nebula-exchange/use-exchange/ex-ug-import-from-json.md
Original file line number Diff line number Diff line change
Expand Up @@ -215,6 +215,11 @@
# 目前,{{nebula.name}} {{nebula.release}}仅支持字符串或整数类型的 VID。
vertex: {
field:id
# udf:{
# separator:"_"
# oldColNames:[field-0,field-1,field-2]
# newColName:new-field
# }
}

# 指定单批次写入 {{nebula.name}} 的最大点数量。
Expand Down Expand Up @@ -297,9 +302,19 @@
# 目前,{{nebula.name}} {{nebula.release}}仅支持字符串或整数类型的 VID。
source: {
field: src
# udf:{
# separator:"_"
# oldColNames:[field-0,field-1,field-2]
# newColName:new-field
# }
}
target: {
field: dst
# udf:{
# separator:"_"
# oldColNames:[field-0,field-1,field-2]
# newColName:new-field
# }
}

# 指定一个列作为 rank 的源(可选)。
Expand Down
15 changes: 15 additions & 0 deletions docs-2.0/nebula-exchange/use-exchange/ex-ug-import-from-kafka.md
Original file line number Diff line number Diff line change
Expand Up @@ -152,6 +152,11 @@
# 这里的值 key 和上面的 key 重复,表示 key 既作为 VID,也作为属性 name。
vertex:{
field:personId
# udf:{
# separator:"_"
# oldColNames:[field-0,field-1,field-2]
# newColName:new-field
# }
}

# 单批次写入 {{nebula.name}} 的数据条数。
Expand Down Expand Up @@ -213,10 +218,20 @@
# 在 target 里,将 topic 中某一列作为边的目的点数据源。
source:{
field:srcPersonId
# udf:{
# separator:"_"
# oldColNames:[field-0,field-1,field-2]
# newColName:new-field
# }
}

target:{
field:dstPersonId
# udf:{
# separator:"_"
# oldColNames:[field-0,field-1,field-2]
# newColName:new-field
# }
}

# 指定一个列作为 rank 的源(可选)。
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -179,6 +179,11 @@
# 指定表中某一列数据为 {{nebula.name}} 中点 VID 的来源。
vertex:{
field: playerid
# udf:{
# separator:"_"
# oldColNames:[field-0,field-1,field-2]
# newColName:new-field
# }
}

# 单批次写入 {{nebula.name}} 的数据条数。
Expand Down Expand Up @@ -259,11 +264,21 @@
# 在 source 里,将 follow 表中某一列作为边的起始点数据源。
source:{
field: src_player
# udf:{
# separator:"_"
# oldColNames:[field-0,field-1,field-2]
# newColName:new-field
# }
}

# 在 target 里,将 follow 表中某一列作为边的目的点数据源。
target:{
field: dst_player
# udf:{
# separator:"_"
# oldColNames:[field-0,field-1,field-2]
# newColName:new-field
# }
}

# 指定一个列作为 rank 的源(可选)。
Expand Down
15 changes: 15 additions & 0 deletions docs-2.0/nebula-exchange/use-exchange/ex-ug-import-from-mysql.md
Original file line number Diff line number Diff line change
Expand Up @@ -204,6 +204,11 @@ mysql> desc serve;
# 指定表中某一列数据为 {{nebula.name}} 中点 VID 的来源。
vertex: {
field:playerid
# udf:{
# separator:"_"
# oldColNames:[field-0,field-1,field-2]
# newColName:new-field
# }
}

# 单批次写入 {{nebula.name}} 的数据条数。
Expand Down Expand Up @@ -273,10 +278,20 @@ mysql> desc serve;
# 在 target 里,将 follow 表中某一列作为边的目的点数据源。
source: {
field: src_player
# udf:{
# separator:"_"
# oldColNames:[field-0,field-1,field-2]
# newColName:new-field
# }
}

target: {
field: dst_player
# udf:{
# separator:"_"
# oldColNames:[field-0,field-1,field-2]
# newColName:new-field
# }
}

# 指定一个列作为 rank 的源(可选)。
Expand Down
15 changes: 15 additions & 0 deletions docs-2.0/nebula-exchange/use-exchange/ex-ug-import-from-neo4j.md
Original file line number Diff line number Diff line change
Expand Up @@ -182,6 +182,11 @@ Exchange 读取 Neo4j 数据时需要完成以下工作:
nebula.fields: [age,name]
vertex: {
field:id
# udf:{
# separator:"_"
# oldColNames:[field-0,field-1,field-2]
# newColName:new-field
# }
}
partition: 10
batch: 1000
Expand Down Expand Up @@ -229,9 +234,19 @@ Exchange 读取 Neo4j 数据时需要完成以下工作:
nebula.fields: [degree]
source: {
field: src
# udf:{
# separator:"_"
# oldColNames:[field-0,field-1,field-2]
# newColName:new-field
# }
}
target: {
field: dst
# udf:{
# separator:"_"
# oldColNames:[field-0,field-1,field-2]
# newColName:new-field
# }
}
#ranking: rank
partition: 10
Expand Down
Loading