Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize doc quickstart deployment #41

Merged
merged 4 commits into from
Oct 25, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
80 changes: 73 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,13 @@ The latest version of bitsail has the following minimal requirements:
<tr>
<td>Hadoop</td>
<td>-</td>
<td>❎</td>
<td>✅</td>
<td> </td>
</tr>
<tr>
<td>Hbase</td>
<td>-</td>
<td>✅</td>
<td>✅</td>
</tr>
<tr>
Expand All @@ -58,10 +64,40 @@ The latest version of bitsail has the following minimal requirements:
<td>✅</td>
</tr>
<tr>
<td>StreamingFile(Hadoop Streaming mode.)</td>
<td>RocketMQ</td>
<td>-</td>
<td> </td>
<td>✅</td>
</tr>
<tr>
<td>StreamingFile (Hadoop Streaming mode.)</td>
<td>-</td>
<td> </td>
<td>✅</td>
</tr>
<tr>
<td>Redis</td>
<td>-</td>
<td> </td>
<td>✅</td>
</tr>
<tr>
<td>Doris</td>
<td>-</td>
<td> </td>
<td>✅</td>
</tr>
<tr>
<td>MongoDB</td>
<td>-</td>
<td>✅</td>
<td>✅</td>
</tr>
<tr>
<td>Doris</td>
<td>-</td>
<td>❎</td>
<td>✅</td>
<td> </td>
</tr>
<tr>
<td rowspan="4">JDBC</td>
Expand All @@ -82,12 +118,12 @@ The latest version of bitsail has the following minimal requirements:
<td>Fake</td>
<td>-</td>
<td>✅</td>
<td></td>
<td> </td>
</tr>
<tr>
<td>Print</td>
<td>-</td>
<td></td>
<td> </td>
<td>✅</td>
</tr>
</table>
Expand All @@ -111,9 +147,39 @@ We also prepare a profile for `flink-embedded`, you can use follow command:
mvn clean package -pl bitsail-dist -am -Dmaven.test.skip=true -Pflink-embedded
```

## QuickStart & Architecture
After building the project, the project production file structure is as follows:

``` simple
bitsail-archive-${version}-SNAPSHOT
/bin
/bitsail #Startup script
/conf
/bitsail.conf #bitsail system config
/embedded
/flink #embedded flink
/examples #examples configuration files
/example-datas #examples data
/Fake_xx_Example.json #Fake source to xx examples config files
/xx_Print_Example.json #xx to print sink examples config files
/libs #jar libs
/bitsail-core.jar #entering jar package
/connectors #connector plugin jars
/mapping #connector plugin config files
/components #components jars,such as metric、dirty-collector
/clients #bitsail client jar
```

## Environment Setup

Link to [Environment Setup](docs/env_setup.md).

## Deployment Guide

Link to [Deployment Guide](docs/deployment.md).

## Developer Guide

Reference [QuickStart](./docs/quickstart.md)
Link to [Developer Guide](docs/developer_guide.md).

## Contact

Expand Down
80 changes: 71 additions & 9 deletions README_zh.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,13 @@
<tr>
<td>Hadoop</td>
<td>-</td>
<td>❎</td>
<td>✅</td>
<td> </td>
</tr>
<tr>
<td>Hbase</td>
<td>-</td>
<td>✅</td>
<td>✅</td>
</tr>
<tr>
Expand All @@ -50,11 +56,41 @@
<td>✅</td>
</tr>
<tr>
<td>StreamingFile(Hadoop Streaming mode.)</td>
<td>RocketMQ</td>
<td>-</td>
<td> </td>
<td>✅</td>
</tr>
<tr>
<td>StreamingFile (Hadoop Streaming mode.)</td>
<td>-</td>
<td> </td>
<td>✅</td>
</tr>
<tr>
<td>Redis</td>
<td>-</td>
<td> </td>
<td>✅</td>
</tr>
<tr>
<td>Doris</td>
<td>-</td>
<td> </td>
<td>✅</td>
</tr>
<tr>
<td>MongoDB</td>
<td>-</td>
<td></td>
<td></td>
<td>✅</td>
</tr>
<tr>
<td>Doris</td>
<td>-</td>
<td>✅</td>
<td> </td>
</tr>
<tr>
<td rowspan="4">JDBC</td>
<td>MySQL</td>
Expand All @@ -74,12 +110,12 @@
<td>Fake</td>
<td>-</td>
<td>✅</td>
<td></td>
<td> </td>
</tr>
<tr>
<td>Print</td>
<td>-</td>
<td></td>
<td> </td>
<td>✅</td>
</tr>
</table>
Expand All @@ -99,13 +135,39 @@ mvn clean package -pl bitsail-dist -am -Dmaven.test.skip=true
mvn clean package -pl bitsail-dist -am -Dmaven.test.skip=true -Pflink-embedded
```

## 快速使用
打包完成后,产物的目录结构如下:

``` simple
bitsail-archive-${version}-SNAPSHOT
/bin
/bitsail #Startup script
/conf
/bitsail.conf #bitsail system config
/embedded
/flink #embedded flink
/examples #examples configuration files
/example-datas #examples data
/Fake_xx_Example.json #Fake source to xx examples config files
/xx_Print_Example.json #xx to print sink examples config files
/libs #jar libs
/bitsail-core.jar #entering jar package
/connectors #connector plugin jars
/mapping #connector plugin config files
/components #components jars,such as metric、dirty-collector
/clients #bitsail client jar
```

## 环境配置

参考 [环境配置](docs/env_setup_zh.md).

## 部署指南

参考文档[快速开始](docs/quickstart.md)
Link to [部署指南](docs/deployment_zh.md).

## 架构
## 开发指南

参考文档[架构](docs/introduction.md)
Link to [开发指南](docs/developer_guide_zh.md).

## 联系方式

Expand Down
111 changes: 111 additions & 0 deletions docs/deployment.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,111 @@
# Deployment Guide

> At present, ***BitSail*** only supports flink deployment on Yarn.<br>
Other platforms like `native kubernetes` will be release recently.

-----

Here are the contents of this part:

- [Configure Hadoop Environment](#jump_configure_hadoop)
- [Configure Flink Cluster](#jump_configure_flink)
- [Submit to Yarn](#jump_submit_to_yarn)
- [Submit an example job](#jump_submit_example)
- [Log for Debugging](#jump_log)

Below is a step-by-step guide to help you effectively deploy it on Yarn.

## <span id="jump_configure_hadoop">Configure Hadoop Environment</span>


To support Yarn deployment, `HADOOP_CLASSPATH` has to be set in system environment properties. There are two ways to set this environment property:

1. Set `HADOOP_CLASSPATH` directly.

2. Set `HADOOP_HOME` targeting to the hadoop dir in deploy environment. The [bitsail](https://github.com/bytedance/bitsail/blob/master/bitsail-dist/src/main/archive/bin/bitsail) scripts will use the following command to generate `HADOOP_CLASSPATH`.

```shell
if [ -n "$HADOOP_HOME" ]; then
export HADOOP_CLASSPATH=$($HADOOP_HOME/bin/hadoop classpath)
fi
```

## <span id="jump_configure_flink">Configure Flink Cluster</span>

After packaging, the project production contains a file [conf/bitsail.conf](https://github.com/bytedance/bitsail/blob/master/bitsail-dist/src/main/resources/bitsail.conf).
This file describes the system configuration of deployment environment, including the flink path and some other default parameters.

Here are some frequently-used options in the configuration file:


<table>
<tr>
<th>Prefix</th>
<th>Parameter name</th>
<th>Description</th>
<th>Example</th>
</tr>

<tr>
<td rowspan="3">sys.flink.</td>
<td>flink_home</td>
<td>The root dir of flink.</td>
<td>${BITSAIL_HOME}/embedded/flink</td>
</tr>

<tr>
<td>checkpoint_dir</td>
<td>The path storing the meta data file and data files of checkpoints.<br/>Reference: <a href="https://nightlies.apache.org/flink/flink-docs-master/docs/ops/state/checkpoints/">Flink Checkpoints</a></td>
<td>"hdfs://opensource/bitsail/flink-1.11/checkpoints/"</td>
</tr>

<tr>
<td>flink_default_properties</td>
<td>General flink runtime options configued by "-D".</td>
<td>{<br/>
classloader.resolve-order: "child-first"<br/>
akka.framesize: "838860800b"<br/>
rest.client.max-content-length: 838860800<br/>
rest.server.max-content-len<br/>}
</td>
</tr>
</table>


## <span id="jump_submit_to_yarn">Submit to Yarn</span>

> ***BitSail*** only support resource provider `yarn's yarn-per-job` mode until now, others like `native kubernetes` will be release recently.

You can use the startup script `bin/bitsail` to submit flink jobs to yarn.

The specific commands are as follows:


``` bash
bash ./bin/bitsail run --engine flink --conf [job_conf_path] --execution-mode run --queue [queue_name] --deployment-mode yarn-per-job [--priority [yarn_priority] -p/--props [name=value]]
```

Parameter description

* Required parameters
* **queue_name**: Target yarn queue
* **job_conf_path**: Path of job configuration file
* Optional parameters
* **yarn_priority**: Job priority on yarn
* **name=value**: Flink properties, for example `classloader.resolve-order=child-first`
* **name**: Property key. Configurable flink parameters that will be transparently transmitted to the flink task.
* **value**: Property value.

## <span id="jump_submit_example">Submit an example job</span>
Submit a fake source to print sink test to yarn.
``` bash
bash ./bin/bitsail run --engine flink --conf ~/bitsail-archive-1.0.0-SNAPSHOT/examples/Fake_Print_Example.json --execution-mode run -p 1=1 --deployment-mode yarn-per-job --queue default
```

## <span id="jump_log">Log for Debugging</span>

### Client side log file
Please check `${FLINK_HOME}/log/` folder to read the log file of BitSail client.

### Yarn task log file
Please go to Yarn WebUI to check the logs of Flink JobManager and TaskManager.
Loading