@ -24,53 +24,50 @@ specific language governing permissions and limitations
under the License.
-->
# Introduction
# 简介
Kubernetes is a popular container-orchestration system for automating computer application deployment, scaling, and management.
Flink's native Kubernetes integration allows you to directly deploy Flink on a running Kubernetes cluster.
Moreover, Flink is able to dynamically allocate and de-allocate TaskManagers depending on the required resources because it can directly talk to Kubernetes.
Apache Flink also provides a Kubernetes operator for managing Flink clusters on Kubernetes. It supports both standalone and native deployment mode and greatly simplifies deployment, configuration and the life cycle management of Flink resources on Kubernetes.
For more information, please refer to the [Flink Kubernetes Operator documentation](https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-main/docs/concepts/overview/).
The doc assumes a running Kubernetes cluster fulfilling the following requirements:
假设您正在运行的Kubernetes集群满足以下要求:
- Kubernetes >= 1.9.
- KubeConfig, which has access to list, create, delete pods and services, configurable via `~/.kube/config`. You can verify permissions by running `kubectl auth can-i <list|create|edit|delete> pods`.
- Enabled Kubernetes DNS.
- `default` service account with [RBAC](https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/resource-providers/native_kubernetes/#rbac) permissions to create, delete pods.
Flink runs on all UNIX-like environments, i.e. Linux, Mac OS X, and Cygwin (for Windows).
You can refer [overview]({{< ref "docs/connectors/pipeline-connectors/overview" >}}) to check supported versions and download [the binary release](https://flink.apache.org/downloads/) of Flink,
then extract the archive:
Flink可以在所有类UNIX环境上运行,即Linux、Mac OS X和Cygwin(适用于 Windows)。
please refer to [Flink documentation](https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/resource-providers/native_kubernetes/#accessing-flinks-web-ui) to expose Flink’s Web UI and REST endpoint.
You should ensure that REST endpoint can be accessed by the node of your submission.
{{</hint>}}
Then, you need to add these two config to your flink-conf.yaml:
{{<hintinfo>}}
请参考[Flink文档](https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/resource-providers/native_kubernetes/#accessing-flinks-web-ui)来暴露Flink Web UI和REST端口。
请确保您提交作业的节点可以访问REST端口。
{{</hint>}}
然后,将以下两个配置添加到flink-conf.yaml中:
```yaml
rest.bind-port: {{REST_PORT}}
rest.address: {{NODE_IP}}
```
{{REST_PORT}} and {{NODE_IP}} should be replaced by the actual values of your JobManager Web Interface.
{{REST_PORT}}和{{NODE_IP}}替换为JobManager Web界面的实际值。
### Set up Flink CDC
Download the tar file of Flink CDC from [release page](https://github.com/apache/flink-cdc/releases), then extract the archive:
flinkdeployment.flink.apache.org/flink-cdc-pipeline-job created
```
如您需要查看日志、暴露Flink Web UI等,请参考:[Flink Kubernetes Operator文档](https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-main/docs/concepts/overview/)。
please refer to [Flink documentation](https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/resource-providers/native_kubernetes/#accessing-flinks-web-ui) to expose Flink’s Web UI and REST endpoint.
You should ensure that REST endpoint can be accessed by the node of your submission.
{{</hint>}}
You should ensure that REST endpoint can be accessed by the node of your submission.
{{</hint>}}
Then, you need to add these two config to your flink-conf.yaml:
Then you can find a job named `Sync MySQL Database to Doris` running through Flink Web UI.
Then you can find a job named `Sync MySQL Database to Doris` running through Flink Web UI.
## Kubernetes Operator Mode
The doc assumes a [Flink Kubernetes Operator](https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-main/docs/concepts/overview/) has been deployed on your K8s cluster, then you only need to build a Docker image of Flink CDC.
### Build a custom Docker image
1. Download the tar file of Flink CDC and needed connectors from [release page](https://github.com/apache/flink-cdc/releases), then move them to the docker image build directory.
Assume that your docker image build directory is `/opt/docker/flink-cdc`, The structure of this directory is as follow:
```text
/opt/docker/flink-cdc
├── flink-cdc-3.1.0-bin.tar.gz
├── flink-cdc-pipeline-connector-doris-3.1.0.jar
├── flink-cdc-pipeline-connector-mysql-3.1.0.jar
├── mysql-connector-java-8.0.27.jar
└── ...
```
2. Create a Dockerfile to build a custom image from the `flink` official image and add Flink CDC dependencies.
```shell script
FROM flink:1.18.0-java8
ADD *.jar $FLINK_HOME/lib/
ADD flink-cdc*.tar.gz $FLINK_HOME/
RUN mv $FLINK_HOME/flink-cdc-3.1.0/lib/flink-cdc-dist-3.1.0.jar $FLINK_HOME/lib/
```
Finally, The structure is as follow:
```text
/opt/docker/flink-cdc
├── Dockerfile
├── flink-cdc-3.1.0-bin.tar.gz
├── flink-cdc-pipeline-connector-doris-3.1.0.jar
├── flink-cdc-pipeline-connector-mysql-3.1.0.jar
├── mysql-connector-java-8.0.27.jar
└── ...
```
3. Build the custom Docker image then push.
```bash
docker build -t flink-cdc-pipeline:3.1.0 .
docker push flink-cdc-pipeline:3.1.0
```
### Create a ConfigMap for mounting Flink CDC configuration files
Here is an example file, please change the connection parameters into your actual values:
```yaml
---
apiVersion: v1
data:
flink-cdc.yaml: |-
parallelism: 4
schema.change.behavior: EVOLVE
mysql-to-doris.yaml: |-
source:
type: mysql
hostname: localhost
port: 3306
username: root
password: 123456
tables: app_db.\.*
server-id: 5400-5404
server-time-zone: UTC
sink:
type: doris
fenodes: 127.0.0.1:8030
username: root
password: ""
pipeline:
name: Sync MySQL Database to Doris
parallelism: 2
kind: ConfigMap
metadata:
name: flink-cdc-pipeline-configmap
```
### Create a FlinkDeployment YAML
Here is an example file `flink-cdc-pipeline-job.yaml`:
1. Due to Flink's class loader, the parameter of `classloader.resolve-order` must be `parent-first`.
2. Flink CDC submits a job to a remote Flink cluster by default, you should start a Standalone Flink cluster in the pod by `--use-mini-cluster` in Operator mode.
{{</hint>}}
### Submit a Flink CDC Job
After the ConfigMap and FlinkDeployment YAML are created, you can submit the Flink CDC job to the Operator through kubectl like:
```bash
kubectl apply -f flink-cdc-pipeline-job.yaml
```
After successful submission, the return information is as follows:
```shell
flinkdeployment.flink.apache.org/flink-cdc-pipeline-job created
```
If you want to trace the logs or expose the Flink Web UI, please refer to: [Flink Kubernetes Operator documentation](https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-main/docs/concepts/overview/)。
{{<hintinfo>}}
Please note that submitting with **native application mode** and **Flink Kubernetes operator** are not supported for now.
{{<hintinfo>}}
Please note that submitting with **native application mode** is not supported for now.