From fa774c86d24e4fede1f429423bb49bb94ece12a6 Mon Sep 17 00:00:00 2001 From: Leonard Xu Date: Fri, 6 Aug 2021 19:02:56 +0800 Subject: [PATCH] [doc] Add readme and support markdown list --- README.md | 112 ++++++++++++++++++++++++ docs/_static/theme_overrides.css | 10 +++ docs/_templates/versions.html | 20 +---- docs/build_docs.sh | 4 +- docs/content/about.md | 8 +- docs/content/connectors/mysql-cdc.md | 10 +-- docs/content/connectors/postgres-cdc.md | 10 +-- docs/content/formats/changelog-json.md | 6 +- 8 files changed, 142 insertions(+), 38 deletions(-) diff --git a/README.md b/README.md index bc4c7e4ad..a6db81771 100644 --- a/README.md +++ b/README.md @@ -3,4 +3,116 @@ Flink CDC Connectors is a set of source connectors for Apache Flink, ingesting changes from different databases using change data capture (CDC). The Flink CDC Connectors integrates Debezium as the engine to capture data changes. So it can fully leverage the ability of Debezium. See more about what is [Debezium](https://github.com/debezium/debezium). +This README is meant as a brief walkthrough on the core features with Flink CDC Connectors. For a fully detailed documentation, please see [Documentation](https://github.com/ververica/flink-cdc-connectors/wiki). + +## Supported (Tested) Connectors + +| Database | Version | +| --- | --- | +| MySQL | Database: 5.7, 8.0.x
JDBC Driver: 8.0.16 | +| PostgreSQL | Database: 9.6, 10, 11, 12
JDBC Driver: 42.2.12| + +## Features + +1. Supports reading database snapshot and continues to read binlogs with **exactly-once processing** even failures happen. +2. CDC connectors for DataStream API, users can consume changes on multiple databases and tables in a single job without Debezium and Kafka deployed. +3. CDC connectors for Table/SQL API, users can use SQL DDL to create a CDC source to monitor changes on a single table. + +## Usage for Table/SQL API + +We need several steps to setup a Flink cluster with the provided connector. + +1. Setup a Flink cluster with version 1.12+ and Java 8+ installed. +2. Download the connector SQL jars from the [Download](https://github.com/ververica/flink-cdc-connectors/wiki/Downloads) page (or [build yourself](#building-from-source). +3. Put the downloaded jars under `FLINK_HOME/lib/`. +4. Restart the Flink cluster. + +The example shows how to create a MySQL CDC source in [Flink SQL Client](https://ci.apache.org/projects/flink/flink-docs-release-1.13/dev/table/sqlClient.html) and execute queries on it. + +```sql +-- creates a mysql cdc table source +CREATE TABLE mysql_binlog ( + id INT NOT NULL, + name STRING, + description STRING, + weight DECIMAL(10,3) +) WITH ( + 'connector' = 'mysql-cdc', + 'hostname' = 'localhost', + 'port' = '3306', + 'username' = 'flinkuser', + 'password' = 'flinkpw', + 'database-name' = 'inventory', + 'table-name' = 'products' +); + +-- read snapshot and binlog data from mysql, and do some transformation, and show on the client +SELECT id, UPPER(name), description, weight FROM mysql_binlog; +``` + +## Usage for DataStream API + +Include following Maven dependency (available through Maven Central): + +``` + + com.ververica + + flink-connector-mysql-cdc + 2.0.0 + +``` + +```java +import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment; +import org.apache.flink.streaming.api.functions.source.SourceFunction; +import com.ververica.cdc.debezium.StringDebeziumDeserializationSchema; +import com.ververica.cdc.connectors.mysql.MySqlSource; + +public class MySqlBinlogSourceExample { + public static void main(String[] args) throws Exception { + SourceFunction sourceFunction = MySqlSource.builder() + .hostname("localhost") + .port(3306) + .databaseList("inventory") // monitor all tables under inventory database + .username("flinkuser") + .password("flinkpw") + .deserializer(new StringDebeziumDeserializationSchema()) // converts SourceRecord to String + .build(); + + StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment(); + + env + .addSource(sourceFunction) + .print().setParallelism(1); // use parallelism 1 for sink to keep message ordering + + env.execute(); + } +} +``` + +## Building from source + +- Prerequisites: + - git + - Maven + - At least Java 8 + +``` +git clone https://github.com/ververica/flink-cdc-connectors.git +cd flink-cdc-connectors +mvn clean install -DskipTests +``` + +Flink CDC Connectors is now available at your local `.m2` repository. + +## License + +The code in this repository is licensed under the [Apache Software License 2](https://github.com/ververica/flink-cdc-connectors/blob/master/LICENSE). + +## Contributing + +The Flink CDC Connectors welcomes anyone that wants to help out in any way, whether that includes reporting problems, helping with documentation, or contributing code changes to fix bugs, add tests, or implement new features. You can report problems to request features in the [GitHub Issues](https://github.com/ververica/flink-cdc-connectors/issues). + +## Documents To get started, please see https://ververica.github.io/flink-cdc-connectors/ diff --git a/docs/_static/theme_overrides.css b/docs/_static/theme_overrides.css index fc4e609c1..b32ee806d 100644 --- a/docs/_static/theme_overrides.css +++ b/docs/_static/theme_overrides.css @@ -26,3 +26,13 @@ max-width: 100%; overflow: visible; } + +/* override style of li under ul */ +.wy-nav-content ul li { + list-style: disc; + margin-left: 36px; +} + +.wy-nav-content ul li p { + margin: 0 0 8px; +} diff --git a/docs/_templates/versions.html b/docs/_templates/versions.html index 2eb36ef7a..94c196df7 100644 --- a/docs/_templates/versions.html +++ b/docs/_templates/versions.html @@ -25,17 +25,7 @@
- {% if languages|length >= 1 %} -
-
{{ _('Languages') }}
- {% for slug, url in languages %} - {% if slug == current_language %} {% endif %} -
{{ slug }}
- {% if slug == current_language %}
{% endif %} - {% endfor %} -
- {% endif %} - {% if versions|length >= 1 %} + {% if versions %}
{{ _('Versions') }}
{% for slug, url in versions %} @@ -45,14 +35,6 @@ {% endfor %}
{% endif %} - {% if downloads|length >= 1 %} -
-
{{ _('Downloads') }}
- {% for type, url in downloads %} -
{{ type }}
- {% endfor %} -
- {% endif %} {% if READTHEDOCS %}
{{ _('On Read the Docs') }}
diff --git a/docs/build_docs.sh b/docs/build_docs.sh index fad7b4687..5e87b8227 100755 --- a/docs/build_docs.sh +++ b/docs/build_docs.sh @@ -21,8 +21,8 @@ set -x # step-1: install dependencies apt-get update -apt-get -y install git rsync python3-pip python3-sphinx python3-git python3-stemmer python3-virtualenv python3-setuptools -python3 -m pip install myst-parser pygments sphinx-rtd-theme +apt-get -y install git rsync python3-pip python3-git python3-stemmer python3-virtualenv python3-setuptools +python3 -m pip install -U sphinx==4.1.1 myst-parser pygments sphinx-rtd-theme export SOURCE_DATE_EPOCH=$(git log -1 --pretty=%ct) export REPO_NAME="${GITHUB_REPOSITORY##*/}" diff --git a/docs/content/about.md b/docs/content/about.md index 23dc8f397..8fc9a538b 100644 --- a/docs/content/about.md +++ b/docs/content/about.md @@ -71,18 +71,18 @@ Include following Maven dependency (available through Maven Central): ``` - com.alibaba.ververica + com.ververica flink-connector-mysql-cdc - 1.2.0 + 2.0.0 ``` ```java import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment; import org.apache.flink.streaming.api.functions.source.SourceFunction; -import com.alibaba.ververica.cdc.debezium.StringDebeziumDeserializationSchema; -import com.alibaba.ververica.cdc.connectors.mysql.MySqlSource; +import com.ververica.cdc.debezium.StringDebeziumDeserializationSchema; +import com.ververica.cdc.connectors.mysql.MySqlSource; public class MySqlBinlogSourceExample { public static void main(String[] args) throws Exception { diff --git a/docs/content/connectors/mysql-cdc.md b/docs/content/connectors/mysql-cdc.md index ed62a7674..3c900a4ce 100644 --- a/docs/content/connectors/mysql-cdc.md +++ b/docs/content/connectors/mysql-cdc.md @@ -11,15 +11,15 @@ In order to setup the MySQL CDC connector, the following table provides dependen ``` - com.alibaba.ververica + com.ververica flink-connector-mysql-cdc - 1.4.0 + 2.0.0 ``` ### SQL Client JAR -Download [flink-sql-connector-mysql-cdc-1.4.0.jar](https://repo1.maven.org/maven2/com/alibaba/ververica/flink-sql-connector-mysql-cdc/1.4.0/flink-sql-connector-mysql-cdc-1.4.0.jar) and put it under `/lib/`. +Download [flink-sql-connector-mysql-cdc-2.0.0.jar](https://repo1.maven.org/maven2/com/ververica/flink-sql-connector-mysql-cdc/2.0.0/flink-sql-connector-mysql-cdc-2.0.0.jar) and put it under `/lib/`. Setup MySQL server ---------------- @@ -261,8 +261,8 @@ The MySQL CDC connector can also be a DataStream source. You can create a Source ```java import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment; import org.apache.flink.streaming.api.functions.source.SourceFunction; -import com.alibaba.ververica.cdc.debezium.StringDebeziumDeserializationSchema; -import com.alibaba.ververica.cdc.connectors.mysql.MySQLSource; +import com.ververica.cdc.debezium.StringDebeziumDeserializationSchema; +import com.ververica.cdc.connectors.mysql.MySQLSource; public class MySqlBinlogSourceExample { public static void main(String[] args) throws Exception { diff --git a/docs/content/connectors/postgres-cdc.md b/docs/content/connectors/postgres-cdc.md index 6a0c157c6..100f74bc9 100644 --- a/docs/content/connectors/postgres-cdc.md +++ b/docs/content/connectors/postgres-cdc.md @@ -11,15 +11,15 @@ In order to setup the Postgres CDC connector, the following table provides depen ``` - com.alibaba.ververica + com.ververica flink-connector-postgres-cdc - 1.4.0 + 2.0.0 ``` ### SQL Client JAR -Download [flink-sql-connector-postgres-cdc-1.4.0.jar](https://repo1.maven.org/maven2/com/alibaba/ververica/flink-sql-connector-postgres-cdc/1.4.0/flink-sql-connector-postgres-cdc-1.4.0.jar) and put it under `/lib/`. +Download [flink-sql-connector-postgres-cdc-2.0.0.jar](https://repo1.maven.org/maven2/com/ververica/flink-sql-connector-postgres-cdc/2.0.0/flink-sql-connector-postgres-cdc-2.0.0.jar) and put it under `/lib/`. How to create a Postgres CDC table ---------------- @@ -170,8 +170,8 @@ The Postgres CDC connector can also be a DataStream source. You can create a Sou ```java import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment; import org.apache.flink.streaming.api.functions.source.SourceFunction; -import com.alibaba.ververica.cdc.debezium.StringDebeziumDeserializationSchema; -import com.alibaba.ververica.cdc.connectors.postgres.PostgreSQLSource; +import com.ververica.cdc.debezium.StringDebeziumDeserializationSchema; +import com.ververica.cdc.connectors.postgres.PostgreSQLSource; public class PostgreSQLSourceExample { public static void main(String[] args) throws Exception { diff --git a/docs/content/formats/changelog-json.md b/docs/content/formats/changelog-json.md index 8a9bb6cc8..6f63f1802 100644 --- a/docs/content/formats/changelog-json.md +++ b/docs/content/formats/changelog-json.md @@ -11,15 +11,15 @@ In order to setup the Changelog JSON format, the following table provides depend ``` - com.alibaba.ververica + com.ververica flink-format-changelog-json - 1.4.0 + 2.0.0 ``` ### SQL Client JAR -Download [flink-format-changelog-json-1.4.0.jar](https://repo1.maven.org/maven2/com/alibaba/ververica/flink-format-changelog-json/1.4.0/flink-format-changelog-json-1.4.0.jar) and put it under `/lib/`. +Download [flink-format-changelog-json-2.0.0.jar](https://repo1.maven.org/maven2/com/ververica/flink-format-changelog-json/2.0.0/flink-format-changelog-json-2.0.0.jar) and put it under `/lib/`. How to use Changelog JSON format