CDC Connectors for Apache Flink®
You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 
 
 
 
 
Go to file
gongzhongqiang 2455f10e94 [hotfix][docs] Fix typo in tidb-tutorial.md and tidb-tutorial-zh.md 3 years ago
.github [common] Update issue templates 3 years ago
docs [hotfix][docs] Fix typo in tidb-tutorial.md and tidb-tutorial-zh.md 3 years ago
flink-cdc-e2e-tests [tidb] remove useless code. 3 years ago
flink-connector-debezium [common] Allow user to pass custom JsonConverter configs in JsonDebeziumDeserializationSchema (#615) 3 years ago
flink-connector-mongodb-cdc [mongodb] Match multiple database and collection names using a regular expression in MongoDB. (#940) 3 years ago
flink-connector-mysql-cdc [mysql] Suspend SourceReader eagerly to reduce restoration time 3 years ago
flink-connector-oceanbase-cdc [oceanbase] Add OceanBase CDC connector 3 years ago
flink-connector-oracle-cdc [mongodb] Correct the TypeInformation argument of DebeziumDeserializationSchema in SourceFuntion (#780) 3 years ago
flink-connector-postgres-cdc [mongodb] Correct the TypeInformation argument of DebeziumDeserializationSchema in SourceFuntion (#780) 3 years ago
flink-connector-sqlserver-cdc [mongodb] Correct the TypeInformation argument of DebeziumDeserializationSchema in SourceFuntion (#780) 3 years ago
flink-connector-test-util [mongodb] Correct the TypeInformation argument of DebeziumDeserializationSchema in SourceFuntion (#780) 3 years ago
flink-connector-tidb-cdc [tidb] Introduce runtime converter to improve the deserialization 3 years ago
flink-sql-connector-mongodb-cdc [common] Make Flink CDC Compatible with Flink 1.14 3 years ago
flink-sql-connector-mysql-cdc [common] Make Flink CDC Compatible with Flink 1.14 3 years ago
flink-sql-connector-oceanbase-cdc [oceanbase] Add OceanBase CDC connector 3 years ago
flink-sql-connector-oracle-cdc [common] Make Flink CDC Compatible with Flink 1.14 3 years ago
flink-sql-connector-postgres-cdc [common] Make Flink CDC Compatible with Flink 1.14 3 years ago
flink-sql-connector-sqlserver-cdc [common] Make Flink CDC Compatible with Flink 1.14 3 years ago
flink-sql-connector-tidb-cdc [tidb] Add flink1.14 support. 3 years ago
tools [build][oceanbase] Use self-hosted agent for oceanbase cdc tests 3 years ago
.gitignore [doc] Use sphinx-rtd-theme to manage project documents 4 years ago
LICENSE Add README and LICENSE 5 years ago
NOTICE Add README and LICENSE 5 years ago
README.md [docs][oceanbase] Add docs for OceanBase CDC connector 3 years ago
azure-pipelines.yml [build][oceanbase] Use self-hosted agent for oceanbase cdc tests 3 years ago
pom.xml [oceanbase] Add OceanBase CDC connector 3 years ago

README.md

Flink CDC Connectors

Flink CDC Connectors is a set of source connectors for Apache Flink, ingesting changes from different databases using change data capture (CDC). The Flink CDC Connectors integrates Debezium as the engine to capture data changes. So it can fully leverage the ability of Debezium. See more about what is Debezium.

This README is meant as a brief walkthrough on the core features with Flink CDC Connectors. For a fully detailed documentation, please see Documentation.

Supported (Tested) Databases

Connector Database Driver
mysql-cdc
  • MySQL: 5.6, 5.7, 8.0.x
  • RDS MySQL: 5.6, 5.7, 8.0.x
  • PolarDB MySQL: 5.6, 5.7, 8.0.x
  • Aurora MySQL: 5.6, 5.7, 8.0.x
  • MariaDB: 10.x
  • PolarDB X: 2.0.1
  • JDBC Driver: 8.0.16
    postgres-cdc
  • PostgreSQL: 9.6, 10, 11, 12
  • JDBC Driver: 42.2.12
    mongodb-cdc
  • MongoDB: 3.6, 4.x, 5.0
  • MongoDB Driver: 4.3.1
    oracle-cdc
  • Oracle: 11, 12, 19
  • Oracle Driver: 19.3.0.0
    sqlserver-cdc
  • Sqlserver: 2017, 2019
  • JDBC Driver: 7.2.2.jre8
    oceanbase-cdc
  • OceanBase CE: 3.1.x
  • JDBC Driver: 5.7.4x

    Features

    1. Supports reading database snapshot and continues to read transaction logs with exactly-once processing even failures happen.
    2. CDC connectors for DataStream API, users can consume changes on multiple databases and tables in a single job without Debezium and Kafka deployed.
    3. CDC connectors for Table/SQL API, users can use SQL DDL to create a CDC source to monitor changes on a single table.

    Usage for Table/SQL API

    We need several steps to setup a Flink cluster with the provided connector.

    1. Setup a Flink cluster with version 1.12+ and Java 8+ installed.
    2. Download the connector SQL jars from the Download page (or build yourself).
    3. Put the downloaded jars under FLINK_HOME/lib/.
    4. Restart the Flink cluster.

    The example shows how to create a MySQL CDC source in Flink SQL Client and execute queries on it.

    -- creates a mysql cdc table source
    CREATE TABLE mysql_binlog (
     id INT NOT NULL,
     name STRING,
     description STRING,
     weight DECIMAL(10,3)
    ) WITH (
     'connector' = 'mysql-cdc',
     'hostname' = 'localhost',
     'port' = '3306',
     'username' = 'flinkuser',
     'password' = 'flinkpw',
     'database-name' = 'inventory',
     'table-name' = 'products'
    );
    
    -- read snapshot and binlog data from mysql, and do some transformation, and show on the client
    SELECT id, UPPER(name), description, weight FROM mysql_binlog;
    

    Usage for DataStream API

    Include following Maven dependency (available through Maven Central):

    <dependency>
      <groupId>com.ververica</groupId>
      <!-- add the dependency matching your database -->
      <artifactId>flink-connector-mysql-cdc</artifactId>
      <!-- the dependency is available only for stable releases. -->
      <version>2.2-SNAPSHOT</version>
    </dependency>
    
    import org.apache.flink.api.common.eventtime.WatermarkStrategy;
    import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
    import com.ververica.cdc.debezium.JsonDebeziumDeserializationSchema;
    import com.ververica.cdc.connectors.mysql.source.MySqlSource;
    
    public class MySqlSourceExample {
      public static void main(String[] args) throws Exception {
        MySqlSource<String> mySqlSource = MySqlSource.<String>builder()
                .hostname("yourHostname")
                .port(yourPort)
                .databaseList("yourDatabaseName") // set captured database
                .tableList("yourDatabaseName.yourTableName") // set captured table
                .username("yourUsername")
                .password("yourPassword")
                .deserializer(new JsonDebeziumDeserializationSchema()) // converts SourceRecord to JSON String
                .build();
    
        StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
    
        // enable checkpoint
        env.enableCheckpointing(3000);
    
        env
          .fromSource(mySqlSource, WatermarkStrategy.noWatermarks(), "MySQL Source")
          // set 4 parallel source tasks
          .setParallelism(4)
          .print().setParallelism(1); // use parallelism 1 for sink to keep message ordering
    
        env.execute("Print MySQL Snapshot + Binlog");
      }
    }
    

    Building from source

    • Prerequisites:
      • git
      • Maven
      • At least Java 8
    git clone https://github.com/ververica/flink-cdc-connectors.git
    cd flink-cdc-connectors
    mvn clean install -DskipTests
    

    Flink CDC Connectors is now available at your local .m2 repository.

    License

    The code in this repository is licensed under the Apache Software License 2.

    Contributing

    The Flink CDC Connectors welcomes anyone that wants to help out in any way, whether that includes reporting problems, helping with documentation, or contributing code changes to fix bugs, add tests, or implement new features. You can report problems to request features in the GitHub Issues.

    Community

    • DingTalk Chinese User Group

      You can search the group number [33121212] or scan the following QR code to join in the group.

    Documents

    To get started, please see https://ververica.github.io/flink-cdc-connectors/