| Microsoft SQL Server | Database | Schema | Table |
| StarRocks | Database | - | Table |
| Doris | Database | - | Table |
# Functions
Flink CDC uses [Calcite](https://calcite.apache.org/) to parse expressions and [Janino script](https://www.janino.net/) to evaluate expressions with function call.
| value1 <= value2 | value1 <= value2 | Returns TRUE if value1 is less than or equal to value2; returns FALSE if value1 or value2 is NULL. |
| value IS NULL | null == value | Returns TRUE if value is NULL. |
| value IS NOT NULL | null != value | Returns TRUE if value is not NULL. |
| value1 BETWEEN value2 AND value3 | betweenAsymmetric(value1, value2, value3) | Returns TRUE if value1 is greater than or equal to value2 and less than or equal to value3. |
| value1 NOT BETWEEN value2 AND value3 | notBetweenAsymmetric(value1, value2, value3) | Returns TRUE if value1 is less than value2 or greater than value3. |
| string1 LIKE string2 | like(string1, string2) | Returns TRUE if string1 matches pattern string2. |
| string1 NOT LIKE string2 | notLike(string1, string2) | Returns TRUE if string1 does not match pattern string2. |
| value1 IN (value2 [, value3]* ) | in(value1, value2 [, value3]*) | Returns TRUE if value1 exists in the given list (value2, value3, …). |
| value1 NOT IN (value2 [, value3]* ) | notIn(value1, value2 [, value3]*) | Returns TRUE if value1 does not exist in the given list (value2, value3, …). |
| string1 || string2 | concat(string1, string2) | Returns the concatenation of STRING1 and STRING2. |
| CHAR_LENGTH(string) | charLength(string) | Returns the number of characters in STRING. |
| UPPER(string) | upper(string) | Returns string in uppercase. |
| LOWER(string) | lower(string) | Returns string in lowercase. |
| TRIM(string1) | trim('BOTH',string1) | Returns a string that removes whitespaces at both sides. |
| REGEXP_REPLACE(string1, string2, string3) | regexpReplace(string1, string2, string3) | Returns a string from STRING1 with all the substrings that match a regular expression STRING2 consecutively being replaced with STRING3. E.g., 'foobar'.regexpReplace('oo\|ar', '') returns "fb". |
| SUBSTRING(string FROM integer1 [ FOR integer2 ]) | substring(string,integer1,integer2) | Returns a substring of STRING starting from position INT1 with length INT2 (to the end by default). |
| LOCALTIME | localtime() | Returns the current SQL time in the local time zone, the return type is TIME(0). |
| LOCALTIMESTAMP | localtimestamp() | Returns the current SQL timestamp in local time zone, the return type is TIMESTAMP(3). |
| CURRENT_TIME | currentTime() | Returns the current SQL time in the local time zone, this is a synonym of LOCAL_TIME. |
| CURRENT_DATE | currentDate() | Returns the current SQL date in the local time zone. |
| CURRENT_TIMESTAMP | currentTimestamp() | Returns the current SQL timestamp in the local time zone, the return type is TIMESTAMP_LTZ(3). |
| NOW() | now() | Returns the current SQL timestamp in the local time zone, this is a synonym of CURRENT_TIMESTAMP. |
| DATE_FORMAT(timestamp, string) | dateFormat(timestamp, string) | Converts timestamp to a value of string in the format specified by the date format string. The format string is compatible with Java's SimpleDateFormat. |
| TIMESTAMPDIFF(timepointunit, timepoint1, timepoint2) | timestampDiff(timepointunit, timepoint1, timepoint2) | Returns the (signed) number of timepointunit between timepoint1 and timepoint2. The unit for the interval is given by the first argument, which should be one of the following values: SECOND, MINUTE, HOUR, DAY, MONTH, or YEAR. |
| TO_DATE(string1[, string2]) | toDate(string1[, string2]) | Converts a date string string1 with format string2 (by default 'yyyy-MM-dd') to a date. |
| TO_TIMESTAMP(string1[, string2]) | toTimestamp(string1[, string2]) | Converts date time string string1 with format string2 (by default: 'yyyy-MM-dd HH:mm:ss') to a timestamp, without time zone. |
| CASE value WHEN value1_1 [, value1_2]* THEN RESULT1 (WHEN value2_1 [, value2_2 ]* THEN result_2)* (ELSE result_z) END | Nested ternary expression | Returns resultX when the first time value is contained in (valueX_1, valueX_2, …). When no value matches, returns result_z if it is provided and returns NULL otherwise. |
| CASE WHEN condition1 THEN result1 (WHEN condition2 THEN result2)* (ELSE result_z) END | Nested ternary expression | Returns resultX when the first conditionX is met. When no condition is met, returns result_z if it is provided and returns NULL otherwise. |
| COALESCE(value1 [, value2]*) | coalesce(Object... objects) | Returns the first argument that is not NULL.If all arguments are NULL, it returns NULL as well. The return type is the least restrictive, common type of all of its arguments. The return type is nullable if all arguments are nullable as well. |
| IF(condition, true_value, false_value) | condition ? true_value : false_value | Returns the true_value if condition is met, otherwise false_value. E.g., IF(5 > 3, 5, 3) returns 5. |
# Example
This feature will support soon.
## Add computed columns
Evaluation expressions can be used to generate new columns. For example, if we want to append two computed columns based on the table `web_order` in the database `mydb`, we may define a transform rule as follows:
```yaml
transform:
- source-table: mydb.web_order
projection: id, order_id, UPPER(product_name) as product_name, localtimestamp as new_timestamp
description: append calculated columns based on source table
```
## Reference metadata columns
We may reference metadata column in projection expressions. For example, given a table `web_order` in the database `mydb`, we may define a transform rule as follows:
description: access metadata columns from source table
```
## Use wildcard character to project all fields
A wildcard character (`*`) can be used to reference all fields in a table. For example, given two tables `web_order` and `app_order` in the database `mydb`, we may define a transform rule as follows:
```yaml
transform:
- source-table: mydb.web_order
projection: \*, UPPER(product_name) as product_name
description: project fields with wildcard character from source table
- source-table: mydb.app_order
projection: UPPER(product_name) as product_name, *
description: project fields with wildcard character from source table
```
Notice: When `*` character presents at the beginning of expressions, an escaping backslash is required.
## Add filter rule
Use reference columns when adding filtering rules to the table `web_order` in the database `mydb`, we may define a transform rule as follows:
```yaml
transform:
- source-table: mydb.web_order
filter: id > 10 AND order_id > 100
description: filtering rows from source table
```
Computed columns can be used in filtering conditions, too. For example, given a table `web_order` in the database `mydb`, we may define a transform rule as follows:
```yaml
transform:
- source-table: mydb.web_order
projection: id, order_id, UPPER(province) as new_province
filter: new_province = 'SHANGHAI'
description: filtering rows based on computed columns
```
## Reassign primary key
We can reassign the primary key in transform rules. For example, given a table `web_order` in the database `mydb`, we may define a transform rule as follows:
```yaml
transform:
- source-table: mydb.web_order
projection: id, order_id
primary-keys: order_id
description: reassign primary key example
```
Composite primary keys are also supported:
```yaml
transform:
- source-table: mydb.web_order
projection: id, order_id, UPPER(product_name) as product_name
primary-keys: order_id, product_name
description: reassign composite primary keys example
```
## Reassign partition key
We can reassign the partition key in transform rules. For example, given a table web_order in the database mydb, we may define a transform rule as follows:
```yaml
transform:
- source-table: mydb.web_order
projection: id, order_id, UPPER(product_name) as product_name
partition-keys: product_name
description: reassign partition key example
```
## Specify table creation configuration
Extra options can be defined in a transform rule, and will be applied when creating downstream tables. Given a table `web_order` in the database `mydb`, we may define a transform rule as follows:
```yaml
transform:
- source-table: mydb.web_order
projection: id, order_id, UPPER(product_name) as product_name
table-options: comment=web order
description: auto creating table options example
```
Tips: The format of table-options is `key1=value1,key2=value2`.
## Classification mapping
Multiple transform rules can be defined to classify input data rows and apply different processings. For example, we may define a transform rule as follows:
```yaml
transform:
- source-table: mydb.web_order
projection: id, order_id
filter: UPPER(province) = 'SHANGHAI'
description: classification mapping example
- source-table: mydb.web_order
projection: order_id as id, id as order_id
filter: UPPER(province) = 'BEIJING'
description: classification mapping example
```
# Known limitations
* Currently, transform doesn't work with route rules. It will be supported in future versions.
* Computed columns cannot reference trimmed columns that do not present in final projection results. This will be fixed in future versions.
* Regular matching of tables with different schemas is not supported. If necessary, multiple rules need to be written.
| Microsoft SQL Server | Database | Schema | Table |
| StarRocks | Database | - | Table |
| Doris | Database | - | Table |
# Functions
Flink CDC uses [Calcite](https://calcite.apache.org/) to parse expressions and [Janino script](https://www.janino.net/) to evaluate expressions with function call.
| value1 <= value2 | value1 <= value2 | Returns TRUE if value1 is less than or equal to value2; returns FALSE if value1 or value2 is NULL. |
| value IS NULL | null == value | Returns TRUE if value is NULL. |
| value IS NOT NULL | null != value | Returns TRUE if value is not NULL. |
| value1 BETWEEN value2 AND value3 | betweenAsymmetric(value1, value2, value3) | Returns TRUE if value1 is greater than or equal to value2 and less than or equal to value3. |
| value1 NOT BETWEEN value2 AND value3 | notBetweenAsymmetric(value1, value2, value3) | Returns TRUE if value1 is less than value2 or greater than value3. |
| string1 LIKE string2 | like(string1, string2) | Returns TRUE if string1 matches pattern string2. |
| string1 NOT LIKE string2 | notLike(string1, string2) | Returns TRUE if string1 does not match pattern string2. |
| value1 IN (value2 [, value3]* ) | in(value1, value2 [, value3]*) | Returns TRUE if value1 exists in the given list (value2, value3, …). |
| value1 NOT IN (value2 [, value3]* ) | notIn(value1, value2 [, value3]*) | Returns TRUE if value1 does not exist in the given list (value2, value3, …). |
| string1 || string2 | concat(string1, string2) | Returns the concatenation of STRING1 and STRING2. |
| CHAR_LENGTH(string) | charLength(string) | Returns the number of characters in STRING. |
| UPPER(string) | upper(string) | Returns string in uppercase. |
| LOWER(string) | lower(string) | Returns string in lowercase. |
| TRIM(string1) | trim('BOTH',string1) | Returns a string that removes whitespaces at both sides. |
| REGEXP_REPLACE(string1, string2, string3) | regexpReplace(string1, string2, string3) | Returns a string from STRING1 with all the substrings that match a regular expression STRING2 consecutively being replaced with STRING3. E.g., 'foobar'.regexpReplace('oo\|ar', '') returns "fb". |
| SUBSTRING(string FROM integer1 [ FOR integer2 ]) | substring(string,integer1,integer2) | Returns a substring of STRING starting from position INT1 with length INT2 (to the end by default). |
| LOCALTIME | localtime() | Returns the current SQL time in the local time zone, the return type is TIME(0). |
| LOCALTIMESTAMP | localtimestamp() | Returns the current SQL timestamp in local time zone, the return type is TIMESTAMP(3). |
| CURRENT_TIME | currentTime() | Returns the current SQL time in the local time zone, this is a synonym of LOCAL_TIME. |
| CURRENT_DATE | currentDate() | Returns the current SQL date in the local time zone. |
| CURRENT_TIMESTAMP | currentTimestamp() | Returns the current SQL timestamp in the local time zone, the return type is TIMESTAMP_LTZ(3). |
| NOW() | now() | Returns the current SQL timestamp in the local time zone, this is a synonym of CURRENT_TIMESTAMP. |
| DATE_FORMAT(timestamp, string) | dateFormat(timestamp, string) | Converts timestamp to a value of string in the format specified by the date format string. The format string is compatible with Java's SimpleDateFormat. |
| TIMESTAMPDIFF(timepointunit, timepoint1, timepoint2) | timestampDiff(timepointunit, timepoint1, timepoint2) | Returns the (signed) number of timepointunit between timepoint1 and timepoint2. The unit for the interval is given by the first argument, which should be one of the following values: SECOND, MINUTE, HOUR, DAY, MONTH, or YEAR. |
| TO_DATE(string1[, string2]) | toDate(string1[, string2]) | Converts a date string string1 with format string2 (by default 'yyyy-MM-dd') to a date. |
| TO_TIMESTAMP(string1[, string2]) | toTimestamp(string1[, string2]) | Converts date time string string1 with format string2 (by default: 'yyyy-MM-dd HH:mm:ss') to a timestamp, without time zone. |
| CASE value WHEN value1_1 [, value1_2]* THEN RESULT1 (WHEN value2_1 [, value2_2 ]* THEN result_2)* (ELSE result_z) END | Nested ternary expression | Returns resultX when the first time value is contained in (valueX_1, valueX_2, …). When no value matches, returns result_z if it is provided and returns NULL otherwise. |
| CASE WHEN condition1 THEN result1 (WHEN condition2 THEN result2)* (ELSE result_z) END | Nested ternary expression | Returns resultX when the first conditionX is met. When no condition is met, returns result_z if it is provided and returns NULL otherwise. |
| COALESCE(value1 [, value2]*) | coalesce(Object... objects) | Returns the first argument that is not NULL.If all arguments are NULL, it returns NULL as well. The return type is the least restrictive, common type of all of its arguments. The return type is nullable if all arguments are nullable as well. |
| IF(condition, true_value, false_value) | condition ? true_value : false_value | Returns the true_value if condition is met, otherwise false_value. E.g., IF(5 > 3, 5, 3) returns 5. |
# Example
This feature will support soon.
## Add computed columns
Evaluation expressions can be used to generate new columns. For example, if we want to append two computed columns based on the table `web_order` in the database `mydb`, we may define a transform rule as follows:
```yaml
transform:
- source-table: mydb.web_order
projection: id, order_id, UPPER(product_name) as product_name, localtimestamp as new_timestamp
description: append calculated columns based on source table
```
## Reference metadata columns
We may reference metadata column in projection expressions. For example, given a table `web_order` in the database `mydb`, we may define a transform rule as follows:
description: access metadata columns from source table
```
## Use wildcard character to project all fields
A wildcard character (`*`) can be used to reference all fields in a table. For example, given two tables `web_order` and `app_order` in the database `mydb`, we may define a transform rule as follows:
```yaml
transform:
- source-table: mydb.web_order
projection: \*, UPPER(product_name) as product_name
description: project fields with wildcard character from source table
- source-table: mydb.app_order
projection: UPPER(product_name) as product_name, *
description: project fields with wildcard character from source table
```
Notice: When `*` character presents at the beginning of expressions, an escaping backslash is required.
## Add filter rule
Use reference columns when adding filtering rules to the table `web_order` in the database `mydb`, we may define a transform rule as follows:
```yaml
transform:
- source-table: mydb.web_order
filter: id > 10 AND order_id > 100
description: filtering rows from source table
```
Computed columns can be used in filtering conditions, too. For example, given a table `web_order` in the database `mydb`, we may define a transform rule as follows:
```yaml
transform:
- source-table: mydb.web_order
projection: id, order_id, UPPER(province) as new_province
filter: new_province = 'SHANGHAI'
description: filtering rows based on computed columns
```
## Reassign primary key
We can reassign the primary key in transform rules. For example, given a table `web_order` in the database `mydb`, we may define a transform rule as follows:
```yaml
transform:
- source-table: mydb.web_order
projection: id, order_id
primary-keys: order_id
description: reassign primary key example
```
Composite primary keys are also supported:
```yaml
transform:
- source-table: mydb.web_order
projection: id, order_id, UPPER(product_name) as product_name
primary-keys: order_id, product_name
description: reassign composite primary keys example
```
## Reassign partition key
We can reassign the partition key in transform rules. For example, given a table web_order in the database mydb, we may define a transform rule as follows:
```yaml
transform:
- source-table: mydb.web_order
projection: id, order_id, UPPER(product_name) as product_name
partition-keys: product_name
description: reassign partition key example
```
## Specify table creation configuration
Extra options can be defined in a transform rule, and will be applied when creating downstream tables. Given a table `web_order` in the database `mydb`, we may define a transform rule as follows:
```yaml
transform:
- source-table: mydb.web_order
projection: id, order_id, UPPER(product_name) as product_name
table-options: comment=web order
description: auto creating table options example
```
Tips: The format of table-options is `key1=value1,key2=value2`.
## Classification mapping
Multiple transform rules can be defined to classify input data rows and apply different processings. For example, we may define a transform rule as follows:
```yaml
transform:
- source-table: mydb.web_order
projection: id, order_id
filter: UPPER(province) = 'SHANGHAI'
description: classification mapping example
- source-table: mydb.web_order
projection: order_id as id, id as order_id
filter: UPPER(province) = 'BEIJING'
description: classification mapping example
```
# Known limitations
* Currently, transform doesn't work with route rules. It will be supported in future versions.
* Computed columns cannot reference trimmed columns that do not present in final projection results. This will be fixed in future versions.
* Regular matching of tables with different schemas is not supported. If necessary, multiple rules need to be written.