doc-exports/docs/dli/sqlreference/dli_08_0409.html
Su, Xiaomeng 04d4597cf3 dli_sqlreference_0511_version
Reviewed-by: Pruthi, Vineet <vineet.pruthi@t-systems.com>
Co-authored-by: Su, Xiaomeng <suxiaomeng1@huawei.com>
Co-committed-by: Su, Xiaomeng <suxiaomeng1@huawei.com>
2023-11-02 14:34:08 +00:00

183 lines
17 KiB
HTML

<a name="dli_08_0409"></a><a name="dli_08_0409"></a>
<h1 class="topictitle1">Canal</h1>
<div id="body8662426"><div class="section" id="dli_08_0409__en-us_topic_0000001310015813_section167371042163516"><h4 class="sectiontitle">Function</h4><p id="dli_08_0409__en-us_topic_0000001310015813_p613994415365">Canal is a Changelog Data Capture (CDC) tool that can stream changes in real-time from MySQL into other systems. Canal provides a unified format schema for changelog and supports to serialize messages using JSON and protobuf (the default format for Canal).</p>
<p id="dli_08_0409__en-us_topic_0000001310015813_p191398442368">Flink supports to interpret Canal JSON messages as INSERT, UPDATE, and DELETE messages into the Flink SQL system. This is useful in many cases to leverage this feature, such as:</p>
<ul id="dli_08_0409__en-us_topic_0000001310015813_ul7139164493616"><li id="dli_08_0409__en-us_topic_0000001310015813_li18139114419367">synchronizing incremental data from databases to other systems</li><li id="dli_08_0409__en-us_topic_0000001310015813_li1913917445364">Auditing logs</li><li id="dli_08_0409__en-us_topic_0000001310015813_li0139134416368">Real-time materialized view on databases</li><li id="dli_08_0409__en-us_topic_0000001310015813_li181391444153617">Temporal join changing history of a database table, etc.</li></ul>
<p id="dli_08_0409__en-us_topic_0000001310015813_p413911444367">Flink also supports to encode the INSERT, UPDATE, and DELETE messages in Flink SQL as Canal JSON messages, and emit to storage like Kafka. However, currently Flink cannot combine UPDATE_BEFORE and UPDATE_AFTER into a single UPDATE message. Therefore, Flink encodes UPDATE_BEFORE and UPDATE_AFTER as DELETE and INSERT Canal messages.</p>
</div>
<div class="section" id="dli_08_0409__en-us_topic_0000001310015813_section1392435673512"><h4 class="sectiontitle">Parameters</h4>
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="dli_08_0409__en-us_topic_0000001310015813_table435879171415" frame="border" border="1" rules="all"><caption><b>Table 1 </b>Parameter description</caption><thead align="left"><tr id="dli_08_0409__en-us_topic_0000001310015813_row93599914149"><th align="left" class="cellrowborder" valign="top" width="18.34%" id="mcps1.3.2.2.2.6.1.1"><p id="dli_08_0409__en-us_topic_0000001310015813_p12919181516141">Parameter</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="11.959999999999999%" id="mcps1.3.2.2.2.6.1.2"><p id="dli_08_0409__en-us_topic_0000001310015813_p4359194143">Mandatory</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="10.93%" id="mcps1.3.2.2.2.6.1.3"><p id="dli_08_0409__en-us_topic_0000001310015813_p635989181414">Default Value</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="13.780000000000001%" id="mcps1.3.2.2.2.6.1.4"><p id="dli_08_0409__en-us_topic_0000001310015813_p143594916144">Type</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="44.99%" id="mcps1.3.2.2.2.6.1.5"><p id="dli_08_0409__en-us_topic_0000001310015813_p93590913145">Description</p>
</th>
</tr>
</thead>
<tbody><tr id="dli_08_0409__en-us_topic_0000001310015813_row1335959131413"><td class="cellrowborder" valign="top" width="18.34%" headers="mcps1.3.2.2.2.6.1.1 "><p id="dli_08_0409__en-us_topic_0000001310015813_p15430123321510">format</p>
</td>
<td class="cellrowborder" valign="top" width="11.959999999999999%" headers="mcps1.3.2.2.2.6.1.2 "><p id="dli_08_0409__en-us_topic_0000001310015813_p144302033111510">Yes</p>
</td>
<td class="cellrowborder" valign="top" width="10.93%" headers="mcps1.3.2.2.2.6.1.3 "><p id="dli_08_0409__en-us_topic_0000001310015813_p1643043361513">None</p>
</td>
<td class="cellrowborder" valign="top" width="13.780000000000001%" headers="mcps1.3.2.2.2.6.1.4 "><p id="dli_08_0409__en-us_topic_0000001310015813_p6430103312158">String</p>
</td>
<td class="cellrowborder" valign="top" width="44.99%" headers="mcps1.3.2.2.2.6.1.5 "><p id="dli_08_0409__en-us_topic_0000001310015813_p643033311519">Format to be used. In this example.Set this parameter to <strong id="dli_08_0409__en-us_topic_0000001310015813_b12755964512">canal-json</strong>.</p>
</td>
</tr>
<tr id="dli_08_0409__en-us_topic_0000001310015813_row935919951413"><td class="cellrowborder" valign="top" width="18.34%" headers="mcps1.3.2.2.2.6.1.1 "><p id="dli_08_0409__en-us_topic_0000001310015813_p16430123351510">canal-json.ignore-parse-errors</p>
</td>
<td class="cellrowborder" valign="top" width="11.959999999999999%" headers="mcps1.3.2.2.2.6.1.2 "><p id="dli_08_0409__en-us_topic_0000001310015813_p154301433141513">No</p>
</td>
<td class="cellrowborder" valign="top" width="10.93%" headers="mcps1.3.2.2.2.6.1.3 "><p id="dli_08_0409__en-us_topic_0000001310015813_p11430133131513">false</p>
</td>
<td class="cellrowborder" valign="top" width="13.780000000000001%" headers="mcps1.3.2.2.2.6.1.4 "><p id="dli_08_0409__en-us_topic_0000001310015813_p154301933101516">Boolean</p>
</td>
<td class="cellrowborder" valign="top" width="44.99%" headers="mcps1.3.2.2.2.6.1.5 "><p id="dli_08_0409__en-us_topic_0000001310015813_p5430333171512">Whether fields and rows with parse errors will be skipped or failed. The default value is <strong id="dli_08_0409__en-us_topic_0000001310015813_b15814394712">false</strong>, indicating that an error will be thrown. Fields are set to null in case of errors.</p>
</td>
</tr>
<tr id="dli_08_0409__en-us_topic_0000001310015813_row835959111418"><td class="cellrowborder" valign="top" width="18.34%" headers="mcps1.3.2.2.2.6.1.1 "><p id="dli_08_0409__en-us_topic_0000001310015813_p1943063321514">canal-json.timestamp-format.standard</p>
</td>
<td class="cellrowborder" valign="top" width="11.959999999999999%" headers="mcps1.3.2.2.2.6.1.2 "><p id="dli_08_0409__en-us_topic_0000001310015813_p1643023351517">No</p>
</td>
<td class="cellrowborder" valign="top" width="10.93%" headers="mcps1.3.2.2.2.6.1.3 "><p id="dli_08_0409__en-us_topic_0000001310015813_p835811243316">'SQL'</p>
</td>
<td class="cellrowborder" valign="top" width="13.780000000000001%" headers="mcps1.3.2.2.2.6.1.4 "><p id="dli_08_0409__en-us_topic_0000001310015813_p7430433101519">String</p>
</td>
<td class="cellrowborder" valign="top" width="44.99%" headers="mcps1.3.2.2.2.6.1.5 "><p id="dli_08_0409__en-us_topic_0000001310015813_p54301433181514">Input and output timestamp formats. Currently supported values are <strong id="dli_08_0409__en-us_topic_0000001310015813_b20801173974119">SQL</strong> and <strong id="dli_08_0409__en-us_topic_0000001310015813_b1815545144115">ISO-8601</strong>:</p>
<ul id="dli_08_0409__en-us_topic_0000001310015813_ul14430123311518"><li id="dli_08_0409__en-us_topic_0000001310015813_li13430133171512"><strong id="dli_08_0409__en-us_topic_0000001310015813_b1989117269439">SQL</strong> will parse input timestamp in "yyyy-MM-dd HH:mm:ss.s{precision}" format, for example <strong id="dli_08_0409__en-us_topic_0000001310015813_b095610444413">2020-12-30 12:13:14.123</strong> and output timestamp in the same format.</li><li id="dli_08_0409__en-us_topic_0000001310015813_li19430233111512"><strong id="dli_08_0409__en-us_topic_0000001310015813_b1696253584317">ISO-8601</strong> will parse input timestamp in "yyyy-MM-ddTHH:mm:ss.s{precision}" format, for example <strong id="dli_08_0409__en-us_topic_0000001310015813_b59303490430">2020-12-30T12:13:14.123</strong> and output timestamp in the same format.</li></ul>
</td>
</tr>
<tr id="dli_08_0409__en-us_topic_0000001310015813_row1736016971410"><td class="cellrowborder" valign="top" width="18.34%" headers="mcps1.3.2.2.2.6.1.1 "><p id="dli_08_0409__en-us_topic_0000001310015813_p743113391514">canal-json.map-null-key.mode</p>
</td>
<td class="cellrowborder" valign="top" width="11.959999999999999%" headers="mcps1.3.2.2.2.6.1.2 "><p id="dli_08_0409__en-us_topic_0000001310015813_p4431153361520">No</p>
</td>
<td class="cellrowborder" valign="top" width="10.93%" headers="mcps1.3.2.2.2.6.1.3 "><p id="dli_08_0409__en-us_topic_0000001310015813_p76701236153311">'FALL'</p>
</td>
<td class="cellrowborder" valign="top" width="13.780000000000001%" headers="mcps1.3.2.2.2.6.1.4 "><p id="dli_08_0409__en-us_topic_0000001310015813_p4431123361515">String</p>
</td>
<td class="cellrowborder" valign="top" width="44.99%" headers="mcps1.3.2.2.2.6.1.5 "><p id="dli_08_0409__en-us_topic_0000001310015813_p74311333151519">Handling mode when serializing null keys for map data. Available values are as follows:</p>
<ul id="dli_08_0409__en-us_topic_0000001310015813_ul14431533201515"><li id="dli_08_0409__en-us_topic_0000001310015813_li1143111337157"><strong id="dli_08_0409__en-us_topic_0000001310015813_b1321417016423">FAIL</strong> will throw exception when encountering map value with null key.</li><li id="dli_08_0409__en-us_topic_0000001310015813_li943113341518"><strong id="dli_08_0409__en-us_topic_0000001310015813_b473413410422">DROP</strong> will drop null key entries for map data.</li><li id="dli_08_0409__en-us_topic_0000001310015813_li1543173319157"><strong id="dli_08_0409__en-us_topic_0000001310015813_b9974171234218">LITERAL</strong> replaces the empty key value in the map with a string constant. The string literal is defined by <strong id="dli_08_0409__en-us_topic_0000001310015813_b1896016276445">canal-json.map-null-key.literal</strong> option.</li></ul>
</td>
</tr>
<tr id="dli_08_0409__en-us_topic_0000001310015813_row436119971415"><td class="cellrowborder" valign="top" width="18.34%" headers="mcps1.3.2.2.2.6.1.1 "><p id="dli_08_0409__en-us_topic_0000001310015813_p84316338150">canal-json.map-null-key.literal</p>
</td>
<td class="cellrowborder" valign="top" width="11.959999999999999%" headers="mcps1.3.2.2.2.6.1.2 "><p id="dli_08_0409__en-us_topic_0000001310015813_p64311339152">No</p>
</td>
<td class="cellrowborder" valign="top" width="10.93%" headers="mcps1.3.2.2.2.6.1.3 "><p id="dli_08_0409__en-us_topic_0000001310015813_p943173315151">'null'</p>
</td>
<td class="cellrowborder" valign="top" width="13.780000000000001%" headers="mcps1.3.2.2.2.6.1.4 "><p id="dli_08_0409__en-us_topic_0000001310015813_p164315335154">String</p>
</td>
<td class="cellrowborder" valign="top" width="44.99%" headers="mcps1.3.2.2.2.6.1.5 "><p id="dli_08_0409__en-us_topic_0000001310015813_p184311133121512">String literal to replace null key when <strong id="dli_08_0409__en-us_topic_0000001310015813_b63473105523">canal-json.map-null-key.mode</strong> is <strong id="dli_08_0409__en-us_topic_0000001310015813_b43571212155212">LITERAL</strong>.</p>
</td>
</tr>
<tr id="dli_08_0409__en-us_topic_0000001310015813_row63611795147"><td class="cellrowborder" valign="top" width="18.34%" headers="mcps1.3.2.2.2.6.1.1 "><p id="dli_08_0409__en-us_topic_0000001310015813_p8431033101518">canal-json.database.include</p>
</td>
<td class="cellrowborder" valign="top" width="11.959999999999999%" headers="mcps1.3.2.2.2.6.1.2 "><p id="dli_08_0409__en-us_topic_0000001310015813_p124311333181515">No</p>
</td>
<td class="cellrowborder" valign="top" width="10.93%" headers="mcps1.3.2.2.2.6.1.3 "><p id="dli_08_0409__en-us_topic_0000001310015813_p18431123310157">None</p>
<p id="dli_08_0409__en-us_topic_0000001310015813_p11431533151520"></p>
</td>
<td class="cellrowborder" valign="top" width="13.780000000000001%" headers="mcps1.3.2.2.2.6.1.4 "><p id="dli_08_0409__en-us_topic_0000001310015813_p543112336155">String</p>
<p id="dli_08_0409__en-us_topic_0000001310015813_p1143153318159"></p>
</td>
<td class="cellrowborder" valign="top" width="44.99%" headers="mcps1.3.2.2.2.6.1.5 "><p id="dli_08_0409__en-us_topic_0000001310015813_p943114330152">An optional regular expression to only read the specific databases changelog rows by regular matching the <strong id="dli_08_0409__en-us_topic_0000001310015813_b44821618195316">database</strong> meta field in the Canal record.</p>
</td>
</tr>
<tr id="dli_08_0409__en-us_topic_0000001310015813_row17129103923716"><td class="cellrowborder" valign="top" width="18.34%" headers="mcps1.3.2.2.2.6.1.1 "><p id="dli_08_0409__en-us_topic_0000001310015813_p121300392376">canal-json.table.include</p>
</td>
<td class="cellrowborder" valign="top" width="11.959999999999999%" headers="mcps1.3.2.2.2.6.1.2 "><p id="dli_08_0409__en-us_topic_0000001310015813_p31306396372">No</p>
</td>
<td class="cellrowborder" valign="top" width="10.93%" headers="mcps1.3.2.2.2.6.1.3 "><p id="dli_08_0409__en-us_topic_0000001310015813_p14130133918371">None</p>
</td>
<td class="cellrowborder" valign="top" width="13.780000000000001%" headers="mcps1.3.2.2.2.6.1.4 "><p id="dli_08_0409__en-us_topic_0000001310015813_p0130133953718">String</p>
</td>
<td class="cellrowborder" valign="top" width="44.99%" headers="mcps1.3.2.2.2.6.1.5 "><p id="dli_08_0409__en-us_topic_0000001310015813_p113013953710">An optional regular expression to only read the specific tables changelog rows by regular matching the <strong id="dli_08_0409__en-us_topic_0000001310015813_b186504238533">table</strong> meta field in the Canal record.</p>
</td>
</tr>
</tbody>
</table>
</div>
</div>
<div class="section" id="dli_08_0409__en-us_topic_0000001310015813_section8503847368"><h4 class="sectiontitle">Supported Connectors</h4><ul id="dli_08_0409__en-us_topic_0000001310015813_ul11921748173912"><li id="dli_08_0409__en-us_topic_0000001310015813_li79211348183917">Kafka</li></ul>
</div>
<div class="section" id="dli_08_0409__en-us_topic_0000001310015813_section1713602117369"><h4 class="sectiontitle">Example</h4><p id="dli_08_0409__en-us_topic_0000001310015813_p15881132116016">Use Kafka to send data and output the data to print.</p>
<ol id="dli_08_0409__en-us_topic_0000001310015813_ol06367232518"><li id="dli_08_0409__en-us_topic_0000001310015813_li04031578234"><span>Create a datasource connection for the communication with the VPC and subnet where Kafka locates and bind the connection to the queue. Set a security group and inbound rule to allow access of the queue and test the connectivity of the queue using the Kafka IP address. For example, locate a general-purpose queue where the job runs and choose <strong id="dli_08_0409__en-us_topic_0000001310015813_b198477295114">More</strong> &gt; <strong id="dli_08_0409__en-us_topic_0000001310015813_b884720293112">Test Address Connectivity</strong> in the <strong id="dli_08_0409__en-us_topic_0000001310015813_b0847162912114">Operation</strong> column. If the connection is successful, the datasource is bound to the queue. Otherwise, the binding fails.</span></li><li id="dli_08_0409__en-us_topic_0000001310015813_li1599913011242"><span>Create a Flink OpenSource SQL job and select Flink 1.12. Copy the following statement and submit the job:</span><p><pre class="screen" id="dli_08_0409__en-us_topic_0000001310015813_screen177262050142516">create table kafkaSource(
id bigint,
name string,
description string,
weight DECIMAL(10, 2)
) with (
'connector' = 'kafka',
'topic' = '&lt;yourTopic&gt;',
'properties.group.id' = '&lt;yourGroupId&gt;',
'properties.bootstrap.servers' = '&lt;yourKafkaAddress&gt;:&lt;yourKafkaPort&gt;',
'scan.startup.mode' = 'latest-offset',
'format' = 'canal-json'
);
create table printSink(
id bigint,
name string,
description string,
weight DECIMAL(10, 2)
) with (
'connector' = 'print'
);
insert into printSink select * from kafkaSource;</pre>
</p></li><li id="dli_08_0409__en-us_topic_0000001310015813_li185918297252"><span>Insert the following data to the corresponding topic in Kafka:</span><p><pre class="screen" id="dli_08_0409__en-us_topic_0000001310015813_screen2451108112611">{
"data": [
{
"id": "111",
"name": "scooter",
"description": "Big 2-wheel scooter",
"weight": "5.18"
}
],
"database": "inventory",
"es": 1589373560000,
"id": 9,
"isDdl": false,
"mysqlType": {
"id": "INTEGER",
"name": "VARCHAR(255)",
"description": "VARCHAR(512)",
"weight": "FLOAT"
},
"old": [
{
"weight": "5.15"
}
],
"pkNames": [
"id"
],
"sql": "",
"sqlType": {
"id": 4,
"name": 12,
"description": 12,
"weight": 7
},
"table": "products",
"ts": 1589373560798,
"type": "UPDATE"
}</pre>
</p></li><li id="dli_08_0409__en-us_topic_0000001310015813_li4353143193117"><span>View the output through either of the following methods:</span><p><ul id="dli_08_0409__en-us_topic_0000001310015813_ul173531443113114"><li id="dli_08_0409__en-us_topic_0000001310015813_li735314312316">Method 1: Locate the job and click <strong id="dli_08_0409__en-us_topic_0000001310015813_b36033121329">More</strong> &gt; <strong id="dli_08_0409__en-us_topic_0000001310015813_b0603171215219">FlinkUI</strong>. Choose <strong id="dli_08_0409__en-us_topic_0000001310015813_b760313121629">Task Managers</strong> &gt; <strong id="dli_08_0409__en-us_topic_0000001310015813_b4604121219215">Stdout</strong>.</li><li id="dli_08_0409__en-us_topic_0000001310015813_li33538437315">Method 2: If you allow DLI to save job logs in OBS, view the output in the <strong id="dli_08_0409__en-us_topic_0000001310015813_b12785191315212">taskmanager.out</strong> file.</li></ul>
<pre class="screen" id="dli_08_0409__en-us_topic_0000001310015813_screen1526419222268">-U(111,scooter,Big2-wheel scooter,5.15)
+U(111,scooter,Big2-wheel scooter,5.18)</pre>
</p></li></ol>
<p id="dli_08_0409__en-us_topic_0000001310015813_p13300127285"></p>
</div>
</div>
<div>
<div class="familylinks">
<div class="parentlink"><strong>Parent topic:</strong> <a href="dli_08_0407.html">Format</a></div>
</div>
</div>