Apache Avro is supported for you to read and write Avro data based on an Avro schema with Flink. The Avro schema is derived from the table schema.
Parameter |
Mandatory |
Default value |
Type |
Description |
---|---|---|---|---|
format |
Yes |
None |
String |
Format to be used. Set the value to avro. |
avro.codec |
No |
None |
String |
Avro compression codec for the file system only The codec is disabled by default. Available values are deflate, snappy, bzip2, and xz. |
Currently, the Avro schema is derived from the table schema and cannot be explicitly defined. The following table lists mappings between Flink to Avro types.
In addition to the following types, Flink supports reading/writing nullable types. Flink maps nullable types to Avro union(something, null), where something is an Avro type converted from Flink type.
You can refer to Apache Avro 1.11.0 Specification for more information about Avro types.
Flink SQL Type |
Avro Type |
Avro Logical Type |
---|---|---|
CHAR / VARCHAR / STRING |
string |
- |
BOOLEAN |
boolean |
- |
BINARY / VARBINARY |
bytes |
- |
DECIMAL |
fixed |
decimal |
TINYINT |
int |
- |
SMALLINT |
int |
- |
INT |
int |
- |
BIGINT |
long |
- |
FLOAT |
float |
- |
DOUBLE |
double |
- |
DATE |
int |
date |
TIME |
int |
time-millis |
TIMESTAMP |
long |
timestamp-millis |
ARRAY |
array |
- |
MAP (keys must be of the string, char, or varchar type.) |
map |
- |
MULTISET (elements must be of the string, char, or varchar type.) |
map |
- |
ROW |
record |
- |
Read data from Kafka, deserialize the data to the Avro format, and outputs the data to print.
CREATE TABLE kafkaSource ( order_id string, order_channel string, order_time string, pay_amount double, real_pay double, pay_time string, user_id string, user_name string, area_id string ) WITH ( 'connector' = 'kafka', 'topic' = '<yourTopic>', 'properties.bootstrap.servers' = '<yourKafkaAddress1>:<yourKafkaPort>,<yourKafkaAddress2>:<yourKafkaPort>,<yourKafkaAddress3>:<yourKafkaPort>', 'properties.group.id' = '<yourGroupId>', 'scan.startup.mode' = 'latest-offset', "format" = "avro" ); CREATE TABLE printSink ( order_id string, order_channel string, order_time string, pay_amount double, real_pay double, pay_time string, user_id string, user_name string, area_id string ) WITH ( 'connector' = 'print' ); insert into printSink select * from kafkaSource;
{"order_id":"202103241000000001","order_channel":"webShop","order_time":"2021-03-24 10:00:00","pay_amount":100.0,"real_pay":100.0,"pay_time":"2021-03-24 10:02:03","user_id":"0001","user_name":"Alice","area_id":"330106"} {"order_id":"202103241606060001","order_channel":"appShop","order_time":"2021-03-24 16:06:06","pay_amount":200.0,"real_pay":180.0,"pay_time":"2021-03-24 16:10:06","user_id":"0001","user_name":"Alice","area_id":"330106"}
+I(202103241000000001,webShop,2021-03-2410:00:00,100.0,100.0,2021-03-2410:02:03,0001,Alice,330106) +I(202103241606060001,appShop,2021-03-2416:06:06,200.0,180.0,2021-03-2416:10:06,0001,Alice,330106)