各位大神求帮忙看一下。
Flink 版本:1.10.0 Planner:blink 我在使用Flink SQL的时候遇到了一个问题,能否帮忙看一下,我尝试在寻找了解决方法,但是没有起作用。 比如我发现类似的问题 https://www.mail-archive.com/user-zh@.../msg03916.html 中描述的问题,根据这个mail中的解决方法我设置了timezone,但是问题没有被解决。 Flink Table Env配置 *StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();* * env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime);* * env.setParallelism(1);* *EnvironmentSettings envSetting = EnvironmentSettings.newInstance().useBlinkPlanner().inStreamingMode().build(); * *StreamTableEnvironment tableEnv = StreamTableEnvironment.create(env, envSetting);* *tableEnv.getConfig().setLocalTimeZone(ZoneId.of("Asia/Shanghai")); * 我这个job应用中定义了两个table,分别为source table “sqlDdlAnaTable” *String sqlDdlAnaTable = "CREATE TABLE ana_Source(type INT, datatime BIGINT, list ARRAY <ROW(id STRING, v FLOAT, q INTEGER)>, ts AS TO_TIMESTAMP(FROM_UNIXTIME(datatime)), WATERMARK FOR ts AS ts - INTERVAL '5' SECOND)" +* * " WITH (" +* * "'connector.type' = 'pravega'," +* * "'connector.version' = '1'," +* * "'connector.connection-config.controller-uri'= 'tcp://192.168.188.130:9090 <http://192.168.188.130:9090>'," +* * "'connector.connection-config.default-scope' = 'Demo'," +* * "'connector.reader.stream-info.0.stream' = 'test'," +* * "'format.type' = 'json'," +* * "'format.fail-on-missing-field' = 'false', " +* * "'update-mode' = 'append')";* 和sink table " sqlDdlSinkTable ". * String sqlDdlSinkTable = "CREATE TABLE tb_sink" +* * "(id STRING, " +* * "wStart TIMESTAMP(3) , " +* * "v FLOAT)" +* * " WITH (" +* * "'connector.type' = 'pravega'," +* * "'connector.version' = '1'," +* * "'connector.connection-config.controller-uri'= 'tcp://192.168.188.130:9090 <http://192.168.188.130:9090>'," +* * "'connector.connection-config.default-scope' = 'Demo'," +* * "'connector.writer.stream' = 'result'," +* * "'connector.writer.routingkey-field-name' = 'id'," +* * "'connector.writer.mode' = 'atleast_once'," +* * "'format.type' = 'json'," +* * "'update-mode' = 'append')";* 在数据处理逻辑比较简单,计算10s tumble window的vaule的平均。 我一开始直接打印结果能够明确看到10s中输出一次计算结果,watermark也正常移动。 *String sqlAna = "SELECT ts, id, v " +* * "FROM tb_JsonRecord " +* * "WHERE q=1 AND type=1";* * Table tableAnaRecord = tableEnv.sqlQuery(sqlAna);* * tableEnv.registerTable("tb_AnaRecord", tableAnaRecord);* *tableEnv.toAppendStream(tableAnaRecord, Row.class).print()* 但是我尝试将结果insert到sink table中发现,就没有任何结果被写入。 *String sqlAnaAvg = "INSERT INTO tb_sink(id, wStart, v) " +* * "SELECT id, " +* * "TUMBLE_START(ts, INTERVAL '10' SECOND) as wStart, " +* * "AVG(v) FROM tb_AnaRecord " +* * "GROUP BY TUMBLE(ts, INTERVAL '10' SECOND), id"; * * tableEnv.sqlUpdate(sqlAnaAvg);* 提前感谢! BR//Chao |
Hi,
pravega connector[1] 应该不是社区提供的,之前没看过这个connector的代码, 看你的描述,可以检查下写入时是否有一些参数需要设置。 祝好, Leonard Xu [1] https://github.com/pravega/flink-connectors <https://github.com/pravega/flink-connectors> > 在 2020年6月19日,13:31,王超 <[hidden email]> 写道: > > 各位大神求帮忙看一下。 > > Flink 版本:1.10.0 > Planner:blink > > 我在使用Flink SQL的时候遇到了一个问题,能否帮忙看一下,我尝试在寻找了解决方法,但是没有起作用。 > 比如我发现类似的问题 > https://www.mail-archive.com/user-zh@.../msg03916.html > 中描述的问题,根据这个mail中的解决方法我设置了timezone,但是问题没有被解决。 > > > Flink Table Env配置 > *StreamExecutionEnvironment env = > StreamExecutionEnvironment.getExecutionEnvironment();* > * env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime);* > * env.setParallelism(1);* > *EnvironmentSettings envSetting = > EnvironmentSettings.newInstance().useBlinkPlanner().inStreamingMode().build(); > * > *StreamTableEnvironment tableEnv = StreamTableEnvironment.create(env, > envSetting);* > *tableEnv.getConfig().setLocalTimeZone(ZoneId.of("Asia/Shanghai")); * > > > 我这个job应用中定义了两个table,分别为source table “sqlDdlAnaTable” > > *String sqlDdlAnaTable = "CREATE TABLE ana_Source(type INT, datatime > BIGINT, list ARRAY <ROW(id STRING, v FLOAT, q INTEGER)>, ts AS > TO_TIMESTAMP(FROM_UNIXTIME(datatime)), WATERMARK FOR ts AS ts - INTERVAL > '5' SECOND)" +* > * " WITH (" +* > * "'connector.type' = 'pravega'," +* > * "'connector.version' = '1'," +* > * "'connector.connection-config.controller-uri'= > 'tcp://192.168.188.130:9090 <http://192.168.188.130:9090>'," +* > * "'connector.connection-config.default-scope' = 'Demo'," +* > * "'connector.reader.stream-info.0.stream' = 'test'," +* > * "'format.type' = 'json'," +* > * "'format.fail-on-missing-field' = 'false', " +* > * "'update-mode' = 'append')";* > > 和sink table " sqlDdlSinkTable ". > > * String sqlDdlSinkTable = "CREATE TABLE tb_sink" +* > * "(id STRING, " +* > * "wStart TIMESTAMP(3) , " +* > * "v FLOAT)" +* > * " WITH (" +* > * "'connector.type' = 'pravega'," +* > * "'connector.version' = '1'," +* > * "'connector.connection-config.controller-uri'= > 'tcp://192.168.188.130:9090 <http://192.168.188.130:9090>'," +* > * "'connector.connection-config.default-scope' = 'Demo'," +* > * "'connector.writer.stream' = 'result'," +* > * "'connector.writer.routingkey-field-name' = 'id'," +* > * "'connector.writer.mode' = 'atleast_once'," +* > * "'format.type' = 'json'," +* > * "'update-mode' = 'append')";* > > 在数据处理逻辑比较简单,计算10s tumble window的vaule的平均。 > 我一开始直接打印结果能够明确看到10s中输出一次计算结果,watermark也正常移动。 > *String sqlAna = "SELECT ts, id, v " +* > * "FROM tb_JsonRecord " +* > * "WHERE q=1 AND type=1";* > * Table tableAnaRecord = tableEnv.sqlQuery(sqlAna);* > * tableEnv.registerTable("tb_AnaRecord", tableAnaRecord);* > > *tableEnv.toAppendStream(tableAnaRecord, Row.class).print()* > > > > 但是我尝试将结果insert到sink table中发现,就没有任何结果被写入。 > *String sqlAnaAvg = "INSERT INTO tb_sink(id, wStart, v) " +* > * "SELECT id, " +* > * "TUMBLE_START(ts, INTERVAL '10' SECOND) as wStart, " +* > * "AVG(v) FROM tb_AnaRecord " +* > * "GROUP BY TUMBLE(ts, INTERVAL '10' SECOND), id"; * > * tableEnv.sqlUpdate(sqlAnaAvg);* > > > 提前感谢! > > BR//Chao |
除了pravega connector之外的有没有什么地方有问题呢?
On Fri, Jun 19, 2020 at 18:43 Leonard Xu <[hidden email]> wrote: > Hi, > pravega connector[1] 应该不是社区提供的,之前没看过这个connector的代码, > 看你的描述,可以检查下写入时是否有一些参数需要设置。 > > > 祝好, > Leonard Xu > [1] https://github.com/pravega/flink-connectors < > https://github.com/pravega/flink-connectors> > > > > 在 2020年6月19日,13:31,王超 <[hidden email]> 写道: > > > > 各位大神求帮忙看一下。 > > > > Flink 版本:1.10.0 > > Planner:blink > > > > 我在使用Flink SQL的时候遇到了一个问题,能否帮忙看一下,我尝试在寻找了解决方法,但是没有起作用。 > > 比如我发现类似的问题 > > https://www.mail-archive.com/user-zh@.../msg03916.html > > 中描述的问题,根据这个mail中的解决方法我设置了timezone,但是问题没有被解决。 > > > > > > Flink Table Env配置 > > *StreamExecutionEnvironment env = > > StreamExecutionEnvironment.getExecutionEnvironment();* > > * env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime);* > > * env.setParallelism(1);* > > *EnvironmentSettings envSetting = > > > EnvironmentSettings.newInstance().useBlinkPlanner().inStreamingMode().build(); > > * > > *StreamTableEnvironment tableEnv = StreamTableEnvironment.create(env, > > envSetting);* > > *tableEnv.getConfig().setLocalTimeZone(ZoneId.of("Asia/Shanghai")); * > > > > > > 我这个job应用中定义了两个table,分别为source table “sqlDdlAnaTable” > > > > *String sqlDdlAnaTable = "CREATE TABLE ana_Source(type INT, datatime > > BIGINT, list ARRAY <ROW(id STRING, v FLOAT, q INTEGER)>, ts AS > > TO_TIMESTAMP(FROM_UNIXTIME(datatime)), WATERMARK FOR ts AS ts - INTERVAL > > '5' SECOND)" +* > > * " WITH (" +* > > * "'connector.type' = 'pravega'," +* > > * "'connector.version' = '1'," +* > > * "'connector.connection-config.controller-uri'= > > 'tcp://192.168.188.130:9090 <http://192.168.188.130:9090>'," +* > > * "'connector.connection-config.default-scope' = 'Demo'," > +* > > * "'connector.reader.stream-info.0.stream' = 'test'," +* > > * "'format.type' = 'json'," +* > > * "'format.fail-on-missing-field' = 'false', " +* > > * "'update-mode' = 'append')";* > > > > 和sink table " sqlDdlSinkTable ". > > > > * String sqlDdlSinkTable = "CREATE TABLE tb_sink" +* > > * "(id STRING, " +* > > * "wStart TIMESTAMP(3) , " +* > > * "v FLOAT)" +* > > * " WITH (" +* > > * "'connector.type' = 'pravega'," +* > > * "'connector.version' = '1'," +* > > * "'connector.connection-config.controller-uri'= > > 'tcp://192.168.188.130:9090 <http://192.168.188.130:9090>'," +* > > * "'connector.connection-config.default-scope' = 'Demo'," > +* > > * "'connector.writer.stream' = 'result'," +* > > * "'connector.writer.routingkey-field-name' = 'id'," +* > > * "'connector.writer.mode' = 'atleast_once'," +* > > * "'format.type' = 'json'," +* > > * "'update-mode' = 'append')";* > > > > 在数据处理逻辑比较简单,计算10s tumble window的vaule的平均。 > > 我一开始直接打印结果能够明确看到10s中输出一次计算结果,watermark也正常移动。 > > *String sqlAna = "SELECT ts, id, v " +* > > * "FROM tb_JsonRecord " +* > > * "WHERE q=1 AND type=1";* > > * Table tableAnaRecord = tableEnv.sqlQuery(sqlAna);* > > * tableEnv.registerTable("tb_AnaRecord", tableAnaRecord);* > > > > *tableEnv.toAppendStream(tableAnaRecord, Row.class).print()* > > > > > > > > 但是我尝试将结果insert到sink table中发现,就没有任何结果被写入。 > > *String sqlAnaAvg = "INSERT INTO tb_sink(id, wStart, v) " +* > > * "SELECT id, " +* > > * "TUMBLE_START(ts, INTERVAL '10' SECOND) as wStart, " +* > > * "AVG(v) FROM tb_AnaRecord " +* > > * "GROUP BY TUMBLE(ts, INTERVAL '10' SECOND), id"; * > > * tableEnv.sqlUpdate(sqlAnaAvg);* > > > > > > 提前感谢! > > > > BR//Chao > > -- |
Hi,
根据你的描述,应该是sink这里有问题的可能性比较大。因为正常的query可以输出结果,但是insert到sink就看不到输出。 你可以看下你的job的metrics,每个operator的输入输出,看下window operator是否有输出。 此外,也可以看下这个sink的实现,是否有攒batch的一些逻辑,比如要打到多少条数据才会输出等。 王超 <[hidden email]> 于2020年6月20日周六 下午1:18写道: > 除了pravega connector之外的有没有什么地方有问题呢? > > On Fri, Jun 19, 2020 at 18:43 Leonard Xu <[hidden email]> wrote: > > > Hi, > > pravega connector[1] 应该不是社区提供的,之前没看过这个connector的代码, > > 看你的描述,可以检查下写入时是否有一些参数需要设置。 > > > > > > 祝好, > > Leonard Xu > > [1] https://github.com/pravega/flink-connectors < > > https://github.com/pravega/flink-connectors> > > > > > > > 在 2020年6月19日,13:31,王超 <[hidden email]> 写道: > > > > > > 各位大神求帮忙看一下。 > > > > > > Flink 版本:1.10.0 > > > Planner:blink > > > > > > 我在使用Flink SQL的时候遇到了一个问题,能否帮忙看一下,我尝试在寻找了解决方法,但是没有起作用。 > > > 比如我发现类似的问题 > > > https://www.mail-archive.com/user-zh@.../msg03916.html > > > 中描述的问题,根据这个mail中的解决方法我设置了timezone,但是问题没有被解决。 > > > > > > > > > Flink Table Env配置 > > > *StreamExecutionEnvironment env = > > > StreamExecutionEnvironment.getExecutionEnvironment();* > > > * > env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime);* > > > * env.setParallelism(1);* > > > *EnvironmentSettings envSetting = > > > > > > EnvironmentSettings.newInstance().useBlinkPlanner().inStreamingMode().build(); > > > * > > > *StreamTableEnvironment tableEnv = StreamTableEnvironment.create(env, > > > envSetting);* > > > *tableEnv.getConfig().setLocalTimeZone(ZoneId.of("Asia/Shanghai")); * > > > > > > > > > 我这个job应用中定义了两个table,分别为source table “sqlDdlAnaTable” > > > > > > *String sqlDdlAnaTable = "CREATE TABLE ana_Source(type INT, datatime > > > BIGINT, list ARRAY <ROW(id STRING, v FLOAT, q INTEGER)>, ts AS > > > TO_TIMESTAMP(FROM_UNIXTIME(datatime)), WATERMARK FOR ts AS ts - > INTERVAL > > > '5' SECOND)" +* > > > * " WITH (" +* > > > * "'connector.type' = 'pravega'," +* > > > * "'connector.version' = '1'," +* > > > * "'connector.connection-config.controller-uri'= > > > 'tcp://192.168.188.130:9090 <http://192.168.188.130:9090>'," +* > > > * "'connector.connection-config.default-scope' = > 'Demo'," > > +* > > > * "'connector.reader.stream-info.0.stream' = 'test'," +* > > > * "'format.type' = 'json'," +* > > > * "'format.fail-on-missing-field' = 'false', " +* > > > * "'update-mode' = 'append')";* > > > > > > 和sink table " sqlDdlSinkTable ". > > > > > > * String sqlDdlSinkTable = "CREATE TABLE tb_sink" +* > > > * "(id STRING, " +* > > > * "wStart TIMESTAMP(3) , " +* > > > * "v FLOAT)" +* > > > * " WITH (" +* > > > * "'connector.type' = 'pravega'," +* > > > * "'connector.version' = '1'," +* > > > * "'connector.connection-config.controller-uri'= > > > 'tcp://192.168.188.130:9090 <http://192.168.188.130:9090>'," +* > > > * "'connector.connection-config.default-scope' = > 'Demo'," > > +* > > > * "'connector.writer.stream' = 'result'," +* > > > * "'connector.writer.routingkey-field-name' = 'id'," +* > > > * "'connector.writer.mode' = 'atleast_once'," +* > > > * "'format.type' = 'json'," +* > > > * "'update-mode' = 'append')";* > > > > > > 在数据处理逻辑比较简单,计算10s tumble window的vaule的平均。 > > > 我一开始直接打印结果能够明确看到10s中输出一次计算结果,watermark也正常移动。 > > > *String sqlAna = "SELECT ts, id, v " +* > > > * "FROM tb_JsonRecord " +* > > > * "WHERE q=1 AND type=1";* > > > * Table tableAnaRecord = tableEnv.sqlQuery(sqlAna);* > > > * tableEnv.registerTable("tb_AnaRecord", tableAnaRecord);* > > > > > > *tableEnv.toAppendStream(tableAnaRecord, Row.class).print()* > > > > > > > > > > > > 但是我尝试将结果insert到sink table中发现,就没有任何结果被写入。 > > > *String sqlAnaAvg = "INSERT INTO tb_sink(id, wStart, v) " +* > > > * "SELECT id, " +* > > > * "TUMBLE_START(ts, INTERVAL '10' SECOND) as wStart, " +* > > > * "AVG(v) FROM tb_AnaRecord " +* > > > * "GROUP BY TUMBLE(ts, INTERVAL '10' SECOND), id"; * > > > * tableEnv.sqlUpdate(sqlAnaAvg);* > > > > > > > > > 提前感谢! > > > > > > BR//Chao > > > > -- > 发自移动版 Gmail > -- Best, Benchao Li |
Administrator
|
可以替换成其他 sink 试一下,比如 kafka,验证下是不是 sink 的问题。
Btw,1.11 里面可以非常方便地创建一个测试 sink,比如 CREATE TABLE print_table WITH ('connector' = 'print') LIKE tb_sink (EXCLUDING ALL) Best, Jark On Sat, 20 Jun 2020 at 15:51, Benchao Li <[hidden email]> wrote: > Hi, > > 根据你的描述,应该是sink这里有问题的可能性比较大。因为正常的query可以输出结果,但是insert到sink就看不到输出。 > 你可以看下你的job的metrics,每个operator的输入输出,看下window operator是否有输出。 > 此外,也可以看下这个sink的实现,是否有攒batch的一些逻辑,比如要打到多少条数据才会输出等。 > > 王超 <[hidden email]> 于2020年6月20日周六 下午1:18写道: > > > 除了pravega connector之外的有没有什么地方有问题呢? > > > > On Fri, Jun 19, 2020 at 18:43 Leonard Xu <[hidden email]> wrote: > > > > > Hi, > > > pravega connector[1] 应该不是社区提供的,之前没看过这个connector的代码, > > > 看你的描述,可以检查下写入时是否有一些参数需要设置。 > > > > > > > > > 祝好, > > > Leonard Xu > > > [1] https://github.com/pravega/flink-connectors < > > > https://github.com/pravega/flink-connectors> > > > > > > > > > > 在 2020年6月19日,13:31,王超 <[hidden email]> 写道: > > > > > > > > 各位大神求帮忙看一下。 > > > > > > > > Flink 版本:1.10.0 > > > > Planner:blink > > > > > > > > 我在使用Flink SQL的时候遇到了一个问题,能否帮忙看一下,我尝试在寻找了解决方法,但是没有起作用。 > > > > 比如我发现类似的问题 > > > > https://www.mail-archive.com/user-zh@.../msg03916.html > > > > 中描述的问题,根据这个mail中的解决方法我设置了timezone,但是问题没有被解决。 > > > > > > > > > > > > Flink Table Env配置 > > > > *StreamExecutionEnvironment env = > > > > StreamExecutionEnvironment.getExecutionEnvironment();* > > > > * > > env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime);* > > > > * env.setParallelism(1);* > > > > *EnvironmentSettings envSetting = > > > > > > > > > > EnvironmentSettings.newInstance().useBlinkPlanner().inStreamingMode().build(); > > > > * > > > > *StreamTableEnvironment tableEnv = StreamTableEnvironment.create(env, > > > > envSetting);* > > > > *tableEnv.getConfig().setLocalTimeZone(ZoneId.of("Asia/Shanghai")); > * > > > > > > > > > > > > 我这个job应用中定义了两个table,分别为source table “sqlDdlAnaTable” > > > > > > > > *String sqlDdlAnaTable = "CREATE TABLE ana_Source(type INT, datatime > > > > BIGINT, list ARRAY <ROW(id STRING, v FLOAT, q INTEGER)>, ts AS > > > > TO_TIMESTAMP(FROM_UNIXTIME(datatime)), WATERMARK FOR ts AS ts - > > INTERVAL > > > > '5' SECOND)" +* > > > > * " WITH (" +* > > > > * "'connector.type' = 'pravega'," +* > > > > * "'connector.version' = '1'," +* > > > > * "'connector.connection-config.controller-uri'= > > > > 'tcp://192.168.188.130:9090 <http://192.168.188.130:9090>'," +* > > > > * "'connector.connection-config.default-scope' = > > 'Demo'," > > > +* > > > > * "'connector.reader.stream-info.0.stream' = 'test'," > +* > > > > * "'format.type' = 'json'," +* > > > > * "'format.fail-on-missing-field' = 'false', " +* > > > > * "'update-mode' = 'append')";* > > > > > > > > 和sink table " sqlDdlSinkTable ". > > > > > > > > * String sqlDdlSinkTable = "CREATE TABLE tb_sink" +* > > > > * "(id STRING, " +* > > > > * "wStart TIMESTAMP(3) , " +* > > > > * "v FLOAT)" +* > > > > * " WITH (" +* > > > > * "'connector.type' = 'pravega'," +* > > > > * "'connector.version' = '1'," +* > > > > * "'connector.connection-config.controller-uri'= > > > > 'tcp://192.168.188.130:9090 <http://192.168.188.130:9090>'," +* > > > > * "'connector.connection-config.default-scope' = > > 'Demo'," > > > +* > > > > * "'connector.writer.stream' = 'result'," +* > > > > * "'connector.writer.routingkey-field-name' = 'id'," > +* > > > > * "'connector.writer.mode' = 'atleast_once'," +* > > > > * "'format.type' = 'json'," +* > > > > * "'update-mode' = 'append')";* > > > > > > > > 在数据处理逻辑比较简单,计算10s tumble window的vaule的平均。 > > > > 我一开始直接打印结果能够明确看到10s中输出一次计算结果,watermark也正常移动。 > > > > *String sqlAna = "SELECT ts, id, v " +* > > > > * "FROM tb_JsonRecord " +* > > > > * "WHERE q=1 AND type=1";* > > > > * Table tableAnaRecord = tableEnv.sqlQuery(sqlAna);* > > > > * tableEnv.registerTable("tb_AnaRecord", tableAnaRecord);* > > > > > > > > *tableEnv.toAppendStream(tableAnaRecord, Row.class).print()* > > > > > > > > > > > > > > > > 但是我尝试将结果insert到sink table中发现,就没有任何结果被写入。 > > > > *String sqlAnaAvg = "INSERT INTO tb_sink(id, wStart, v) " +* > > > > * "SELECT id, " +* > > > > * "TUMBLE_START(ts, INTERVAL '10' SECOND) as wStart, " +* > > > > * "AVG(v) FROM tb_AnaRecord " +* > > > > * "GROUP BY TUMBLE(ts, INTERVAL '10' SECOND), id"; * > > > > * tableEnv.sqlUpdate(sqlAnaAvg);* > > > > > > > > > > > > 提前感谢! > > > > > > > > BR//Chao > > > > > > -- > > 发自移动版 Gmail > > > > > -- > > Best, > Benchao Li > |
Free forum by Nabble | Edit this page |