hive-1.2.1
chk 已经成功了(去chk目录查看了的确有chk数据,kafka也有数据),但是hive表没有数据,我是哪里缺少了什么吗? String hiveSql = "CREATE TABLE stream_tmp.fs_table (\n" + " host STRING,\n" + " url STRING," + " public_date STRING" + ") partitioned by (public_date string) " + "stored as PARQUET " + "TBLPROPERTIES (\n" + " 'sink.partition-commit.delay'='0 s',\n" + " 'sink.partition-commit.trigger'='partition-time',\n" + " 'sink.partition-commit.policy.kind'='metastore,success-file'" + ")"; tableEnv.executeSql(hiveSql); tableEnv.executeSql("INSERT INTO stream_tmp.fs_table SELECT host, url, DATE_FORMAT(public_date, 'yyyy-MM-dd') FROM stream_tmp.source_table"); |
Administrator
|
rolling 策略配一下?
https://ci.apache.org/projects/flink/flink-docs-release-1.11/dev/table/connectors/filesystem.html#sink-rolling-policy-rollover-interval Best, Jark On Tue, 21 Jul 2020 at 20:38, JasonLee <[hidden email]> wrote: > hi > hive表是一直没有数据还是过一段时间就有数据了? > > > | | > JasonLee > | > | > 邮箱:[hidden email] > | > > Signature is customized by Netease Mail Master > > 在2020年07月21日 19:09,kcz 写道: > hive-1.2.1 > chk 已经成功了(去chk目录查看了的确有chk数据,kafka也有数据),但是hive表没有数据,我是哪里缺少了什么吗? > String hiveSql = "CREATE TABLE stream_tmp.fs_table (\n" + > " host STRING,\n" + > " url STRING," + > " public_date STRING" + > ") partitioned by (public_date string) " + > "stored as PARQUET " + > "TBLPROPERTIES (\n" + > " 'sink.partition-commit.delay'='0 s',\n" + > " 'sink.partition-commit.trigger'='partition-time',\n" + > " 'sink.partition-commit.policy.kind'='metastore,success-file'" + > ")"; > tableEnv.executeSql(hiveSql); > > > tableEnv.executeSql("INSERT INTO stream_tmp.fs_table SELECT host, url, > DATE_FORMAT(public_date, 'yyyy-MM-dd') FROM stream_tmp.source_table"); |
In reply to this post by kcz
一直都木有数据 我也不知道哪里不太对 hive有这个表了已经。我测试写ddl hdfs 是OK的
------------------ 原始邮件 ------------------ 发件人: JasonLee <[hidden email]> 发送时间: 2020年7月21日 20:39 收件人: user-zh <[hidden email]> 主题: 回复:flink-1.11 ddl kafka-to-hive问题 hi hive表是一直没有数据还是过一段时间就有数据了? | | JasonLee | | 邮箱:[hidden email] | Signature is customized by Netease Mail Master 在2020年07月21日 19:09,kcz 写道: hive-1.2.1 chk 已经成功了(去chk目录查看了的确有chk数据,kafka也有数据),但是hive表没有数据,我是哪里缺少了什么吗? String hiveSql = "CREATE TABLE stream_tmp.fs_table (\n" + " host STRING,\n" + " url STRING," + " public_date STRING" + ") partitioned by (public_date string) " + "stored as PARQUET " + "TBLPROPERTIES (\n" + " 'sink.partition-commit.delay'='0 s',\n" + " 'sink.partition-commit.trigger'='partition-time',\n" + " 'sink.partition-commit.policy.kind'='metastore,success-file'" + ")"; tableEnv.executeSql(hiveSql); tableEnv.executeSql("INSERT INTO stream_tmp.fs_table SELECT host, url, DATE_FORMAT(public_date, 'yyyy-MM-dd') FROM stream_tmp.source_table"); |
HI,
Hive 表时在flink里建的吗? 如果是建表时使用了hive dialect吗?可以参考[1]设置下 Best Leonard Xu [1] https://ci.apache.org/projects/flink/flink-docs-master/dev/table/hive/hive_dialect.html#use-hive-dialect <https://ci.apache.org/projects/flink/flink-docs-master/dev/table/hive/hive_dialect.html#use-hive-dialect> > 在 2020年7月21日,22:57,kcz <[hidden email]> 写道: > > 一直都木有数据 我也不知道哪里不太对 hive有这个表了已经。我测试写ddl hdfs 是OK的 > > > > > > ------------------ 原始邮件 ------------------ > 发件人: JasonLee <[hidden email] <mailto:[hidden email]>> > 发送时间: 2020年7月21日 20:39 > 收件人: user-zh <[hidden email] <mailto:[hidden email]>> > 主题: 回复:flink-1.11 ddl kafka-to-hive问题 > > > > hi > hive表是一直没有数据还是过一段时间就有数据了? > > > | | > JasonLee > | > | > 邮箱:[hidden email] > | > > Signature is customized by Netease Mail Master > > 在2020年07月21日 19:09,kcz 写道: > hive-1.2.1 > chk 已经成功了(去chk目录查看了的确有chk数据,kafka也有数据),但是hive表没有数据,我是哪里缺少了什么吗? > String hiveSql = "CREATE TABLE stream_tmp.fs_table (\n" + > " host STRING,\n" + > " url STRING," + > " public_date STRING" + > ") partitioned by (public_date string) " + > "stored as PARQUET " + > "TBLPROPERTIES (\n" + > " 'sink.partition-commit.delay'='0 s',\n" + > " 'sink.partition-commit.trigger'='partition-time',\n" + > " 'sink.partition-commit.policy.kind'='metastore,success-file'" + > ")"; > tableEnv.executeSql(hiveSql); > > > tableEnv.executeSql("INSERT INTO stream_tmp.fs_table SELECT host, url, DATE_FORMAT(public_date, 'yyyy-MM-dd') FROM stream_tmp.source_table"); |
你的Source表是怎么定义的?确定有watermark前进吗?(可以看Flink UI)
'sink.partition-commit.trigger'='partition-time' 去掉试试? Best, Jingsong On Wed, Jul 22, 2020 at 12:02 AM Leonard Xu <[hidden email]> wrote: > HI, > > Hive 表时在flink里建的吗? 如果是建表时使用了hive dialect吗?可以参考[1]设置下 > > Best > Leonard Xu > [1] > https://ci.apache.org/projects/flink/flink-docs-master/dev/table/hive/hive_dialect.html#use-hive-dialect > < > https://ci.apache.org/projects/flink/flink-docs-master/dev/table/hive/hive_dialect.html#use-hive-dialect > > > > > 在 2020年7月21日,22:57,kcz <[hidden email]> 写道: > > > > 一直都木有数据 我也不知道哪里不太对 hive有这个表了已经。我测试写ddl hdfs 是OK的 > > > > > > > > > > > > ------------------ 原始邮件 ------------------ > > 发件人: JasonLee <[hidden email] <mailto:[hidden email]>> > > 发送时间: 2020年7月21日 20:39 > > 收件人: user-zh <[hidden email] <mailto:[hidden email] > >> > > 主题: 回复:flink-1.11 ddl kafka-to-hive问题 > > > > > > > > hi > > hive表是一直没有数据还是过一段时间就有数据了? > > > > > > | | > > JasonLee > > | > > | > > 邮箱:[hidden email] > > | > > > > Signature is customized by Netease Mail Master > > > > 在2020年07月21日 19:09,kcz 写道: > > hive-1.2.1 > > chk 已经成功了(去chk目录查看了的确有chk数据,kafka也有数据),但是hive表没有数据,我是哪里缺少了什么吗? > > String hiveSql = "CREATE TABLE stream_tmp.fs_table (\n" + > > " host STRING,\n" + > > " url STRING," + > > " public_date STRING" + > > ") partitioned by (public_date > string) " + > > "stored as PARQUET " + > > "TBLPROPERTIES (\n" + > > " > 'sink.partition-commit.delay'='0 s',\n" + > > " > 'sink.partition-commit.trigger'='partition-time',\n" + > > " > 'sink.partition-commit.policy.kind'='metastore,success-file'" + > > ")"; > > tableEnv.executeSql(hiveSql); > > > > > > tableEnv.executeSql("INSERT INTO stream_tmp.fs_table SELECT host, > url, DATE_FORMAT(public_date, 'yyyy-MM-dd') FROM stream_tmp.source_table"); > > -- Best, Jingsong Lee |
Free forum by Nabble | Edit this page |