flink 1.11.1 与HDP3.0.1中的hive集成,查询不出hive表数据

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

flink 1.11.1 与HDP3.0.1中的hive集成,查询不出hive表数据

黄蓉
各位好:

       
我使用的环境是HDP3.0.1的沙盒,flink是最新版本的1.11.1,从官网直接下载的编译好的jar包。我想测试flink与hive的集成,包括查询hive表的数据、写入数据到hive表等操作。目前我遇到问题就是通过flink
sql client查询不出表数据,并且也不报错。但是该表在hive中查询是有记录的。其余的show tables,show
database等语句都可以正常显示。

配置的hadoop环境变量如下:
export HADOOP_CONF_DIR="/etc/hadoop/conf"
export HADOOP_HOME="/usr/hdp/3.0.1.0-187/hadoop"
export
HADOOP_CLASSPATH="/usr/hdp/current/hadoop-client/*:/usr/hdp/current/hadoop-client/lib/*:/usr/hdp/current/hadoop-hdfs-client/*:/usr/hdp/current/hadoop-hdfs-client/lib/*:/usr/hdp/current/hadoop-yarn-client/*:/usr/hdp/current/hadoop-yarn-client/lib/*:/usr/hdp/current/hadoop-mapreduce-client/*:/usr/hdp/current/hadoop-mapreduce-client/lib/*"

sql-client配置文件如下:
tables: []
functions: []
catalogs:
    - name: myhive
      type: hive
      hive-conf-dir: /opt/hive-conf
execution:
   planner: blink
   type: batch
   result-mode: table
   max-table-result-rows: 1000000
   parallelism: 3
   max-parallelism: 128
   min-idle-state-retention: 0
   max-idle-state-retention: 0
   current-catalog: myhive
   current-database: default
   restart-strategy:
     type: fallback
deployment:
   response-timeout: 5000
   gateway-address: ""
   gateway-port: 0


请问出现这种情况是不是官网的flink包与hdp3.0.1不兼容?我需要自己重新编译flink吗?

Jessie
[hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: flink 1.11.1 与HDP3.0.1中的hive集成,查询不出hive表数据

Rui Li
Hello,

HDP里的hive版本是多少啊?另外你要查的表是啥样的呢(describe formatted看一下)?

On Mon, Aug 24, 2020 at 3:02 AM 黄蓉 <[hidden email]> wrote:

> 各位好:
>
>
> 我使用的环境是HDP3.0.1的沙盒,flink是最新版本的1.11.1,从官网直接下载的编译好的jar包。我想测试flink与hive的集成,包括查询hive表的数据、写入数据到hive表等操作。目前我遇到问题就是通过flink
>
> sql client查询不出表数据,并且也不报错。但是该表在hive中查询是有记录的。其余的show tables,show
> database等语句都可以正常显示。
>
> 配置的hadoop环境变量如下:
> export HADOOP_CONF_DIR="/etc/hadoop/conf"
> export HADOOP_HOME="/usr/hdp/3.0.1.0-187/hadoop"
> export
>
> HADOOP_CLASSPATH="/usr/hdp/current/hadoop-client/*:/usr/hdp/current/hadoop-client/lib/*:/usr/hdp/current/hadoop-hdfs-client/*:/usr/hdp/current/hadoop-hdfs-client/lib/*:/usr/hdp/current/hadoop-yarn-client/*:/usr/hdp/current/hadoop-yarn-client/lib/*:/usr/hdp/current/hadoop-mapreduce-client/*:/usr/hdp/current/hadoop-mapreduce-client/lib/*"
>
> sql-client配置文件如下:
> tables: []
> functions: []
> catalogs:
>     - name: myhive
>       type: hive
>       hive-conf-dir: /opt/hive-conf
> execution:
>    planner: blink
>    type: batch
>    result-mode: table
>    max-table-result-rows: 1000000
>    parallelism: 3
>    max-parallelism: 128
>    min-idle-state-retention: 0
>    max-idle-state-retention: 0
>    current-catalog: myhive
>    current-database: default
>    restart-strategy:
>      type: fallback
> deployment:
>    response-timeout: 5000
>    gateway-address: ""
>    gateway-port: 0
>
>
> 请问出现这种情况是不是官网的flink包与hdp3.0.1不兼容?我需要自己重新编译flink吗?
>
> Jessie
> [hidden email]
>
>

--
Best regards!
Rui Li
Reply | Threaded
Open this post in threaded view
|

Re: flink 1.11.1 与HDP3.0.1中的hive集成,查询不出hive表数据

china_tao
In reply to this post by 黄蓉
https://ci.apache.org/projects/flink/flink-docs-release-1.11/dev/table/hive/ 
中的 flink-sql-connector-hive-3.1.2 下载了么,放到lib里面了么?

在 2020/8/24 3:01, 黄蓉 写道:

> 各位好:
>
> 我使用的环境是HDP3.0.1的沙盒,flink是最新版本的1.11.1,从官网直接下载的编译好的jar包。我想测试flink与hive的集成,包括查询hive表的数据、写入数据到hive表等操作。目前我遇到问题就是通过flink
> sql
> client查询不出表数据,并且也不报错。但是该表在hive中查询是有记录的。其余的show
> tables,show database等语句都可以正常显示。
>
> 配置的hadoop环境变量如下:
> export HADOOP_CONF_DIR="/etc/hadoop/conf"
> export HADOOP_HOME="/usr/hdp/3.0.1.0-187/hadoop"
> export
> HADOOP_CLASSPATH="/usr/hdp/current/hadoop-client/*:/usr/hdp/current/hadoop-client/lib/*:/usr/hdp/current/hadoop-hdfs-client/*:/usr/hdp/current/hadoop-hdfs-client/lib/*:/usr/hdp/current/hadoop-yarn-client/*:/usr/hdp/current/hadoop-yarn-client/lib/*:/usr/hdp/current/hadoop-mapreduce-client/*:/usr/hdp/current/hadoop-mapreduce-client/lib/*"
>
> sql-client配置文件如下:
> tables: []
> functions: []
> catalogs:
>    - name: myhive
>      type: hive
>      hive-conf-dir: /opt/hive-conf
> execution:
>   planner: blink
>   type: batch
>   result-mode: table
>   max-table-result-rows: 1000000
>   parallelism: 3
>   max-parallelism: 128
>   min-idle-state-retention: 0
>   max-idle-state-retention: 0
>   current-catalog: myhive
>   current-database: default
>   restart-strategy:
>     type: fallback
> deployment:
>   response-timeout: 5000
>   gateway-address: ""
>   gateway-port: 0
>
>
> 请问出现这种情况是不是官网的flink包与hdp3.0.1不兼容?我需要自己重新编译flink吗?
>
>
> Jessie
> [hidden email]
Reply | Threaded
Open this post in threaded view
|

Re[2]: flink 1.11.1 与HDP3.0.1中的hive集成,查询不出hive表数据

黄蓉
感谢各位:

        我已经找到问题的原因了,是因为HDP3.0.1中的Hive3.1.0默认开启了事务,而Flink
1.11.0写入和读取hive表应该是暂时不支持事务的。所以两者不兼容。我把Hive中事务相关的设置都关闭之后就正常了。

Jessie
[hidden email]

------ Original Message ------
From: "taochanglian" <[hidden email]>
To: [hidden email]
Sent: 8/24/2020 5:28:56 AM
Subject: Re: flink 1.11.1 与HDP3.0.1中的hive集成,查询不出hive表数据

>https://ci.apache.org/projects/flink/flink-docs-release-1.11/dev/table/hive/ 中的 flink-sql-connector-hive-3.1.2 下载了么,放到lib里面了么?
>
>在 2020/8/24 3:01, 黄蓉 写道:
>>各位好:
>>
>>我使用的环境是HDP3.0.1的沙盒,flink是最新版本的1.11.1,从官网直接下载的编译好的jar包。我想测试flink与hive的集成,包括查询hive表的数据、写入数据到hive表等操作。目前我遇到问题就是通过flink sql client查询不出表数据,并且也不报错。但是该表在hive中查询是有记录的。其余的show tables,show database等语句都可以正常显示。
>>
>>配置的hadoop环境变量如下:
>>export HADOOP_CONF_DIR="/etc/hadoop/conf"
>>export HADOOP_HOME="/usr/hdp/3.0.1.0-187/hadoop"
>>export HADOOP_CLASSPATH="/usr/hdp/current/hadoop-client/*:/usr/hdp/current/hadoop-client/lib/*:/usr/hdp/current/hadoop-hdfs-client/*:/usr/hdp/current/hadoop-hdfs-client/lib/*:/usr/hdp/current/hadoop-yarn-client/*:/usr/hdp/current/hadoop-yarn-client/lib/*:/usr/hdp/current/hadoop-mapreduce-client/*:/usr/hdp/current/hadoop-mapreduce-client/lib/*"
>>
>>sql-client配置文件如下:
>>tables: []
>>functions: []
>>catalogs:
>>    - name: myhive
>>      type: hive
>>      hive-conf-dir: /opt/hive-conf
>>execution:
>>   planner: blink
>>   type: batch
>>   result-mode: table
>>   max-table-result-rows: 1000000
>>   parallelism: 3
>>   max-parallelism: 128
>>   min-idle-state-retention: 0
>>   max-idle-state-retention: 0
>>   current-catalog: myhive
>>   current-database: default
>>   restart-strategy:
>>     type: fallback
>>deployment:
>>   response-timeout: 5000
>>   gateway-address: ""
>>   gateway-port: 0
>>
>>
>>请问出现这种情况是不是官网的flink包与hdp3.0.1不兼容?我需要自己重新编译flink吗?
>>
>>Jessie
>>[hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: flink 1.11.1 与HDP3.0.1中的hive集成,查询不出hive表数据

china_tao
hive3.0默认就是事务表,建表语句加上 TBLPROPERTIES('transactional'='false')


在 2020/8/24 15:43, 黄蓉 写道:

> 感谢各位:
>
>       
> 我已经找到问题的原因了,是因为HDP3.0.1中的Hive3.1.0默认开启了事务,而Flink
> 1.11.0写入和读取hive表应该是暂时不支持事务的。所以两者不兼容。我把Hive中事务相关的设置都关闭之后就正常了。
>
>
> Jessie
> [hidden email]
>
> ------ Original Message ------
> From: "taochanglian" <[hidden email]>
> To: [hidden email]
> Sent: 8/24/2020 5:28:56 AM
> Subject: Re: flink 1.11.1 与HDP3.0.1中的hive集成,查询不出hive表数据
>
>> https://ci.apache.org/projects/flink/flink-docs-release-1.11/dev/table/hive/ 
>> 中的 flink-sql-connector-hive-3.1.2 下载了么,放到lib里面了么?
>>
>> 在 2020/8/24 3:01, 黄蓉 写道:
>>> 各位好:
>>>
>>> 我使用的环境是HDP3.0.1的沙盒,flink是最新版本的1.11.1,从官网直接下载的编译好的jar包。我想测试flink与hive的集成,包括查询hive表的数据、写入数据到hive表等操作。目前我遇到问题就是通过flink
>>> sql
>>> client查询不出表数据,并且也不报错。但是该表在hive中查询是有记录的。其余的show
>>> tables,show database等语句都可以正常显示。
>>>
>>> 配置的hadoop环境变量如下:
>>> export HADOOP_CONF_DIR="/etc/hadoop/conf"
>>> export HADOOP_HOME="/usr/hdp/3.0.1.0-187/hadoop"
>>> export
>>> HADOOP_CLASSPATH="/usr/hdp/current/hadoop-client/*:/usr/hdp/current/hadoop-client/lib/*:/usr/hdp/current/hadoop-hdfs-client/*:/usr/hdp/current/hadoop-hdfs-client/lib/*:/usr/hdp/current/hadoop-yarn-client/*:/usr/hdp/current/hadoop-yarn-client/lib/*:/usr/hdp/current/hadoop-mapreduce-client/*:/usr/hdp/current/hadoop-mapreduce-client/lib/*"
>>>
>>> sql-client配置文件如下:
>>> tables: []
>>> functions: []
>>> catalogs:
>>>    - name: myhive
>>>      type: hive
>>>      hive-conf-dir: /opt/hive-conf
>>> execution:
>>>   planner: blink
>>>   type: batch
>>>   result-mode: table
>>>   max-table-result-rows: 1000000
>>>   parallelism: 3
>>>   max-parallelism: 128
>>>   min-idle-state-retention: 0
>>>   max-idle-state-retention: 0
>>>   current-catalog: myhive
>>>   current-database: default
>>>   restart-strategy:
>>>     type: fallback
>>> deployment:
>>>   response-timeout: 5000
>>>   gateway-address: ""
>>>   gateway-port: 0
>>>
>>>
>>> 请问出现这种情况是不是官网的flink包与hdp3.0.1不兼容?我需要自己重新编译flink吗?
>>>
>>>
>>> Jessie
>>> [hidden email]