Flink StreamingFileSink.forBulkFormat to HDFS

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Flink StreamingFileSink.forBulkFormat to HDFS

yanggang_it_job
消费Kafka数据到HDFS,是否能支持ORC格式的Hive表


1. 保证EXACTLY_ONCE
2. 支持ORC格式、Snappy、ZLIB压缩
Reply | Threaded
Open this post in threaded view
|

Re: Flink StreamingFileSink.forBulkFormat to HDFS

Congxian Qiu
Hi,

如果是写 ORC 的话,是可以的,Hive 表应该可以自己控制往某个 HDFS 路径写就行了,然后就变成了写 ORC 格式的 HDFS
文件。另外可以参考下这两个链接[1][2]

Exactly Once 的话可以看一下这个文档[3]

[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.8/dev/connectors/streamfile_sink.html
[2]
https://stackoverflow.com/questions/47669729/how-to-write-to-orc-files-using-bucketingsink-in-apache-flink
[3]
https://flink.apache.org/features/2018/03/01/end-to-end-exactly-once-apache-flink.html

Best,
Congxian


yanggang_it_job <[hidden email]> 于2019年10月13日周日 下午6:21写道:

> 消费Kafka数据到HDFS,是否能支持ORC格式的Hive表
>
>
> 1. 保证EXACTLY_ONCE
> 2. 支持ORC格式、Snappy、ZLIB压缩