Flink StreamingFileSink滚动策略

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Flink StreamingFileSink滚动策略

guoliang_wang1335
请问,Flink StreamingFileSink使用批量写Hadoop SequenceFile format,能自定义滚动策略吗?我想指定文件大小、文件最长未更新时间和checponit来进行滚动,可以通过实现RollingPolicy接口来定制吗?谢谢!


看文档<https://ci.apache.org/projects/flink/flink-docs-release-1.11/dev/connectors/streamfile_sink.html>备注,批量编码默认情况下仅仅有OnCheckpointRollingPolicy,在每次checkpoint时候进行切分。如果设置checkpoint时间不合理,这样会产生蛮多小文件的。





Reply | Threaded
Open this post in threaded view
|

Re: Flink StreamingFileSink滚动策略

Jingsong Li
只要你继承CheckpointRollingPolicy,想怎么实现shouldRollOnEvent和shouldRollOnProcessingTime都行

On Wed, Aug 19, 2020 at 6:20 PM guoliang_wang1335 <[hidden email]>
wrote:

> 请问,Flink StreamingFileSink使用批量写Hadoop SequenceFile
> format,能自定义滚动策略吗?我想指定文件大小、文件最长未更新时间和checponit来进行滚动,可以通过实现RollingPolicy接口来定制吗?谢谢!
>
>
> 看文档<
> https://ci.apache.org/projects/flink/flink-docs-release-1.11/dev/connectors/streamfile_sink.html
> >备注,批量编码默认情况下仅仅有OnCheckpointRollingPolicy,在每次checkpoint时候进行切分。如果设置checkpoint时间不合理,这样会产生蛮多小文件的。
>
>
>
>
>
>

--
Best, Jingsong Lee
Reply | Threaded
Open this post in threaded view
|

Re:Re: Flink StreamingFileSink滚动策略

guoliang_wang1335
我去试试,谢谢啦。

















在 2020-08-20 14:19:41,"Jingsong Li" <[hidden email]> 写道:

>只要你继承CheckpointRollingPolicy,想怎么实现shouldRollOnEvent和shouldRollOnProcessingTime都行
>
>On Wed, Aug 19, 2020 at 6:20 PM guoliang_wang1335 <[hidden email]>
>wrote:
>
>> 请问,Flink StreamingFileSink使用批量写Hadoop SequenceFile
>> format,能自定义滚动策略吗?我想指定文件大小、文件最长未更新时间和checponit来进行滚动,可以通过实现RollingPolicy接口来定制吗?谢谢!
>>
>>
>> 看文档<
>> https://ci.apache.org/projects/flink/flink-docs-release-1.11/dev/connectors/streamfile_sink.html
>> >备注,批量编码默认情况下仅仅有OnCheckpointRollingPolicy,在每次checkpoint时候进行切分。如果设置checkpoint时间不合理,这样会产生蛮多小文件的。
>>
>>
>>
>>
>>
>>
>
>--
>Best, Jingsong Lee
Reply | Threaded
Open this post in threaded view
|

Re: Re:Re: Flink StreamingFileSink滚动策略

bradyMk
CONTENTS DELETED
The author has deleted this message.
Reply | Threaded
Open this post in threaded view
|

Re:Re: Re:Re: Flink StreamingFileSink滚动策略

hailongwang
Hi bradyMk,


Bulk-encoded Formats  只能在 Checkpoint 时滚动,详见文档一[1].


Best,
Hailong Wang


[1] https://ci.apache.org/projects/flink/flink-docs-release-1.11/dev/connectors/streamfile_sink.html#bulk-encoded-formats

















在 2020-11-06 10:47:33,"bradyMk" <[hidden email]> 写道:
>Hi,guoliang_wang1335
>请问StreamingFileSink用forBulkFormat方法时,可以自定义滚动策略么?你这边实现成功了么?
>
>
>
>-----
>Best Wishes
>--
>Sent from: http://apache-flink.147419.n8.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re: Re:Re: Re:Re: Flink StreamingFileSink滚动策略

bradyMk
CONTENTS DELETED
The author has deleted this message.