FLINK WEEKLY 2019/43

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

FLINK WEEKLY 2019/43

tison
FLINK WEEKLY 2019/43 <https://zhuanlan.zhihu.com/p/88918722>用户问题

为什么在KeyedStream上进行任何transformation都会变回DataStream
<https://lists.apache.org/x/thread.html/aacea54f4f0e04b57ccabedc45791599b4855f4154342aa29352ebcd@%3Cuser-zh.flink.apache.org%3E>

可以使用 DataStreamUtils.reinterpretAsKeyedStream 添加先验假设

Flink 消费Kafka Window不触发计算
<https://lists.apache.org/x/thread.html/2f024571a6993ed24a97ad2ee650b057e2c80a239edf407f4437511c@%3Cuser-zh.flink.apache.org%3E>

源间歇性不生产数据的情况下主动触发计算的方法

Watermark won't advance in ProcessFunction
<https://lists.apache.org/x/thread.html/336586e4cfb4810f6e3fad045f5d5d1fc744868f95ca9736721fcbcb@%3Cuser.flink.apache.org%3E>

如何正确设置 Timer 和编写触发 watermark 的逻辑

Guarantee of event-time order in FlinkKafkaConsumer
<https://lists.apache.org/x/thread.html/27534fcce59241e053ded810ab1021d264b6c26e0071f37b3616dc71@%3Cuser.flink.apache.org%3E>

FLINK 并不天然地保证时间处理的顺序,讨论介绍了用户层面实现这一逻辑的方法

Monitor number of keys per Taskmanager
<https://lists.apache.org/x/thread.html/12be05a2927eddf92808cbe905715521edbc09a0da72ac8e8a6784ed@%3Cuser.flink.apache.org%3E>

FLINK 的 key groups 并不能尽可能平均的分布到 TM 上,可能出现少许的偏差,在 key
很少的情况下数据倾斜可能很严重;同时也不能确定不同的 TM 是否在同一台机器上,可能一台机器会抗很大的压力。这是一个已知问题,暂无修复计划

The RMClient's and YarnResourceManagers internal state about the number of
pending container requests has diverged
<https://lists.apache.org/x/thread.html/173a678c02246af6b6a7b3dcfac0d548f5be27496fb5b53145b853e5@%3Cuser.flink.apache.org%3E>

关于 FLIP-6 架构下 YARN 资源超用的问题,需要调整配置以尽快的将不用的资源或失败的请求返还给 YARN

Can flink 1.9.1 use flink-shaded 2.8.3-1.8.2
<https://lists.apache.org/x/thread.html/28bc50be1f5579a8121303e87680aad01d5517309412538cd7aca9cd@%3Cuser.flink.apache.org%3E>

FLINK 需要 Hadoop 支持的时候从 flink-shaded 获取依赖的讨论

Does operator uid() have to be unique across all jobs?
<https://lists.apache.org/x/thread.html/bb853ee17c28e823dd3153fa9f13875110de10bdbd97a86d183f9bcc@%3Cuser.flink.apache.org%3E>

算子上的 uid 函数的返回值需要在单个任务内唯一还是所有任务内都唯一

Comparing Storm and Flink resource requirements
<https://lists.apache.org/x/thread.html/0caf2693f38366b5eb013f254463d40c1218dd173c96bdefe337cfe5@%3Cuser.flink.apache.org%3E>

Storm 任务迁移到 FLINK 的时候对资源需求的差别的估计
已知缺陷

PostgreSQL JDBC sink generates invalid SQL in upsert mode
<https://issues.apache.org/jira/browse/FLINK-14524>

PostgreSQL JDBC 连接器的缺陷

Watermark display not working with high parallelism job
<https://issues.apache.org/jira/browse/FLINK-14470>

Web UI 显示 watermark 的缺陷
开发讨论

[REMINDER] Ensuring build stability
<http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/REMINDER-Ensuring-build-stability-td34158.html>

Gray Yao 提醒开发者关注 build 邮件列表以及时响应 master 上的失败测试

[DISCUSS] FLIP-76: Unaligned checkpoints
<http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-76-Unaligned-checkpoints-td33651.html>

Piotr Nowojski 针对 FLIP-76 的设计方案提出了一个修改意见以优化实现 Unaligned checkpoints
现有方案对存储的开销

[DISCUSS] Introduce a location-oriented two-stage query mechanism to
improve the queryable state.
<http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Introduce-a-location-oriented-two-stage-query-mechanism-to-improve-the-queryable-state-td34265.html>

Vino Yang 提出了一个优化 Queryable State 的方案

FLIP-81: Executor-related new ConfigOptions
<http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/FLIP-81-Executor-related-new-ConfigOptions-td34236.html>

Kostas 的 FLIP-81 作为 FLIP-73 的一个子 FLIP 旨在为 Executors 引入一系列的 ConfigOptions
社区发展

Flink or Flunk? Why Ele.me Is Developing a Taste for Apache Flink
<https://hackernoon.com/flink-or-flunk-why-ele-me-is-developing-a-taste-for-apache-flink-7d2a74e4d6c0>

饿了么团队发文介绍了他们 FLINK 的使用经验