FLINK 1.9.1 版本,线上任务运行的时候,偶现这个checkpoint 被decline的问题。能帮忙确认一下根本原因是什么吗? 是kafka
出问题还是代码有bug? > 2020-02-21 08:32:15,738 INFO > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Discarding > checkpoint 941 of job 0e16cf38a0bff313544e1f31d078f75b. > org.apache.flink.runtime.checkpoint.CheckpointException: Could not > complete snapshot 941 for operator PctrLogJoin -> (Sink: hdfsSink, Sink: > kafkaSink) (8/36). Failure reason: Checkpoint was declined.Checkpoint was > declined. at > org.apache.flink.streaming.api.operators.AbstractStreamOperator.snapshotState(AbstractStreamOperator.java:431) > at > org.apache.flink.streaming.runtime.tasks.StreamTask$CheckpointingOperation.checkpointStreamOperator(StreamTask.java:1282) > at > org.apache.flink.streaming.runtime.tasks.StreamTask$CheckpointingOperation.executeCheckpointing(StreamTask.java:1216) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.checkpointState(StreamTask.java:872) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.performCheckpoint(StreamTask.java:777) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.triggerCheckpointOnBarrier(StreamTask.java:708) > at > org.apache.flink.streaming.runtime.io.CheckpointBarrierHandler.notifyCheckpoint(CheckpointBarrierHandler.java:88) > at > org.apache.flink.streaming.runtime.io.CheckpointBarrierAligner.processBarrier(CheckpointBarrierAligner.java:177) > at > org.apache.flink.streaming.runtime.io.CheckpointedInputGate.pollNext(CheckpointedInputGate.java:155) > at > org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.pollNextNullable(StreamTaskNetworkInput.java:102) > at > org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.pollNextNullable(StreamTaskNetworkInput.java:47) > at > org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:135) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:279) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.run(StreamTask.java:301) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:406) > at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:705) > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:530) > at java.lang.Thread.run(Thread.java:748) > Caused by: java.lang.IllegalStateException: Pending record count must be > zero at this point: 2 > at > org.apache.flink.streaming.connectors.kafka.FlinkKafkaProducer.flush(FlinkKafkaProducer.java:964) > at > org.apache.flink.streaming.connectors.kafka.FlinkKafkaProducer.preCommit(FlinkKafkaProducer.java:892) > at > org.apache.flink.streaming.connectors.kafka.FlinkKafkaProducer.preCommit(FlinkKafkaProducer.java:98) > at > org.apache.flink.streaming.api.functions.sink.TwoPhaseCommitSinkFunction.snapshotState(TwoPhaseCommitSinkFunction.java:311) > at > org.apache.flink.streaming.connectors.kafka.FlinkKafkaProducer.snapshotState(FlinkKafkaProducer.java:973) > at > org.apache.flink.streaming.util.functions.StreamingFunctionUtils.trySnapshotFunctionState(StreamingFunctionUtils.java:118) > at > org.apache.flink.streaming.util.functions.StreamingFunctionUtils.snapshotFunctionState(StreamingFunctionUtils.java:99) > at > org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.snapshotState(AbstractUdfStreamOperator.java:90) > at > org.apache.flink.streaming.api.operators.AbstractStreamOperator.snapshotState(AbstractStreamOperator.java:399) > ... 17 more |
hi
可以看一下 PctrLogJoin -> (Sink: hdfsSink, Sink:> kafkaSink) (8/36) 这个的 tm log 看看具体是什么原因导致的 checkpoint 失败 Best, Congxian tao wang <[hidden email]> 于2020年2月21日周五 下午4:42写道: > FLINK 1.9.1 版本,线上任务运行的时候,偶现这个checkpoint 被decline的问题。能帮忙确认一下根本原因是什么吗? 是kafka > 出问题还是代码有bug? > > > > > 2020-02-21 08:32:15,738 INFO > > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - > Discarding > > checkpoint 941 of job 0e16cf38a0bff313544e1f31d078f75b. > > org.apache.flink.runtime.checkpoint.CheckpointException: Could not > > complete snapshot 941 for operator PctrLogJoin -> (Sink: hdfsSink, Sink: > > kafkaSink) (8/36). Failure reason: Checkpoint was declined.Checkpoint > was > > declined. > > at > > > org.apache.flink.streaming.api.operators.AbstractStreamOperator.snapshotState(AbstractStreamOperator.java:431) > > at > > > org.apache.flink.streaming.runtime.tasks.StreamTask$CheckpointingOperation.checkpointStreamOperator(StreamTask.java:1282) > > at > > > org.apache.flink.streaming.runtime.tasks.StreamTask$CheckpointingOperation.executeCheckpointing(StreamTask.java:1216) > > at > > > org.apache.flink.streaming.runtime.tasks.StreamTask.checkpointState(StreamTask.java:872) > > at > > > org.apache.flink.streaming.runtime.tasks.StreamTask.performCheckpoint(StreamTask.java:777) > > at > > > org.apache.flink.streaming.runtime.tasks.StreamTask.triggerCheckpointOnBarrier(StreamTask.java:708) > > at > > org.apache.flink.streaming.runtime.io > .CheckpointBarrierHandler.notifyCheckpoint(CheckpointBarrierHandler.java:88) > > at > > org.apache.flink.streaming.runtime.io > .CheckpointBarrierAligner.processBarrier(CheckpointBarrierAligner.java:177) > > at > > org.apache.flink.streaming.runtime.io > .CheckpointedInputGate.pollNext(CheckpointedInputGate.java:155) > > at > > org.apache.flink.streaming.runtime.io > .StreamTaskNetworkInput.pollNextNullable(StreamTaskNetworkInput.java:102) > > at > > org.apache.flink.streaming.runtime.io > .StreamTaskNetworkInput.pollNextNullable(StreamTaskNetworkInput.java:47) > > at > > org.apache.flink.streaming.runtime.io > .StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:135) > > at > > > org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:279) > > at > > > org.apache.flink.streaming.runtime.tasks.StreamTask.run(StreamTask.java:301) > > at > > > org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:406) > > at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:705) > > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:530) > > at java.lang.Thread.run(Thread.java:748) > > Caused by: java.lang.IllegalStateException: Pending record count must be > > zero at this point: 2 > > at > > > org.apache.flink.streaming.connectors.kafka.FlinkKafkaProducer.flush(FlinkKafkaProducer.java:964) > > at > > > org.apache.flink.streaming.connectors.kafka.FlinkKafkaProducer.preCommit(FlinkKafkaProducer.java:892) > > at > > > org.apache.flink.streaming.connectors.kafka.FlinkKafkaProducer.preCommit(FlinkKafkaProducer.java:98) > > at > > > org.apache.flink.streaming.api.functions.sink.TwoPhaseCommitSinkFunction.snapshotState(TwoPhaseCommitSinkFunction.java:311) > > at > > > org.apache.flink.streaming.connectors.kafka.FlinkKafkaProducer.snapshotState(FlinkKafkaProducer.java:973) > > at > > > org.apache.flink.streaming.util.functions.StreamingFunctionUtils.trySnapshotFunctionState(StreamingFunctionUtils.java:118) > > at > > > org.apache.flink.streaming.util.functions.StreamingFunctionUtils.snapshotFunctionState(StreamingFunctionUtils.java:99) > > at > > > org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.snapshotState(AbstractUdfStreamOperator.java:90) > > at > > > org.apache.flink.streaming.api.operators.AbstractStreamOperator.snapshotState(AbstractStreamOperator.java:399) > > ... 17 more > |
Free forum by Nabble | Edit this page |