Hi,
我现在有一个flink任务,运行一段时间后checkpoint会超时,INFO信息如下:
checkpoint xxx of job xxx expired before completing.
Trying to recover from a global failure.
org.apache.flink.util.FlinkRuntimeException: Excedded checkpoint toerable
failure threshold.
然后我查看了taskmanager日志,在报错之前的日志有一条WARN:
WARN akka.remote.Remoting [] -
Association to [akka.tcp://flink@hadoop43:38839] with unknown UID is
irrecoverably failed. Address cannot be quarantined without knowing the UID,
gating instead for 50 ms.
这条WARN之后task就开始Attempting to cancel task Source,不知道是因为什么原因,期望收到各位的回复
Best
--
Sent from:
http://apache-flink.147419.n8.nabble.com/