Flink 1.9 Failed to take leadership with session id 异常

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

Flink 1.9 Failed to take leadership with session id 异常

王佩-2
Flink 1.9 DataStream程序,运行一段时间后报如下错误:

2019-10-09 21:07:44 INFO org.apache.flink.runtime.jobmaster.JobMaster
dissolveResourceManagerConnection 1010 Close ResourceManager
connection be4e0b96b331165ff9f4bd7ef4868d94: JobManager is no longer
the leader..
2019-10-09 21:07:44 INFO org.apache.flink.runtime.jobmaster.JobMaster
onStop 335 Stopping the JobMaster for job
flinkx_liveme_microservice_rawdata(d2f6aa0115f4116f690451ef64c56fd4).
2019-10-09 21:07:44 INFO
org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionService
stop 154 Stopping ZooKeeperLeaderElectionService
ZooKeeperLeaderElectionService{leaderPath='/leader/d2f6aa0115f4116f690451ef64c56fd4/job_manager_lock'}.
2019-10-09 21:07:44 INFO
org.apache.flink.runtime.jobmanager.ZooKeeperSubmittedJobGraphStore
recoverJobGraph 197 Recovered
SubmittedJobGraph(d2f6aa0115f4116f690451ef64c56fd4).
2019-10-09 21:07:44 ERROR
org.apache.flink.runtime.entrypoint.ClusterEntrypoint onFatalError 374
Fatal error occurred in the cluster entrypoint.
org.apache.flink.runtime.dispatcher.DispatcherException: Failed to
take leadership with session id a2a48489-b1b4-4f70-a721-edeaf543e1da.
        at org.apache.flink.runtime.dispatcher.Dispatcher.lambda$null$30(Dispatcher.java:915)
        at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:760)
        at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:736)
        at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
        at java.util.concurrent.CompletableFuture.postFire(CompletableFuture.java:561)
        at java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:929)
        at java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:442)
        at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRunAsync(AkkaRpcActor.java:397)
        at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:190)
        at org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleRpcMessage(FencedAkkaRpcActor.java:88)
        at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:152)
        at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:26)
        at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:21)
        at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:123)
        at akka.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:21)
        at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:170)
        at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
        at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
        at akka.actor.Actor$class.aroundReceive(Actor.scala:517)
        at akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:225)
        at akka.actor.ActorCell.receiveMessage(ActorCell.scala:592)
        at akka.actor.ActorCell.invoke(ActorCell.scala:561)
        at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:258)
        at akka.dispatch.Mailbox.run(Mailbox.scala:225)
        at akka.dispatch.Mailbox.exec(Mailbox.scala:235)
        at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
        at akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
        at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
        at akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Caused by: org.apache.flink.runtime.dispatcher.DispatcherException:
Termination of previous JobManager for job
d2f6aa0115f4116f690451ef64c56fd4 failed. Cannot submit job under the
same job id.
        at org.apache.flink.runtime.dispatcher.Dispatcher.lambda$waitForTerminatingJobManager$33(Dispatcher.java:949)
        at java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:870)
        at java.util.concurrent.CompletableFuture.uniExceptionallyStage(CompletableFuture.java:884)
        at java.util.concurrent.CompletableFuture.exceptionally(CompletableFuture.java:2196)
        at org.apache.flink.runtime.dispatcher.Dispatcher.waitForTerminatingJobManager(Dispatcher.java:946)
        at org.apache.flink.runtime.dispatcher.Dispatcher.tryAcceptLeadershipAndRunJobs(Dispatcher.java:933)
        at org.apache.flink.runtime.dispatcher.Dispatcher.lambda$null$28(Dispatcher.java:892)
        at java.util.concurrent.CompletableFuture.uniCompose(CompletableFuture.java:952)
        at java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:926)
        ... 23 more
Caused by: java.util.concurrent.CompletionException:
org.apache.flink.util.FlinkException: Could not properly shut down the
JobManagerRunner
        at java.util.concurrent.CompletableFuture.encodeRelay(CompletableFuture.java:326)
        at java.util.concurrent.CompletableFuture.completeRelay(CompletableFuture.java:338)
        at java.util.concurrent.CompletableFuture.uniRelay(CompletableFuture.java:911)
        at java.util.concurrent.CompletableFuture$UniRelay.tryFire(CompletableFuture.java:899)
        at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
        at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977)
        at org.apache.flink.runtime.jobmaster.JobManagerRunner.lambda$closeAsync$0(JobManagerRunner.java:207)
        at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:760)
        at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:736)
        at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
        at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977)
        at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.postStop(AkkaRpcActor.java:132)
        at org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.postStop(FencedAkkaRpcActor.java:40)
        at akka.actor.Actor$class.aroundPostStop(Actor.scala:536)
        at akka.actor.AbstractActor.aroundPostStop(AbstractActor.scala:225)
        at akka.actor.dungeon.FaultHandling$class.akka$actor$dungeon$FaultHandling$$finishTerminate(FaultHandling.scala:210)
        at akka.actor.dungeon.FaultHandling$class.terminate(FaultHandling.scala:172)
        at akka.actor.ActorCell.terminate(ActorCell.scala:429)
        at akka.actor.ActorCell.invokeAll$1(ActorCell.scala:533)
        at akka.actor.ActorCell.systemInvoke(ActorCell.scala:549)
        at akka.dispatch.Mailbox.processAllSystemMessages(Mailbox.scala:283)
        at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:261)
        ... 6 more
Caused by: org.apache.flink.util.FlinkException: Could not properly
shut down the JobManagerRunner
        ... 22 more
Caused by: org.apache.flink.runtime.rpc.akka.exceptions.AkkaRpcException:
Failure while stopping RpcEndpoint jobmanager_0.
        at org.apache.flink.runtime.rpc.akka.AkkaRpcActor$StartedState.terminate(AkkaRpcActor.java:513)
        at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleControlMessage(AkkaRpcActor.java:175)
        at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:26)
        at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:21)
        at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:123)
        at akka.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:21)
        at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:170)
        at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
        at akka.actor.Actor$class.aroundReceive(Actor.scala:517)
        at akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:225)
        at akka.actor.ActorCell.receiveMessage(ActorCell.scala:592)
        at akka.actor.ActorCell.invoke(ActorCell.scala:561)
        at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:258)
        ... 6 more
Caused by: java.lang.NullPointerException
        at org.apache.flink.runtime.jobmaster.JobMaster.disconnectTaskManager(JobMaster.java:425)
        at org.apache.flink.runtime.jobmaster.JobMaster.onStop(JobMaster.java:343)
        at org.apache.flink.runtime.rpc.akka.AkkaRpcActor$StartedState.terminate(AkkaRpcActor.java:509)
        ... 18 more
2019-10-09 21:07:44 INFO org.apache.flink.runtime.blob.BlobServer
close 340 Stopped BLOB server at 0.0.0.0:37947


运行好久了都不出问题,突然出现的问题。帮忙看下!