---------- Forwarded message --------- 发件人: 陈某 <[hidden email]> Date: 2019年8月8日周四 下午7:25 Subject: need help To: <[hidden email]> 你好,我是一个刚接触flink的新手,在搭建完flink on yarn集群后,依次启动zookeeper,hadoop,yarn,flkink集群,并提交认识到yarn上时运行遇到问题,网上搜索相关问题,暂未找到解决方式,希望能得到帮助,谢谢。
使用的运行指令为: [root@flink01 logs]# flink run -m yarn-cluster ./examples/streaming/WordCount.jar 查看log后错误信息如下:(附件中为完整的log文件) org.apache.flink.client.program.ProgramInvocationException: Could not retrieve the execution result. (JobID: 91e82fd8626bde4c901bf0b1639e12e7) at org.apache.flink.client.program.rest.RestClusterClient.submitJob(RestClusterClient.java:261) at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:483) at org.apache.flink.streaming.api.environment.StreamContextEnvironment.execute(StreamContextEnvironment.java:66) at org.apache.flink.streaming.examples.wordcount.WordCount.main(WordCount.java:89) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:529) at org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:421) at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:423) at org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:813) at org.apache.flink.client.cli.CliFrontend.runProgram(CliFrontend.java:287) at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:213) at org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:1050) at org.apache.flink.client.cli.CliFrontend.lambda$main$11(CliFrontend.java:1126) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1836) at org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41) at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1126) Caused by: org.apache.flink.runtime.client.JobSubmissionException: Failed to submit JobGraph. at org.apache.flink.client.program.rest.RestClusterClient.lambda$submitJob$8(RestClusterClient.java:388) at java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:870) at java.util.concurrent.CompletableFuture$UniExceptionally.tryFire(CompletableFuture.java:852) at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474) at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977) at org.apache.flink.runtime.concurrent.FutureUtils.lambda$retryOperationWithDelay$5(FutureUtils.java:208) at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:760) at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:736) at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474) at java.util.concurrent.CompletableFuture.postFire(CompletableFuture.java:561) at java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:929) at java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:442) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: org.apache.flink.runtime.rest.util.RestClientException: [Internal server error., <Exception on server side: akka.pattern.AskTimeoutException: Ask timed out on [Actor[akka://flink/user/dispatcher#2035575525]] after [10000 ms]. Sender[null] sent message of type "org.apache.flink.runtime.rpc.messages.LocalFencedMessage". at akka.pattern.PromiseActorRef$$anonfun$1.apply$mcV$sp(AskSupport.scala:604) at akka.actor.Scheduler$$anon$4.run(Scheduler.scala:126) at scala.concurrent.Future$InternalCallbackExecutor$.unbatchedExecute(Future.scala:601) at scala.concurrent.BatchingExecutor$class.execute(BatchingExecutor.scala:109) at scala.concurrent.Future$InternalCallbackExecutor$.execute(Future.scala:599) at akka.actor.LightArrayRevolverScheduler$TaskHolder.executeTask(LightArrayRevolverScheduler.scala:329) at akka.actor.LightArrayRevolverScheduler$$anon$4.executeBucket$1(LightArrayRevolverScheduler.scala:280) at akka.actor.LightArrayRevolverScheduler$$anon$4.nextTick(LightArrayRevolverScheduler.scala:284) at akka.actor.LightArrayRevolverScheduler$$anon$4.run(LightArrayRevolverScheduler.scala:236) at java.lang.Thread.run(Thread.java:748) |
你好,
异常里可以看出 AskTimeoutException, 可以调整两个参数 akka.ask.timeout 和 web.timeout 再试一下,默认值如下 akka.ask.timeout: 10 s web.timeout: 10000 PS: 搜 “AskTimeoutException Flink” 可以搜到很多相关答案 Thanks, Biao /'bɪ.aʊ/ On Thu, Aug 8, 2019 at 7:33 PM 陈某 <[hidden email]> wrote: > > > ---------- Forwarded message --------- > 发件人: 陈某 <[hidden email]> > Date: 2019年8月8日周四 下午7:25 > Subject: need help > To: <[hidden email]> > > > 你好,我是一个刚接触flink的新手,在搭建完flink on > yarn集群后,依次启动zookeeper,hadoop,yarn,flkink集群,并提交认识到yarn上时运行遇到问题,网上搜索相关问题,暂未找到解决方式,希望能得到帮助,谢谢。 > > 使用的运行指令为: > [root@flink01 logs]# flink run -m yarn-cluster > ./examples/streaming/WordCount.jar > 查看log后错误信息如下:(附件中为完整的log文件) > org.apache.flink.client.program.ProgramInvocationException: Could not > retrieve the execution result. (JobID: 91e82fd8626bde4c901bf0b1639e12e7) > at > org.apache.flink.client.program.rest.RestClusterClient.submitJob(RestClusterClient.java:261) > at > org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:483) > at > org.apache.flink.streaming.api.environment.StreamContextEnvironment.execute(StreamContextEnvironment.java:66) > at > org.apache.flink.streaming.examples.wordcount.WordCount.main(WordCount.java:89) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:529) > at > org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:421) > at > org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:423) > at > org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:813) > at org.apache.flink.client.cli.CliFrontend.runProgram(CliFrontend.java:287) > at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:213) > at > org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:1050) > at > org.apache.flink.client.cli.CliFrontend.lambda$main$11(CliFrontend.java:1126) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1836) > at > org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41) > at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1126) > Caused by: org.apache.flink.runtime.client.JobSubmissionException: Failed > to submit JobGraph. > at > org.apache.flink.client.program.rest.RestClusterClient.lambda$submitJob$8(RestClusterClient.java:388) > at > java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:870) > at > java.util.concurrent.CompletableFuture$UniExceptionally.tryFire(CompletableFuture.java:852) > at > java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474) > at > java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977) > at > org.apache.flink.runtime.concurrent.FutureUtils.lambda$retryOperationWithDelay$5(FutureUtils.java:208) > at > java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:760) > at > java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:736) > at > java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474) > at > java.util.concurrent.CompletableFuture.postFire(CompletableFuture.java:561) > at > java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:929) > at > java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:442) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Caused by: org.apache.flink.runtime.rest.util.RestClientException: > [Internal server error., <Exception on server side: > akka.pattern.AskTimeoutException: Ask timed out on > [Actor[akka://flink/user/dispatcher#2035575525]] after [10000 ms]. > Sender[null] sent message of type > "org.apache.flink.runtime.rpc.messages.LocalFencedMessage". > at > akka.pattern.PromiseActorRef$$anonfun$1.apply$mcV$sp(AskSupport.scala:604) > at akka.actor.Scheduler$$anon$4.run(Scheduler.scala:126) > at > scala.concurrent.Future$InternalCallbackExecutor$.unbatchedExecute(Future.scala:601) > at > scala.concurrent.BatchingExecutor$class.execute(BatchingExecutor.scala:109) > at > scala.concurrent.Future$InternalCallbackExecutor$.execute(Future.scala:599) > at > akka.actor.LightArrayRevolverScheduler$TaskHolder.executeTask(LightArrayRevolverScheduler.scala:329) > at > akka.actor.LightArrayRevolverScheduler$$anon$4.executeBucket$1(LightArrayRevolverScheduler.scala:280) > at > akka.actor.LightArrayRevolverScheduler$$anon$4.nextTick(LightArrayRevolverScheduler.scala:284) > at > akka.actor.LightArrayRevolverScheduler$$anon$4.run(LightArrayRevolverScheduler.scala:236) > at java.lang.Thread.run(Thread.java:748) > |
异常原因如上所说是 akka ask timeout 的问题,这个问题前两天有人在部署 k8s 的时候也遇过[1]
他的情况是配置资源过少导致 JM 未能及时响应。除了调整上述参数外也可看看是不是这个问题。 Best, tison. [1] https://lists.apache.org/thread.html/84db9dca2e990dd0ebc30aa35390ac75a0e9c7cbfcdbc2029986d4d7@%3Cuser-zh.flink.apache.org%3E Biao Liu <[hidden email]> 于2019年8月8日周四 下午8:00写道: > 你好, > > 异常里可以看出 AskTimeoutException, 可以调整两个参数 akka.ask.timeout 和 web.timeout > 再试一下,默认值如下 > > akka.ask.timeout: 10 s > web.timeout: 10000 > > PS: 搜 “AskTimeoutException Flink” 可以搜到很多相关答案 > > Thanks, > Biao /'bɪ.aʊ/ > > > > On Thu, Aug 8, 2019 at 7:33 PM 陈某 <[hidden email]> wrote: > > > > > > > ---------- Forwarded message --------- > > 发件人: 陈某 <[hidden email]> > > Date: 2019年8月8日周四 下午7:25 > > Subject: need help > > To: <[hidden email]> > > > > > > 你好,我是一个刚接触flink的新手,在搭建完flink on > > > yarn集群后,依次启动zookeeper,hadoop,yarn,flkink集群,并提交认识到yarn上时运行遇到问题,网上搜索相关问题,暂未找到解决方式,希望能得到帮助,谢谢。 > > > > 使用的运行指令为: > > [root@flink01 logs]# flink run -m yarn-cluster > > ./examples/streaming/WordCount.jar > > 查看log后错误信息如下:(附件中为完整的log文件) > > org.apache.flink.client.program.ProgramInvocationException: Could not > > retrieve the execution result. (JobID: 91e82fd8626bde4c901bf0b1639e12e7) > > at > > > org.apache.flink.client.program.rest.RestClusterClient.submitJob(RestClusterClient.java:261) > > at > > org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:483) > > at > > > org.apache.flink.streaming.api.environment.StreamContextEnvironment.execute(StreamContextEnvironment.java:66) > > at > > > org.apache.flink.streaming.examples.wordcount.WordCount.main(WordCount.java:89) > > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > > at > > > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > > at > > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > > at java.lang.reflect.Method.invoke(Method.java:498) > > at > > > org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:529) > > at > > > org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:421) > > at > > org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:423) > > at > > > org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:813) > > at > org.apache.flink.client.cli.CliFrontend.runProgram(CliFrontend.java:287) > > at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:213) > > at > > > org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:1050) > > at > > > org.apache.flink.client.cli.CliFrontend.lambda$main$11(CliFrontend.java:1126) > > at java.security.AccessController.doPrivileged(Native Method) > > at javax.security.auth.Subject.doAs(Subject.java:422) > > at > > > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1836) > > at > > > org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41) > > at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1126) > > Caused by: org.apache.flink.runtime.client.JobSubmissionException: Failed > > to submit JobGraph. > > at > > > org.apache.flink.client.program.rest.RestClusterClient.lambda$submitJob$8(RestClusterClient.java:388) > > at > > > java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:870) > > at > > > java.util.concurrent.CompletableFuture$UniExceptionally.tryFire(CompletableFuture.java:852) > > at > > > java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474) > > at > > > java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977) > > at > > > org.apache.flink.runtime.concurrent.FutureUtils.lambda$retryOperationWithDelay$5(FutureUtils.java:208) > > at > > > java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:760) > > at > > > java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:736) > > at > > > java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474) > > at > > > java.util.concurrent.CompletableFuture.postFire(CompletableFuture.java:561) > > at > > > java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:929) > > at > > > java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:442) > > at > > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > > at > > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > > at java.lang.Thread.run(Thread.java:748) > > Caused by: org.apache.flink.runtime.rest.util.RestClientException: > > [Internal server error., <Exception on server side: > > akka.pattern.AskTimeoutException: Ask timed out on > > [Actor[akka://flink/user/dispatcher#2035575525]] after [10000 ms]. > > Sender[null] sent message of type > > "org.apache.flink.runtime.rpc.messages.LocalFencedMessage". > > at > > > akka.pattern.PromiseActorRef$$anonfun$1.apply$mcV$sp(AskSupport.scala:604) > > at akka.actor.Scheduler$$anon$4.run(Scheduler.scala:126) > > at > > > scala.concurrent.Future$InternalCallbackExecutor$.unbatchedExecute(Future.scala:601) > > at > > > scala.concurrent.BatchingExecutor$class.execute(BatchingExecutor.scala:109) > > at > > > scala.concurrent.Future$InternalCallbackExecutor$.execute(Future.scala:599) > > at > > > akka.actor.LightArrayRevolverScheduler$TaskHolder.executeTask(LightArrayRevolverScheduler.scala:329) > > at > > > akka.actor.LightArrayRevolverScheduler$$anon$4.executeBucket$1(LightArrayRevolverScheduler.scala:280) > > at > > > akka.actor.LightArrayRevolverScheduler$$anon$4.nextTick(LightArrayRevolverScheduler.scala:284) > > at > > > akka.actor.LightArrayRevolverScheduler$$anon$4.run(LightArrayRevolverScheduler.scala:236) > > at java.lang.Thread.run(Thread.java:748) > > > |
Free forum by Nabble | Edit this page |