flink1.9.1用采用-d(分离模式提交)作业报错,但是不加-d是可以正常跑的

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

flink1.9.1用采用-d(分离模式提交)作业报错,但是不加-d是可以正常跑的

bradyMk
CONTENTS DELETED
The author has deleted this message.
Reply | Threaded
Open this post in threaded view
|

Re: flink1.9.1用采用-d(分离模式提交)作业报错,但是不加-d是可以正常跑的

Congxian Qiu
Hi
   如果我理解没错的话,是否添加 -d 会使用不同的模式启动作业(PerJob 和 Session
模式),从错误栈来看猜测是版本冲突了导致的,你有尝试过最新的 1.11 是否还有这个问题吗?
Best,
Congxian


bradyMk <[hidden email]> 于2020年8月14日周五 下午6:52写道:

> 请问大家:
> 我采用如下命令提交:
> flink run \
> -m yarn-cluster \
> -yn 3 \
> -ys 3 \
> -yjm 2048m \
> -ytm 2048m \
> -ynm flink_test \
> -d \
> -c net.realtime.app.FlinkTest ./hotmall-flink.jar
> 就会失败,报错信息如下:
> [AMRM Callback Handler Thread] ERROR
> org.apache.flink.yarn.YarnResourceManager - Fatal error occurred in
> ResourceManager.
> java.lang.NoSuchMethodError:
>
> org.apache.hadoop.yarn.api.protocolrecords.AllocateRequest.newInstance(IFLjava/util/List;Ljava/util/List;Ljava/util/List;Lorg/apache/hadoop/yarn/api/records/ResourceBlacklistRequest;)Lorg/apache/hadoop/yarn/api/protocolrecords/AllocateRequest;
>         at
>
> org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl.allocate(AMRMClientImpl.java:279)
>         at
>
> org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl$HeartbeatThread.run(AMRMClientAsyncImpl.java:273)
> [AMRM Callback Handler Thread] ERROR
> org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Fatal error
> occurred
> in the cluster entrypoint.
> java.lang.NoSuchMethodError:
>
> org.apache.hadoop.yarn.api.protocolrecords.AllocateRequest.newInstance(IFLjava/util/List;Ljava/util/List;Ljava/util/List;Lorg/apache/hadoop/yarn/api/records/ResourceBlacklistRequest;)Lorg/apache/hadoop/yarn/api/protocolrecords/AllocateRequest;
>         at
>
> org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl.allocate(AMRMClientImpl.java:279)
>         at
>
> org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl$HeartbeatThread.run(AMRMClientAsyncImpl.java:273)
> [flink-akka.actor.default-dispatcher-2] INFO
> org.apache.flink.yarn.YarnResourceManager - ResourceManager
> akka.tcp://[hidden email]-174460:33650/user/resourcemanager
> was
> granted leadership with fencing token 00000000000000000000000000000000
> [BlobServer shutdown hook] INFO org.apache.flink.runtime.blob.BlobServer -
> Stopped BLOB server at 0.0.0.0:36247
> <
> http://apache-flink.147419.n8.nabble.com/file/t802/%E6%8D%95%E8%8E%B71111.png>
>
> 但是我在提交命令时,不加-d,就可以正常提交运行;更奇怪的是,我运行另一个任务,加了-d参数,可以正常提交。
> 我这个提交失败的任务开始是用如下命令运行的:
> nohup flink run \
> -m yarn-cluster \
> -yn 3 \
> -ys 3 \
> -yjm 2048m \
> -ytm 2048m \
> -ynm flink_test \
> -c net.realtime.app.FlinkTest ./hotmall-flink.jar > /logs/flink.log 2>&1 &
>  > /logs/nohup.out 2>&1 &
>
> 在这个任务挂掉之后,再用-d的方式重启就会出现我开始说的问题,很奇怪,有大佬知道为什么么?
>
>
>
> -----
> Best Wishes
> --
> Sent from: http://apache-flink.147419.n8.nabble.com/
>
Reply | Threaded
Open this post in threaded view
|

Re: flink1.9.1用采用-d(分离模式提交)作业报错,但是不加-d是可以正常跑的

bradyMk
CONTENTS DELETED
The author has deleted this message.
Reply | Threaded
Open this post in threaded view
|

Re: flink1.9.1用采用-d(分离模式提交)作业报错,但是不加-d是可以正常跑的

Congxian Qiu
Hi
   像我之前说的那样,加 -d 和不加 -d 使用的是不同的模式启动作业的。从你的报错栈来看,应该是类冲突了。你可以看下这个文档[1] 看看能否帮助你
java.lang.NoSuchMethodError:
org.apache.hadoop.yarn.api.protocolrecords.AllocateRequest.newInstance(IFLjava/util/List;Ljava/util/List;Ljava/util/List;Lorg/apache/hadoop/yarn/api/records/ResourceBlacklistRequest;)Lorg/apache/hadoop/yarn/api/protocolrecords/AllocateRequest;
        at
org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl.allocate(AMRMClientImpl.java:279)
        at
org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl$HeartbeatThread.run(AMRMClientAsyncImpl.java:273)


[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.9/monitoring/debugging_classloading.html
Best,
Congxian


bradyMk <[hidden email]> 于2020年8月17日周一 下午2:36写道:

> 您好:
>
> 我没有尝试过新版本,但是觉得好像不是版本的问题,因为我其他所有flink作业加上-d都能正常运行,就这个不行,并且如果我不用(-d)提交,这个也是可以运行的。我也很奇怪
>
>
>
> -----
> Best Wishes
> --
> Sent from: http://apache-flink.147419.n8.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re: flink1.9.1用采用-d(分离模式提交)作业报错,但是不加-d是可以正常跑的

bradyMk
CONTENTS DELETED
The author has deleted this message.
Reply | Threaded
Open this post in threaded view
|

Re: flink1.9.1用采用-d(分离模式提交)作业报错,但是不加-d是可以正常跑的

Congxian Qiu
Hi
    1.9 上是否加 -d 应该会使用不同的模式来启动作业 (perjob 还是
session),这两个模式下的行为应该是不完全一致的,具体的可以看下这里[1]

[1]
https://github.com/apache/flink/blob/5125b1123dfcfff73b5070401dfccb162959080c/flink-clients/src/main/java/org/apache/flink/client/cli/CliFrontend.java#L211

Best,
Congxian


bradyMk <[hidden email]> 于2020年8月19日周三 上午10:54写道:

> 万分感谢!
> 问题已经解决,确实是包的问题,我很傻的以为不加-d可以运行,那就跟包没关系。
> 所以说加不加-d,应该是调用不同包的不同方法吧?
>
>
>
> -----
> Best Wishes
> --
> Sent from: http://apache-flink.147419.n8.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re: flink1.9.1用采用-d(分离模式提交)作业报错,但是不加-d是可以正常跑的

bradyMk
CONTENTS DELETED
The author has deleted this message.