Apache Flink 中文用户邮件列表

flink 1.7.2 YARN Session模式提交任务问题求助

Classic

List

Threaded

5 messages Options

Chief

flink 1.7.2 YARN Session模式提交任务问题求助

大家好
目前环境是flink 1.7.2，使用YARN Session模式提交任务，Hadoop 版本2.7.3，hdfs namenode配置了ha模式，提交任务的时候报以下错误，系统环境变量中已经设置了HADOOP_HOME,YARN_CONF_DIR，HADOOP_CONF_DIR，HADOOP_CLASSPATH，在flink_conf.yaml中配置了fs.hdfs.hadoopconf

2020-04-10 19:12:02,908 INFO  org.apache.flink.runtime.jobmaster.JobMaster                  - Connecting to ResourceManager akka.tcp://flink@trusfortpoc1:23584/user/resourcemanager(00000000000000000000000000000000)
2020-04-10 19:12:02,909 INFO  org.apache.flink.runtime.jobmaster.slotpool.SlotPool          - Cannot serve slot request, no ResourceManager connected. Adding as pending request [SlotRequestId{0feacbb4fe16c8c7a70249f1396565d0}]
2020-04-10 19:12:02,911 INFO  org.apache.flink.runtime.jobmaster.JobMaster                  - Resolved ResourceManager address, beginning registration
2020-04-10 19:12:02,911 INFO  org.apache.flink.runtime.jobmaster.JobMaster                  - Registration at ResourceManager attempt 1 (timeout=100ms)
2020-04-10 19:12:02,912 INFO  org.apache.flink.runtime.jobmaster.slotpool.SlotPool          - Cannot serve slot request, no ResourceManager connected. Adding as pending request [SlotRequestId{35ad2384e9cd0efd30b43f5302db24b6}]
2020-04-10 19:12:02,913 INFO  org.apache.flink.yarn.YarnResourceManager                     - Registering job manager [hidden email]://flink@trusfortpoc1:23584/user/jobmanager_0 for job 24691b33c18d7ad73b1f52edb3d68ae4.
2020-04-10 19:12:02,917 INFO  org.apache.flink.yarn.YarnResourceManager                     - Registered job manager [hidden email]://flink@trusfortpoc1:23584/user/jobmanager_0 for job 24691b33c18d7ad73b1f52edb3d68ae4.
2020-04-10 19:12:02,919 INFO  org.apache.flink.runtime.jobmaster.JobMaster                  - JobManager successfully registered at ResourceManager, leader id: 00000000000000000000000000000000.
2020-04-10 19:12:02,919 INFO  org.apache.flink.runtime.jobmaster.slotpool.SlotPool          - Requesting new slot [SlotRequestId{35ad2384e9cd0efd30b43f5302db24b6}] and profile ResourceProfile{cpuCores=-1.0, heapMemoryInMB=-1, directMemoryInMB=0, nativeMemoryInMB=0, networkMemoryInMB=0} from resource manager.
2020-04-10 19:12:02,920 INFO  org.apache.flink.yarn.YarnResourceManager                     - Request slot with profile ResourceProfile{cpuCores=-1.0, heapMemoryInMB=-1, directMemoryInMB=0, nativeMemoryInMB=0, networkMemoryInMB=0} for job 24691b33c18d7ad73b1f52edb3d68ae4 with allocation id AllocationID{5a12237c7f2bd8b1cc760ddcbab5a1c0}.
2020-04-10 19:12:02,921 INFO  org.apache.flink.runtime.jobmaster.slotpool.SlotPool          - Requesting new slot [SlotRequestId{0feacbb4fe16c8c7a70249f1396565d0}] and profile ResourceProfile{cpuCores=-1.0, heapMemoryInMB=-1, directMemoryInMB=0, nativeMemoryInMB=0, networkMemoryInMB=0} from resource manager.
2020-04-10 19:12:02,924 INFO  org.apache.flink.yarn.YarnResourceManager                     - Requesting new TaskExecutor container with resources <memory:4096, vCores:6>. Number pending requests 1.
2020-04-10 19:12:02,926 INFO  org.apache.flink.yarn.YarnResourceManager                     - Request slot with profile ResourceProfile{cpuCores=-1.0, heapMemoryInMB=-1, directMemoryInMB=0, nativeMemoryInMB=0, networkMemoryInMB=0} for job 24691b33c18d7ad73b1f52edb3d68ae4 with allocation id AllocationID{37dd666a18040bf63ffbf2e022b2ea9b}.
2020-04-10 19:12:06,531 INFO  org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl         - Received new token for : trusfortpoc3:35206
2020-04-10 19:12:06,543 INFO  org.apache.flink.yarn.YarnResourceManager                     - Received new container: container_1586426824930_0006_01_000002 - Remaining pending container requests: 1
2020-04-10 19:12:06,543 INFO  org.apache.flink.yarn.YarnResourceManager                     - Removing container request Capability[<memory:4096, vCores:6>]Priority[1]. Pending container requests 0.
2020-04-10 19:12:06,568 ERROR org.apache.flink.yarn.YarnResourceManager                     - Could not start TaskManager in container container_1586426824930_0006_01_000002.
java.lang.IllegalArgumentException: java.net.UnknownHostException: hdfsClusterForML
at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:378)
at org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProxies.java:320)
at org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:176)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:687)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:628)
at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:149)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2667)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:93)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2701)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2683)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:372)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:295)
at org.apache.flink.yarn.Utils.createTaskExecutorContext(Utils.java:453)
at org.apache.flink.yarn.YarnResourceManager.createTaskExecutorLaunchContext(YarnResourceManager.java:555)
at org.apache.flink.yarn.YarnResourceManager.lambda$onContainersAllocated$1(YarnResourceManager.java:390)
at org.apache.flink.yarn.YarnResourceManager$$Lambda$183/1182651376.run(Unknown Source)
at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRunAsync(AkkaRpcActor.java:332)
at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:158)
at org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleRpcMessage(FencedAkkaRpcActor.java:70)
at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.onReceive(AkkaRpcActor.java:142)
at org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.onReceive(FencedAkkaRpcActor.java:40)
at akka.actor.UntypedActor$$anonfun$receive$1.applyOrElse(UntypedActor.scala:165)
at akka.actor.Actor$class.aroundReceive(Actor.scala:502)
at akka.actor.UntypedActor.aroundReceive(UntypedActor.scala:95)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:526)
at akka.actor.ActorCell.invoke(ActorCell.scala:495)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:257)
at akka.dispatch.Mailbox.run(Mailbox.scala:224)
at akka.dispatch.Mailbox.exec(Mailbox.scala:234)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Caused by: java.net.UnknownHostException: hdfsClusterForML
... 33 more

这个hdfsClusterForML是namenode ha 的nameservice，经过分析是没加载hdfs-site.xml配置导致的，
也尝试过把Hadoop的几个配置文件放到flink 的conf目录下但都无效，最终通过改YarnResourceManager源码后能够正常提交任务。
public YarnResourceManager(
      RpcService rpcService,
      String resourceManagerEndpointId,
      ResourceID resourceId,
      Configuration flinkConfig,
      Map<String, String> env,
      HighAvailabilityServices highAvailabilityServices,
      HeartbeatServices heartbeatServices,
      SlotManager slotManager,
      MetricRegistry metricRegistry,
      JobLeaderIdService jobLeaderIdService,
      ClusterInformation clusterInformation,
      FatalErrorHandler fatalErrorHandler,
      @Nullable String webInterfaceUrl,
      JobManagerMetricGroup jobManagerMetricGroup) {
   super(
      rpcService,
      resourceManagerEndpointId,
      resourceId,
      highAvailabilityServices,
      heartbeatServices,
      slotManager,
      metricRegistry,
      jobLeaderIdService,
      clusterInformation,
      fatalErrorHandler,
      jobManagerMetricGroup);
   this.flinkConfig  = flinkConfig;
   this.yarnConfig = new YarnConfiguration(HadoopUtils.getHadoopConfiguration(flinkConfig));
但我认为这肯定不是解决问题的方法，所以向大家求助，是不是我忽略什么。

Yangze Guo

Re: flink 1.7.2 YARN Session模式提交任务问题求助

Flink需要设置hadoop相关conf位置的环境变量 YARN_CONF_DIR or HADOOP_CONF_DIR [1]

[1] https://ci.apache.org/projects/flink/flink-docs-master/ops/deployment/yarn_setup.html

Best,
Yangze Guo

On Mon, Apr 13, 2020 at 10:52 PM Chief <[hidden email]> wrote:

>
> 大家好
> 目前环境是flink 1.7.2，使用YARN Session模式提交任务，Hadoop 版本2.7.3，hdfs namenode配置了ha模式，提交任务的时候报以下错误，系统环境变量中已经设置了HADOOP_HOME,YARN_CONF_DIR，HADOOP_CONF_DIR，HADOOP_CLASSPATH，在flink_conf.yaml中配置了fs.hdfs.hadoopconf
>
>
> 2020-04-10 19:12:02,908 INFO  org.apache.flink.runtime.jobmaster.JobMaster                  - Connecting to ResourceManager akka.tcp://flink@trusfortpoc1:23584/user/resourcemanager(00000000000000000000000000000000)
> 2020-04-10 19:12:02,909 INFO  org.apache.flink.runtime.jobmaster.slotpool.SlotPool          - Cannot serve slot request, no ResourceManager connected. Adding as pending request [SlotRequestId{0feacbb4fe16c8c7a70249f1396565d0}]
> 2020-04-10 19:12:02,911 INFO  org.apache.flink.runtime.jobmaster.JobMaster                  - Resolved ResourceManager address, beginning registration
> 2020-04-10 19:12:02,911 INFO  org.apache.flink.runtime.jobmaster.JobMaster                  - Registration at ResourceManager attempt 1 (timeout=100ms)
> 2020-04-10 19:12:02,912 INFO  org.apache.flink.runtime.jobmaster.slotpool.SlotPool          - Cannot serve slot request, no ResourceManager connected. Adding as pending request [SlotRequestId{35ad2384e9cd0efd30b43f5302db24b6}]
> 2020-04-10 19:12:02,913 INFO  org.apache.flink.yarn.YarnResourceManager                     - Registering job manager [hidden email]://flink@trusfortpoc1:23584/user/jobmanager_0 for job 24691b33c18d7ad73b1f52edb3d68ae4.
> 2020-04-10 19:12:02,917 INFO  org.apache.flink.yarn.YarnResourceManager                     - Registered job manager [hidden email]://flink@trusfortpoc1:23584/user/jobmanager_0 for job 24691b33c18d7ad73b1f52edb3d68ae4.
> 2020-04-10 19:12:02,919 INFO  org.apache.flink.runtime.jobmaster.JobMaster                  - JobManager successfully registered at ResourceManager, leader id: 00000000000000000000000000000000.
> 2020-04-10 19:12:02,919 INFO  org.apache.flink.runtime.jobmaster.slotpool.SlotPool          - Requesting new slot [SlotRequestId{35ad2384e9cd0efd30b43f5302db24b6}] and profile ResourceProfile{cpuCores=-1.0, heapMemoryInMB=-1, directMemoryInMB=0, nativeMemoryInMB=0, networkMemoryInMB=0} from resource manager.
> 2020-04-10 19:12:02,920 INFO  org.apache.flink.yarn.YarnResourceManager                     - Request slot with profile ResourceProfile{cpuCores=-1.0, heapMemoryInMB=-1, directMemoryInMB=0, nativeMemoryInMB=0, networkMemoryInMB=0} for job 24691b33c18d7ad73b1f52edb3d68ae4 with allocation id AllocationID{5a12237c7f2bd8b1cc760ddcbab5a1c0}.
> 2020-04-10 19:12:02,921 INFO  org.apache.flink.runtime.jobmaster.slotpool.SlotPool          - Requesting new slot [SlotRequestId{0feacbb4fe16c8c7a70249f1396565d0}] and profile ResourceProfile{cpuCores=-1.0, heapMemoryInMB=-1, directMemoryInMB=0, nativeMemoryInMB=0, networkMemoryInMB=0} from resource manager.
> 2020-04-10 19:12:02,924 INFO  org.apache.flink.yarn.YarnResourceManager                     - Requesting new TaskExecutor container with resources <memory:4096, vCores:6>. Number pending requests 1.
> 2020-04-10 19:12:02,926 INFO  org.apache.flink.yarn.YarnResourceManager                     - Request slot with profile ResourceProfile{cpuCores=-1.0, heapMemoryInMB=-1, directMemoryInMB=0, nativeMemoryInMB=0, networkMemoryInMB=0} for job 24691b33c18d7ad73b1f52edb3d68ae4 with allocation id AllocationID{37dd666a18040bf63ffbf2e022b2ea9b}.
> 2020-04-10 19:12:06,531 INFO  org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl         - Received new token for : trusfortpoc3:35206
> 2020-04-10 19:12:06,543 INFO  org.apache.flink.yarn.YarnResourceManager                     - Received new container: container_1586426824930_0006_01_000002 - Remaining pending container requests: 1
> 2020-04-10 19:12:06,543 INFO  org.apache.flink.yarn.YarnResourceManager                     - Removing container request Capability[<memory:4096, vCores:6>]Priority[1]. Pending container requests 0.
> 2020-04-10 19:12:06,568 ERROR org.apache.flink.yarn.YarnResourceManager                     - Could not start TaskManager in container container_1586426824930_0006_01_000002.
> java.lang.IllegalArgumentException: java.net.UnknownHostException: hdfsClusterForML
> at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:378)
> at org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProxies.java:320)
> at org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:176)
> at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:687)
> at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:628)
> at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:149)
> at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2667)
> at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:93)
> at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2701)
> at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2683)
> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:372)
> at org.apache.hadoop.fs.Path.getFileSystem(Path.java:295)
> at org.apache.flink.yarn.Utils.createTaskExecutorContext(Utils.java:453)
> at org.apache.flink.yarn.YarnResourceManager.createTaskExecutorLaunchContext(YarnResourceManager.java:555)
> at org.apache.flink.yarn.YarnResourceManager.lambda$onContainersAllocated$1(YarnResourceManager.java:390)
> at org.apache.flink.yarn.YarnResourceManager$$Lambda$183/1182651376.run(Unknown Source)
> at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRunAsync(AkkaRpcActor.java:332)
> at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:158)
> at org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleRpcMessage(FencedAkkaRpcActor.java:70)
> at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.onReceive(AkkaRpcActor.java:142)
> at org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.onReceive(FencedAkkaRpcActor.java:40)
> at akka.actor.UntypedActor$$anonfun$receive$1.applyOrElse(UntypedActor.scala:165)
> at akka.actor.Actor$class.aroundReceive(Actor.scala:502)
> at akka.actor.UntypedActor.aroundReceive(UntypedActor.scala:95)
> at akka.actor.ActorCell.receiveMessage(ActorCell.scala:526)
> at akka.actor.ActorCell.invoke(ActorCell.scala:495)
> at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:257)
> at akka.dispatch.Mailbox.run(Mailbox.scala:224)
> at akka.dispatch.Mailbox.exec(Mailbox.scala:234)
> at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
> at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
> at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
> at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> Caused by: java.net.UnknownHostException: hdfsClusterForML
> ... 33 more
>
>
> 这个hdfsClusterForML是namenode ha 的nameservice，经过分析是没加载hdfs-site.xml配置导致的，
> 也尝试过把Hadoop的几个配置文件放到flink 的conf目录下但都无效，最终通过改YarnResourceManager源码后能够正常提交任务。
> public YarnResourceManager(
>       RpcService rpcService,
>       String resourceManagerEndpointId,
>       ResourceID resourceId,
>       Configuration flinkConfig,
>       Map<String, String> env,
>       HighAvailabilityServices highAvailabilityServices,
>       HeartbeatServices heartbeatServices,
>       SlotManager slotManager,
>       MetricRegistry metricRegistry,
>       JobLeaderIdService jobLeaderIdService,
>       ClusterInformation clusterInformation,
>       FatalErrorHandler fatalErrorHandler,
>       @Nullable String webInterfaceUrl,
>       JobManagerMetricGroup jobManagerMetricGroup) {
>    super(
>       rpcService,
>       resourceManagerEndpointId,
>       resourceId,
>       highAvailabilityServices,
>       heartbeatServices,
>       slotManager,
>       metricRegistry,
>       jobLeaderIdService,
>       clusterInformation,
>       fatalErrorHandler,
>       jobManagerMetricGroup);
>    this.flinkConfig  = flinkConfig;
>    this.yarnConfig = new YarnConfiguration(HadoopUtils.getHadoopConfiguration(flinkConfig));
> 但我认为这肯定不是解决问题的方法，所以向大家求助，是不是我忽略什么。

Chief

回复： flink 1.7.2 YARN Session模式提交任务问题求助

hi Yangze Guo
您说的环境变量已经在当前用户的环境变量文件里面设置了，您可以看看我的问题描述，现在如果checkpoint的路径设置不是namenode ha的nameservice就不会报错，checkpoint都正常。

------------------ 原始邮件 ------------------
发件人: "Yangze Guo"<[hidden email]>;
发送时间: 2020年4月15日(星期三) 下午3:00
收件人: "user-zh"<[hidden email]>;

主题: Re: flink 1.7.2 YARN Session模式提交任务问题求助

Flink需要设置hadoop相关conf位置的环境变量 YARN_CONF_DIR or HADOOP_CONF_DIR [1]

[1] https://ci.apache.org/projects/flink/flink-docs-master/ops/deployment/yarn_setup.html

Best,
Yangze Guo

On Mon, Apr 13, 2020 at 10:52 PM Chief <[hidden email]> wrote:
>
> 大家好
> 目前环境是flink 1.7.2，使用YARN Session模式提交任务，Hadoop 版本2.7.3，hdfs namenode配置了ha模式，提交任务的时候报以下错误，系统环境变量中已经设置了HADOOP_HOME,YARN_CONF_DIR，HADOOP_CONF_DIR，HADOOP_CLASSPATH，在flink_conf.yaml中配置了fs.hdfs.hadoopconf
>
>
> 2020-04-10 19:12:02,908 INFO&nbsp; org.apache.flink.runtime.jobmaster.JobMaster&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; - Connecting to ResourceManager akka.tcp://flink@trusfortpoc1:23584/user/resourcemanager(00000000000000000000000000000000)
> 2020-04-10 19:12:02,909 INFO&nbsp; org.apache.flink.runtime.jobmaster.slotpool.SlotPool&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; - Cannot serve slot request, no ResourceManager connected. Adding as pending request [SlotRequestId{0feacbb4fe16c8c7a70249f1396565d0}]
> 2020-04-10 19:12:02,911 INFO&nbsp; org.apache.flink.runtime.jobmaster.JobMaster&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; - Resolved ResourceManager address, beginning registration
> 2020-04-10 19:12:02,911 INFO&nbsp; org.apache.flink.runtime.jobmaster.JobMaster&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; - Registration at ResourceManager attempt 1 (timeout=100ms)
> 2020-04-10 19:12:02,912 INFO&nbsp; org.apache.flink.runtime.jobmaster.slotpool.SlotPool&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; - Cannot serve slot request, no ResourceManager connected. Adding as pending request [SlotRequestId{35ad2384e9cd0efd30b43f5302db24b6}]
> 2020-04-10 19:12:02,913 INFO&nbsp; org.apache.flink.yarn.YarnResourceManager&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;- Registering job manager [hidden email]://flink@trusfortpoc1:23584/user/jobmanager_0 for job 24691b33c18d7ad73b1f52edb3d68ae4.
> 2020-04-10 19:12:02,917 INFO&nbsp; org.apache.flink.yarn.YarnResourceManager&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;- Registered job manager [hidden email]://flink@trusfortpoc1:23584/user/jobmanager_0 for job 24691b33c18d7ad73b1f52edb3d68ae4.
> 2020-04-10 19:12:02,919 INFO&nbsp; org.apache.flink.runtime.jobmaster.JobMaster&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; - JobManager successfully registered at ResourceManager, leader id: 00000000000000000000000000000000.
> 2020-04-10 19:12:02,919 INFO&nbsp; org.apache.flink.runtime.jobmaster.slotpool.SlotPool&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; - Requesting new slot [SlotRequestId{35ad2384e9cd0efd30b43f5302db24b6}] and profile ResourceProfile{cpuCores=-1.0, heapMemoryInMB=-1, directMemoryInMB=0, nativeMemoryInMB=0, networkMemoryInMB=0} from resource manager.
> 2020-04-10 19:12:02,920 INFO&nbsp; org.apache.flink.yarn.YarnResourceManager&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;- Request slot with profile ResourceProfile{cpuCores=-1.0, heapMemoryInMB=-1, directMemoryInMB=0, nativeMemoryInMB=0, networkMemoryInMB=0} for job 24691b33c18d7ad73b1f52edb3d68ae4 with allocation id AllocationID{5a12237c7f2bd8b1cc760ddcbab5a1c0}.
> 2020-04-10 19:12:02,921 INFO&nbsp; org.apache.flink.runtime.jobmaster.slotpool.SlotPool&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; - Requesting new slot [SlotRequestId{0feacbb4fe16c8c7a70249f1396565d0}] and profile ResourceProfile{cpuCores=-1.0, heapMemoryInMB=-1, directMemoryInMB=0, nativeMemoryInMB=0, networkMemoryInMB=0} from resource manager.
> 2020-04-10 19:12:02,924 INFO&nbsp; org.apache.flink.yarn.YarnResourceManager&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;- Requesting new TaskExecutor container with resources <memory:4096, vCores:6&gt;. Number pending requests 1.
> 2020-04-10 19:12:02,926 INFO&nbsp; org.apache.flink.yarn.YarnResourceManager&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;- Request slot with profile ResourceProfile{cpuCores=-1.0, heapMemoryInMB=-1, directMemoryInMB=0, nativeMemoryInMB=0, networkMemoryInMB=0} for job 24691b33c18d7ad73b1f52edb3d68ae4 with allocation id AllocationID{37dd666a18040bf63ffbf2e022b2ea9b}.
> 2020-04-10 19:12:06,531 INFO&nbsp; org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;- Received new token for : trusfortpoc3:35206
> 2020-04-10 19:12:06,543 INFO&nbsp; org.apache.flink.yarn.YarnResourceManager&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;- Received new container: container_1586426824930_0006_01_000002 - Remaining pending container requests: 1
> 2020-04-10 19:12:06,543 INFO&nbsp; org.apache.flink.yarn.YarnResourceManager&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;- Removing container request Capability[<memory:4096, vCores:6&gt;]Priority[1]. Pending container requests 0.
> 2020-04-10 19:12:06,568 ERROR org.apache.flink.yarn.YarnResourceManager&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;- Could not start TaskManager in container container_1586426824930_0006_01_000002.
> java.lang.IllegalArgumentException: java.net.UnknownHostException: hdfsClusterForML
>         at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:378)
>         at org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProxies.java:320)
>         at org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:176)
>         at org.apache.hadoop.hdfs.DFSClient.<init&gt;(DFSClient.java:687)
>         at org.apache.hadoop.hdfs.DFSClient.<init&gt;(DFSClient.java:628)
>         at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:149)
>         at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2667)
>         at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:93)
>         at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2701)
>         at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2683)
>         at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:372)
>         at org.apache.hadoop.fs.Path.getFileSystem(Path.java:295)
>         at org.apache.flink.yarn.Utils.createTaskExecutorContext(Utils.java:453)
>         at org.apache.flink.yarn.YarnResourceManager.createTaskExecutorLaunchContext(YarnResourceManager.java:555)
>         at org.apache.flink.yarn.YarnResourceManager.lambda$onContainersAllocated$1(YarnResourceManager.java:390)
>         at org.apache.flink.yarn.YarnResourceManager$$Lambda$183/1182651376.run(Unknown Source)
>         at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRunAsync(AkkaRpcActor.java:332)
>         at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:158)
>         at org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleRpcMessage(FencedAkkaRpcActor.java:70)
>         at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.onReceive(AkkaRpcActor.java:142)
>         at org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.onReceive(FencedAkkaRpcActor.java:40)
>         at akka.actor.UntypedActor$$anonfun$receive$1.applyOrElse(UntypedActor.scala:165)
>         at akka.actor.Actor$class.aroundReceive(Actor.scala:502)
>         at akka.actor.UntypedActor.aroundReceive(UntypedActor.scala:95)
>         at akka.actor.ActorCell.receiveMessage(ActorCell.scala:526)
>         at akka.actor.ActorCell.invoke(ActorCell.scala:495)
>         at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:257)
>         at akka.dispatch.Mailbox.run(Mailbox.scala:224)
>         at akka.dispatch.Mailbox.exec(Mailbox.scala:234)
>         at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>         at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
>         at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
>         at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> Caused by: java.net.UnknownHostException: hdfsClusterForML
>         ... 33 more
>
>
> 这个hdfsClusterForML是namenode ha 的nameservice，经过分析是没加载hdfs-site.xml配置导致的，
> 也尝试过把Hadoop的几个配置文件放到flink 的conf目录下但都无效，最终通过改YarnResourceManager源码后能够正常提交任务。
> public YarnResourceManager(
> &nbsp; &nbsp; &nbsp; RpcService rpcService,
> &nbsp; &nbsp; &nbsp; String resourceManagerEndpointId,
> &nbsp; &nbsp; &nbsp; ResourceID resourceId,
> &nbsp; &nbsp; &nbsp; Configuration flinkConfig,
> &nbsp; &nbsp; &nbsp; Map<String, String&gt; env,
> &nbsp; &nbsp; &nbsp; HighAvailabilityServices highAvailabilityServices,
> &nbsp; &nbsp; &nbsp; HeartbeatServices heartbeatServices,
> &nbsp; &nbsp; &nbsp; SlotManager slotManager,
> &nbsp; &nbsp; &nbsp; MetricRegistry metricRegistry,
> &nbsp; &nbsp; &nbsp; JobLeaderIdService jobLeaderIdService,
> &nbsp; &nbsp; &nbsp; ClusterInformation clusterInformation,
> &nbsp; &nbsp; &nbsp; FatalErrorHandler fatalErrorHandler,
> &nbsp; &nbsp; &nbsp; @Nullable String webInterfaceUrl,
> &nbsp; &nbsp; &nbsp; JobManagerMetricGroup jobManagerMetricGroup) {
> &nbsp; &nbsp;super(
> &nbsp; &nbsp; &nbsp; rpcService,
> &nbsp; &nbsp; &nbsp; resourceManagerEndpointId,
> &nbsp; &nbsp; &nbsp; resourceId,
> &nbsp; &nbsp; &nbsp; highAvailabilityServices,
> &nbsp; &nbsp; &nbsp; heartbeatServices,
> &nbsp; &nbsp; &nbsp; slotManager,
> &nbsp; &nbsp; &nbsp; metricRegistry,
> &nbsp; &nbsp; &nbsp; jobLeaderIdService,
> &nbsp; &nbsp; &nbsp; clusterInformation,
> &nbsp; &nbsp; &nbsp; fatalErrorHandler,
> &nbsp; &nbsp; &nbsp; jobManagerMetricGroup);
> &nbsp; &nbsp;this.flinkConfig&nbsp; = flinkConfig;
> &nbsp; &nbsp;this.yarnConfig = new YarnConfiguration(HadoopUtils.getHadoopConfiguration(flinkConfig));
> 但我认为这肯定不是解决问题的方法，所以向大家求助，是不是我忽略什么。

tison

Re: flink 1.7.2 YARN Session模式提交任务问题求助

注意环境变量和 fs.hdfs.hdfsdefault 要配置成 HDFS 路径或 YARN
集群已知的本地路径，不要配置成客户端的路径。因为实际起作用是在拉起 TM 的那台机器上解析拉取的。

Best,
tison.

Chief <[hidden email]> 于2020年4月15日周三下午7:40写道：

> hi Yangze Guo
> 您说的环境变量已经在当前用户的环境变量文件里面设置了，您可以看看我的问题描述，现在如果checkpoint的路径设置不是namenode
> ha的nameservice就不会报错，checkpoint都正常。
>
>
>
>
> ------------------ 原始邮件 ------------------
> 发件人: "Yangze Guo"<[hidden email]>;
> 发送时间: 2020年4月15日(星期三) 下午3:00
> 收件人: "user-zh"<[hidden email]>;
>
> 主题: Re: flink 1.7.2 YARN Session模式提交任务问题求助
>
>
>
> Flink需要设置hadoop相关conf位置的环境变量 YARN_CONF_DIR or HADOOP_CONF_DIR [1]
>
> [1]
> https://ci.apache.org/projects/flink/flink-docs-master/ops/deployment/yarn_setup.html
>
> Best,
> Yangze Guo
>
> On Mon, Apr 13, 2020 at 10:52 PM Chief <[hidden email]> wrote:
> >
> > 大家好
> > 目前环境是flink 1.7.2，使用YARN Session模式提交任务，Hadoop 版本2.7.3，hdfs
> namenode配置了ha模式，提交任务的时候报以下错误，系统环境变量中已经设置了HADOOP_HOME,YARN_CONF_DIR，HADOOP_CONF_DIR，HADOOP_CLASSPATH，在flink_conf.yaml中配置了fs.hdfs.hadoopconf
> >
> >
> > 2020-04-10 19:12:02,908 INFO&nbsp;
> org.apache.flink.runtime.jobmaster.JobMaster&nbsp; &nbsp;
> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;
> &nbsp; - Connecting to ResourceManager akka.tcp://flink@trusfortpoc1
> :23584/user/resourcemanager(00000000000000000000000000000000)
> > 2020-04-10 19:12:02,909 INFO&nbsp;
> org.apache.flink.runtime.jobmaster.slotpool.SlotPool&nbsp; &nbsp;
> &nbsp; &nbsp; &nbsp; - Cannot serve slot request, no
> ResourceManager connected. Adding as pending request
> [SlotRequestId{0feacbb4fe16c8c7a70249f1396565d0}]
> > 2020-04-10 19:12:02,911 INFO&nbsp;
> org.apache.flink.runtime.jobmaster.JobMaster&nbsp; &nbsp;
> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;
> &nbsp; - Resolved ResourceManager address, beginning registration
> > 2020-04-10 19:12:02,911 INFO&nbsp;
> org.apache.flink.runtime.jobmaster.JobMaster&nbsp; &nbsp;
> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;
> &nbsp; - Registration at ResourceManager attempt 1 (timeout=100ms)
> > 2020-04-10 19:12:02,912 INFO&nbsp;
> org.apache.flink.runtime.jobmaster.slotpool.SlotPool&nbsp; &nbsp;
> &nbsp; &nbsp; &nbsp; - Cannot serve slot request, no
> ResourceManager connected. Adding as pending request
> [SlotRequestId{35ad2384e9cd0efd30b43f5302db24b6}]
> > 2020-04-10 19:12:02,913 INFO&nbsp;
> org.apache.flink.yarn.YarnResourceManager&nbsp; &nbsp; &nbsp;
> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;
> &nbsp; &nbsp;- Registering job manager
> [hidden email]://flink@trusfortpoc1:23584/user/jobmanager_0
> for job 24691b33c18d7ad73b1f52edb3d68ae4.
> > 2020-04-10 19:12:02,917 INFO&nbsp;
> org.apache.flink.yarn.YarnResourceManager&nbsp; &nbsp; &nbsp;
> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;
> &nbsp; &nbsp;- Registered job manager
> [hidden email]://flink@trusfortpoc1:23584/user/jobmanager_0
> for job 24691b33c18d7ad73b1f52edb3d68ae4.
> > 2020-04-10 19:12:02,919 INFO&nbsp;
> org.apache.flink.runtime.jobmaster.JobMaster&nbsp; &nbsp;
> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;
> &nbsp; - JobManager successfully registered at ResourceManager, leader
> id: 00000000000000000000000000000000.
> > 2020-04-10 19:12:02,919 INFO&nbsp;
> org.apache.flink.runtime.jobmaster.slotpool.SlotPool&nbsp; &nbsp;
> &nbsp; &nbsp; &nbsp; - Requesting new slot
> [SlotRequestId{35ad2384e9cd0efd30b43f5302db24b6}] and profile
> ResourceProfile{cpuCores=-1.0, heapMemoryInMB=-1, directMemoryInMB=0,
> nativeMemoryInMB=0, networkMemoryInMB=0} from resource manager.
> > 2020-04-10 19:12:02,920 INFO&nbsp;
> org.apache.flink.yarn.YarnResourceManager&nbsp; &nbsp; &nbsp;
> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;
> &nbsp; &nbsp;- Request slot with profile
> ResourceProfile{cpuCores=-1.0, heapMemoryInMB=-1, directMemoryInMB=0,
> nativeMemoryInMB=0, networkMemoryInMB=0} for job
> 24691b33c18d7ad73b1f52edb3d68ae4 with allocation id
> AllocationID{5a12237c7f2bd8b1cc760ddcbab5a1c0}.
> > 2020-04-10 19:12:02,921 INFO&nbsp;
> org.apache.flink.runtime.jobmaster.slotpool.SlotPool&nbsp; &nbsp;
> &nbsp; &nbsp; &nbsp; - Requesting new slot
> [SlotRequestId{0feacbb4fe16c8c7a70249f1396565d0}] and profile
> ResourceProfile{cpuCores=-1.0, heapMemoryInMB=-1, directMemoryInMB=0,
> nativeMemoryInMB=0, networkMemoryInMB=0} from resource manager.
> > 2020-04-10 19:12:02,924 INFO&nbsp;
> org.apache.flink.yarn.YarnResourceManager&nbsp; &nbsp; &nbsp;
> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;
> &nbsp; &nbsp;- Requesting new TaskExecutor container with resources
> <memory:4096, vCores:6&gt;. Number pending requests 1.
> > 2020-04-10 19:12:02,926 INFO&nbsp;
> org.apache.flink.yarn.YarnResourceManager&nbsp; &nbsp; &nbsp;
> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;
> &nbsp; &nbsp;- Request slot with profile
> ResourceProfile{cpuCores=-1.0, heapMemoryInMB=-1, directMemoryInMB=0,
> nativeMemoryInMB=0, networkMemoryInMB=0} for job
> 24691b33c18d7ad73b1f52edb3d68ae4 with allocation id
> AllocationID{37dd666a18040bf63ffbf2e022b2ea9b}.
> > 2020-04-10 19:12:06,531 INFO&nbsp;
> org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl&nbsp; &nbsp;
> &nbsp; &nbsp; &nbsp;- Received new token for :
> trusfortpoc3:35206
> > 2020-04-10 19:12:06,543 INFO&nbsp;
> org.apache.flink.yarn.YarnResourceManager&nbsp; &nbsp; &nbsp;
> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;
> &nbsp; &nbsp;- Received new container:
> container_1586426824930_0006_01_000002 - Remaining pending container
> requests: 1
> > 2020-04-10 19:12:06,543 INFO&nbsp;
> org.apache.flink.yarn.YarnResourceManager&nbsp; &nbsp; &nbsp;
> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;
> &nbsp; &nbsp;- Removing container request Capability[<memory:4096,
> vCores:6&gt;]Priority[1]. Pending container requests 0.
> > 2020-04-10 19:12:06,568 ERROR
> org.apache.flink.yarn.YarnResourceManager&nbsp; &nbsp; &nbsp;
> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;
> &nbsp; &nbsp;- Could not start TaskManager in container
> container_1586426824930_0006_01_000002.
> > java.lang.IllegalArgumentException: java.net.UnknownHostException:
> hdfsClusterForML
> >         at
> org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:378)
> >         at
> org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProxies.java:320)
> >         at
> org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:176)
> >         at
> org.apache.hadoop.hdfs.DFSClient.<init&gt;(DFSClient.java:687)
> >         at
> org.apache.hadoop.hdfs.DFSClient.<init&gt;(DFSClient.java:628)
> >         at
> org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:149)
> >         at
> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2667)
> >         at
> org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:93)
> >         at
> org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2701)
> >         at
> org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2683)
> >         at
> org.apache.hadoop.fs.FileSystem.get(FileSystem.java:372)
> >         at
> org.apache.hadoop.fs.Path.getFileSystem(Path.java:295)
> >         at
> org.apache.flink.yarn.Utils.createTaskExecutorContext(Utils.java:453)
> >         at
> org.apache.flink.yarn.YarnResourceManager.createTaskExecutorLaunchContext(YarnResourceManager.java:555)
> >         at
> org.apache.flink.yarn.YarnResourceManager.lambda$onContainersAllocated$1(YarnResourceManager.java:390)
> >         at
> org.apache.flink.yarn.YarnResourceManager$$Lambda$183/1182651376.run(Unknown
> Source)
> >         at
> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRunAsync(AkkaRpcActor.java:332)
> >         at
> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:158)
> >         at
> org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleRpcMessage(FencedAkkaRpcActor.java:70)
> >         at
> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.onReceive(AkkaRpcActor.java:142)
> >         at
> org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.onReceive(FencedAkkaRpcActor.java:40)
> >         at
> akka.actor.UntypedActor$$anonfun$receive$1.applyOrElse(UntypedActor.scala:165)
> >         at
> akka.actor.Actor$class.aroundReceive(Actor.scala:502)
> >         at
> akka.actor.UntypedActor.aroundReceive(UntypedActor.scala:95)
> >         at
> akka.actor.ActorCell.receiveMessage(ActorCell.scala:526)
> >         at
> akka.actor.ActorCell.invoke(ActorCell.scala:495)
> >         at
> akka.dispatch.Mailbox.processMailbox(Mailbox.scala:257)
> >         at
> akka.dispatch.Mailbox.run(Mailbox.scala:224)
> >         at
> akka.dispatch.Mailbox.exec(Mailbox.scala:234)
> >         at
> scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
> >         at
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
> >         at
> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
> >         at
> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> > Caused by: java.net.UnknownHostException: hdfsClusterForML
> >         ... 33 more
> >
> >
> > 这个hdfsClusterForML是namenode ha
> 的nameservice，经过分析是没加载hdfs-site.xml配置导致的，
> > 也尝试过把Hadoop的几个配置文件放到flink
> 的conf目录下但都无效，最终通过改YarnResourceManager源码后能够正常提交任务。
> > public YarnResourceManager(
> > &nbsp; &nbsp; &nbsp; RpcService rpcService,
> > &nbsp; &nbsp; &nbsp; String resourceManagerEndpointId,
> > &nbsp; &nbsp; &nbsp; ResourceID resourceId,
> > &nbsp; &nbsp; &nbsp; Configuration flinkConfig,
> > &nbsp; &nbsp; &nbsp; Map<String, String&gt; env,
> > &nbsp; &nbsp; &nbsp; HighAvailabilityServices
> highAvailabilityServices,
> > &nbsp; &nbsp; &nbsp; HeartbeatServices heartbeatServices,
> > &nbsp; &nbsp; &nbsp; SlotManager slotManager,
> > &nbsp; &nbsp; &nbsp; MetricRegistry metricRegistry,
> > &nbsp; &nbsp; &nbsp; JobLeaderIdService
> jobLeaderIdService,
> > &nbsp; &nbsp; &nbsp; ClusterInformation
> clusterInformation,
> > &nbsp; &nbsp; &nbsp; FatalErrorHandler fatalErrorHandler,
> > &nbsp; &nbsp; &nbsp; @Nullable String webInterfaceUrl,
> > &nbsp; &nbsp; &nbsp; JobManagerMetricGroup
> jobManagerMetricGroup) {
> > &nbsp; &nbsp;super(
> > &nbsp; &nbsp; &nbsp; rpcService,
> > &nbsp; &nbsp; &nbsp; resourceManagerEndpointId,
> > &nbsp; &nbsp; &nbsp; resourceId,
> > &nbsp; &nbsp; &nbsp; highAvailabilityServices,
> > &nbsp; &nbsp; &nbsp; heartbeatServices,
> > &nbsp; &nbsp; &nbsp; slotManager,
> > &nbsp; &nbsp; &nbsp; metricRegistry,
> > &nbsp; &nbsp; &nbsp; jobLeaderIdService,
> > &nbsp; &nbsp; &nbsp; clusterInformation,
> > &nbsp; &nbsp; &nbsp; fatalErrorHandler,
> > &nbsp; &nbsp; &nbsp; jobManagerMetricGroup);
> > &nbsp; &nbsp;this.flinkConfig&nbsp; = flinkConfig;
> > &nbsp; &nbsp;this.yarnConfig = new
> YarnConfiguration(HadoopUtils.getHadoopConfiguration(flinkConfig));
> > 但我认为这肯定不是解决问题的方法，所以向大家求助，是不是我忽略什么。

Lori681

Re: flink 1.7.2 YARN Session模式提交任务问题求助

In reply to this post by Chief