这是启动日志报错的信息 通过flink on yarn模式进行提交的
[root@node01 flink-1.9.1]# bin/flink run -m yarn-cluster -yn 2 ./examples/batch/WordCount.jar 2020-09-06 14:30:00,803 INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at /0.0.0.0:8032 2020-09-06 14:30:00,938 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli - No path for the flink jar passed. Using the location of class org.apache.flink.yarn.YarnClusterDescriptor to locate the jar 2020-09-06 14:30:00,938 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli - No path for the flink jar passed. Using the location of class org.apache.flink.yarn.YarnClusterDescriptor to locate the jar 2020-09-06 14:30:00,947 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli - The argument yn is deprecated in will be ignored. 2020-09-06 14:30:00,947 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli - The argument yn is deprecated in will be ignored. 2020-09-06 14:30:01,105 INFO org.apache.hadoop.conf.Configuration - resource-types.xml not found 2020-09-06 14:30:01,105 INFO org.apache.hadoop.yarn.util.resource.ResourceUtils - Unable to find 'resource-types.xml'. 2020-09-06 14:30:01,136 INFO org.apache.flink.yarn.AbstractYarnClusterDescriptor - Cluster specification: ClusterSpecification{masterMemoryMB=1024, taskManagerMemoryMB=1024, numberTaskManagers=2, slotsPerTaskManager=2} 2020-09-06 14:30:01,193 WARN org.apache.flink.yarn.AbstractYarnClusterDescriptor - The file system scheme is 'file'. This indicates that the specified Hadoop configuration path is wrong and the system is using the default Hadoop configuration values.The Flink YARN client needs to store its files in a distributed file system 2020-09-06 14:30:01,196 WARN org.apache.flink.yarn.AbstractYarnClusterDescriptor - The configuration directory ('/export/servers/flink-1.9.1/conf') contains both LOG4J and Logback configuration files. Please delete or rename one of them. 2020-09-06 14:30:01,600 INFO org.apache.flink.yarn.AbstractYarnClusterDescriptor - Submitting application master application_1599371603539_0004 2020-09-06 14:30:01,635 INFO org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted application application_1599371603539_0004 2020-09-06 14:30:01,635 INFO org.apache.flink.yarn.AbstractYarnClusterDescriptor - Waiting for the cluster to be allocated 2020-09-06 14:30:01,644 INFO org.apache.flink.yarn.AbstractYarnClusterDescriptor - Deploying cluster, current state ACCEPTED ------------------------------------------------------------ The program finished with the following exception: org.apache.flink.client.deployment.ClusterDeploymentException: Couldn't deploy Yarn session cluster at org.apache.flink.yarn.AbstractYarnClusterDescriptor.deploySessionCluster(AbstractYarnClusterDescriptor.java:385) at org.apache.flink.client.cli.CliFrontend.runProgram(CliFrontend.java:251) at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:205) at org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:1010) at org.apache.flink.client.cli.CliFrontend.lambda$main$10(CliFrontend.java:1083) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1875) at org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41) at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1083) Caused by: org.apache.flink.yarn.AbstractYarnClusterDescriptor$YarnDeploymentException: The YARN application unexpectedly switched to state FAILED during deployment. Diagnostics from YARN: Application application_1599371603539_0004 failed 2 times due to AM Container for appattempt_1599371603539_0004_000002 exited with exitCode: 1 For more detailed output, check application tracking page:http://node01:8088/cluster/app/application_1599371603539_0004Then, click on links to logs of each attempt. Diagnostics: Exception from container-launch. Container id: container_1599371603539_0004_02_000001 Exit code: 1 Stack trace: ExitCodeException exitCode=1: at org.apache.hadoop.util.Shell.runCommand(Shell.java:585) at org.apache.hadoop.util.Shell.run(Shell.java:482) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:776) at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:212) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Container exited with a non-zero exit code 1 Failing this attempt. Failing the application. If log aggregation is enabled on your cluster, use this command to further investigate the issue: yarn logs -applicationId application_1599371603539_0004 at org.apache.flink.yarn.AbstractYarnClusterDescriptor.startAppMaster(AbstractYarnClusterDescriptor.java:1024) at org.apache.flink.yarn.AbstractYarnClusterDescriptor.deployInternal(AbstractYarnClusterDescriptor.java:507) at org.apache.flink.yarn.AbstractYarnClusterDescriptor.deploySessionCluster(AbstractYarnClusterDescriptor.java:378) ... 9 more 2020-09-06 14:30:06,357 INFO org.apache.flink.yarn.AbstractYarnClusterDescriptor - Cancelling deployment from Deployment Failure Hook 2020-09-06 14:30:06,357 INFO org.apache.flink.yarn.AbstractYarnClusterDescriptor - Killing YARN application 2020-09-06 14:30:06,378 INFO org.apache.hadoop.io.retry.RetryInvocationHandler - java.io.IOException: The client is stopped, while invoking ApplicationClientProtocolPBClientImpl.forceKillApplication over null. Trying to failover immediately. 2020-09-06 14:30:06,382 INFO org.apache.hadoop.io.retry.RetryInvocationHandler - java.io.IOException: The client is stopped, while invoking ApplicationClientProtocolPBClientImpl.forceKillApplication over null after 1 failover attempts. Trying to failover after sleeping for 27593ms. 2020-09-06 14:30:33,977 INFO org.apache.hadoop.io.retry.RetryInvocationHandler - java.io.IOException: The client is stopped, while invoking ApplicationClientProtocolPBClientImpl.forceKillApplication over null after 2 failover attempts. Trying to failover after sleeping for 39571ms. 2020-09-06 14:31:13,549 INFO org.apache.hadoop.io.retry.RetryInvocationHandler - java.io.IOException: The client is stopped, while invoking ApplicationClientProtocolPBClientImpl.forceKillApplication over null after 3 failover attempts. Trying to failover after sleeping for 26075ms. 这是我查看yarn logs -applicationId application_1599371603539_0004 的报错信息 Container: container_1599371603539_0004_02_000001 on node01_35016 =================================================================== LogType:jobmanager.err Log Upload Time:星期日 九月 06 14:30:07 +0800 2020 LogLength:589 Log Contents: SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/export/servers/hadoop-2.7.7/hadoopDatas/tempDatas/nm-local-dir/usercache/root/appcache/application_1599371603539_0004/filecache/10/slf4j-log4j12-1.7.15.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/export/servers/hadoop-2.7.7/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] End of LogType:jobmanager.err LogType:jobmanager.out Log Upload Time:星期日 九月 06 14:30:07 +0800 2020 LogLength:0 Log Contents: End of LogType:jobmanager.out 我使用的Hadoop版本是2.7.7的, 请问一下 我这个是哪里出现了问题呢 -- Sent from: http://apache-flink.147419.n8.nabble.com/ |
你把-yn 2这个参数去了看一下,这个参数很早就不能生效了
TM都是动态申请和释放的 Best, Yang xzw0223 <[hidden email]> 于2020年9月7日周一 上午9:50写道: > 这是启动日志报错的信息 通过flink on yarn模式进行提交的 > > [root@node01 flink-1.9.1]# bin/flink run -m yarn-cluster -yn 2 > ./examples/batch/WordCount.jar > > 2020-09-06 14:30:00,803 INFO org.apache.hadoop.yarn.client.RMProxy > > - Connecting to ResourceManager at /0.0.0.0:8032 > 2020-09-06 14:30:00,938 INFO > org.apache.flink.yarn.cli.FlinkYarnSessionCli > - No path for the flink jar passed. Using the location of class > org.apache.flink.yarn.YarnClusterDescriptor to locate the jar > 2020-09-06 14:30:00,938 INFO > org.apache.flink.yarn.cli.FlinkYarnSessionCli > - No path for the flink jar passed. Using the location of class > org.apache.flink.yarn.YarnClusterDescriptor to locate the jar > 2020-09-06 14:30:00,947 INFO > org.apache.flink.yarn.cli.FlinkYarnSessionCli > - The argument yn is deprecated in will be ignored. > 2020-09-06 14:30:00,947 INFO > org.apache.flink.yarn.cli.FlinkYarnSessionCli > - The argument yn is deprecated in will be ignored. > 2020-09-06 14:30:01,105 INFO org.apache.hadoop.conf.Configuration > > - resource-types.xml not found > 2020-09-06 14:30:01,105 INFO > org.apache.hadoop.yarn.util.resource.ResourceUtils - Unable to > find 'resource-types.xml'. > 2020-09-06 14:30:01,136 INFO > org.apache.flink.yarn.AbstractYarnClusterDescriptor - Cluster > specification: ClusterSpecification{masterMemoryMB=1024, > taskManagerMemoryMB=1024, numberTaskManagers=2, slotsPerTaskManager=2} > 2020-09-06 14:30:01,193 WARN > org.apache.flink.yarn.AbstractYarnClusterDescriptor - The file > system scheme is 'file'. This indicates that the specified Hadoop > configuration path is wrong and the system is using the default Hadoop > configuration values.The Flink YARN client needs to store its files in a > distributed file system > 2020-09-06 14:30:01,196 WARN > org.apache.flink.yarn.AbstractYarnClusterDescriptor - The > configuration directory ('/export/servers/flink-1.9.1/conf') contains both > LOG4J and Logback configuration files. Please delete or rename one of them. > 2020-09-06 14:30:01,600 INFO > org.apache.flink.yarn.AbstractYarnClusterDescriptor - Submitting > application master application_1599371603539_0004 > 2020-09-06 14:30:01,635 INFO > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted > application application_1599371603539_0004 > 2020-09-06 14:30:01,635 INFO > org.apache.flink.yarn.AbstractYarnClusterDescriptor - Waiting for > the cluster to be allocated > 2020-09-06 14:30:01,644 INFO > org.apache.flink.yarn.AbstractYarnClusterDescriptor - Deploying > cluster, current state ACCEPTED > > ------------------------------------------------------------ > The program finished with the following exception: > > org.apache.flink.client.deployment.ClusterDeploymentException: Couldn't > deploy Yarn session cluster > at > > org.apache.flink.yarn.AbstractYarnClusterDescriptor.deploySessionCluster(AbstractYarnClusterDescriptor.java:385) > at > org.apache.flink.client.cli.CliFrontend.runProgram(CliFrontend.java:251) > at > org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:205) > at > > org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:1010) > at > > org.apache.flink.client.cli.CliFrontend.lambda$main$10(CliFrontend.java:1083) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1875) > at > > org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41) > at > org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1083) > Caused by: > > org.apache.flink.yarn.AbstractYarnClusterDescriptor$YarnDeploymentException: > The YARN application unexpectedly switched to state FAILED during > deployment. > Diagnostics from YARN: Application application_1599371603539_0004 failed 2 > times due to AM Container for appattempt_1599371603539_0004_000002 exited > with exitCode: 1 > For more detailed output, check application tracking > page:http://node01:8088/cluster/app/application_1599371603539_0004Then, > click on links to logs of each attempt. > Diagnostics: Exception from container-launch. > Container id: container_1599371603539_0004_02_000001 > Exit code: 1 > Stack trace: ExitCodeException exitCode=1: > at org.apache.hadoop.util.Shell.runCommand(Shell.java:585) > at org.apache.hadoop.util.Shell.run(Shell.java:482) > at > org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:776) > at > > org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:212) > at > > org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302) > at > > org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > > > Container exited with a non-zero exit code 1 > Failing this attempt. Failing the application. > If log aggregation is enabled on your cluster, use this command to further > investigate the issue: > yarn logs -applicationId application_1599371603539_0004 > at > > org.apache.flink.yarn.AbstractYarnClusterDescriptor.startAppMaster(AbstractYarnClusterDescriptor.java:1024) > at > > org.apache.flink.yarn.AbstractYarnClusterDescriptor.deployInternal(AbstractYarnClusterDescriptor.java:507) > at > > org.apache.flink.yarn.AbstractYarnClusterDescriptor.deploySessionCluster(AbstractYarnClusterDescriptor.java:378) > ... 9 more > 2020-09-06 14:30:06,357 INFO > org.apache.flink.yarn.AbstractYarnClusterDescriptor - Cancelling > deployment from Deployment Failure Hook > 2020-09-06 14:30:06,357 INFO > org.apache.flink.yarn.AbstractYarnClusterDescriptor - Killing > YARN > application > 2020-09-06 14:30:06,378 INFO > org.apache.hadoop.io.retry.RetryInvocationHandler - > java.io.IOException: The client is stopped, while invoking > ApplicationClientProtocolPBClientImpl.forceKillApplication over null. > Trying > to failover immediately. > 2020-09-06 14:30:06,382 INFO > org.apache.hadoop.io.retry.RetryInvocationHandler - > java.io.IOException: The client is stopped, while invoking > ApplicationClientProtocolPBClientImpl.forceKillApplication over null after > 1 > failover attempts. Trying to failover after sleeping for 27593ms. > 2020-09-06 14:30:33,977 INFO > org.apache.hadoop.io.retry.RetryInvocationHandler - > java.io.IOException: The client is stopped, while invoking > ApplicationClientProtocolPBClientImpl.forceKillApplication over null after > 2 > failover attempts. Trying to failover after sleeping for 39571ms. > 2020-09-06 14:31:13,549 INFO > org.apache.hadoop.io.retry.RetryInvocationHandler - > java.io.IOException: The client is stopped, while invoking > ApplicationClientProtocolPBClientImpl.forceKillApplication over null after > 3 > failover attempts. Trying to failover after sleeping for 26075ms. > > > 这是我查看yarn logs -applicationId application_1599371603539_0004 的报错信息 > > > Container: container_1599371603539_0004_02_000001 on node01_35016 > =================================================================== > LogType:jobmanager.err > Log Upload Time:星期日 九月 06 14:30:07 +0800 2020 > LogLength:589 > Log Contents: > SLF4J: Class path contains multiple SLF4J bindings. > SLF4J: Found binding in > > [jar:file:/export/servers/hadoop-2.7.7/hadoopDatas/tempDatas/nm-local-dir/usercache/root/appcache/application_1599371603539_0004/filecache/10/slf4j-log4j12-1.7.15.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: Found binding in > > [jar:file:/export/servers/hadoop-2.7.7/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an > explanation. > SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] > End of LogType:jobmanager.err > > LogType:jobmanager.out > Log Upload Time:星期日 九月 06 14:30:07 +0800 2020 > LogLength:0 > Log Contents: > End of LogType:jobmanager.out > > > > 我使用的Hadoop版本是2.7.7的, 请问一下 我这个是哪里出现了问题呢 > > > > > > -- > Sent from: http://apache-flink.147419.n8.nabble.com/ > |
Free forum by Nabble | Edit this page |