Hi 社区,
最近从 Flink 1.10 升级版本至 Flink 1.12,在提交作业到 Yarn 时,作业一直报错如下: org.apache.flink.client.program.ProgramInvocationException: The main method caused an error: Failed to execute sql at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:365) at org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:218) at org.apache.flink.client.ClientUtils.executeProgram(ClientUtils.java:114) at org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:812) at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:246) at org.apache.flink.client.cli.CliFrontend.parseAndRun(CliFrontend.java:1054) at org.apache.flink.client.cli.CliFrontend.lambda$main$10(CliFrontend.java:1132) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1692) at org.apache.flink.runtime.security.contexts.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41) at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1132) Caused by: org.apache.flink.table.api.TableException: Failed to execute sql at org.apache.flink.table.api.internal.TableEnvironmentImpl.executeInternal(TableEnvironmentImpl.java:699) at org.apache.flink.table.api.internal.TableEnvironmentImpl.executeOperation(TableEnvironmentImpl.java:767) at org.apache.flink.table.api.internal.TableEnvironmentImpl.executeSql(TableEnvironmentImpl.java:666) at com.youzan.bigdata.FlinkStreamSQLDDLJob.lambda$main$0(FlinkStreamSQLDDLJob.java:95) at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1380) at java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:580) at com.youzan.bigdata.FlinkStreamSQLDDLJob.main(FlinkStreamSQLDDLJob.java:93) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:348) ... 11 more Caused by: org.apache.flink.client.deployment.ClusterDeploymentException: Could not deploy Yarn job cluster. at org.apache.flink.yarn.YarnClusterDescriptor.deployJobCluster(YarnClusterDescriptor.java:481) at org.apache.flink.client.deployment.executors.AbstractJobClusterExecutor.execute(AbstractJobClusterExecutor.java:81) at org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.executeAsync(StreamExecutionEnvironment.java:1905) at org.apache.flink.client.program.StreamContextEnvironment.executeAsync(StreamContextEnvironment.java:135) at org.apache.flink.table.planner.delegation.ExecutorBase.executeAsync(ExecutorBase.java:55) at org.apache.flink.table.api.internal.TableEnvironmentImpl.executeInternal(TableEnvironmentImpl.java:681) ... 22 more Caused by: org.apache.flink.yarn.YarnClusterDescriptor$YarnDeploymentException: The YARN application unexpectedly switched to state FAILED during deployment. Diagnostics from YARN: Application application_1613992328588_4441 failed 2 times due to AM Container for appattempt_1613992328588_4441_000002 exited with exitCode: 1 Diagnostics: Exception from container-launch. Container id: container_xxx Exit code: 1 Stack trace: ExitCodeException exitCode=1: at org.apache.hadoop.util.Shell.runCommand(Shell.java:575) at org.apache.hadoop.util.Shell.run(Shell.java:478) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:766) at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:212) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) 相关信息如下: 1. 我的 Flink 作业中没有 Hadoop 相关的依赖 2. 提交作业的机器,以及 Hadoop 集群每台机器都有 HADOOP_CLASSPATH 环境变量 3. Flink 作业提交到 Yarn 后,状态之后从 Accepted 到 FAILED 状态。 希望有人帮我解惑,感谢 Best, LakeShen |
HADOOP_CLASSPATH 看下环境变量的内容是什么,是否和 "hadoop classpath" 这个语句执行的结果一致?
根据 [1],Flink 1.11 开始,不再默认把 HDFS 相关的 jar 包打进 Flink 的包里面了,而是需要用户在执行时指定 HDFS 相关包路径,export HADOOP_CLASSPATH=`hadoop classpath` 这句话实际上的效果是执行 hadoop classpath 命令并将结果赋值给 HADOOP_CLASSPATH 这个系统变量。 另外控制变量的话你找个最简单的作业提交一下看看? 看上面的错误日志你提交的应该是个 SQL 作业,找个 DataStream 的 word count 提交看下报错信息是什么。 参考: [1]. https://ci.apache.org/projects/flink/flink-docs-release-1.12/deployment/resource-providers/yarn.html |
In reply to this post by LakeShen
同提交作业到On Yarn集群,客户端的错误也是
org.apache.flink.yarn.YarnClusterDescriptor$YarnDeploymentException: The YARN application unexpectedly switched to state FAILED during deployment. Diagnostics from YARN: Application application_1610671284452_0243 failed 10 times due to AM Container for appattempt_1610671284452_0243_000010 exited with exitCode: 1 Failing this attempt.Diagnostics: [2021-02-23 18:51:00.021]Exception from container-launch. Container id: container_e48_1610671284452_0243_10_000001 Exit code: 1 [2021-02-23 18:51:00.024]Container exited with a non-zero exit code 1. Error file: prelaunch.err. Last 4096 bytes of prelaunch.err : [2021-02-23 18:51:00.027]Container exited with a non-zero exit code 1. Error file: prelaunch.err. Last 4096 bytes of prelaunch.err : Yarn那边的日志显示:Could not find or load main class org.apache.flink.yarn.entrypoint.YarnJobClusterEntrypoint 不过我是Flink 1.12 的API,然后提交的集群还是Flink1.10.1的,不知道哪里的问题 | | 凌战 | | [hidden email] | 签名由网易邮箱大师定制 在2021年2月23日 18:46,LakeShen<[hidden email]> 写道: Hi 社区, 最近从 Flink 1.10 升级版本至 Flink 1.12,在提交作业到 Yarn 时,作业一直报错如下: org.apache.flink.client.program.ProgramInvocationException: The main method caused an error: Failed to execute sql at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:365) at org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:218) at org.apache.flink.client.ClientUtils.executeProgram(ClientUtils.java:114) at org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:812) at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:246) at org.apache.flink.client.cli.CliFrontend.parseAndRun(CliFrontend.java:1054) at org.apache.flink.client.cli.CliFrontend.lambda$main$10(CliFrontend.java:1132) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1692) at org.apache.flink.runtime.security.contexts.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41) at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1132) Caused by: org.apache.flink.table.api.TableException: Failed to execute sql at org.apache.flink.table.api.internal.TableEnvironmentImpl.executeInternal(TableEnvironmentImpl.java:699) at org.apache.flink.table.api.internal.TableEnvironmentImpl.executeOperation(TableEnvironmentImpl.java:767) at org.apache.flink.table.api.internal.TableEnvironmentImpl.executeSql(TableEnvironmentImpl.java:666) at com.youzan.bigdata.FlinkStreamSQLDDLJob.lambda$main$0(FlinkStreamSQLDDLJob.java:95) at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1380) at java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:580) at com.youzan.bigdata.FlinkStreamSQLDDLJob.main(FlinkStreamSQLDDLJob.java:93) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:348) ... 11 more Caused by: org.apache.flink.client.deployment.ClusterDeploymentException: Could not deploy Yarn job cluster. at org.apache.flink.yarn.YarnClusterDescriptor.deployJobCluster(YarnClusterDescriptor.java:481) at org.apache.flink.client.deployment.executors.AbstractJobClusterExecutor.execute(AbstractJobClusterExecutor.java:81) at org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.executeAsync(StreamExecutionEnvironment.java:1905) at org.apache.flink.client.program.StreamContextEnvironment.executeAsync(StreamContextEnvironment.java:135) at org.apache.flink.table.planner.delegation.ExecutorBase.executeAsync(ExecutorBase.java:55) at org.apache.flink.table.api.internal.TableEnvironmentImpl.executeInternal(TableEnvironmentImpl.java:681) ... 22 more Caused by: org.apache.flink.yarn.YarnClusterDescriptor$YarnDeploymentException: The YARN application unexpectedly switched to state FAILED during deployment. Diagnostics from YARN: Application application_1613992328588_4441 failed 2 times due to AM Container for appattempt_1613992328588_4441_000002 exited with exitCode: 1 Diagnostics: Exception from container-launch. Container id: container_xxx Exit code: 1 Stack trace: ExitCodeException exitCode=1: at org.apache.hadoop.util.Shell.runCommand(Shell.java:575) at org.apache.hadoop.util.Shell.run(Shell.java:478) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:766) at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:212) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) 相关信息如下: 1. 我的 Flink 作业中没有 Hadoop 相关的依赖 2. 提交作业的机器,以及 Hadoop 集群每台机器都有 HADOOP_CLASSPATH 环境变量 3. Flink 作业提交到 Yarn 后,状态之后从 Accepted 到 FAILED 状态。 希望有人帮我解惑,感谢 Best, LakeShen |
In reply to this post by LakeShen
// 上一次似乎没发成功,换个方式重发一次,若有打扰请见谅
HADOOP_CLASSPATH 看下环境变量的内容是什么,是否和 "hadoop classpath" 这个语句执行的结果一致? 根据 [1],Flink 1.11 开始,不再默认把 HDFS 相关的 jar 包打进 Flink 的包里面了,而是需要用户在执行时指定 HDFS 相关包路径,export HADOOP_CLASSPATH=`hadoop classpath` 这句话实际上的效果是执行 hadoop classpath 命令并将结果赋值给 HADOOP_CLASSPATH 这个系统变量。 另外控制变量的话你找个最简单的作业提交一下看看? 看上面的错误日志你提交的应该是个 SQL 作业,找个 DataStream 的 word count 提交看下报错信息是什么。 参考: [1]. https://ci.apache.org/projects/flink/flink-docs-release-1.12/deployment/resource-providers/yarn.html -- Sent from: http://apache-flink.147419.n8.nabble.com/ |
In reply to this post by 凌战
这个应该你的 flink 本地配置的目录要是 1.12 版本的,也就是 flink-dist 目录
凌战 <[hidden email]> 于2021年2月23日周二 下午7:33写道: > 同提交作业到On Yarn集群,客户端的错误也是 > > > org.apache.flink.yarn.YarnClusterDescriptor$YarnDeploymentException: The > YARN application unexpectedly switched to state FAILED during deployment. > Diagnostics from YARN: Application application_1610671284452_0243 failed > 10 times due to AM Container for appattempt_1610671284452_0243_000010 > exited with exitCode: 1 > Failing this attempt.Diagnostics: [2021-02-23 18:51:00.021]Exception from > container-launch. > Container id: container_e48_1610671284452_0243_10_000001 > Exit code: 1 > > > [2021-02-23 18:51:00.024]Container exited with a non-zero exit code 1. > Error file: prelaunch.err. > Last 4096 bytes of prelaunch.err : > > > [2021-02-23 18:51:00.027]Container exited with a non-zero exit code 1. > Error file: prelaunch.err. > Last 4096 bytes of prelaunch.err : > > > Yarn那边的日志显示:Could not find or load main class > org.apache.flink.yarn.entrypoint.YarnJobClusterEntrypoint > > > 不过我是Flink 1.12 的API,然后提交的集群还是Flink1.10.1的,不知道哪里的问题 > > > | | > 凌战 > | > | > [hidden email] > | > 签名由网易邮箱大师定制 > 在2021年2月23日 18:46,LakeShen<[hidden email]> 写道: > Hi 社区, > > 最近从 Flink 1.10 升级版本至 Flink 1.12,在提交作业到 Yarn 时,作业一直报错如下: > > > org.apache.flink.client.program.ProgramInvocationException: The main method > caused an error: Failed to execute sql > > at > > org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:365) > > at > > org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:218) > > at org.apache.flink.client.ClientUtils.executeProgram(ClientUtils.java:114) > > at > > org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:812) > > at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:246) > > at > org.apache.flink.client.cli.CliFrontend.parseAndRun(CliFrontend.java:1054) > > at > > org.apache.flink.client.cli.CliFrontend.lambda$main$10(CliFrontend.java:1132) > > at java.security.AccessController.doPrivileged(Native Method) > > at javax.security.auth.Subject.doAs(Subject.java:422) > > at > > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1692) > > at > > org.apache.flink.runtime.security.contexts.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41) > > at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1132) > > Caused by: org.apache.flink.table.api.TableException: Failed to execute sql > > at > > org.apache.flink.table.api.internal.TableEnvironmentImpl.executeInternal(TableEnvironmentImpl.java:699) > > at > > org.apache.flink.table.api.internal.TableEnvironmentImpl.executeOperation(TableEnvironmentImpl.java:767) > > at > > org.apache.flink.table.api.internal.TableEnvironmentImpl.executeSql(TableEnvironmentImpl.java:666) > > at > > com.youzan.bigdata.FlinkStreamSQLDDLJob.lambda$main$0(FlinkStreamSQLDDLJob.java:95) > > at > > java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1380) > > at > java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:580) > > at > com.youzan.bigdata.FlinkStreamSQLDDLJob.main(FlinkStreamSQLDDLJob.java:93) > > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > > at > > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > > at > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > > at java.lang.reflect.Method.invoke(Method.java:498) > > at > > org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:348) > > ... 11 more > > Caused by: org.apache.flink.client.deployment.ClusterDeploymentException: > Could not deploy Yarn job cluster. > > at > > org.apache.flink.yarn.YarnClusterDescriptor.deployJobCluster(YarnClusterDescriptor.java:481) > > at > > org.apache.flink.client.deployment.executors.AbstractJobClusterExecutor.execute(AbstractJobClusterExecutor.java:81) > > at > > org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.executeAsync(StreamExecutionEnvironment.java:1905) > > at > > org.apache.flink.client.program.StreamContextEnvironment.executeAsync(StreamContextEnvironment.java:135) > > at > > org.apache.flink.table.planner.delegation.ExecutorBase.executeAsync(ExecutorBase.java:55) > > at > > org.apache.flink.table.api.internal.TableEnvironmentImpl.executeInternal(TableEnvironmentImpl.java:681) > > ... 22 more > > Caused by: > org.apache.flink.yarn.YarnClusterDescriptor$YarnDeploymentException: The > YARN application unexpectedly switched to state FAILED during deployment. > Diagnostics from YARN: Application application_1613992328588_4441 failed 2 > times due to AM Container for appattempt_1613992328588_4441_000002 exited > with exitCode: 1 > Diagnostics: Exception from container-launch. > Container id: container_xxx > Exit code: 1 > Stack trace: ExitCodeException exitCode=1: > > at org.apache.hadoop.util.Shell.runCommand(Shell.java:575) > > at org.apache.hadoop.util.Shell.run(Shell.java:478) > > at > org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:766) > > at > > org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:212) > > at > > org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302) > > at > > org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82) > > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > > at > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > > at > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > > at java.lang.Thread.run(Thread.java:748) > > > 相关信息如下: > 1. 我的 Flink 作业中没有 Hadoop 相关的依赖 > 2. 提交作业的机器,以及 Hadoop 集群每台机器都有 HADOOP_CLASSPATH 环境变量 > 3. Flink 作业提交到 Yarn 后,状态之后从 Accepted 到 FAILED 状态。 > > 希望有人帮我解惑,感谢 > > Best, > LakeShen > |
你是指提交时所依赖的flink-dist jar包需要是 1.12 版本吗,现在改成1.12 版本还是不行
| | 凌战 | | [hidden email] | 签名由网易邮箱大师定制 在2021年2月23日 21:27,LakeShen<[hidden email]> 写道: 这个应该你的 flink 本地配置的目录要是 1.12 版本的,也就是 flink-dist 目录 凌战 <[hidden email]> 于2021年2月23日周二 下午7:33写道: 同提交作业到On Yarn集群,客户端的错误也是 org.apache.flink.yarn.YarnClusterDescriptor$YarnDeploymentException: The YARN application unexpectedly switched to state FAILED during deployment. Diagnostics from YARN: Application application_1610671284452_0243 failed 10 times due to AM Container for appattempt_1610671284452_0243_000010 exited with exitCode: 1 Failing this attempt.Diagnostics: [2021-02-23 18:51:00.021]Exception from container-launch. Container id: container_e48_1610671284452_0243_10_000001 Exit code: 1 [2021-02-23 18:51:00.024]Container exited with a non-zero exit code 1. Error file: prelaunch.err. Last 4096 bytes of prelaunch.err : [2021-02-23 18:51:00.027]Container exited with a non-zero exit code 1. Error file: prelaunch.err. Last 4096 bytes of prelaunch.err : Yarn那边的日志显示:Could not find or load main class org.apache.flink.yarn.entrypoint.YarnJobClusterEntrypoint 不过我是Flink 1.12 的API,然后提交的集群还是Flink1.10.1的,不知道哪里的问题 | | 凌战 | | [hidden email] | 签名由网易邮箱大师定制 在2021年2月23日 18:46,LakeShen<[hidden email]> 写道: Hi 社区, 最近从 Flink 1.10 升级版本至 Flink 1.12,在提交作业到 Yarn 时,作业一直报错如下: org.apache.flink.client.program.ProgramInvocationException: The main method caused an error: Failed to execute sql at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:365) at org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:218) at org.apache.flink.client.ClientUtils.executeProgram(ClientUtils.java:114) at org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:812) at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:246) at org.apache.flink.client.cli.CliFrontend.parseAndRun(CliFrontend.java:1054) at org.apache.flink.client.cli.CliFrontend.lambda$main$10(CliFrontend.java:1132) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1692) at org.apache.flink.runtime.security.contexts.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41) at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1132) Caused by: org.apache.flink.table.api.TableException: Failed to execute sql at org.apache.flink.table.api.internal.TableEnvironmentImpl.executeInternal(TableEnvironmentImpl.java:699) at org.apache.flink.table.api.internal.TableEnvironmentImpl.executeOperation(TableEnvironmentImpl.java:767) at org.apache.flink.table.api.internal.TableEnvironmentImpl.executeSql(TableEnvironmentImpl.java:666) at com.youzan.bigdata.FlinkStreamSQLDDLJob.lambda$main$0(FlinkStreamSQLDDLJob.java:95) at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1380) at java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:580) at com.youzan.bigdata.FlinkStreamSQLDDLJob.main(FlinkStreamSQLDDLJob.java:93) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:348) ... 11 more Caused by: org.apache.flink.client.deployment.ClusterDeploymentException: Could not deploy Yarn job cluster. at org.apache.flink.yarn.YarnClusterDescriptor.deployJobCluster(YarnClusterDescriptor.java:481) at org.apache.flink.client.deployment.executors.AbstractJobClusterExecutor.execute(AbstractJobClusterExecutor.java:81) at org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.executeAsync(StreamExecutionEnvironment.java:1905) at org.apache.flink.client.program.StreamContextEnvironment.executeAsync(StreamContextEnvironment.java:135) at org.apache.flink.table.planner.delegation.ExecutorBase.executeAsync(ExecutorBase.java:55) at org.apache.flink.table.api.internal.TableEnvironmentImpl.executeInternal(TableEnvironmentImpl.java:681) ... 22 more Caused by: org.apache.flink.yarn.YarnClusterDescriptor$YarnDeploymentException: The YARN application unexpectedly switched to state FAILED during deployment. Diagnostics from YARN: Application application_1613992328588_4441 failed 2 times due to AM Container for appattempt_1613992328588_4441_000002 exited with exitCode: 1 Diagnostics: Exception from container-launch. Container id: container_xxx Exit code: 1 Stack trace: ExitCodeException exitCode=1: at org.apache.hadoop.util.Shell.runCommand(Shell.java:575) at org.apache.hadoop.util.Shell.run(Shell.java:478) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:766) at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:212) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) 相关信息如下: 1. 我的 Flink 作业中没有 Hadoop 相关的依赖 2. 提交作业的机器,以及 Hadoop 集群每台机器都有 HADOOP_CLASSPATH 环境变量 3. Flink 作业提交到 Yarn 后,状态之后从 Accepted 到 FAILED 状态。 希望有人帮我解惑,感谢 Best, LakeShen |
In reply to this post by LakeShen
你是指提交时所依赖的flink-dist jar包需要是 1.12 版本吗,现在改成1.12 版本还是不行
> 2021年2月23日 下午9:27,LakeShen <[hidden email]> 写道: > > 这个应该你的 flink 本地配置的目录要是 1.12 版本的,也就是 flink-dist 目录 > > > > 凌战 <[hidden email]> 于2021年2月23日周二 下午7:33写道: > >> 同提交作业到On Yarn集群,客户端的错误也是 >> >> >> org.apache.flink.yarn.YarnClusterDescriptor$YarnDeploymentException: The >> YARN application unexpectedly switched to state FAILED during deployment. >> Diagnostics from YARN: Application application_1610671284452_0243 failed >> 10 times due to AM Container for appattempt_1610671284452_0243_000010 >> exited with exitCode: 1 >> Failing this attempt.Diagnostics: [2021-02-23 18:51:00.021]Exception from >> container-launch. >> Container id: container_e48_1610671284452_0243_10_000001 >> Exit code: 1 >> >> >> [2021-02-23 18:51:00.024]Container exited with a non-zero exit code 1. >> Error file: prelaunch.err. >> Last 4096 bytes of prelaunch.err : >> >> >> [2021-02-23 18:51:00.027]Container exited with a non-zero exit code 1. >> Error file: prelaunch.err. >> Last 4096 bytes of prelaunch.err : >> >> >> Yarn那边的日志显示:Could not find or load main class >> org.apache.flink.yarn.entrypoint.YarnJobClusterEntrypoint >> >> >> 不过我是Flink 1.12 的API,然后提交的集群还是Flink1.10.1的,不知道哪里的问题 >> >> >> | | >> 凌战 >> | >> | >> [hidden email] >> | >> 签名由网易邮箱大师定制 >> 在2021年2月23日 18:46,LakeShen<[hidden email]> 写道: >> Hi 社区, >> >> 最近从 Flink 1.10 升级版本至 Flink 1.12,在提交作业到 Yarn 时,作业一直报错如下: >> >> >> org.apache.flink.client.program.ProgramInvocationException: The main method >> caused an error: Failed to execute sql >> >> at >> >> org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:365) >> >> at >> >> org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:218) >> >> at org.apache.flink.client.ClientUtils.executeProgram(ClientUtils.java:114) >> >> at >> >> org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:812) >> >> at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:246) >> >> at >> org.apache.flink.client.cli.CliFrontend.parseAndRun(CliFrontend.java:1054) >> >> at >> >> org.apache.flink.client.cli.CliFrontend.lambda$main$10(CliFrontend.java:1132) >> >> at java.security.AccessController.doPrivileged(Native Method) >> >> at javax.security.auth.Subject.doAs(Subject.java:422) >> >> at >> >> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1692) >> >> at >> >> org.apache.flink.runtime.security.contexts.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41) >> >> at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1132) >> >> Caused by: org.apache.flink.table.api.TableException: Failed to execute sql >> >> at >> >> org.apache.flink.table.api.internal.TableEnvironmentImpl.executeInternal(TableEnvironmentImpl.java:699) >> >> at >> >> org.apache.flink.table.api.internal.TableEnvironmentImpl.executeOperation(TableEnvironmentImpl.java:767) >> >> at >> >> org.apache.flink.table.api.internal.TableEnvironmentImpl.executeSql(TableEnvironmentImpl.java:666) >> >> at >> >> com.youzan.bigdata.FlinkStreamSQLDDLJob.lambda$main$0(FlinkStreamSQLDDLJob.java:95) >> >> at >> >> java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1380) >> >> at >> java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:580) >> >> at >> com.youzan.bigdata.FlinkStreamSQLDDLJob.main(FlinkStreamSQLDDLJob.java:93) >> >> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >> >> at >> >> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) >> >> at >> >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >> >> at java.lang.reflect.Method.invoke(Method.java:498) >> >> at >> >> org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:348) >> >> ... 11 more >> >> Caused by: org.apache.flink.client.deployment.ClusterDeploymentException: >> Could not deploy Yarn job cluster. >> >> at >> >> org.apache.flink.yarn.YarnClusterDescriptor.deployJobCluster(YarnClusterDescriptor.java:481) >> >> at >> >> org.apache.flink.client.deployment.executors.AbstractJobClusterExecutor.execute(AbstractJobClusterExecutor.java:81) >> >> at >> >> org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.executeAsync(StreamExecutionEnvironment.java:1905) >> >> at >> >> org.apache.flink.client.program.StreamContextEnvironment.executeAsync(StreamContextEnvironment.java:135) >> >> at >> >> org.apache.flink.table.planner.delegation.ExecutorBase.executeAsync(ExecutorBase.java:55) >> >> at >> >> org.apache.flink.table.api.internal.TableEnvironmentImpl.executeInternal(TableEnvironmentImpl.java:681) >> >> ... 22 more >> >> Caused by: >> org.apache.flink.yarn.YarnClusterDescriptor$YarnDeploymentException: The >> YARN application unexpectedly switched to state FAILED during deployment. >> Diagnostics from YARN: Application application_1613992328588_4441 failed 2 >> times due to AM Container for appattempt_1613992328588_4441_000002 exited >> with exitCode: 1 >> Diagnostics: Exception from container-launch. >> Container id: container_xxx >> Exit code: 1 >> Stack trace: ExitCodeException exitCode=1: >> >> at org.apache.hadoop.util.Shell.runCommand(Shell.java:575) >> >> at org.apache.hadoop.util.Shell.run(Shell.java:478) >> >> at >> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:766) >> >> at >> >> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:212) >> >> at >> >> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302) >> >> at >> >> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82) >> >> at java.util.concurrent.FutureTask.run(FutureTask.java:266) >> >> at >> >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) >> >> at >> >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) >> >> at java.lang.Thread.run(Thread.java:748) >> >> >> 相关信息如下: >> 1. 我的 Flink 作业中没有 Hadoop 相关的依赖 >> 2. 提交作业的机器,以及 Hadoop 集群每台机器都有 HADOOP_CLASSPATH 环境变量 >> 3. Flink 作业提交到 Yarn 后,状态之后从 Accepted 到 FAILED 状态。 >> >> 希望有人帮我解惑,感谢 >> >> Best, >> LakeShen >> |
可以试试在flink-conf.yaml里添加如下配置:
yarn.flink-dist-jar: /opt/flink-1.12/lib/flink-dist_2.11-1.12.0.jar yarn.ship-files: /data/dfl2/lib 这个行为其实很奇怪,在我们的环境里,有的提交任务的机器不需要添加这个配置,有的不加这个配置就会造成那个main class找不到的问题。 Ps: 造成main class找不到的原因还可能是程序依赖的版本和部署的flink版本不一致,这种情况可能发生在flink依赖升级之后,部署的flink没有更新或者没有完全更新 | | 马阳阳 | | [hidden email] | 签名由网易邮箱大师定制 在2021年02月23日 22:36,m183<[hidden email]> 写道: 你是指提交时所依赖的flink-dist jar包需要是 1.12 版本吗,现在改成1.12 版本还是不行 2021年2月23日 下午9:27,LakeShen <[hidden email]> 写道: 这个应该你的 flink 本地配置的目录要是 1.12 版本的,也就是 flink-dist 目录 凌战 <[hidden email]> 于2021年2月23日周二 下午7:33写道: 同提交作业到On Yarn集群,客户端的错误也是 org.apache.flink.yarn.YarnClusterDescriptor$YarnDeploymentException: The YARN application unexpectedly switched to state FAILED during deployment. Diagnostics from YARN: Application application_1610671284452_0243 failed 10 times due to AM Container for appattempt_1610671284452_0243_000010 exited with exitCode: 1 Failing this attempt.Diagnostics: [2021-02-23 18:51:00.021]Exception from container-launch. Container id: container_e48_1610671284452_0243_10_000001 Exit code: 1 [2021-02-23 18:51:00.024]Container exited with a non-zero exit code 1. Error file: prelaunch.err. Last 4096 bytes of prelaunch.err : [2021-02-23 18:51:00.027]Container exited with a non-zero exit code 1. Error file: prelaunch.err. Last 4096 bytes of prelaunch.err : Yarn那边的日志显示:Could not find or load main class org.apache.flink.yarn.entrypoint.YarnJobClusterEntrypoint 不过我是Flink 1.12 的API,然后提交的集群还是Flink1.10.1的,不知道哪里的问题 | | 凌战 | | [hidden email] | 签名由网易邮箱大师定制 在2021年2月23日 18:46,LakeShen<[hidden email]> 写道: Hi 社区, 最近从 Flink 1.10 升级版本至 Flink 1.12,在提交作业到 Yarn 时,作业一直报错如下: org.apache.flink.client.program.ProgramInvocationException: The main method caused an error: Failed to execute sql at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:365) at org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:218) at org.apache.flink.client.ClientUtils.executeProgram(ClientUtils.java:114) at org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:812) at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:246) at org.apache.flink.client.cli.CliFrontend.parseAndRun(CliFrontend.java:1054) at org.apache.flink.client.cli.CliFrontend.lambda$main$10(CliFrontend.java:1132) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1692) at org.apache.flink.runtime.security.contexts.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41) at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1132) Caused by: org.apache.flink.table.api.TableException: Failed to execute sql at org.apache.flink.table.api.internal.TableEnvironmentImpl.executeInternal(TableEnvironmentImpl.java:699) at org.apache.flink.table.api.internal.TableEnvironmentImpl.executeOperation(TableEnvironmentImpl.java:767) at org.apache.flink.table.api.internal.TableEnvironmentImpl.executeSql(TableEnvironmentImpl.java:666) at com.youzan.bigdata.FlinkStreamSQLDDLJob.lambda$main$0(FlinkStreamSQLDDLJob.java:95) at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1380) at java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:580) at com.youzan.bigdata.FlinkStreamSQLDDLJob.main(FlinkStreamSQLDDLJob.java:93) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:348) ... 11 more Caused by: org.apache.flink.client.deployment.ClusterDeploymentException: Could not deploy Yarn job cluster. at org.apache.flink.yarn.YarnClusterDescriptor.deployJobCluster(YarnClusterDescriptor.java:481) at org.apache.flink.client.deployment.executors.AbstractJobClusterExecutor.execute(AbstractJobClusterExecutor.java:81) at org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.executeAsync(StreamExecutionEnvironment.java:1905) at org.apache.flink.client.program.StreamContextEnvironment.executeAsync(StreamContextEnvironment.java:135) at org.apache.flink.table.planner.delegation.ExecutorBase.executeAsync(ExecutorBase.java:55) at org.apache.flink.table.api.internal.TableEnvironmentImpl.executeInternal(TableEnvironmentImpl.java:681) ... 22 more Caused by: org.apache.flink.yarn.YarnClusterDescriptor$YarnDeploymentException: The YARN application unexpectedly switched to state FAILED during deployment. Diagnostics from YARN: Application application_1613992328588_4441 failed 2 times due to AM Container for appattempt_1613992328588_4441_000002 exited with exitCode: 1 Diagnostics: Exception from container-launch. Container id: container_xxx Exit code: 1 Stack trace: ExitCodeException exitCode=1: at org.apache.hadoop.util.Shell.runCommand(Shell.java:575) at org.apache.hadoop.util.Shell.run(Shell.java:478) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:766) at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:212) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) 相关信息如下: 1. 我的 Flink 作业中没有 Hadoop 相关的依赖 2. 提交作业的机器,以及 Hadoop 集群每台机器都有 HADOOP_CLASSPATH 环境变量 3. Flink 作业提交到 Yarn 后,状态之后从 Accepted 到 FAILED 状态。 希望有人帮我解惑,感谢 Best, LakeShen |
说明一下,yarn.ship-files这个配置的文件夹下需要包含flink-yarn的jar包,可以配置成flink home下的lib文件夹
| | 马阳阳 | | [hidden email] | 签名由网易邮箱大师定制 在2021年02月25日 08:59,马阳阳<[hidden email]> 写道: 可以试试在flink-conf.yaml里添加如下配置: yarn.flink-dist-jar: /opt/flink-1.12/lib/flink-dist_2.11-1.12.0.jar yarn.ship-files: /data/dfl2/lib 这个行为其实很奇怪,在我们的环境里,有的提交任务的机器不需要添加这个配置,有的不加这个配置就会造成那个main class找不到的问题。 Ps: 造成main class找不到的原因还可能是程序依赖的版本和部署的flink版本不一致,这种情况可能发生在flink依赖升级之后,部署的flink没有更新或者没有完全更新 | | 马阳阳 | | [hidden email] | 签名由网易邮箱大师定制 在2021年02月23日 22:36,m183<[hidden email]> 写道: 你是指提交时所依赖的flink-dist jar包需要是 1.12 版本吗,现在改成1.12 版本还是不行 2021年2月23日 下午9:27,LakeShen <[hidden email]> 写道: 这个应该你的 flink 本地配置的目录要是 1.12 版本的,也就是 flink-dist 目录 凌战 <[hidden email]> 于2021年2月23日周二 下午7:33写道: 同提交作业到On Yarn集群,客户端的错误也是 org.apache.flink.yarn.YarnClusterDescriptor$YarnDeploymentException: The YARN application unexpectedly switched to state FAILED during deployment. Diagnostics from YARN: Application application_1610671284452_0243 failed 10 times due to AM Container for appattempt_1610671284452_0243_000010 exited with exitCode: 1 Failing this attempt.Diagnostics: [2021-02-23 18:51:00.021]Exception from container-launch. Container id: container_e48_1610671284452_0243_10_000001 Exit code: 1 [2021-02-23 18:51:00.024]Container exited with a non-zero exit code 1. Error file: prelaunch.err. Last 4096 bytes of prelaunch.err : [2021-02-23 18:51:00.027]Container exited with a non-zero exit code 1. Error file: prelaunch.err. Last 4096 bytes of prelaunch.err : Yarn那边的日志显示:Could not find or load main class org.apache.flink.yarn.entrypoint.YarnJobClusterEntrypoint 不过我是Flink 1.12 的API,然后提交的集群还是Flink1.10.1的,不知道哪里的问题 | | 凌战 | | [hidden email] | 签名由网易邮箱大师定制 在2021年2月23日 18:46,LakeShen<[hidden email]> 写道: Hi 社区, 最近从 Flink 1.10 升级版本至 Flink 1.12,在提交作业到 Yarn 时,作业一直报错如下: org.apache.flink.client.program.ProgramInvocationException: The main method caused an error: Failed to execute sql at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:365) at org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:218) at org.apache.flink.client.ClientUtils.executeProgram(ClientUtils.java:114) at org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:812) at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:246) at org.apache.flink.client.cli.CliFrontend.parseAndRun(CliFrontend.java:1054) at org.apache.flink.client.cli.CliFrontend.lambda$main$10(CliFrontend.java:1132) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1692) at org.apache.flink.runtime.security.contexts.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41) at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1132) Caused by: org.apache.flink.table.api.TableException: Failed to execute sql at org.apache.flink.table.api.internal.TableEnvironmentImpl.executeInternal(TableEnvironmentImpl.java:699) at org.apache.flink.table.api.internal.TableEnvironmentImpl.executeOperation(TableEnvironmentImpl.java:767) at org.apache.flink.table.api.internal.TableEnvironmentImpl.executeSql(TableEnvironmentImpl.java:666) at com.youzan.bigdata.FlinkStreamSQLDDLJob.lambda$main$0(FlinkStreamSQLDDLJob.java:95) at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1380) at java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:580) at com.youzan.bigdata.FlinkStreamSQLDDLJob.main(FlinkStreamSQLDDLJob.java:93) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:348) ... 11 more Caused by: org.apache.flink.client.deployment.ClusterDeploymentException: Could not deploy Yarn job cluster. at org.apache.flink.yarn.YarnClusterDescriptor.deployJobCluster(YarnClusterDescriptor.java:481) at org.apache.flink.client.deployment.executors.AbstractJobClusterExecutor.execute(AbstractJobClusterExecutor.java:81) at org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.executeAsync(StreamExecutionEnvironment.java:1905) at org.apache.flink.client.program.StreamContextEnvironment.executeAsync(StreamContextEnvironment.java:135) at org.apache.flink.table.planner.delegation.ExecutorBase.executeAsync(ExecutorBase.java:55) at org.apache.flink.table.api.internal.TableEnvironmentImpl.executeInternal(TableEnvironmentImpl.java:681) ... 22 more Caused by: org.apache.flink.yarn.YarnClusterDescriptor$YarnDeploymentException: The YARN application unexpectedly switched to state FAILED during deployment. Diagnostics from YARN: Application application_1613992328588_4441 failed 2 times due to AM Container for appattempt_1613992328588_4441_000002 exited with exitCode: 1 Diagnostics: Exception from container-launch. Container id: container_xxx Exit code: 1 Stack trace: ExitCodeException exitCode=1: at org.apache.hadoop.util.Shell.runCommand(Shell.java:575) at org.apache.hadoop.util.Shell.run(Shell.java:478) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:766) at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:212) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) 相关信息如下: 1. 我的 Flink 作业中没有 Hadoop 相关的依赖 2. 提交作业的机器,以及 Hadoop 集群每台机器都有 HADOOP_CLASSPATH 环境变量 3. Flink 作业提交到 Yarn 后,状态之后从 Accepted 到 FAILED 状态。 希望有人帮我解惑,感谢 Best, LakeShen |
Free forum by Nabble | Edit this page |