Flink standalone on k8s HA异常

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Flink standalone on k8s HA异常

casel.chen
我试着答k8s上部署flink standalone集群,做HA之前集群是能够正常work的,在做HA的时候发现在configmap中添加了如下两个HA配置后JM就会抛异常,这是为什么?


high-availability: org.apache.flink.kubernetes.highavailability.KubernetesHaServicesFactory
high-availability.storageDir: oss:///odps-prd/rtdp/flink/recovery


2021-02-09 00:03:04,421 ERROR org.apache.flink.runtime.entrypoint.ClusterEntrypoint        [] - Could not start cluster entrypoint StandaloneSessionClusterEntrypoint.
org.apache.flink.runtime.entrypoint.ClusterEntrypointException: Failed to initialize the cluster entrypoint StandaloneSessionClusterEntrypoint.
at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.startCluster(ClusterEntrypoint.java:200) ~[flink-dist_2.12-1.12.1.jar:1.12.1]
at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.runClusterEntrypoint(ClusterEntrypoint.java:569) [flink-dist_2.12-1.12.1.jar:1.12.1]
at org.apache.flink.runtime.entrypoint.StandaloneSessionClusterEntrypoint.main(StandaloneSessionClusterEntrypoint.java:59) [flink-dist_2.12-1.12.1.jar:1.12.1]
Caused by: org.apache.flink.util.FlinkException: Could not create the ha services from the instantiated HighAvailabilityServicesFactory org.apache.flink.kubernetes.highavailability.KubernetesHaServicesFactory.
at org.apache.flink.runtime.highavailability.HighAvailabilityServicesUtils.createCustomHAServices(HighAvailabilityServicesUtils.java:268) ~[flink-dist_2.12-1.12.1.jar:1.12.1]
at org.apache.flink.runtime.highavailability.HighAvailabilityServicesUtils.createHighAvailabilityServices(HighAvailabilityServicesUtils.java:124) ~[flink-dist_2.12-1.12.1.jar:1.12.1]
at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.createHaServices(ClusterEntrypoint.java:332) ~[flink-dist_2.12-1.12.1.jar:1.12.1]
at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.initializeServices(ClusterEntrypoint.java:290) ~[flink-dist_2.12-1.12.1.jar:1.12.1]
at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.runCluster(ClusterEntrypoint.java:223) ~[flink-dist_2.12-1.12.1.jar:1.12.1]
at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.lambda$startCluster$0(ClusterEntrypoint.java:178) ~[flink-dist_2.12-1.12.1.jar:1.12.1]
at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_282]
at javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_282]
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1836) ~[flink-shaded-hadoop-2-uber-2.8.3-10.0.jar:2.8.3-10.0]
at org.apache.flink.runtime.security.contexts.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41) ~[flink-dist_2.12-1.12.1.jar:1.12.1]
at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.startCluster(ClusterEntrypoint.java:175) ~[flink-dist_2.12-1.12.1.jar:1.12.1]
... 2 more
Caused by: java.lang.NullPointerException
at org.apache.flink.util.Preconditions.checkNotNull(Preconditions.java:59) ~[flink-dist_2.12-1.12.1.jar:1.12.1]
at org.apache.flink.kubernetes.kubeclient.Fabric8FlinkKubeClient.<init>(Fabric8FlinkKubeClient.java:84) ~[flink-dist_2.12-1.12.1.jar:1.12.1]
at org.apache.flink.kubernetes.kubeclient.DefaultKubeClientFactory.fromConfiguration(DefaultKubeClientFactory.java:88) ~[flink-dist_2.12-1.12.1.jar:1.12.1]
at org.apache.flink.kubernetes.highavailability.KubernetesHaServicesFactory.createHAServices(KubernetesHaServicesFactory.java:38) ~[flink-dist_2.12-1.12.1.jar:1.12.1]
at org.apache.flink.runtime.highavailability.HighAvailabilityServicesUtils.createCustomHAServices(HighAvailabilityServicesUtils.java:265) ~[flink-dist_2.12-1.12.1.jar:1.12.1]
at org.apache.flink.runtime.highavailability.HighAvailabilityServicesUtils.createHighAvailabilityServices(HighAvailabilityServicesUtils.java:124) ~[flink-dist_2.12-1.12.1.jar:1.12.1]
at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.createHaServices(ClusterEntrypoint.java:332) ~[flink-dist_2.12-1.12.1.jar:1.12.1]
at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.initializeServices(ClusterEntrypoint.java:290) ~[flink-dist_2.12-1.12.1.jar:1.12.1]
at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.runCluster(ClusterEntrypoint.java:223) ~[flink-dist_2.12-1.12.1.jar:1.12.1]
at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.lambda$startCluster$0(ClusterEntrypoint.java:178) ~[flink-dist_2.12-1.12.1.jar:1.12.1]
at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_282]
at javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_282]
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1836) ~[flink-shaded-hadoop-2-uber-2.8.3-10.0.jar:2.8.3-10.0]
at org.apache.flink.runtime.security.contexts.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41) ~[flink-dist_2.12-1.12.1.jar:1.12.1]
at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.startCluster(ClusterEntrypoint.java:175) ~[flink-dist_2.12-1.12.1.jar:1.12.1]
... 2 more
Reply | Threaded
Open this post in threaded view
|

Re: Flink standalone on k8s HA异常

Yang Wang
启用HA以后,你需要创建一个有create/watch ConfigMap的权限的service account
然后挂载给JobManager和TaskManager
从你的报错看应该是没有配置service account

Best,
Yang


casel.chen <[hidden email]> 于2021年2月9日周二 上午12:10写道:

> 我试着答k8s上部署flink
> standalone集群,做HA之前集群是能够正常work的,在做HA的时候发现在configmap中添加了如下两个HA配置后JM就会抛异常,这是为什么?
>
>
> high-availability:
> org.apache.flink.kubernetes.highavailability.KubernetesHaServicesFactory
> high-availability.storageDir: oss:///odps-prd/rtdp/flink/recovery
>
>
> 2021-02-09 00:03:04,421 ERROR
> org.apache.flink.runtime.entrypoint.ClusterEntrypoint        [] - Could not
> start cluster entrypoint StandaloneSessionClusterEntrypoint.
> org.apache.flink.runtime.entrypoint.ClusterEntrypointException: Failed to
> initialize the cluster entrypoint StandaloneSessionClusterEntrypoint.
> at
> org.apache.flink.runtime.entrypoint.ClusterEntrypoint.startCluster(ClusterEntrypoint.java:200)
> ~[flink-dist_2.12-1.12.1.jar:1.12.1]
> at
> org.apache.flink.runtime.entrypoint.ClusterEntrypoint.runClusterEntrypoint(ClusterEntrypoint.java:569)
> [flink-dist_2.12-1.12.1.jar:1.12.1]
> at
> org.apache.flink.runtime.entrypoint.StandaloneSessionClusterEntrypoint.main(StandaloneSessionClusterEntrypoint.java:59)
> [flink-dist_2.12-1.12.1.jar:1.12.1]
> Caused by: org.apache.flink.util.FlinkException: Could not create the ha
> services from the instantiated HighAvailabilityServicesFactory
> org.apache.flink.kubernetes.highavailability.KubernetesHaServicesFactory.
> at
> org.apache.flink.runtime.highavailability.HighAvailabilityServicesUtils.createCustomHAServices(HighAvailabilityServicesUtils.java:268)
> ~[flink-dist_2.12-1.12.1.jar:1.12.1]
> at
> org.apache.flink.runtime.highavailability.HighAvailabilityServicesUtils.createHighAvailabilityServices(HighAvailabilityServicesUtils.java:124)
> ~[flink-dist_2.12-1.12.1.jar:1.12.1]
> at
> org.apache.flink.runtime.entrypoint.ClusterEntrypoint.createHaServices(ClusterEntrypoint.java:332)
> ~[flink-dist_2.12-1.12.1.jar:1.12.1]
> at
> org.apache.flink.runtime.entrypoint.ClusterEntrypoint.initializeServices(ClusterEntrypoint.java:290)
> ~[flink-dist_2.12-1.12.1.jar:1.12.1]
> at
> org.apache.flink.runtime.entrypoint.ClusterEntrypoint.runCluster(ClusterEntrypoint.java:223)
> ~[flink-dist_2.12-1.12.1.jar:1.12.1]
> at
> org.apache.flink.runtime.entrypoint.ClusterEntrypoint.lambda$startCluster$0(ClusterEntrypoint.java:178)
> ~[flink-dist_2.12-1.12.1.jar:1.12.1]
> at java.security.AccessController.doPrivileged(Native Method)
> ~[?:1.8.0_282]
> at javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_282]
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1836)
> ~[flink-shaded-hadoop-2-uber-2.8.3-10.0.jar:2.8.3-10.0]
> at
> org.apache.flink.runtime.security.contexts.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
> ~[flink-dist_2.12-1.12.1.jar:1.12.1]
> at
> org.apache.flink.runtime.entrypoint.ClusterEntrypoint.startCluster(ClusterEntrypoint.java:175)
> ~[flink-dist_2.12-1.12.1.jar:1.12.1]
> ... 2 more
> Caused by: java.lang.NullPointerException
> at org.apache.flink.util.Preconditions.checkNotNull(Preconditions.java:59)
> ~[flink-dist_2.12-1.12.1.jar:1.12.1]
> at
> org.apache.flink.kubernetes.kubeclient.Fabric8FlinkKubeClient.<init>(Fabric8FlinkKubeClient.java:84)
> ~[flink-dist_2.12-1.12.1.jar:1.12.1]
> at
> org.apache.flink.kubernetes.kubeclient.DefaultKubeClientFactory.fromConfiguration(DefaultKubeClientFactory.java:88)
> ~[flink-dist_2.12-1.12.1.jar:1.12.1]
> at
> org.apache.flink.kubernetes.highavailability.KubernetesHaServicesFactory.createHAServices(KubernetesHaServicesFactory.java:38)
> ~[flink-dist_2.12-1.12.1.jar:1.12.1]
> at
> org.apache.flink.runtime.highavailability.HighAvailabilityServicesUtils.createCustomHAServices(HighAvailabilityServicesUtils.java:265)
> ~[flink-dist_2.12-1.12.1.jar:1.12.1]
> at
> org.apache.flink.runtime.highavailability.HighAvailabilityServicesUtils.createHighAvailabilityServices(HighAvailabilityServicesUtils.java:124)
> ~[flink-dist_2.12-1.12.1.jar:1.12.1]
> at
> org.apache.flink.runtime.entrypoint.ClusterEntrypoint.createHaServices(ClusterEntrypoint.java:332)
> ~[flink-dist_2.12-1.12.1.jar:1.12.1]
> at
> org.apache.flink.runtime.entrypoint.ClusterEntrypoint.initializeServices(ClusterEntrypoint.java:290)
> ~[flink-dist_2.12-1.12.1.jar:1.12.1]
> at
> org.apache.flink.runtime.entrypoint.ClusterEntrypoint.runCluster(ClusterEntrypoint.java:223)
> ~[flink-dist_2.12-1.12.1.jar:1.12.1]
> at
> org.apache.flink.runtime.entrypoint.ClusterEntrypoint.lambda$startCluster$0(ClusterEntrypoint.java:178)
> ~[flink-dist_2.12-1.12.1.jar:1.12.1]
> at java.security.AccessController.doPrivileged(Native Method)
> ~[?:1.8.0_282]
> at javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_282]
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1836)
> ~[flink-shaded-hadoop-2-uber-2.8.3-10.0.jar:2.8.3-10.0]
> at
> org.apache.flink.runtime.security.contexts.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
> ~[flink-dist_2.12-1.12.1.jar:1.12.1]
> at
> org.apache.flink.runtime.entrypoint.ClusterEntrypoint.startCluster(ClusterEntrypoint.java:175)
> ~[flink-dist_2.12-1.12.1.jar:1.12.1]
> ... 2 more