flink 1.12.0 kubernetes-session部署问题

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

flink 1.12.0 kubernetes-session部署问题

casel.chen
本人第一次尝试在k8s上部署flink,版本用的是1.12.0,jdk是1.8.0_275,scala是2.12.12,在我的mac机器上安装有minikube单机环境,以下是实验步骤:


git clone https://github.com/apache/flink-dockercdflink-docker/1.12/scala_2.12-java8-debian
docker build --tag flink:1.12.0-scala_2.12-java8 .


cd flink-1.12.0
./bin/kubernetes-session.sh \ -Dkubernetes.container.image=flink:1.12.0-scala_2.12-java8 \ -Dkubernetes.rest-service.exposed.type=NodePort \ -Dtaskmanager.numberOfTaskSlots=2 \ -Dkubernetes.cluster-id=flink-session-cluster


显示JM启起来了,但无法通过web访问

2020-12-27 22:08:12,387 INFO  org.apache.flink.kubernetes.KubernetesClusterDescriptor      [] - Create flink session cluster session001 successfully, JobManager Web Interface: http://192.168.99.100:8081




通过 `kubectl get pods` 命令查看到pod一直处理ContainerCreating状态

NAME                                               READY   STATUS              RESTARTS   AGE

flink-session-cluster-858bd55dff-bzjk2             0/1     ContainerCreating   0          5m59s

kubernetes-dashboard-1608509744-6bc8455756-mp47w   1/1     Running             0          6d14h




于是通过 `kubectl describe pod flink-session-cluster-858bd55dff-bzjk2`命令查看详细,结果如下:




Name:         flink-session-cluster-858bd55dff-bzjk2

Namespace:    default

Priority:     0

Node:         minikube/192.168.99.100

Start Time:   Sun, 27 Dec 2020 22:21:56 +0800

Labels:       app=flink-session-cluster

              component=jobmanager

              pod-template-hash=858bd55dff

              type=flink-native-kubernetes

Annotations:  <none>

Status:       Pending

IP:           172.17.0.4

IPs:

  IP:           172.17.0.4

Controlled By:  ReplicaSet/flink-session-cluster-858bd55dff

Containers:

  flink-job-manager:

    Container ID:  

    Image:         flink:1.12.0-scala_2.12-java8

    Image ID:      

    Ports:         8081/TCP, 6123/TCP, 6124/TCP

    Host Ports:    0/TCP, 0/TCP, 0/TCP

    Command:

      /docker-entrypoint.sh

    Args:

      native-k8s

      $JAVA_HOME/bin/java -classpath $FLINK_CLASSPATH -Xmx1073741824 -Xms1073741824 -XX:MaxMetaspaceSize=268435456 -Dlog.file=/opt/flink/log/jobmanager.log -Dlogback.configurationFile=file:/opt/flink/conf/logback-console.xml -Dlog4j.configuration=file:/opt/flink/conf/log4j-console.properties -Dlog4j.configurationFile=file:/opt/flink/conf/log4j-console.properties org.apache.flink.kubernetes.entrypoint.KubernetesSessionClusterEntrypoint -D jobmanager.memory.off-heap.size=134217728b -D jobmanager.memory.jvm-overhead.min=201326592b -D jobmanager.memory.jvm-metaspace.size=268435456b -D jobmanager.memory.heap.size=1073741824b -D jobmanager.memory.jvm-overhead.max=201326592b

    State:          Waiting

      Reason:       ImagePullBackOff

    Ready:          False

    Restart Count:  0

    Limits:

      cpu:     1

      memory:  1600Mi

    Requests:

      cpu:     1

      memory:  1600Mi

    Environment:

      _POD_IP_ADDRESS:   (v1:status.podIP)

      HADOOP_CONF_DIR:  /opt/hadoop/conf

    Mounts:

      /opt/flink/conf from flink-config-volume (rw)

      /opt/hadoop/conf from hadoop-config-volume (rw)

      /var/run/secrets/kubernetes.io/serviceaccount from default-token-s47ht (ro)

Conditions:

  Type              Status

  Initialized       True

  Ready             False

  ContainersReady   False

  PodScheduled      True

Volumes:

  hadoop-config-volume:

    Type:      ConfigMap (a volume populated by a ConfigMap)

    Name:      hadoop-config-flink-session-cluster

    Optional:  false

  flink-config-volume:

    Type:      ConfigMap (a volume populated by a ConfigMap)

    Name:      flink-config-flink-session-cluster

    Optional:  false

  default-token-s47ht:

    Type:        Secret (a volume populated by a Secret)

    SecretName:  default-token-s47ht

    Optional:    false

QoS Class:       Guaranteed

Node-Selectors:  <none>

Tolerations:     node.kubernetes.io/not-ready:NoExecute op=Exists for 300s

                 node.kubernetes.io/unreachable:NoExecute op=Exists for 300s

Events:

  Type     Reason       Age                  From               Message

  ----     ------       ----                 ----               -------

  Normal   Scheduled    21m                  default-scheduler  Successfully assigned default/flink-session-cluster-858bd55dff-bzjk2 to minikube

  Warning  FailedMount  21m (x2 over 21m)    kubelet            MountVolume.SetUp failed for volume "flink-config-volume" : configmap "flink-config-flink-session-cluster" not found

  Warning  FailedMount  21m (x2 over 21m)    kubelet            MountVolume.SetUp failed for volume "hadoop-config-volume" : configmap "hadoop-config-flink-session-cluster" not found

  Normal   Pulling      13m (x4 over 21m)    kubelet            Pulling image "flink:1.12.0-scala_2.12-java8"

  Warning  Failed       13m (x4 over 15m)    kubelet            Failed to pull image "flink:1.12.0-scala_2.12-java8": rpc error: code = Unknown desc = Error response from daemon: manifest for flink:1.12.0-scala_2.12-java8 not found: manifest unknown: manifest unknown

  Normal   BackOff      13m (x5 over 15m)    kubelet            Back-off pulling image "flink:1.12.0-scala_2.12-java8"

  Warning  Failed       11m (x5 over 15m)    kubelet            Error: ErrImagePull

  Warning  Failed       100s (x53 over 15m)  kubelet            Error: ImagePullBackOff




一开始怀疑本地镜像没有生成,于是通过 `docker images` 命令查看

REPOSITORY                                             TAG                       IMAGE ID       CREATED        SIZE

flink                                                  1.12.0-scala_2.12-java8   f7dd9b9e020b   12 hours ago   642MB




显示镜像的确是存在的,这就奇怪了,为什么从本地pull镜像会失败呢?是哪里有问题了吗?minikube下,如何从本地web访问到k8s上运行的flink集群dashboard呢?

第一次用k8s,还请各位指点,谢谢!








Reply | Threaded
Open this post in threaded view
|

Re: flink 1.12.0 kubernetes-session部署问题

Yang Wang
你整个流程理由有两个问题:

1. 镜像找不到
原因应该是和minikube的driver设置有关,如果是hyperkit或者其他vm的方式,你需要minikube
ssh到虚拟机内部查看镜像是否正常存在

2. JM链接无法访问
2020-12-27 22:08:12,387 INFO
org.apache.flink.kubernetes.KubernetesClusterDescriptor      [] - Create
flink session cluster session001 successfully, JobManager Web Interface:
http://192.168.99.100:8081

我猜你上面的这行log应该不是你贴出来的命令打印的,因为你给的命令是NodePort方式,打印出来的JM地址不应该是8081端口的。
只要你在minikube上提交的任务加上kubernetes.rest-service.exposed.type=NodePort,并且JM能起来,打印出来的JM地址就是可以访问的

当然你也可以手动拼接出来这个链接,minikube ip拿到APIServer地址,然后用kubectl get svc 去查看你创建的Flink
Session Cluster对应的rest svc的NodePort,拼起来访问就好了


Best,
Yang

陈帅 <[hidden email]> 于2020年12月27日周日 下午10:51写道:

>
> 本人第一次尝试在k8s上部署flink,版本用的是1.12.0,jdk是1.8.0_275,scala是2.12.12,在我的mac机器上安装有minikube单机环境,以下是实验步骤:
>
>
> git clone
> https://github.com/apache/flink-dockercdflink-docker/1.12/scala_2.12-java8-debian
> docker build --tag flink:1.12.0-scala_2.12-java8 .
>
>
> cd flink-1.12.0
> ./bin/kubernetes-session.sh \
> -Dkubernetes.container.image=flink:1.12.0-scala_2.12-java8 \
> -Dkubernetes.rest-service.exposed.type=NodePort \
> -Dtaskmanager.numberOfTaskSlots=2 \
> -Dkubernetes.cluster-id=flink-session-cluster
>
>
> 显示JM启起来了,但无法通过web访问
>
> 2020-12-27 22:08:12,387 INFO
> org.apache.flink.kubernetes.KubernetesClusterDescriptor      [] - Create
> flink session cluster session001 successfully, JobManager Web Interface:
> http://192.168.99.100:8081
>
>
>
>
> 通过 `kubectl get pods` 命令查看到pod一直处理ContainerCreating状态
>
> NAME                                               READY   STATUS
>     RESTARTS   AGE
>
> flink-session-cluster-858bd55dff-bzjk2             0/1
>  ContainerCreating   0          5m59s
>
> kubernetes-dashboard-1608509744-6bc8455756-mp47w   1/1     Running
>      0          6d14h
>
>
>
>
> 于是通过 `kubectl describe pod
> flink-session-cluster-858bd55dff-bzjk2`命令查看详细,结果如下:
>
>
>
>
> Name:         flink-session-cluster-858bd55dff-bzjk2
>
> Namespace:    default
>
> Priority:     0
>
> Node:         minikube/192.168.99.100
>
> Start Time:   Sun, 27 Dec 2020 22:21:56 +0800
>
> Labels:       app=flink-session-cluster
>
>               component=jobmanager
>
>               pod-template-hash=858bd55dff
>
>               type=flink-native-kubernetes
>
> Annotations:  <none>
>
> Status:       Pending
>
> IP:           172.17.0.4
>
> IPs:
>
>   IP:           172.17.0.4
>
> Controlled By:  ReplicaSet/flink-session-cluster-858bd55dff
>
> Containers:
>
>   flink-job-manager:
>
>     Container ID:
>
>     Image:         flink:1.12.0-scala_2.12-java8
>
>     Image ID:
>
>     Ports:         8081/TCP, 6123/TCP, 6124/TCP
>
>     Host Ports:    0/TCP, 0/TCP, 0/TCP
>
>     Command:
>
>       /docker-entrypoint.sh
>
>     Args:
>
>       native-k8s
>
>       $JAVA_HOME/bin/java -classpath $FLINK_CLASSPATH -Xmx1073741824
> -Xms1073741824 -XX:MaxMetaspaceSize=268435456
> -Dlog.file=/opt/flink/log/jobmanager.log
> -Dlogback.configurationFile=file:/opt/flink/conf/logback-console.xml
> -Dlog4j.configuration=file:/opt/flink/conf/log4j-console.properties
> -Dlog4j.configurationFile=file:/opt/flink/conf/log4j-console.properties
> org.apache.flink.kubernetes.entrypoint.KubernetesSessionClusterEntrypoint
> -D jobmanager.memory.off-heap.size=134217728b -D
> jobmanager.memory.jvm-overhead.min=201326592b -D
> jobmanager.memory.jvm-metaspace.size=268435456b -D
> jobmanager.memory.heap.size=1073741824b -D
> jobmanager.memory.jvm-overhead.max=201326592b
>
>     State:          Waiting
>
>       Reason:       ImagePullBackOff
>
>     Ready:          False
>
>     Restart Count:  0
>
>     Limits:
>
>       cpu:     1
>
>       memory:  1600Mi
>
>     Requests:
>
>       cpu:     1
>
>       memory:  1600Mi
>
>     Environment:
>
>       _POD_IP_ADDRESS:   (v1:status.podIP)
>
>       HADOOP_CONF_DIR:  /opt/hadoop/conf
>
>     Mounts:
>
>       /opt/flink/conf from flink-config-volume (rw)
>
>       /opt/hadoop/conf from hadoop-config-volume (rw)
>
>       /var/run/secrets/kubernetes.io/serviceaccount from
> default-token-s47ht (ro)
>
> Conditions:
>
>   Type              Status
>
>   Initialized       True
>
>   Ready             False
>
>   ContainersReady   False
>
>   PodScheduled      True
>
> Volumes:
>
>   hadoop-config-volume:
>
>     Type:      ConfigMap (a volume populated by a ConfigMap)
>
>     Name:      hadoop-config-flink-session-cluster
>
>     Optional:  false
>
>   flink-config-volume:
>
>     Type:      ConfigMap (a volume populated by a ConfigMap)
>
>     Name:      flink-config-flink-session-cluster
>
>     Optional:  false
>
>   default-token-s47ht:
>
>     Type:        Secret (a volume populated by a Secret)
>
>     SecretName:  default-token-s47ht
>
>     Optional:    false
>
> QoS Class:       Guaranteed
>
> Node-Selectors:  <none>
>
> Tolerations:     node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
>
>                  node.kubernetes.io/unreachable:NoExecute op=Exists for
> 300s
>
> Events:
>
>   Type     Reason       Age                  From               Message
>
>   ----     ------       ----                 ----               -------
>
>   Normal   Scheduled    21m                  default-scheduler
> Successfully assigned default/flink-session-cluster-858bd55dff-bzjk2 to
> minikube
>
>   Warning  FailedMount  21m (x2 over 21m)    kubelet
> MountVolume.SetUp failed for volume "flink-config-volume" : configmap
> "flink-config-flink-session-cluster" not found
>
>   Warning  FailedMount  21m (x2 over 21m)    kubelet
> MountVolume.SetUp failed for volume "hadoop-config-volume" : configmap
> "hadoop-config-flink-session-cluster" not found
>
>   Normal   Pulling      13m (x4 over 21m)    kubelet            Pulling
> image "flink:1.12.0-scala_2.12-java8"
>
>   Warning  Failed       13m (x4 over 15m)    kubelet            Failed to
> pull image "flink:1.12.0-scala_2.12-java8": rpc error: code = Unknown desc
> = Error response from daemon: manifest for flink:1.12.0-scala_2.12-java8
> not found: manifest unknown: manifest unknown
>
>   Normal   BackOff      13m (x5 over 15m)    kubelet            Back-off
> pulling image "flink:1.12.0-scala_2.12-java8"
>
>   Warning  Failed       11m (x5 over 15m)    kubelet            Error:
> ErrImagePull
>
>   Warning  Failed       100s (x53 over 15m)  kubelet            Error:
> ImagePullBackOff
>
>
>
>
> 一开始怀疑本地镜像没有生成,于是通过 `docker images` 命令查看
>
> REPOSITORY                                             TAG
>        IMAGE ID       CREATED        SIZE
>
> flink
> 1.12.0-scala_2.12-java8   f7dd9b9e020b   12 hours ago   642MB
>
>
>
>
>
> 显示镜像的确是存在的,这就奇怪了,为什么从本地pull镜像会失败呢?是哪里有问题了吗?minikube下,如何从本地web访问到k8s上运行的flink集群dashboard呢?
>
> 第一次用k8s,还请各位指点,谢谢!
>
>
>
>
>
>
>
>
>
Reply | Threaded
Open this post in threaded view
|

Re:flink 1.12.0 kubernetes-session部署问题

casel.chen
In reply to this post by casel.chen
今天改用官方最新发布的flink镜像版本1.11.3也启不起来
这是我的命令
./bin/kubernetes-session.sh \
  -Dkubernetes.cluster-id=rtdp \
  -Dtaskmanager.memory.process.size=4096m \
  -Dkubernetes.taskmanager.cpu=2 \
  -Dtaskmanager.numberOfTaskSlots=4 \
  -Dresourcemanager.taskmanager-timeout=3600000 \
  -Dkubernetes.container.image=flink:1.11.3-scala_2.12-java8 \
  -Dkubernetes.namespace=rtdp



Events:

  Type     Reason          Age                From               Message

  ----     ------          ----               ----               -------

  Normal   Scheduled       88s                default-scheduler  Successfully assigned rtdp/rtdp-6d7794d65d-g6mb5 to cn-shanghai.192.168.16.130

  Warning  FailedMount     88s                kubelet            MountVolume.SetUp failed for volume "flink-config-volume" : configmap "flink-config-rtdp" not found

  Warning  FailedMount     88s                kubelet            MountVolume.SetUp failed for volume "hadoop-config-volume" : configmap "hadoop-config-rtdp" not found

  Normal   AllocIPSucceed  87s                terway-daemon      Alloc IP 192.168.32.25/22 for Pod

  Normal   Pulling         87s                kubelet            Pulling image "flink:1.11.3-scala_2.12-java8"

  Normal   Pulled          31s                kubelet            Successfully pulled image "flink:1.11.3-scala_2.12-java8"

  Normal   Created         18s (x2 over 26s)  kubelet            Created container flink-job-manager

  Normal   Started         18s (x2 over 26s)  kubelet            Started container flink-job-manager

  Normal   Pulled          18s                kubelet            Container image "flink:1.11.3-scala_2.12-java8" already present on machine

  Warning  BackOff         10s                kubelet            Back-off restarting failed container







这里面有两个ConfigMap没有找到,是需要提前创建吗?官方文档没有说明?还是我看漏了?
https://ci.apache.org/projects/flink/flink-docs-release-1.11/ops/deployment/native_kubernetes.html#start-flink-session









在 2020-12-27 22:50:32,"陈帅" <[hidden email]> 写道:

>本人第一次尝试在k8s上部署flink,版本用的是1.12.0,jdk是1.8.0_275,scala是2.12.12,在我的mac机器上安装有minikube单机环境,以下是实验步骤:
>
>
>git clone https://github.com/apache/flink-dockercdflink-docker/1.12/scala_2.12-java8-debian
>docker build --tag flink:1.12.0-scala_2.12-java8 .
>
>
>cd flink-1.12.0
>./bin/kubernetes-session.sh \ -Dkubernetes.container.image=flink:1.12.0-scala_2.12-java8 \ -Dkubernetes.rest-service.exposed.type=NodePort \ -Dtaskmanager.numberOfTaskSlots=2 \ -Dkubernetes.cluster-id=flink-session-cluster
>
>
>显示JM启起来了,但无法通过web访问
>
>2020-12-27 22:08:12,387 INFO  org.apache.flink.kubernetes.KubernetesClusterDescriptor      [] - Create flink session cluster session001 successfully, JobManager Web Interface: http://192.168.99.100:8081
>
>
>
>
>通过 `kubectl get pods` 命令查看到pod一直处理ContainerCreating状态
>
>NAME                                               READY   STATUS              RESTARTS   AGE
>
>flink-session-cluster-858bd55dff-bzjk2             0/1     ContainerCreating   0          5m59s
>
>kubernetes-dashboard-1608509744-6bc8455756-mp47w   1/1     Running             0          6d14h
>
>
>
>
>于是通过 `kubectl describe pod flink-session-cluster-858bd55dff-bzjk2`命令查看详细,结果如下:
>
>
>
>
>Name:         flink-session-cluster-858bd55dff-bzjk2
>
>Namespace:    default
>
>Priority:     0
>
>Node:         minikube/192.168.99.100
>
>Start Time:   Sun, 27 Dec 2020 22:21:56 +0800
>
>Labels:       app=flink-session-cluster
>
>              component=jobmanager
>
>              pod-template-hash=858bd55dff
>
>              type=flink-native-kubernetes
>
>Annotations:  <none>
>
>Status:       Pending
>
>IP:           172.17.0.4
>
>IPs:
>
>  IP:           172.17.0.4
>
>Controlled By:  ReplicaSet/flink-session-cluster-858bd55dff
>
>Containers:
>
>  flink-job-manager:
>
>    Container ID:  
>
>    Image:         flink:1.12.0-scala_2.12-java8
>
>    Image ID:      
>
>    Ports:         8081/TCP, 6123/TCP, 6124/TCP
>
>    Host Ports:    0/TCP, 0/TCP, 0/TCP
>
>    Command:
>
>      /docker-entrypoint.sh
>
>    Args:
>
>      native-k8s
>
>      $JAVA_HOME/bin/java -classpath $FLINK_CLASSPATH -Xmx1073741824 -Xms1073741824 -XX:MaxMetaspaceSize=268435456 -Dlog.file=/opt/flink/log/jobmanager.log -Dlogback.configurationFile=file:/opt/flink/conf/logback-console.xml -Dlog4j.configuration=file:/opt/flink/conf/log4j-console.properties -Dlog4j.configurationFile=file:/opt/flink/conf/log4j-console.properties org.apache.flink.kubernetes.entrypoint.KubernetesSessionClusterEntrypoint -D jobmanager.memory.off-heap.size=134217728b -D jobmanager.memory.jvm-overhead.min=201326592b -D jobmanager.memory.jvm-metaspace.size=268435456b -D jobmanager.memory.heap.size=1073741824b -D jobmanager.memory.jvm-overhead.max=201326592b
>
>    State:          Waiting
>
>      Reason:       ImagePullBackOff
>
>    Ready:          False
>
>    Restart Count:  0
>
>    Limits:
>
>      cpu:     1
>
>      memory:  1600Mi
>
>    Requests:
>
>      cpu:     1
>
>      memory:  1600Mi
>
>    Environment:
>
>      _POD_IP_ADDRESS:   (v1:status.podIP)
>
>      HADOOP_CONF_DIR:  /opt/hadoop/conf
>
>    Mounts:
>
>      /opt/flink/conf from flink-config-volume (rw)
>
>      /opt/hadoop/conf from hadoop-config-volume (rw)
>
>      /var/run/secrets/kubernetes.io/serviceaccount from default-token-s47ht (ro)
>
>Conditions:
>
>  Type              Status
>
>  Initialized       True
>
>  Ready             False
>
>  ContainersReady   False
>
>  PodScheduled      True
>
>Volumes:
>
>  hadoop-config-volume:
>
>    Type:      ConfigMap (a volume populated by a ConfigMap)
>
>    Name:      hadoop-config-flink-session-cluster
>
>    Optional:  false
>
>  flink-config-volume:
>
>    Type:      ConfigMap (a volume populated by a ConfigMap)
>
>    Name:      flink-config-flink-session-cluster
>
>    Optional:  false
>
>  default-token-s47ht:
>
>    Type:        Secret (a volume populated by a Secret)
>
>    SecretName:  default-token-s47ht
>
>    Optional:    false
>
>QoS Class:       Guaranteed
>
>Node-Selectors:  <none>
>
>Tolerations:     node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
>
>                 node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
>
>Events:
>
>  Type     Reason       Age                  From               Message
>
>  ----     ------       ----                 ----               -------
>
>  Normal   Scheduled    21m                  default-scheduler  Successfully assigned default/flink-session-cluster-858bd55dff-bzjk2 to minikube
>
>  Warning  FailedMount  21m (x2 over 21m)    kubelet            MountVolume.SetUp failed for volume "flink-config-volume" : configmap "flink-config-flink-session-cluster" not found
>
>  Warning  FailedMount  21m (x2 over 21m)    kubelet            MountVolume.SetUp failed for volume "hadoop-config-volume" : configmap "hadoop-config-flink-session-cluster" not found
>
>  Normal   Pulling      13m (x4 over 21m)    kubelet            Pulling image "flink:1.12.0-scala_2.12-java8"
>
>  Warning  Failed       13m (x4 over 15m)    kubelet            Failed to pull image "flink:1.12.0-scala_2.12-java8": rpc error: code = Unknown desc = Error response from daemon: manifest for flink:1.12.0-scala_2.12-java8 not found: manifest unknown: manifest unknown
>
>  Normal   BackOff      13m (x5 over 15m)    kubelet            Back-off pulling image "flink:1.12.0-scala_2.12-java8"
>
>  Warning  Failed       11m (x5 over 15m)    kubelet            Error: ErrImagePull
>
>  Warning  Failed       100s (x53 over 15m)  kubelet            Error: ImagePullBackOff
>
>
>
>
>一开始怀疑本地镜像没有生成,于是通过 `docker images` 命令查看
>
>REPOSITORY                                             TAG                       IMAGE ID       CREATED        SIZE
>
>flink                                                  1.12.0-scala_2.12-java8   f7dd9b9e020b   12 hours ago   642MB
>
>
>
>
>显示镜像的确是存在的,这就奇怪了,为什么从本地pull镜像会失败呢?是哪里有问题了吗?minikube下,如何从本地web访问到k8s上运行的flink集群dashboard呢?
>
>第一次用k8s,还请各位指点,谢谢!
>
>
>
>
>
>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: flink 1.12.0 kubernetes-session部署问题

Yang Wang
ConfigMap不需要提前创建,那个Warning信息可以忽略,是正常的,主要原因是先创建的deployment,再创建的ConfigMap
你可以参考社区的文档[1]把Jm的log打到console看一下

我怀疑是你没有创建service account导致的[2]

[1].
https://ci.apache.org/projects/flink/flink-docs-release-1.11/ops/deployment/native_kubernetes.html#log-files
[2].
https://ci.apache.org/projects/flink/flink-docs-release-1.11/ops/deployment/native_kubernetes.html#rbac

Best,
Yang

陈帅 <[hidden email]> 于2020年12月28日周一 下午5:54写道:

> 今天改用官方最新发布的flink镜像版本1.11.3也启不起来
> 这是我的命令
> ./bin/kubernetes-session.sh \
>   -Dkubernetes.cluster-id=rtdp \
>   -Dtaskmanager.memory.process.size=4096m \
>   -Dkubernetes.taskmanager.cpu=2 \
>   -Dtaskmanager.numberOfTaskSlots=4 \
>   -Dresourcemanager.taskmanager-timeout=3600000 \
>   -Dkubernetes.container.image=flink:1.11.3-scala_2.12-java8 \
>   -Dkubernetes.namespace=rtdp
>
>
>
> Events:
>
>   Type     Reason          Age                From               Message
>
>   ----     ------          ----               ----               -------
>
>   Normal   Scheduled       88s                default-scheduler
> Successfully assigned rtdp/rtdp-6d7794d65d-g6mb5 to
> cn-shanghai.192.168.16.130
>
>   Warning  FailedMount     88s                kubelet
> MountVolume.SetUp failed for volume "flink-config-volume" : configmap
> "flink-config-rtdp" not found
>
>   Warning  FailedMount     88s                kubelet
> MountVolume.SetUp failed for volume "hadoop-config-volume" : configmap
> "hadoop-config-rtdp" not found
>
>   Normal   AllocIPSucceed  87s                terway-daemon      Alloc IP
> 192.168.32.25/22 for Pod
>
>   Normal   Pulling         87s                kubelet            Pulling
> image "flink:1.11.3-scala_2.12-java8"
>
>   Normal   Pulled          31s                kubelet
> Successfully pulled image "flink:1.11.3-scala_2.12-java8"
>
>   Normal   Created         18s (x2 over 26s)  kubelet            Created
> container flink-job-manager
>
>   Normal   Started         18s (x2 over 26s)  kubelet            Started
> container flink-job-manager
>
>   Normal   Pulled          18s                kubelet            Container
> image "flink:1.11.3-scala_2.12-java8" already present on machine
>
>   Warning  BackOff         10s                kubelet            Back-off
> restarting failed container
>
>
>
>
>
>
>
> 这里面有两个ConfigMap没有找到,是需要提前创建吗?官方文档没有说明?还是我看漏了?
>
> https://ci.apache.org/projects/flink/flink-docs-release-1.11/ops/deployment/native_kubernetes.html#start-flink-session
>
>
>
>
>
>
>
>
>
> 在 2020-12-27 22:50:32,"陈帅" <[hidden email]> 写道:
>
> >本人第一次尝试在k8s上部署flink,版本用的是1.12.0,jdk是1.8.0_275,scala是2.12.12,在我的mac机器上安装有minikube单机环境,以下是实验步骤:
> >
> >
> >git clone
> https://github.com/apache/flink-dockercdflink-docker/1.12/scala_2.12-java8-debian
> >docker build --tag flink:1.12.0-scala_2.12-java8 .
> >
> >
> >cd flink-1.12.0
> >./bin/kubernetes-session.sh \
> -Dkubernetes.container.image=flink:1.12.0-scala_2.12-java8 \
> -Dkubernetes.rest-service.exposed.type=NodePort \
> -Dtaskmanager.numberOfTaskSlots=2 \
> -Dkubernetes.cluster-id=flink-session-cluster
> >
> >
> >显示JM启起来了,但无法通过web访问
> >
> >2020-12-27 22:08:12,387 INFO
> org.apache.flink.kubernetes.KubernetesClusterDescriptor      [] - Create
> flink session cluster session001 successfully, JobManager Web Interface:
> http://192.168.99.100:8081
> >
> >
> >
> >
> >通过 `kubectl get pods` 命令查看到pod一直处理ContainerCreating状态
> >
> >NAME                                               READY   STATUS
>       RESTARTS   AGE
> >
> >flink-session-cluster-858bd55dff-bzjk2             0/1
>  ContainerCreating   0          5m59s
> >
> >kubernetes-dashboard-1608509744-6bc8455756-mp47w   1/1     Running
>      0          6d14h
> >
> >
> >
> >
> >于是通过 `kubectl describe pod
> flink-session-cluster-858bd55dff-bzjk2`命令查看详细,结果如下:
> >
> >
> >
> >
> >Name:         flink-session-cluster-858bd55dff-bzjk2
> >
> >Namespace:    default
> >
> >Priority:     0
> >
> >Node:         minikube/192.168.99.100
> >
> >Start Time:   Sun, 27 Dec 2020 22:21:56 +0800
> >
> >Labels:       app=flink-session-cluster
> >
> >              component=jobmanager
> >
> >              pod-template-hash=858bd55dff
> >
> >              type=flink-native-kubernetes
> >
> >Annotations:  <none>
> >
> >Status:       Pending
> >
> >IP:           172.17.0.4
> >
> >IPs:
> >
> >  IP:           172.17.0.4
> >
> >Controlled By:  ReplicaSet/flink-session-cluster-858bd55dff
> >
> >Containers:
> >
> >  flink-job-manager:
> >
> >    Container ID:
> >
> >    Image:         flink:1.12.0-scala_2.12-java8
> >
> >    Image ID:
> >
> >    Ports:         8081/TCP, 6123/TCP, 6124/TCP
> >
> >    Host Ports:    0/TCP, 0/TCP, 0/TCP
> >
> >    Command:
> >
> >      /docker-entrypoint.sh
> >
> >    Args:
> >
> >      native-k8s
> >
> >      $JAVA_HOME/bin/java -classpath $FLINK_CLASSPATH -Xmx1073741824
> -Xms1073741824 -XX:MaxMetaspaceSize=268435456
> -Dlog.file=/opt/flink/log/jobmanager.log
> -Dlogback.configurationFile=file:/opt/flink/conf/logback-console.xml
> -Dlog4j.configuration=file:/opt/flink/conf/log4j-console.properties
> -Dlog4j.configurationFile=file:/opt/flink/conf/log4j-console.properties
> org.apache.flink.kubernetes.entrypoint.KubernetesSessionClusterEntrypoint
> -D jobmanager.memory.off-heap.size=134217728b -D
> jobmanager.memory.jvm-overhead.min=201326592b -D
> jobmanager.memory.jvm-metaspace.size=268435456b -D
> jobmanager.memory.heap.size=1073741824b -D
> jobmanager.memory.jvm-overhead.max=201326592b
> >
> >    State:          Waiting
> >
> >      Reason:       ImagePullBackOff
> >
> >    Ready:          False
> >
> >    Restart Count:  0
> >
> >    Limits:
> >
> >      cpu:     1
> >
> >      memory:  1600Mi
> >
> >    Requests:
> >
> >      cpu:     1
> >
> >      memory:  1600Mi
> >
> >    Environment:
> >
> >      _POD_IP_ADDRESS:   (v1:status.podIP)
> >
> >      HADOOP_CONF_DIR:  /opt/hadoop/conf
> >
> >    Mounts:
> >
> >      /opt/flink/conf from flink-config-volume (rw)
> >
> >      /opt/hadoop/conf from hadoop-config-volume (rw)
> >
> >      /var/run/secrets/kubernetes.io/serviceaccount from
> default-token-s47ht (ro)
> >
> >Conditions:
> >
> >  Type              Status
> >
> >  Initialized       True
> >
> >  Ready             False
> >
> >  ContainersReady   False
> >
> >  PodScheduled      True
> >
> >Volumes:
> >
> >  hadoop-config-volume:
> >
> >    Type:      ConfigMap (a volume populated by a ConfigMap)
> >
> >    Name:      hadoop-config-flink-session-cluster
> >
> >    Optional:  false
> >
> >  flink-config-volume:
> >
> >    Type:      ConfigMap (a volume populated by a ConfigMap)
> >
> >    Name:      flink-config-flink-session-cluster
> >
> >    Optional:  false
> >
> >  default-token-s47ht:
> >
> >    Type:        Secret (a volume populated by a Secret)
> >
> >    SecretName:  default-token-s47ht
> >
> >    Optional:    false
> >
> >QoS Class:       Guaranteed
> >
> >Node-Selectors:  <none>
> >
> >Tolerations:     node.kubernetes.io/not-ready:NoExecute op=Exists for
> 300s
> >
> >                 node.kubernetes.io/unreachable:NoExecute op=Exists for
> 300s
> >
> >Events:
> >
> >  Type     Reason       Age                  From               Message
> >
> >  ----     ------       ----                 ----               -------
> >
> >  Normal   Scheduled    21m                  default-scheduler
> Successfully assigned default/flink-session-cluster-858bd55dff-bzjk2 to
> minikube
> >
> >  Warning  FailedMount  21m (x2 over 21m)    kubelet
> MountVolume.SetUp failed for volume "flink-config-volume" : configmap
> "flink-config-flink-session-cluster" not found
> >
> >  Warning  FailedMount  21m (x2 over 21m)    kubelet
> MountVolume.SetUp failed for volume "hadoop-config-volume" : configmap
> "hadoop-config-flink-session-cluster" not found
> >
> >  Normal   Pulling      13m (x4 over 21m)    kubelet            Pulling
> image "flink:1.12.0-scala_2.12-java8"
> >
> >  Warning  Failed       13m (x4 over 15m)    kubelet            Failed to
> pull image "flink:1.12.0-scala_2.12-java8": rpc error: code = Unknown desc
> = Error response from daemon: manifest for flink:1.12.0-scala_2.12-java8
> not found: manifest unknown: manifest unknown
> >
> >  Normal   BackOff      13m (x5 over 15m)    kubelet            Back-off
> pulling image "flink:1.12.0-scala_2.12-java8"
> >
> >  Warning  Failed       11m (x5 over 15m)    kubelet            Error:
> ErrImagePull
> >
> >  Warning  Failed       100s (x53 over 15m)  kubelet            Error:
> ImagePullBackOff
> >
> >
> >
> >
> >一开始怀疑本地镜像没有生成,于是通过 `docker images` 命令查看
> >
> >REPOSITORY                                             TAG
>        IMAGE ID       CREATED        SIZE
> >
> >flink
> 1.12.0-scala_2.12-java8   f7dd9b9e020b   12 hours ago   642MB
> >
> >
> >
> >
>
> >显示镜像的确是存在的,这就奇怪了,为什么从本地pull镜像会失败呢?是哪里有问题了吗?minikube下,如何从本地web访问到k8s上运行的flink集群dashboard呢?
> >
> >第一次用k8s,还请各位指点,谢谢!
> >
> >
> >
> >
> >
> >
> >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re:Re: flink 1.12.0 kubernetes-session部署问题

casel.chen
In reply to this post by Yang Wang
我是在MacBook Pro上搭建了一套MiniKube,安装了VirtualBox。请问正确启动 Flink v1.11.3 on K8S 的步骤是怎样的?
我实践的步骤是:


minikube start
cd /Users/admin/dev/flink-1.11.3
./bin/kubernetes-session.sh
此时显示拉取的镜像名称是 flink:1.11.3-scala_2.12 ,而不是dockerhub仓库上flink官方给的 flink:1.11.3-scala_2.12-java8
于是我重新使用命令
./bin/kubernetes-session.sh \
  -Dkubernetes.cluster-id=my-flink-cluster \
  -Dkubernetes.container.image=flink:1.11.3-scala_2.12-java8


等待一段拉取镜像时间后get pod显示



SJ-DN0393:flink-1.11.3 admin$ kubectl get pods

NAME                                               READY   STATUS             RESTARTS   AGE

kubernetes-dashboard-1608509744-6bc8455756-mp47w   1/1     Running            3          10d

my-flink-cluster-77c6f85879-9vcx8                  0/1     CrashLoopBackOff   5          29m




通过describe pod命令显示




Events:

  Type     Reason       Age                    From               Message

  ----     ------       ----                   ----               -------

  Normal   Scheduled    29m                    default-scheduler  Successfully assigned default/my-flink-cluster-77c6f85879-9vcx8 to minikube

  Warning  FailedMount  29m                    kubelet            MountVolume.SetUp failed for volume "flink-config-volume" : configmap "flink-config-my-flink-cluster" not found

  Warning  FailedMount  29m                    kubelet            MountVolume.SetUp failed for volume "hadoop-config-volume" : configmap "hadoop-config-my-flink-cluster" not found

  Normal   Pulling      29m                    kubelet            Pulling image "flink:1.11.3-scala_2.12-java8"

  Normal   Pulled       2m41s (x5 over 4m34s)  kubelet            Container image "flink:1.11.3-scala_2.12-java8" already present on machine

  Normal   Created      2m41s (x5 over 4m33s)  kubelet            Created container flink-job-manager

  Normal   Started      2m41s (x5 over 4m33s)  kubelet            Started container flink-job-manager

  Warning  BackOff      2m8s (x10 over 4m18s)  kubelet            Back-off restarting failed container




















在 2020-12-28 10:40:59,"Yang Wang" <[hidden email]> 写道:

>你整个流程理由有两个问题:
>
>1. 镜像找不到
>原因应该是和minikube的driver设置有关,如果是hyperkit或者其他vm的方式,你需要minikube
>ssh到虚拟机内部查看镜像是否正常存在
>
>2. JM链接无法访问
>2020-12-27 22:08:12,387 INFO
>org.apache.flink.kubernetes.KubernetesClusterDescriptor      [] - Create
>flink session cluster session001 successfully, JobManager Web Interface:
>http://192.168.99.100:8081
>
>我猜你上面的这行log应该不是你贴出来的命令打印的,因为你给的命令是NodePort方式,打印出来的JM地址不应该是8081端口的。
>只要你在minikube上提交的任务加上kubernetes.rest-service.exposed.type=NodePort,并且JM能起来,打印出来的JM地址就是可以访问的
>
>当然你也可以手动拼接出来这个链接,minikube ip拿到APIServer地址,然后用kubectl get svc 去查看你创建的Flink
>Session Cluster对应的rest svc的NodePort,拼起来访问就好了
>
>
>Best,
>Yang
>
>陈帅 <[hidden email]> 于2020年12月27日周日 下午10:51写道:
>
>>
>> 本人第一次尝试在k8s上部署flink,版本用的是1.12.0,jdk是1.8.0_275,scala是2.12.12,在我的mac机器上安装有minikube单机环境,以下是实验步骤:
>>
>>
>> git clone
>> https://github.com/apache/flink-dockercdflink-docker/1.12/scala_2.12-java8-debian
>> docker build --tag flink:1.12.0-scala_2.12-java8 .
>>
>>
>> cd flink-1.12.0
>> ./bin/kubernetes-session.sh \
>> -Dkubernetes.container.image=flink:1.12.0-scala_2.12-java8 \
>> -Dkubernetes.rest-service.exposed.type=NodePort \
>> -Dtaskmanager.numberOfTaskSlots=2 \
>> -Dkubernetes.cluster-id=flink-session-cluster
>>
>>
>> 显示JM启起来了,但无法通过web访问
>>
>> 2020-12-27 22:08:12,387 INFO
>> org.apache.flink.kubernetes.KubernetesClusterDescriptor      [] - Create
>> flink session cluster session001 successfully, JobManager Web Interface:
>> http://192.168.99.100:8081
>>
>>
>>
>>
>> 通过 `kubectl get pods` 命令查看到pod一直处理ContainerCreating状态
>>
>> NAME                                               READY   STATUS
>>     RESTARTS   AGE
>>
>> flink-session-cluster-858bd55dff-bzjk2             0/1
>>  ContainerCreating   0          5m59s
>>
>> kubernetes-dashboard-1608509744-6bc8455756-mp47w   1/1     Running
>>      0          6d14h
>>
>>
>>
>>
>> 于是通过 `kubectl describe pod
>> flink-session-cluster-858bd55dff-bzjk2`命令查看详细,结果如下:
>>
>>
>>
>>
>> Name:         flink-session-cluster-858bd55dff-bzjk2
>>
>> Namespace:    default
>>
>> Priority:     0
>>
>> Node:         minikube/192.168.99.100
>>
>> Start Time:   Sun, 27 Dec 2020 22:21:56 +0800
>>
>> Labels:       app=flink-session-cluster
>>
>>               component=jobmanager
>>
>>               pod-template-hash=858bd55dff
>>
>>               type=flink-native-kubernetes
>>
>> Annotations:  <none>
>>
>> Status:       Pending
>>
>> IP:           172.17.0.4
>>
>> IPs:
>>
>>   IP:           172.17.0.4
>>
>> Controlled By:  ReplicaSet/flink-session-cluster-858bd55dff
>>
>> Containers:
>>
>>   flink-job-manager:
>>
>>     Container ID:
>>
>>     Image:         flink:1.12.0-scala_2.12-java8
>>
>>     Image ID:
>>
>>     Ports:         8081/TCP, 6123/TCP, 6124/TCP
>>
>>     Host Ports:    0/TCP, 0/TCP, 0/TCP
>>
>>     Command:
>>
>>       /docker-entrypoint.sh
>>
>>     Args:
>>
>>       native-k8s
>>
>>       $JAVA_HOME/bin/java -classpath $FLINK_CLASSPATH -Xmx1073741824
>> -Xms1073741824 -XX:MaxMetaspaceSize=268435456
>> -Dlog.file=/opt/flink/log/jobmanager.log
>> -Dlogback.configurationFile=file:/opt/flink/conf/logback-console.xml
>> -Dlog4j.configuration=file:/opt/flink/conf/log4j-console.properties
>> -Dlog4j.configurationFile=file:/opt/flink/conf/log4j-console.properties
>> org.apache.flink.kubernetes.entrypoint.KubernetesSessionClusterEntrypoint
>> -D jobmanager.memory.off-heap.size=134217728b -D
>> jobmanager.memory.jvm-overhead.min=201326592b -D
>> jobmanager.memory.jvm-metaspace.size=268435456b -D
>> jobmanager.memory.heap.size=1073741824b -D
>> jobmanager.memory.jvm-overhead.max=201326592b
>>
>>     State:          Waiting
>>
>>       Reason:       ImagePullBackOff
>>
>>     Ready:          False
>>
>>     Restart Count:  0
>>
>>     Limits:
>>
>>       cpu:     1
>>
>>       memory:  1600Mi
>>
>>     Requests:
>>
>>       cpu:     1
>>
>>       memory:  1600Mi
>>
>>     Environment:
>>
>>       _POD_IP_ADDRESS:   (v1:status.podIP)
>>
>>       HADOOP_CONF_DIR:  /opt/hadoop/conf
>>
>>     Mounts:
>>
>>       /opt/flink/conf from flink-config-volume (rw)
>>
>>       /opt/hadoop/conf from hadoop-config-volume (rw)
>>
>>       /var/run/secrets/kubernetes.io/serviceaccount from
>> default-token-s47ht (ro)
>>
>> Conditions:
>>
>>   Type              Status
>>
>>   Initialized       True
>>
>>   Ready             False
>>
>>   ContainersReady   False
>>
>>   PodScheduled      True
>>
>> Volumes:
>>
>>   hadoop-config-volume:
>>
>>     Type:      ConfigMap (a volume populated by a ConfigMap)
>>
>>     Name:      hadoop-config-flink-session-cluster
>>
>>     Optional:  false
>>
>>   flink-config-volume:
>>
>>     Type:      ConfigMap (a volume populated by a ConfigMap)
>>
>>     Name:      flink-config-flink-session-cluster
>>
>>     Optional:  false
>>
>>   default-token-s47ht:
>>
>>     Type:        Secret (a volume populated by a Secret)
>>
>>     SecretName:  default-token-s47ht
>>
>>     Optional:    false
>>
>> QoS Class:       Guaranteed
>>
>> Node-Selectors:  <none>
>>
>> Tolerations:     node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
>>
>>                  node.kubernetes.io/unreachable:NoExecute op=Exists for
>> 300s
>>
>> Events:
>>
>>   Type     Reason       Age                  From               Message
>>
>>   ----     ------       ----                 ----               -------
>>
>>   Normal   Scheduled    21m                  default-scheduler
>> Successfully assigned default/flink-session-cluster-858bd55dff-bzjk2 to
>> minikube
>>
>>   Warning  FailedMount  21m (x2 over 21m)    kubelet
>> MountVolume.SetUp failed for volume "flink-config-volume" : configmap
>> "flink-config-flink-session-cluster" not found
>>
>>   Warning  FailedMount  21m (x2 over 21m)    kubelet
>> MountVolume.SetUp failed for volume "hadoop-config-volume" : configmap
>> "hadoop-config-flink-session-cluster" not found
>>
>>   Normal   Pulling      13m (x4 over 21m)    kubelet            Pulling
>> image "flink:1.12.0-scala_2.12-java8"
>>
>>   Warning  Failed       13m (x4 over 15m)    kubelet            Failed to
>> pull image "flink:1.12.0-scala_2.12-java8": rpc error: code = Unknown desc
>> = Error response from daemon: manifest for flink:1.12.0-scala_2.12-java8
>> not found: manifest unknown: manifest unknown
>>
>>   Normal   BackOff      13m (x5 over 15m)    kubelet            Back-off
>> pulling image "flink:1.12.0-scala_2.12-java8"
>>
>>   Warning  Failed       11m (x5 over 15m)    kubelet            Error:
>> ErrImagePull
>>
>>   Warning  Failed       100s (x53 over 15m)  kubelet            Error:
>> ImagePullBackOff
>>
>>
>>
>>
>> 一开始怀疑本地镜像没有生成,于是通过 `docker images` 命令查看
>>
>> REPOSITORY                                             TAG
>>        IMAGE ID       CREATED        SIZE
>>
>> flink
>> 1.12.0-scala_2.12-java8   f7dd9b9e020b   12 hours ago   642MB
>>
>>
>>
>>
>>
>> 显示镜像的确是存在的,这就奇怪了,为什么从本地pull镜像会失败呢?是哪里有问题了吗?minikube下,如何从本地web访问到k8s上运行的flink集群dashboard呢?
>>
>> 第一次用k8s,还请各位指点,谢谢!
>>
>>
>>
>>
>>
>>
>>
>>
>>
Reply | Threaded
Open this post in threaded view
|

Re:Re: flink 1.12.0 kubernetes-session部署问题

casel.chen
In reply to this post by Yang Wang
环境:MacBook Pro 单机安装了 minkube v1.15.1 和 kubernetes v1.19.4
我在flink v1.11.3发行版下执行如下命令
kubectl create namespace flink-session-cluster


kubectl create serviceaccount flink -n flink-session-cluster


kubectl create clusterrolebinding flink-role-binding-flink \ --clusterrole=edit \ --serviceaccount=flink-session-cluster:flink


./bin/kubernetes-session.sh \ -Dkubernetes.namespace=flink-session-cluster \ -Dkubernetes.jobmanager.service-account=flink \ -Dkubernetes.cluster-id=session001 \ -Dtaskmanager.memory.process.size=8192m \ -Dkubernetes.taskmanager.cpu=1 \ -Dtaskmanager.numberOfTaskSlots=4 \ -Dresourcemanager.taskmanager-timeout=3600000


屏幕打印的结果显示flink web UI启在了 http://192.168.64.2:8081 而不是类似于 http://192.168.50.135:31753 这样的5位数端口,是哪里有问题?这里的host ip应该是minikube ip吗?我本地浏览器访问不了http://192.168.64.2:8081



2021-01-02 10:28:04,177 INFO  org.apache.flink.runtime.util.config.memory.ProcessMemoryUtils [] - The derived from fraction jvm overhead memory (160.000mb (167772162 bytes)) is less than its min value 192.000mb (201326592 bytes), min value will be used instead

2021-01-02 10:28:04,907 INFO  org.apache.flink.kubernetes.KubernetesClusterDescriptor      [] - Create flink session cluster session001 successfully, JobManager Web Interface: http://192.168.64.2:8081




查看了pods, service, deployment都正常启动好了,显示全绿色的


接下来提交任务
./bin/flink run -d \ -e kubernetes-session \ -Dkubernetes.namespace=flink-session-cluster \ -Dkubernetes.cluster-id=session001 \ examples/streaming/WindowJoin.jar



Using windowSize=2000, data rate=3

To customize example, use: WindowJoin [--windowSize <window-size-in-millis>] [--rate <elements-per-second>]

2021-01-02 10:21:48,658 INFO  org.apache.flink.kubernetes.KubernetesClusterDescriptor      [] - Retrieve flink cluster session001 successfully, JobManager Web Interface: http://10.106.136.236:8081




这里显示的 http://10.106.136.236:8081 我是能够通过浏览器访问到的,打开显示作业正在运行,而且available slots一项显示的是 0,查看JM日志有如下error




Causedby: org.apache.flink.runtime.jobmanager.scheduler.NoResourceAvailableException: Couldnot allocate the required slot within slot request timeout. Please make sure that the cluster has enough resources.
    at org.apache.flink.runtime.scheduler.DefaultScheduler.maybeWrapWithNoResourceAvailableException(DefaultScheduler.java:441) ~[flink-dist_2.12-1.11.3.jar:1.11.3]
    ... 47 more
Causedby: java.util.concurrent.CompletionException: java.util.concurrent.TimeoutException
    at java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:292) ~[?:1.8.0_275]
    at java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:308) ~[?:1.8.0_275]
    at java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:607) ~[?:1.8.0_275]
    at java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:591) ~[?:1.8.0_275]
    ... 27 more
Causedby: java.util.concurrent.TimeoutException
    ... 25 more


为什么会报这个资源配置不足的错?谢谢解答!








在 2020-12-29 09:53:48,"Yang Wang" <[hidden email]> 写道:

>ConfigMap不需要提前创建,那个Warning信息可以忽略,是正常的,主要原因是先创建的deployment,再创建的ConfigMap
>你可以参考社区的文档[1]把Jm的log打到console看一下
>
>我怀疑是你没有创建service account导致的[2]
>
>[1].
>https://ci.apache.org/projects/flink/flink-docs-release-1.11/ops/deployment/native_kubernetes.html#log-files
>[2].
>https://ci.apache.org/projects/flink/flink-docs-release-1.11/ops/deployment/native_kubernetes.html#rbac
>
>Best,
>Yang
>
>陈帅 <[hidden email]> 于2020年12月28日周一 下午5:54写道:
>
>> 今天改用官方最新发布的flink镜像版本1.11.3也启不起来
>> 这是我的命令
>> ./bin/kubernetes-session.sh \
>>   -Dkubernetes.cluster-id=rtdp \
>>   -Dtaskmanager.memory.process.size=4096m \
>>   -Dkubernetes.taskmanager.cpu=2 \
>>   -Dtaskmanager.numberOfTaskSlots=4 \
>>   -Dresourcemanager.taskmanager-timeout=3600000 \
>>   -Dkubernetes.container.image=flink:1.11.3-scala_2.12-java8 \
>>   -Dkubernetes.namespace=rtdp
>>
>>
>>
>> Events:
>>
>>   Type     Reason          Age                From               Message
>>
>>   ----     ------          ----               ----               -------
>>
>>   Normal   Scheduled       88s                default-scheduler
>> Successfully assigned rtdp/rtdp-6d7794d65d-g6mb5 to
>> cn-shanghai.192.168.16.130
>>
>>   Warning  FailedMount     88s                kubelet
>> MountVolume.SetUp failed for volume "flink-config-volume" : configmap
>> "flink-config-rtdp" not found
>>
>>   Warning  FailedMount     88s                kubelet
>> MountVolume.SetUp failed for volume "hadoop-config-volume" : configmap
>> "hadoop-config-rtdp" not found
>>
>>   Normal   AllocIPSucceed  87s                terway-daemon      Alloc IP
>> 192.168.32.25/22 for Pod
>>
>>   Normal   Pulling         87s                kubelet            Pulling
>> image "flink:1.11.3-scala_2.12-java8"
>>
>>   Normal   Pulled          31s                kubelet
>> Successfully pulled image "flink:1.11.3-scala_2.12-java8"
>>
>>   Normal   Created         18s (x2 over 26s)  kubelet            Created
>> container flink-job-manager
>>
>>   Normal   Started         18s (x2 over 26s)  kubelet            Started
>> container flink-job-manager
>>
>>   Normal   Pulled          18s                kubelet            Container
>> image "flink:1.11.3-scala_2.12-java8" already present on machine
>>
>>   Warning  BackOff         10s                kubelet            Back-off
>> restarting failed container
>>
>>
>>
>>
>>
>>
>>
>> 这里面有两个ConfigMap没有找到,是需要提前创建吗?官方文档没有说明?还是我看漏了?
>>
>> https://ci.apache.org/projects/flink/flink-docs-release-1.11/ops/deployment/native_kubernetes.html#start-flink-session
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> 在 2020-12-27 22:50:32,"陈帅" <[hidden email]> 写道:
>>
>> >本人第一次尝试在k8s上部署flink,版本用的是1.12.0,jdk是1.8.0_275,scala是2.12.12,在我的mac机器上安装有minikube单机环境,以下是实验步骤:
>> >
>> >
>> >git clone
>> https://github.com/apache/flink-dockercdflink-docker/1.12/scala_2.12-java8-debian
>> >docker build --tag flink:1.12.0-scala_2.12-java8 .
>> >
>> >
>> >cd flink-1.12.0
>> >./bin/kubernetes-session.sh \
>> -Dkubernetes.container.image=flink:1.12.0-scala_2.12-java8 \
>> -Dkubernetes.rest-service.exposed.type=NodePort \
>> -Dtaskmanager.numberOfTaskSlots=2 \
>> -Dkubernetes.cluster-id=flink-session-cluster
>> >
>> >
>> >显示JM启起来了,但无法通过web访问
>> >
>> >2020-12-27 22:08:12,387 INFO
>> org.apache.flink.kubernetes.KubernetesClusterDescriptor      [] - Create
>> flink session cluster session001 successfully, JobManager Web Interface:
>> http://192.168.99.100:8081
>> >
>> >
>> >
>> >
>> >通过 `kubectl get pods` 命令查看到pod一直处理ContainerCreating状态
>> >
>> >NAME                                               READY   STATUS
>>       RESTARTS   AGE
>> >
>> >flink-session-cluster-858bd55dff-bzjk2             0/1
>>  ContainerCreating   0          5m59s
>> >
>> >kubernetes-dashboard-1608509744-6bc8455756-mp47w   1/1     Running
>>      0          6d14h
>> >
>> >
>> >
>> >
>> >于是通过 `kubectl describe pod
>> flink-session-cluster-858bd55dff-bzjk2`命令查看详细,结果如下:
>> >
>> >
>> >
>> >
>> >Name:         flink-session-cluster-858bd55dff-bzjk2
>> >
>> >Namespace:    default
>> >
>> >Priority:     0
>> >
>> >Node:         minikube/192.168.99.100
>> >
>> >Start Time:   Sun, 27 Dec 2020 22:21:56 +0800
>> >
>> >Labels:       app=flink-session-cluster
>> >
>> >              component=jobmanager
>> >
>> >              pod-template-hash=858bd55dff
>> >
>> >              type=flink-native-kubernetes
>> >
>> >Annotations:  <none>
>> >
>> >Status:       Pending
>> >
>> >IP:           172.17.0.4
>> >
>> >IPs:
>> >
>> >  IP:           172.17.0.4
>> >
>> >Controlled By:  ReplicaSet/flink-session-cluster-858bd55dff
>> >
>> >Containers:
>> >
>> >  flink-job-manager:
>> >
>> >    Container ID:
>> >
>> >    Image:         flink:1.12.0-scala_2.12-java8
>> >
>> >    Image ID:
>> >
>> >    Ports:         8081/TCP, 6123/TCP, 6124/TCP
>> >
>> >    Host Ports:    0/TCP, 0/TCP, 0/TCP
>> >
>> >    Command:
>> >
>> >      /docker-entrypoint.sh
>> >
>> >    Args:
>> >
>> >      native-k8s
>> >
>> >      $JAVA_HOME/bin/java -classpath $FLINK_CLASSPATH -Xmx1073741824
>> -Xms1073741824 -XX:MaxMetaspaceSize=268435456
>> -Dlog.file=/opt/flink/log/jobmanager.log
>> -Dlogback.configurationFile=file:/opt/flink/conf/logback-console.xml
>> -Dlog4j.configuration=file:/opt/flink/conf/log4j-console.properties
>> -Dlog4j.configurationFile=file:/opt/flink/conf/log4j-console.properties
>> org.apache.flink.kubernetes.entrypoint.KubernetesSessionClusterEntrypoint
>> -D jobmanager.memory.off-heap.size=134217728b -D
>> jobmanager.memory.jvm-overhead.min=201326592b -D
>> jobmanager.memory.jvm-metaspace.size=268435456b -D
>> jobmanager.memory.heap.size=1073741824b -D
>> jobmanager.memory.jvm-overhead.max=201326592b
>> >
>> >    State:          Waiting
>> >
>> >      Reason:       ImagePullBackOff
>> >
>> >    Ready:          False
>> >
>> >    Restart Count:  0
>> >
>> >    Limits:
>> >
>> >      cpu:     1
>> >
>> >      memory:  1600Mi
>> >
>> >    Requests:
>> >
>> >      cpu:     1
>> >
>> >      memory:  1600Mi
>> >
>> >    Environment:
>> >
>> >      _POD_IP_ADDRESS:   (v1:status.podIP)
>> >
>> >      HADOOP_CONF_DIR:  /opt/hadoop/conf
>> >
>> >    Mounts:
>> >
>> >      /opt/flink/conf from flink-config-volume (rw)
>> >
>> >      /opt/hadoop/conf from hadoop-config-volume (rw)
>> >
>> >      /var/run/secrets/kubernetes.io/serviceaccount from
>> default-token-s47ht (ro)
>> >
>> >Conditions:
>> >
>> >  Type              Status
>> >
>> >  Initialized       True
>> >
>> >  Ready             False
>> >
>> >  ContainersReady   False
>> >
>> >  PodScheduled      True
>> >
>> >Volumes:
>> >
>> >  hadoop-config-volume:
>> >
>> >    Type:      ConfigMap (a volume populated by a ConfigMap)
>> >
>> >    Name:      hadoop-config-flink-session-cluster
>> >
>> >    Optional:  false
>> >
>> >  flink-config-volume:
>> >
>> >    Type:      ConfigMap (a volume populated by a ConfigMap)
>> >
>> >    Name:      flink-config-flink-session-cluster
>> >
>> >    Optional:  false
>> >
>> >  default-token-s47ht:
>> >
>> >    Type:        Secret (a volume populated by a Secret)
>> >
>> >    SecretName:  default-token-s47ht
>> >
>> >    Optional:    false
>> >
>> >QoS Class:       Guaranteed
>> >
>> >Node-Selectors:  <none>
>> >
>> >Tolerations:     node.kubernetes.io/not-ready:NoExecute op=Exists for
>> 300s
>> >
>> >                 node.kubernetes.io/unreachable:NoExecute op=Exists for
>> 300s
>> >
>> >Events:
>> >
>> >  Type     Reason       Age                  From               Message
>> >
>> >  ----     ------       ----                 ----               -------
>> >
>> >  Normal   Scheduled    21m                  default-scheduler
>> Successfully assigned default/flink-session-cluster-858bd55dff-bzjk2 to
>> minikube
>> >
>> >  Warning  FailedMount  21m (x2 over 21m)    kubelet
>> MountVolume.SetUp failed for volume "flink-config-volume" : configmap
>> "flink-config-flink-session-cluster" not found
>> >
>> >  Warning  FailedMount  21m (x2 over 21m)    kubelet
>> MountVolume.SetUp failed for volume "hadoop-config-volume" : configmap
>> "hadoop-config-flink-session-cluster" not found
>> >
>> >  Normal   Pulling      13m (x4 over 21m)    kubelet            Pulling
>> image "flink:1.12.0-scala_2.12-java8"
>> >
>> >  Warning  Failed       13m (x4 over 15m)    kubelet            Failed to
>> pull image "flink:1.12.0-scala_2.12-java8": rpc error: code = Unknown desc
>> = Error response from daemon: manifest for flink:1.12.0-scala_2.12-java8
>> not found: manifest unknown: manifest unknown
>> >
>> >  Normal   BackOff      13m (x5 over 15m)    kubelet            Back-off
>> pulling image "flink:1.12.0-scala_2.12-java8"
>> >
>> >  Warning  Failed       11m (x5 over 15m)    kubelet            Error:
>> ErrImagePull
>> >
>> >  Warning  Failed       100s (x53 over 15m)  kubelet            Error:
>> ImagePullBackOff
>> >
>> >
>> >
>> >
>> >一开始怀疑本地镜像没有生成,于是通过 `docker images` 命令查看
>> >
>> >REPOSITORY                                             TAG
>>        IMAGE ID       CREATED        SIZE
>> >
>> >flink
>> 1.12.0-scala_2.12-java8   f7dd9b9e020b   12 hours ago   642MB
>> >
>> >
>> >
>> >
>>
>> >显示镜像的确是存在的,这就奇怪了,为什么从本地pull镜像会失败呢?是哪里有问题了吗?minikube下,如何从本地web访问到k8s上运行的flink集群dashboard呢?
>> >
>> >第一次用k8s,还请各位指点,谢谢!
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>>
Reply | Threaded
Open this post in threaded view
|

Re: Re: flink 1.12.0 kubernetes-session部署问题

Yang Wang
native方式默认使用的是LoadBalancer的方式来暴露,所以会打印出来一个你无法访问的地址
你可以加一个-Dkubernetes.rest-service.exposed.type=NodePort的方式来使用NodePort来暴露
这样Flink Client端打印出来的地址就是正确的了

另外你可以可以使用minikube ip来查看ip地址,同时用kubectl get svc获取你创建的Flink cluster
svc的NodePort,拼起来就可以


至于你说的NoResourceAvailableException,你可以看下是不是TaskManager的Pod已经创建出来了,但是pending状态
如果是,那就是你minikube资源不够了,可以把minikube资源调大或者把JobManager、TaskManager的Pod资源调小
如果不是,你可以把完整的JobManager日志发一下,这样方便查问题


Best,
Yang

陈帅 <[hidden email]> 于2021年1月2日周六 上午10:43写道:

> 环境:MacBook Pro 单机安装了 minkube v1.15.1 和 kubernetes v1.19.4
> 我在flink v1.11.3发行版下执行如下命令
> kubectl create namespace flink-session-cluster
>
>
> kubectl create serviceaccount flink -n flink-session-cluster
>
>
> kubectl create clusterrolebinding flink-role-binding-flink \
> --clusterrole=edit \ --serviceaccount=flink-session-cluster:flink
>
>
> ./bin/kubernetes-session.sh \ -Dkubernetes.namespace=flink-session-cluster
> \ -Dkubernetes.jobmanager.service-account=flink \
> -Dkubernetes.cluster-id=session001 \
> -Dtaskmanager.memory.process.size=8192m \ -Dkubernetes.taskmanager.cpu=1 \
> -Dtaskmanager.numberOfTaskSlots=4 \
> -Dresourcemanager.taskmanager-timeout=3600000
>
>
> 屏幕打印的结果显示flink web UI启在了 http://192.168.64.2:8081 而不是类似于
> http://192.168.50.135:31753 这样的5位数端口,是哪里有问题?这里的host ip应该是minikube
> ip吗?我本地浏览器访问不了http://192.168.64.2:8081
>
>
>
> 2021-01-02 10:28:04,177 INFO
> org.apache.flink.runtime.util.config.memory.ProcessMemoryUtils [] - The
> derived from fraction jvm overhead memory (160.000mb (167772162 bytes)) is
> less than its min value 192.000mb (201326592 bytes), min value will be used
> instead
>
> 2021-01-02 10:28:04,907 INFO
> org.apache.flink.kubernetes.KubernetesClusterDescriptor      [] - Create
> flink session cluster session001 successfully, JobManager Web Interface:
> http://192.168.64.2:8081
>
>
>
>
> 查看了pods, service, deployment都正常启动好了,显示全绿色的
>
>
> 接下来提交任务
> ./bin/flink run -d \ -e kubernetes-session \
> -Dkubernetes.namespace=flink-session-cluster \
> -Dkubernetes.cluster-id=session001 \ examples/streaming/WindowJoin.jar
>
>
>
> Using windowSize=2000, data rate=3
>
> To customize example, use: WindowJoin [--windowSize
> <window-size-in-millis>] [--rate <elements-per-second>]
>
> 2021-01-02 10:21:48,658 INFO
> org.apache.flink.kubernetes.KubernetesClusterDescriptor      [] - Retrieve
> flink cluster session001 successfully, JobManager Web Interface:
> http://10.106.136.236:8081
>
>
>
>
> 这里显示的 http://10.106.136.236:8081 我是能够通过浏览器访问到的,打开显示作业正在运行,而且available
> slots一项显示的是 0,查看JM日志有如下error
>
>
>
>
> Causedby:
> org.apache.flink.runtime.jobmanager.scheduler.NoResourceAvailableException:
> Couldnot allocate the required slot within slot request timeout. Please
> make sure that the cluster has enough resources.
>     at
> org.apache.flink.runtime.scheduler.DefaultScheduler.maybeWrapWithNoResourceAvailableException(DefaultScheduler.java:441)
> ~[flink-dist_2.12-1.11.3.jar:1.11.3]
>     ... 47 more
> Causedby: java.util.concurrent.CompletionException:
> java.util.concurrent.TimeoutException
>     at
> java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:292)
> ~[?:1.8.0_275]
>     at
> java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:308)
> ~[?:1.8.0_275]
>     at
> java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:607)
> ~[?:1.8.0_275]
>     at
> java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:591)
> ~[?:1.8.0_275]
>     ... 27 more
> Causedby: java.util.concurrent.TimeoutException
>     ... 25 more
>
>
> 为什么会报这个资源配置不足的错?谢谢解答!
>
>
>
>
>
>
>
>
> 在 2020-12-29 09:53:48,"Yang Wang" <[hidden email]> 写道:
> >ConfigMap不需要提前创建,那个Warning信息可以忽略,是正常的,主要原因是先创建的deployment,再创建的ConfigMap
> >你可以参考社区的文档[1]把Jm的log打到console看一下
> >
> >我怀疑是你没有创建service account导致的[2]
> >
> >[1].
> >
> https://ci.apache.org/projects/flink/flink-docs-release-1.11/ops/deployment/native_kubernetes.html#log-files
> >[2].
> >
> https://ci.apache.org/projects/flink/flink-docs-release-1.11/ops/deployment/native_kubernetes.html#rbac
> >
> >Best,
> >Yang
> >
> >陈帅 <[hidden email]> 于2020年12月28日周一 下午5:54写道:
> >
> >> 今天改用官方最新发布的flink镜像版本1.11.3也启不起来
> >> 这是我的命令
> >> ./bin/kubernetes-session.sh \
> >>   -Dkubernetes.cluster-id=rtdp \
> >>   -Dtaskmanager.memory.process.size=4096m \
> >>   -Dkubernetes.taskmanager.cpu=2 \
> >>   -Dtaskmanager.numberOfTaskSlots=4 \
> >>   -Dresourcemanager.taskmanager-timeout=3600000 \
> >>   -Dkubernetes.container.image=flink:1.11.3-scala_2.12-java8 \
> >>   -Dkubernetes.namespace=rtdp
> >>
> >>
> >>
> >> Events:
> >>
> >>   Type     Reason          Age                From               Message
> >>
> >>   ----     ------          ----               ----               -------
> >>
> >>   Normal   Scheduled       88s                default-scheduler
> >> Successfully assigned rtdp/rtdp-6d7794d65d-g6mb5 to
> >> cn-shanghai.192.168.16.130
> >>
> >>   Warning  FailedMount     88s                kubelet
> >> MountVolume.SetUp failed for volume "flink-config-volume" : configmap
> >> "flink-config-rtdp" not found
> >>
> >>   Warning  FailedMount     88s                kubelet
> >> MountVolume.SetUp failed for volume "hadoop-config-volume" : configmap
> >> "hadoop-config-rtdp" not found
> >>
> >>   Normal   AllocIPSucceed  87s                terway-daemon      Alloc
> IP
> >> 192.168.32.25/22 for Pod
> >>
> >>   Normal   Pulling         87s                kubelet            Pulling
> >> image "flink:1.11.3-scala_2.12-java8"
> >>
> >>   Normal   Pulled          31s                kubelet
> >> Successfully pulled image "flink:1.11.3-scala_2.12-java8"
> >>
> >>   Normal   Created         18s (x2 over 26s)  kubelet            Created
> >> container flink-job-manager
> >>
> >>   Normal   Started         18s (x2 over 26s)  kubelet            Started
> >> container flink-job-manager
> >>
> >>   Normal   Pulled          18s                kubelet
> Container
> >> image "flink:1.11.3-scala_2.12-java8" already present on machine
> >>
> >>   Warning  BackOff         10s                kubelet
> Back-off
> >> restarting failed container
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >> 这里面有两个ConfigMap没有找到,是需要提前创建吗?官方文档没有说明?还是我看漏了?
> >>
> >>
> https://ci.apache.org/projects/flink/flink-docs-release-1.11/ops/deployment/native_kubernetes.html#start-flink-session
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >> 在 2020-12-27 22:50:32,"陈帅" <[hidden email]> 写道:
> >>
> >>
> >本人第一次尝试在k8s上部署flink,版本用的是1.12.0,jdk是1.8.0_275,scala是2.12.12,在我的mac机器上安装有minikube单机环境,以下是实验步骤:
> >> >
> >> >
> >> >git clone
> >>
> https://github.com/apache/flink-dockercdflink-docker/1.12/scala_2.12-java8-debian
> >> >docker build --tag flink:1.12.0-scala_2.12-java8 .
> >> >
> >> >
> >> >cd flink-1.12.0
> >> >./bin/kubernetes-session.sh \
> >> -Dkubernetes.container.image=flink:1.12.0-scala_2.12-java8 \
> >> -Dkubernetes.rest-service.exposed.type=NodePort \
> >> -Dtaskmanager.numberOfTaskSlots=2 \
> >> -Dkubernetes.cluster-id=flink-session-cluster
> >> >
> >> >
> >> >显示JM启起来了,但无法通过web访问
> >> >
> >> >2020-12-27 22:08:12,387 INFO
> >> org.apache.flink.kubernetes.KubernetesClusterDescriptor      [] - Create
> >> flink session cluster session001 successfully, JobManager Web Interface:
> >> http://192.168.99.100:8081
> >> >
> >> >
> >> >
> >> >
> >> >通过 `kubectl get pods` 命令查看到pod一直处理ContainerCreating状态
> >> >
> >> >NAME                                               READY   STATUS
> >>       RESTARTS   AGE
> >> >
> >> >flink-session-cluster-858bd55dff-bzjk2             0/1
> >>  ContainerCreating   0          5m59s
> >> >
> >> >kubernetes-dashboard-1608509744-6bc8455756-mp47w   1/1     Running
> >>      0          6d14h
> >> >
> >> >
> >> >
> >> >
> >> >于是通过 `kubectl describe pod
> >> flink-session-cluster-858bd55dff-bzjk2`命令查看详细,结果如下:
> >> >
> >> >
> >> >
> >> >
> >> >Name:         flink-session-cluster-858bd55dff-bzjk2
> >> >
> >> >Namespace:    default
> >> >
> >> >Priority:     0
> >> >
> >> >Node:         minikube/192.168.99.100
> >> >
> >> >Start Time:   Sun, 27 Dec 2020 22:21:56 +0800
> >> >
> >> >Labels:       app=flink-session-cluster
> >> >
> >> >              component=jobmanager
> >> >
> >> >              pod-template-hash=858bd55dff
> >> >
> >> >              type=flink-native-kubernetes
> >> >
> >> >Annotations:  <none>
> >> >
> >> >Status:       Pending
> >> >
> >> >IP:           172.17.0.4
> >> >
> >> >IPs:
> >> >
> >> >  IP:           172.17.0.4
> >> >
> >> >Controlled By:  ReplicaSet/flink-session-cluster-858bd55dff
> >> >
> >> >Containers:
> >> >
> >> >  flink-job-manager:
> >> >
> >> >    Container ID:
> >> >
> >> >    Image:         flink:1.12.0-scala_2.12-java8
> >> >
> >> >    Image ID:
> >> >
> >> >    Ports:         8081/TCP, 6123/TCP, 6124/TCP
> >> >
> >> >    Host Ports:    0/TCP, 0/TCP, 0/TCP
> >> >
> >> >    Command:
> >> >
> >> >      /docker-entrypoint.sh
> >> >
> >> >    Args:
> >> >
> >> >      native-k8s
> >> >
> >> >      $JAVA_HOME/bin/java -classpath $FLINK_CLASSPATH -Xmx1073741824
> >> -Xms1073741824 -XX:MaxMetaspaceSize=268435456
> >> -Dlog.file=/opt/flink/log/jobmanager.log
> >> -Dlogback.configurationFile=file:/opt/flink/conf/logback-console.xml
> >> -Dlog4j.configuration=file:/opt/flink/conf/log4j-console.properties
> >> -Dlog4j.configurationFile=file:/opt/flink/conf/log4j-console.properties
> >>
> org.apache.flink.kubernetes.entrypoint.KubernetesSessionClusterEntrypoint
> >> -D jobmanager.memory.off-heap.size=134217728b -D
> >> jobmanager.memory.jvm-overhead.min=201326592b -D
> >> jobmanager.memory.jvm-metaspace.size=268435456b -D
> >> jobmanager.memory.heap.size=1073741824b -D
> >> jobmanager.memory.jvm-overhead.max=201326592b
> >> >
> >> >    State:          Waiting
> >> >
> >> >      Reason:       ImagePullBackOff
> >> >
> >> >    Ready:          False
> >> >
> >> >    Restart Count:  0
> >> >
> >> >    Limits:
> >> >
> >> >      cpu:     1
> >> >
> >> >      memory:  1600Mi
> >> >
> >> >    Requests:
> >> >
> >> >      cpu:     1
> >> >
> >> >      memory:  1600Mi
> >> >
> >> >    Environment:
> >> >
> >> >      _POD_IP_ADDRESS:   (v1:status.podIP)
> >> >
> >> >      HADOOP_CONF_DIR:  /opt/hadoop/conf
> >> >
> >> >    Mounts:
> >> >
> >> >      /opt/flink/conf from flink-config-volume (rw)
> >> >
> >> >      /opt/hadoop/conf from hadoop-config-volume (rw)
> >> >
> >> >      /var/run/secrets/kubernetes.io/serviceaccount from
> >> default-token-s47ht (ro)
> >> >
> >> >Conditions:
> >> >
> >> >  Type              Status
> >> >
> >> >  Initialized       True
> >> >
> >> >  Ready             False
> >> >
> >> >  ContainersReady   False
> >> >
> >> >  PodScheduled      True
> >> >
> >> >Volumes:
> >> >
> >> >  hadoop-config-volume:
> >> >
> >> >    Type:      ConfigMap (a volume populated by a ConfigMap)
> >> >
> >> >    Name:      hadoop-config-flink-session-cluster
> >> >
> >> >    Optional:  false
> >> >
> >> >  flink-config-volume:
> >> >
> >> >    Type:      ConfigMap (a volume populated by a ConfigMap)
> >> >
> >> >    Name:      flink-config-flink-session-cluster
> >> >
> >> >    Optional:  false
> >> >
> >> >  default-token-s47ht:
> >> >
> >> >    Type:        Secret (a volume populated by a Secret)
> >> >
> >> >    SecretName:  default-token-s47ht
> >> >
> >> >    Optional:    false
> >> >
> >> >QoS Class:       Guaranteed
> >> >
> >> >Node-Selectors:  <none>
> >> >
> >> >Tolerations:     node.kubernetes.io/not-ready:NoExecute op=Exists for
> >> 300s
> >> >
> >> >                 node.kubernetes.io/unreachable:NoExecute op=Exists
> for
> >> 300s
> >> >
> >> >Events:
> >> >
> >> >  Type     Reason       Age                  From               Message
> >> >
> >> >  ----     ------       ----                 ----               -------
> >> >
> >> >  Normal   Scheduled    21m                  default-scheduler
> >> Successfully assigned default/flink-session-cluster-858bd55dff-bzjk2 to
> >> minikube
> >> >
> >> >  Warning  FailedMount  21m (x2 over 21m)    kubelet
> >> MountVolume.SetUp failed for volume "flink-config-volume" : configmap
> >> "flink-config-flink-session-cluster" not found
> >> >
> >> >  Warning  FailedMount  21m (x2 over 21m)    kubelet
> >> MountVolume.SetUp failed for volume "hadoop-config-volume" : configmap
> >> "hadoop-config-flink-session-cluster" not found
> >> >
> >> >  Normal   Pulling      13m (x4 over 21m)    kubelet            Pulling
> >> image "flink:1.12.0-scala_2.12-java8"
> >> >
> >> >  Warning  Failed       13m (x4 over 15m)    kubelet            Failed
> to
> >> pull image "flink:1.12.0-scala_2.12-java8": rpc error: code = Unknown
> desc
> >> = Error response from daemon: manifest for flink:1.12.0-scala_2.12-java8
> >> not found: manifest unknown: manifest unknown
> >> >
> >> >  Normal   BackOff      13m (x5 over 15m)    kubelet
> Back-off
> >> pulling image "flink:1.12.0-scala_2.12-java8"
> >> >
> >> >  Warning  Failed       11m (x5 over 15m)    kubelet            Error:
> >> ErrImagePull
> >> >
> >> >  Warning  Failed       100s (x53 over 15m)  kubelet            Error:
> >> ImagePullBackOff
> >> >
> >> >
> >> >
> >> >
> >> >一开始怀疑本地镜像没有生成,于是通过 `docker images` 命令查看
> >> >
> >> >REPOSITORY                                             TAG
> >>        IMAGE ID       CREATED        SIZE
> >> >
> >> >flink
> >> 1.12.0-scala_2.12-java8   f7dd9b9e020b   12 hours ago   642MB
> >> >
> >> >
> >> >
> >> >
> >>
> >>
> >显示镜像的确是存在的,这就奇怪了,为什么从本地pull镜像会失败呢?是哪里有问题了吗?minikube下,如何从本地web访问到k8s上运行的flink集群dashboard呢?
> >> >
> >> >第一次用k8s,还请各位指点,谢谢!
> >> >
> >> >
> >> >
> >> >
> >> >
> >> >
> >> >
> >> >
> >>
>