HDFS_DELEGATION_TOKEN自动过期问题

classic Classic list List threaded Threaded
2 messages Options
hss
Reply | Threaded
Open this post in threaded view
|

HDFS_DELEGATION_TOKEN自动过期问题

hss
各位好!


hadoop集群开启了Kerberos安全认证,以 Flink on Yarn 的Per-job模式提交任务。 只要是超过七天之后HDFS_DELEGATION_TOKEN自动过期, checkpoint执行不成功, 有遇到这种问题的?
 

2019-12-02 00:00:00.283 ERROR org.apache.flink.yarn.YarnResourceManager                     - Could not start TaskManager in container container_e39_1563434037485_0606_01_552751.
 
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken): token (token for BDATA_UME_ADM: HDFS_DELEGATION_TOKEN owner=[hidden email], renewer=yarn, realUser=, issueDate=1574414126899, maxDate=1575018926899, sequenceNumber=800, masterKeyId=225) can't be found in cache
Reply | Threaded
Open this post in threaded view
|

Re: HDFS_DELEGATION_TOKEN自动过期问题

Paul Lam
Hi,

你需要将 keytab 一并提交到集群,参考 security.kerberos.login.principal 和 security.kerberos.login.keytab 两个配置的说明 [1]。

[1]https://ci.apache.org/projects/flink/flink-docs-release-1.9/ops/config.html#kerberos-based-security <https://ci.apache.org/projects/flink/flink-docs-release-1.9/ops/config.html#kerberos-based-security>

Best,
Paul Lam

> 在 2019年12月10日,11:03,hss <[hidden email]> 写道:
>
> 各位好!
>
>
> hadoop集群开启了Kerberos安全认证,以 Flink on Yarn 的Per-job模式提交任务。&nbsp;只要是超过七天之后HDFS_DELEGATION_TOKEN自动过期, checkpoint执行不成功, 有遇到这种问题的?
> &nbsp;
>
> 2019-12-02 00:00:00.283 ERROR org.apache.flink.yarn.YarnResourceManager &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; - Could not start TaskManager in container container_e39_1563434037485_0606_01_552751.
>
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken): token (token for BDATA_UME_ADM: HDFS_DELEGATION_TOKEN owner=[hidden email], renewer=yarn, realUser=, issueDate=1574414126899, maxDate=1575018926899, sequenceNumber=800, masterKeyId=225) can't be found in cache