flink 1.11.1消费kafka0.10.1.1,然后开窗口去重统计,时间是eventtime,窗口是1分钟。
程序的结构大致如下: kafkaStream.keyBy(<keyselector>).window(<windowassigner>).aggregate(newAverageAggregate()); flink on yarn, 程序能跑,但无法checkpoint,查看taskmanager的日志,发现报错如下。 查看了下,那几个节点都是正常的running。如果去掉窗口统计的代码,直接print kafkaStream,程序是可以正常checkpoint的。日志上也看不出其他问题,百思不得其解。求助各位大佬。 2020-11-18 13:30:52,475 INFO org.apache.kafka.common.utils.AppInfoParser [] - Kafka version : 0.10.2.2 2020-11-18 13:30:52,475 INFO org.apache.kafka.common.utils.AppInfoParser [] - Kafka commitId : cd80bc412b9b9701 2020-11-18 13:31:09,668 INFO org.apache.hadoop.io.retry.RetryInvocationHandler [] - org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): Operation category WRITE is not supported in state standby. Visit https://s.apache.org/sbnn-error at org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:88) at org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:1952) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1423) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:776) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:475) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1685) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675) , while invoking ClientNamenodeProtocolTranslatorPB.create over xxx:8020 after 1 failover attempts. Trying to failover after sleeping for 864ms. -- kingdomad |
问题好像解决了。
使用flink-connector-kafka-0.10_2.12的FlinkKafkaConsumer010就会无法checkpoint,报这个错误, 换成flink-connector-kafka-2.12的FlinkKafkaConsumer就可以正常checkpoint,没报错。 CheckpointingMode是EXACTLY_ONCE或AT_LEAST_ONCE情况都相同。 尚不知何原因。 -- kingdomad 在 2020-11-18 17:19:29,"kingdomad" <[hidden email]> 写道: >flink 1.11.1消费kafka0.10.1.1,然后开窗口去重统计,时间是eventtime,窗口是1分钟。 >程序的结构大致如下: >kafkaStream.keyBy(<keyselector>).window(<windowassigner>).aggregate(newAverageAggregate()); > > >flink on yarn, >程序能跑,但无法checkpoint,查看taskmanager的日志,发现报错如下。 >查看了下,那几个节点都是正常的running。如果去掉窗口统计的代码,直接print kafkaStream,程序是可以正常checkpoint的。日志上也看不出其他问题,百思不得其解。求助各位大佬。 > > > > > > >2020-11-18 13:30:52,475 INFO org.apache.kafka.common.utils.AppInfoParser [] - Kafka version : 0.10.2.2 > >2020-11-18 13:30:52,475 INFO org.apache.kafka.common.utils.AppInfoParser [] - Kafka commitId : cd80bc412b9b9701 > >2020-11-18 13:31:09,668 INFO org.apache.hadoop.io.retry.RetryInvocationHandler [] - org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): Operation category WRITE is not supported in state standby. Visit https://s.apache.org/sbnn-error > >at org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:88) > >at org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:1952) > >at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1423) > >at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:776) > >at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:475) > >at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > >at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523) > >at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991) > >at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869) > >at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815) > >at java.security.AccessController.doPrivileged(Native Method) > >at javax.security.auth.Subject.doAs(Subject.java:422) > >at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1685) > >at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675) > >, while invoking ClientNamenodeProtocolTranslatorPB.create over xxx:8020 after 1 failover attempts. Trying to failover after sleeping for 864ms. > > > > > > > > > > > > > >-- > >kingdomad > |
Free forum by Nabble | Edit this page |