版本:flink版本: flink-1.7.1
kafka客户端版本: flink-connector-kafka-0.11_2.12 kafka服务端版本2.0.0或者1.0.1 flink未启用checkpoint机制,kafka的都是默认配置,offset保存在kafka上 flink-kafka-connector启动参数设置如下: auto.commit.interval.ms = 5000 auto.offset.reset = earliest bootstrap.servers = [ip:9092] check.crcs = true client.id = connections.max.idle.ms = 540000 enable.auto.commit = true exclude.internal.topics = true fetch.max.bytes = 52428800 fetch.max.wait.ms = 500 fetch.min.bytes = 1 group.id = test_user_action_sdb_java2 heartbeat.interval.ms = 3000 interceptor.classes = null internal.leave.group.on.close = true isolation.level = read_uncommitted key.deserializer = class org.apache.kafka.common.serialization.StringDeserializer max.partition.fetch.bytes = 1048576 max.poll.interval.ms = 300000 max.poll.records = 500 metadata.max.age.ms = 300000 metric.reporters = [] metrics.num.samples = 2 metrics.recording.level = INFO metrics.sample.window.ms = 30000 partition.assignment.strategy = [class org.apache.kafka.clients.consumer.RangeAssignor] receive.buffer.bytes = 65536 reconnect.backoff.max.ms = 1000 reconnect.backoff.ms = 50 request.timeout.ms = 305000 retry.backoff.ms = 100 sasl.jaas.config = null sasl.kerberos.kinit.cmd = /usr/bin/kinit sasl.kerberos.min.time.before.relogin = 60000 sasl.kerberos.service.name = null sasl.kerberos.ticket.renew.jitter = 0.05 sasl.kerberos.ticket.renew.window.factor = 0.8 sasl.mechanism = GSSAPI security.protocol = PLAINTEXT send.buffer.bytes = 131072 session.timeout.ms = 10000 ssl.cipher.suites = null ssl.enabled.protocols = [TLSv1.2, TLSv1.1, TLSv1] ssl.endpoint.identification.algorithm = null ssl.key.password = null ssl.keymanager.algorithm = SunX509 ssl.keystore.location = null ssl.keystore.password = null ssl.keystore.type = JKS ssl.protocol = TLS ssl.provider = null ssl.secure.random.implementation = null ssl.trustmanager.algorithm = PKIX ssl.truststore.location = null ssl.truststore.password = null ssl.truststore.type = JKS value.deserializer = class org.apache.kafka.common.serialization.StringDeserializer 现象: 正常运行一段时间后,突然有一个partition的offset提交不上,但是还能正常消费处理,只是提交不了offset。这个问题不好复现,偶尔出现这个错误。 flink端异常信息如下: 2019-06-03 13:30:56,827 INFO org.apache.kafka.clients.consumer.internals.AbstractCoordinator - Marking the coordinator 1.1.1.1:9092 (id: 2147483030 rack: null) dead for group flink-ad-realtime-useraction 2019-06-03 13:30:56,829 WARN org.apache.kafka.clients.consumer.internals.ConsumerCoordinator - Auto-commit of offsets {user_action-96=OffsetAndMetadata{offset=3208384842, metadata=''}, user_action-48=OffsetAndMetadata{offset=3204869414, metadata=''}, user_action-0=OffsetAndMetadata{offset=3208651598, metadata=''}, user_action-120=OffsetAndMetadata{offset=3208633960, metadata=''}, user_action-24=OffsetAndMetadata{offset=3205592887, metadata=''}, user_action-72=OffsetAndMetadata{offset=3209105919, metadata=''}} failed for group flink-ad-realtime-useraction: Offset commit failed with a retriable exception. You should retry committing offsets. The underlying error was: The request timed out. kafka broker端异常信息: 无 我们这边查了很久,不知道是什么原因导致offset提交失败的错误,还请帮忙看看。 |
你的一次性捞了500条消息,在处理的时候消费者超时了。导致coordinator 认为消费者挂了。
解决方法: 1.适当调低max.poll.records 2.调长消费者超时时间 ------------------ 原始邮件 ------------------ 发件人: "孙福"<[hidden email]>; 发送时间: 2019年8月16日(星期五) 下午5:27 收件人: "user-zh"<[hidden email]>; 主题: flink提交offset失败 版本:flink版本: flink-1.7.1 kafka客户端版本: flink-connector-kafka-0.11_2.12 kafka服务端版本2.0.0或者1.0.1 flink未启用checkpoint机制,kafka的都是默认配置,offset保存在kafka上 flink-kafka-connector启动参数设置如下: auto.commit.interval.ms = 5000 auto.offset.reset = earliest bootstrap.servers = [ip:9092] check.crcs = true client.id = connections.max.idle.ms = 540000 enable.auto.commit = true exclude.internal.topics = true fetch.max.bytes = 52428800 fetch.max.wait.ms = 500 fetch.min.bytes = 1 group.id = test_user_action_sdb_java2 heartbeat.interval.ms = 3000 interceptor.classes = null internal.leave.group.on.close = true isolation.level = read_uncommitted key.deserializer = class org.apache.kafka.common.serialization.StringDeserializer max.partition.fetch.bytes = 1048576 max.poll.interval.ms = 300000 max.poll.records = 500 metadata.max.age.ms = 300000 metric.reporters = [] metrics.num.samples = 2 metrics.recording.level = INFO metrics.sample.window.ms = 30000 partition.assignment.strategy = [class org.apache.kafka.clients.consumer.RangeAssignor] receive.buffer.bytes = 65536 reconnect.backoff.max.ms = 1000 reconnect.backoff.ms = 50 request.timeout.ms = 305000 retry.backoff.ms = 100 sasl.jaas.config = null sasl.kerberos.kinit.cmd = /usr/bin/kinit sasl.kerberos.min.time.before.relogin = 60000 sasl.kerberos.service.name = null sasl.kerberos.ticket.renew.jitter = 0.05 sasl.kerberos.ticket.renew.window.factor = 0.8 sasl.mechanism = GSSAPI security.protocol = PLAINTEXT send.buffer.bytes = 131072 session.timeout.ms = 10000 ssl.cipher.suites = null ssl.enabled.protocols = [TLSv1.2, TLSv1.1, TLSv1] ssl.endpoint.identification.algorithm = null ssl.key.password = null ssl.keymanager.algorithm = SunX509 ssl.keystore.location = null ssl.keystore.password = null ssl.keystore.type = JKS ssl.protocol = TLS ssl.provider = null ssl.secure.random.implementation = null ssl.trustmanager.algorithm = PKIX ssl.truststore.location = null ssl.truststore.password = null ssl.truststore.type = JKS value.deserializer = class org.apache.kafka.common.serialization.StringDeserializer 现象: 正常运行一段时间后,突然有一个partition的offset提交不上,但是还能正常消费处理,只是提交不了offset。这个问题不好复现,偶尔出现这个错误。 flink端异常信息如下: 2019-06-03 13:30:56,827 INFO org.apache.kafka.clients.consumer.internals.AbstractCoordinator - Marking the coordinator 1.1.1.1:9092 (id: 2147483030 rack: null) dead for group flink-ad-realtime-useraction 2019-06-03 13:30:56,829 WARN org.apache.kafka.clients.consumer.internals.ConsumerCoordinator - Auto-commit of offsets {user_action-96=OffsetAndMetadata{offset=3208384842, metadata=''}, user_action-48=OffsetAndMetadata{offset=3204869414, metadata=''}, user_action-0=OffsetAndMetadata{offset=3208651598, metadata=''}, user_action-120=OffsetAndMetadata{offset=3208633960, metadata=''}, user_action-24=OffsetAndMetadata{offset=3205592887, metadata=''}, user_action-72=OffsetAndMetadata{offset=3209105919, metadata=''}} failed for group flink-ad-realtime-useraction: Offset commit failed with a retriable exception. You should retry committing offsets. The underlying error was: The request timed out. kafka broker端异常信息: 无 我们这边查了很久,不知道是什么原因导致offset提交失败的错误,还请帮忙看看。 |
您好: max.poll.interval.ms=300000=5分钟 5分钟我肯定处理完500条数据了 在 2019-08-16 17:34:16,"龚中强" <[hidden email]> 写道: >你的一次性捞了500条消息,在处理的时候消费者超时了。导致coordinator 认为消费者挂了。 > >解决方法: >1.适当调低max.poll.records >2.调长消费者超时时间 > > >------------------ 原始邮件 ------------------ >发件人: "孙福"<[hidden email]>; >发送时间: 2019年8月16日(星期五) 下午5:27 >收件人: "user-zh"<[hidden email]>; > >主题: flink提交offset失败 > > > >版本:flink版本: flink-1.7.1 >kafka客户端版本: flink-connector-kafka-0.11_2.12 >kafka服务端版本2.0.0或者1.0.1 >flink未启用checkpoint机制,kafka的都是默认配置,offset保存在kafka上 > > >flink-kafka-connector启动参数设置如下: >auto.commit.interval.ms = 5000 > auto.offset.reset = earliest > bootstrap.servers = [ip:9092] > check.crcs = true > client.id = > connections.max.idle.ms = 540000 > enable.auto.commit = true > exclude.internal.topics = true > fetch.max.bytes = 52428800 > fetch.max.wait.ms = 500 > fetch.min.bytes = 1 > group.id = test_user_action_sdb_java2 > heartbeat.interval.ms = 3000 > interceptor.classes = null > internal.leave.group.on.close = true > isolation.level = read_uncommitted > key.deserializer = class org.apache.kafka.common.serialization.StringDeserializer > max.partition.fetch.bytes = 1048576 > max.poll.interval.ms = 300000 > max.poll.records = 500 > metadata.max.age.ms = 300000 > metric.reporters = [] > metrics.num.samples = 2 > metrics.recording.level = INFO > metrics.sample.window.ms = 30000 > partition.assignment.strategy = [class org.apache.kafka.clients.consumer.RangeAssignor] > receive.buffer.bytes = 65536 > reconnect.backoff.max.ms = 1000 > reconnect.backoff.ms = 50 > request.timeout.ms = 305000 > retry.backoff.ms = 100 > sasl.jaas.config = null > sasl.kerberos.kinit.cmd = /usr/bin/kinit > sasl.kerberos.min.time.before.relogin = 60000 > sasl.kerberos.service.name = null > sasl.kerberos.ticket.renew.jitter = 0.05 > sasl.kerberos.ticket.renew.window.factor = 0.8 > sasl.mechanism = GSSAPI > security.protocol = PLAINTEXT > send.buffer.bytes = 131072 > session.timeout.ms = 10000 > ssl.cipher.suites = null > ssl.enabled.protocols = [TLSv1.2, TLSv1.1, TLSv1] > ssl.endpoint.identification.algorithm = null > ssl.key.password = null > ssl.keymanager.algorithm = SunX509 > ssl.keystore.location = null > ssl.keystore.password = null > ssl.keystore.type = JKS > ssl.protocol = TLS > ssl.provider = null > ssl.secure.random.implementation = null > ssl.trustmanager.algorithm = PKIX > ssl.truststore.location = null > ssl.truststore.password = null > ssl.truststore.type = JKS > value.deserializer = class org.apache.kafka.common.serialization.StringDeserializer > > >现象: >正常运行一段时间后,突然有一个partition的offset提交不上,但是还能正常消费处理,只是提交不了offset。这个问题不好复现,偶尔出现这个错误。 > > >flink端异常信息如下: >2019-06-03 13:30:56,827 INFO org.apache.kafka.clients.consumer.internals.AbstractCoordinator - Marking the coordinator 1.1.1.1:9092 (id: 2147483030 rack: null) dead for group flink-ad-realtime-useraction >2019-06-03 13:30:56,829 WARN org.apache.kafka.clients.consumer.internals.ConsumerCoordinator - Auto-commit of offsets {user_action-96=OffsetAndMetadata{offset=3208384842, metadata=''}, user_action-48=OffsetAndMetadata{offset=3204869414, metadata=''}, user_action-0=OffsetAndMetadata{offset=3208651598, metadata=''}, user_action-120=OffsetAndMetadata{offset=3208633960, metadata=''}, user_action-24=OffsetAndMetadata{offset=3205592887, metadata=''}, user_action-72=OffsetAndMetadata{offset=3209105919, metadata=''}} failed for group flink-ad-realtime-useraction: Offset commit failed with a retriable exception. You should retry committing offsets. The underlying error was: The request timed out. > > >kafka broker端异常信息: >无 > > >我们这边查了很久,不知道是什么原因导致offset提交失败的错误,还请帮忙看看。 |
In reply to this post by 龚中强
您好:
max.poll.interval.ms=300000=5分钟 5分钟我肯定处理完500条数据了 在 2019-08-16 17:34:16,"龚中强" <[hidden email]> 写道: >你的一次性捞了500条消息,在处理的时候消费者超时了。导致coordinator 认为消费者挂了。 > >解决方法: >1.适当调低max.poll.records >2.调长消费者超时时间 > > >------------------ 原始邮件 ------------------ >发件人: "孙福"<[hidden email]>; >发送时间: 2019年8月16日(星期五) 下午5:27 >收件人: "user-zh"<[hidden email]>; > >主题: flink提交offset失败 > > > >版本:flink版本: flink-1.7.1 >kafka客户端版本: flink-connector-kafka-0.11_2.12 >kafka服务端版本2.0.0或者1.0.1 >flink未启用checkpoint机制,kafka的都是默认配置,offset保存在kafka上 > > >flink-kafka-connector启动参数设置如下: >auto.commit.interval.ms = 5000 > auto.offset.reset = earliest > bootstrap.servers = [ip:9092] > check.crcs = true > client.id = > connections.max.idle.ms = 540000 > enable.auto.commit = true > exclude.internal.topics = true > fetch.max.bytes = 52428800 > fetch.max.wait.ms = 500 > fetch.min.bytes = 1 > group.id = test_user_action_sdb_java2 > heartbeat.interval.ms = 3000 > interceptor.classes = null > internal.leave.group.on.close = true > isolation.level = read_uncommitted > key.deserializer = class org.apache.kafka.common.serialization.StringDeserializer > max.partition.fetch.bytes = 1048576 > max.poll.interval.ms = 300000 > max.poll.records = 500 > metadata.max.age.ms = 300000 > metric.reporters = [] > metrics.num.samples = 2 > metrics.recording.level = INFO > metrics.sample.window.ms = 30000 > partition.assignment.strategy = [class org.apache.kafka.clients.consumer.RangeAssignor] > receive.buffer.bytes = 65536 > reconnect.backoff.max.ms = 1000 > reconnect.backoff.ms = 50 > request.timeout.ms = 305000 > retry.backoff.ms = 100 > sasl.jaas.config = null > sasl.kerberos.kinit.cmd = /usr/bin/kinit > sasl.kerberos.min.time.before.relogin = 60000 > sasl.kerberos.service.name = null > sasl.kerberos.ticket.renew.jitter = 0.05 > sasl.kerberos.ticket.renew.window.factor = 0.8 > sasl.mechanism = GSSAPI > security.protocol = PLAINTEXT > send.buffer.bytes = 131072 > session.timeout.ms = 10000 > ssl.cipher.suites = null > ssl.enabled.protocols = [TLSv1.2, TLSv1.1, TLSv1] > ssl.endpoint.identification.algorithm = null > ssl.key.password = null > ssl.keymanager.algorithm = SunX509 > ssl.keystore.location = null > ssl.keystore.password = null > ssl.keystore.type = JKS > ssl.protocol = TLS > ssl.provider = null > ssl.secure.random.implementation = null > ssl.trustmanager.algorithm = PKIX > ssl.truststore.location = null > ssl.truststore.password = null > ssl.truststore.type = JKS > value.deserializer = class org.apache.kafka.common.serialization.StringDeserializer > > >现象: >正常运行一段时间后,突然有一个partition的offset提交不上,但是还能正常消费处理,只是提交不了offset。这个问题不好复现,偶尔出现这个错误。 > > >flink端异常信息如下: >2019-06-03 13:30:56,827 INFO org.apache.kafka.clients.consumer.internals.AbstractCoordinator - Marking the coordinator 1.1.1.1:9092 (id: 2147483030 rack: null) dead for group flink-ad-realtime-useraction >2019-06-03 13:30:56,829 WARN org.apache.kafka.clients.consumer.internals.ConsumerCoordinator - Auto-commit of offsets {user_action-96=OffsetAndMetadata{offset=3208384842, metadata=''}, user_action-48=OffsetAndMetadata{offset=3204869414, metadata=''}, user_action-0=OffsetAndMetadata{offset=3208651598, metadata=''}, user_action-120=OffsetAndMetadata{offset=3208633960, metadata=''}, user_action-24=OffsetAndMetadata{offset=3205592887, metadata=''}, user_action-72=OffsetAndMetadata{offset=3209105919, metadata=''}} failed for group flink-ad-realtime-useraction: Offset commit failed with a retriable exception. You should retry committing offsets. The underlying error was: The request timed out. > > >kafka broker端异常信息: >无 > > >我们这边查了很久,不知道是什么原因导致offset提交失败的错误,还请帮忙看看。 |
Free forum by Nabble | Edit this page |