flink版本: flink-1.11
taskmanager memory: 8G jobmanager memory: 2G akka.ask.timeout:20s akka.retry-gate-closed-for: 5000 client.timeout:600s 运行一段时间后报the remote task manager was lost ,错误信息如下: 2020-10-28 00:25:30,608 INFO org.apache.flink.runtime.checkpoint.CheckpointCoordinator [] - Completed checkpoint 411 for job 031e5f122711786fcc11ee6eb47291fa (2703770 bytes in 336 ms). 2020-10-28 00:27:30,273 INFO org.apache.flink.runtime.checkpoint.CheckpointCoordinator [] - Triggering checkpoint 412 (type=CHECKPOINT) @ 1603816050239 for job 031e5f122711786fcc11ee6eb47291fa. 2020-10-28 00:27:30,776 INFO org.apache.flink.runtime.checkpoint.CheckpointCoordinator [] - Completed checkpoint 412 for job 031e5f122711786fcc11ee6eb47291fa (3466688 bytes in 509 ms). 2020-10-28 00:29:30,246 INFO org.apache.flink.runtime.checkpoint.CheckpointCoordinator [] - Triggering checkpoint 413 (type=CHECKPOINT) @ 1603816170239 for job 031e5f122711786fcc11ee6eb47291fa. 2020-10-28 00:29:30,597 INFO org.apache.flink.runtime.checkpoint.CheckpointCoordinator [] - Completed checkpoint 413 for job 031e5f122711786fcc11ee6eb47291fa (2752681 bytes in 334 ms). 2020-10-28 00:29:47,353 WARN akka.remote.ReliableDeliverySupervisor [] - Association with remote system [akka.tcp://[hidden email]:13912] has failed, address is now gated for [5000] ms. Reason: [Disassociated] 2020-10-28 00:29:47,353 WARN akka.remote.ReliableDeliverySupervisor [] - Association with remote system [akka.tcp://[hidden email]:31260] has failed, address is now gated for [5000] ms. Reason: [Disassociated] 2020-10-28 00:29:47,377 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - KeyedProcess -> async wait operator -> Map (1/3) (f84731e57528b326ad15ddc17821d1b8) switched from RUNNING to FAILED on org.apache.flink.runtime.jobmaster.slotpool.SingleLogicalSlot@538198b8. org.apache.flink.runtime.io.network.netty.exception.RemoteTransportException: Connection unexpectedly closed by remote task manager 'hadoop01.dev.test.cn/192.168.1.21:7527'. This might indicate that the remote task manager was lost. at org.apache.flink.runtime.io.network.netty.CreditBasedPartitionRequestClientHandler.channelInactive(CreditBasedPartitionRequestClientHandler.java:144) ~[flink-dist_2.11-1.11.1.jar:1.11.1] at org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:257) ~[flink-dist_2.11-1.11.1.jar:1.11.1] at org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:243) ~[flink-dist_2.11-1.11.1.jar:1.11.1] at org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:236) ~[flink-dist_2.11-1.11.1.jar:1.11.1] at org.apache.flink.shaded.netty4.io.netty.channel.ChannelInboundHandlerAdapter.channelInactive(ChannelInboundHandlerAdapter.java:81) ~[flink-dist_2.11-1.11.1.jar:1.11.1] at org.apache.flink.runtime.io.network.netty.NettyMessageClientDecoderDelegate.channelInactive(NettyMessageClientDecoderDelegate.java:97) ~[flink-dist_2.11-1.11.1.jar:1.11.1] at org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:257) ~[flink-dist_2.11-1.11.1.jar:1.11.1] at org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:243) ~[flink-dist_2.11-1.11.1.jar:1.11.1] at org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:236) ~[flink-dist_2.11-1.11.1.jar:1.11.1] at org.apache.flink.shaded.netty4.io.netty.channel.DefaultChannelPipeline$HeadContext.channelInactive(DefaultChannelPipeline.java:1416) ~[flink-dist_2.11-1.11.1.jar:1.11.1] at org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:257) ~[flink-dist_2.11-1.11.1.jar:1.11.1] at org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:243) ~[flink-dist_2.11-1.11.1.jar:1.11.1] at org.apache.flink.shaded.netty4.io.netty.channel.DefaultChannelPipeline.fireChannelInactive(DefaultChannelPipeline.java:912) ~[flink-dist_2.11-1.11.1.jar:1.11.1] at org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannel$AbstractUnsafe$8.run(AbstractChannel.java:816) ~[flink-dist_2.11-1.11.1.jar:1.11.1] at org.apache.flink.shaded.netty4.io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163) ~[flink-dist_2.11-1.11.1.jar:1.11.1] at org.apache.flink.shaded.netty4.io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:416) ~[flink-dist_2.11-1.11.1.jar:1.11.1] at org.apache.flink.shaded.netty4.io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:331) ~[flink-dist_2.11-1.11.1.jar:1.11.1] at org.apache.flink.shaded.netty4.io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:918) ~[flink-dist_2.11-1.11.1.jar:1.11.1] at org.apache.flink.shaded.netty4.io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) ~[flink-dist_2.11-1.11.1.jar:1.11.1] at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_131] 2020-10-28 00:29:47,442 INFO org.apache.flink.runtime.executiongraph.failover.flip1.RestartPipelinedRegionFailoverStrategy [] - Calculating tasks to restart to recover the failed task abf129c3bc11e5b145c2f3103110a0b2_0. 2020-10-28 00:29:47,443 INFO org.apache.flink.runtime.executiongraph.failover.flip1.RestartPipelinedRegionFailoverStrategy [] - 19 tasks should be restarted to recover the failed task abf129c3bc11e5b145c2f3103110a0b2_0. 2020-10-28 00:29:47,444 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Job static_order_gmv_by_paydate (031e5f122711786fcc11ee6eb47291fa) switched from state RUNNING to RESTARTING. 2020-10-28 00:29:47,445 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Sink: Unnamed (1/1) (c9d8de20cf8d58d3cd5e9f2dfadd7b70) switched from RUNNING to CANCELING. 2020-10-28 00:29:47,447 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Source: Custom Source -> Flat Map -> Timestamps/Watermarks (2/3) (828066cde4cda22eb4756366eafac229) switched from RUNNING to CANCELING. 2020-10-28 00:29:47,447 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Source: Custom Source -> Flat Map -> Timestamps/Watermarks (1/3) (ae5e40830a57bbd118db2f8ee86a00ae) switched from RUNNING to CANCELING. 2020-10-28 00:29:47,447 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - KeyedProcess -> async wait operator -> Map (2/3) (70eb6b6d5a363910f8fd808024d68b8a) switched from RUNNING to CANCELING. 2020-10-28 00:29:47,447 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - KeyedProcess -> async wait operator -> Map (3/3) (a42963633bf0a142c082ec0e424666b3) switched from RUNNING to CANCELING. 2020-10-28 00:29:47,448 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Source: Custom Source -> Flat Map -> Timestamps/Watermarks (3/3) (591b6fa2ad487cc2fe91cb9ac5a0d19e) switched from RUNNING to CANCELING. 2020-10-28 00:29:47,448 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - KeyedProcess (1/3) (9a35a07b539502ec2d23ec35d3d507db) switched from RUNNING to CANCELING. 2020-10-28 00:29:47,448 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - KeyedProcess (2/3) (82734fa6851b2dcd769b34f7d8d1afaa) switched from RUNNING to CANCELING. 2020-10-28 00:29:47,448 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - KeyedProcess (3/3) (f13b2ef5feba6b65ad276cf87bdf2218) switched from RUNNING to CANCELING. 2020-10-28 00:29:47,448 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Co-Flat Map (2/3) (6442e15db194a591c32a821e18198686) switched from RUNNING to CANCELING. 2020-10-28 00:29:47,448 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Co-Flat Map (1/3) (6961b6cff72d1c41d8345944d246b433) switched from RUNNING to CANCELING. 2020-10-28 00:29:47,448 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Co-Flat Map (3/3) (41edc64886544d8a542b23074c99f614) switched from RUNNING to CANCELING. 2020-10-28 00:29:47,448 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Window(TumblingEventTimeWindows(120000), EventTimeTrigger, ViewAggregateFunction, ViewSumWindowFunction) (1/3) (e9bd1a3fb4f3d0786831a439189e6240) switched from RUNNING to CANCELING. 2020-10-28 00:29:47,448 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - KeyedProcess (1/3) (057233f7fa678b0a54e5c3d682caab24) switched from RUNNING to CANCELING. 2020-10-28 00:29:47,448 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - KeyedProcess (2/3) (11cde122ba8a22ef37269c8cd051e079) switched from RUNNING to CANCELING. 2020-10-28 00:29:47,448 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Window(TumblingEventTimeWindows(120000), EventTimeTrigger, ViewAggregateFunction, ViewSumWindowFunction) (3/3) (40b1bb8ce62b6b2062dc68bd63c2f60a) switched from RUNNING to CANCELING. 2020-10-28 00:29:47,449 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Window(TumblingEventTimeWindows(120000), EventTimeTrigger, ViewAggregateFunction, ViewSumWindowFunction) (2/3) (88e1242700ba1d5a9cba5c466f51cac2) switched from RUNNING to CANCELING. 2020-10-28 00:29:47,449 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - KeyedProcess (3/3) (e67f3c240663d5949872fa5988568e40) switched from RUNNING to CANCELING. 2020-10-28 00:29:47,452 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Window(TumblingEventTimeWindows(120000), EventTimeTrigger, ViewAggregateFunction, ViewSumWindowFunction) (1/3) (e9bd1a3fb4f3d0786831a439189e6240) switched from CANCELING to CANCELED. 2020-10-28 00:29:47,452 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Discarding the results produced by task execution e9bd1a3fb4f3d0786831a439189e6240. 2020-10-28 00:29:47,457 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Discarding the results produced by task execution e9bd1a3fb4f3d0786831a439189e6240. 2020-10-28 00:29:47,459 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - KeyedProcess (1/3) (057233f7fa678b0a54e5c3d682caab24) switched from CANCELING to CANCELED. 2020-10-28 00:29:47,459 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Discarding the results produced by task execution 057233f7fa678b0a54e5c3d682caab24. 2020-10-28 00:29:47,460 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Discarding the results produced by task execution 057233f7fa678b0a54e5c3d682caab24. 2020-10-28 00:29:47,460 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Co-Flat Map (1/3) (6961b6cff72d1c41d8345944d246b433) switched from CANCELING to CANCELED. 2020-10-28 00:29:47,460 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Discarding the results produced by task execution 6961b6cff72d1c41d8345944d246b433. 2020-10-28 00:29:47,460 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Discarding the results produced by task execution 6961b6cff72d1c41d8345944d246b433. 2020-10-28 00:29:47,461 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - KeyedProcess (1/3) (9a35a07b539502ec2d23ec35d3d507db) switched from CANCELING to CANCELED. 2020-10-28 00:29:47,461 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Discarding the results produced by task execution 9a35a07b539502ec2d23ec35d3d507db. 2020-10-28 00:29:47,461 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Discarding the results produced by task execution 9a35a07b539502ec2d23ec35d3d507db. 2020-10-28 00:29:47,517 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Source: Custom Source -> Flat Map -> Timestamps/Watermarks (3/3) (591b6fa2ad487cc2fe91cb9ac5a0d19e) switched from CANCELING to CANCELED. 2020-10-28 00:29:47,566 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Sink: Unnamed (1/1) (c9d8de20cf8d58d3cd5e9f2dfadd7b70) switched from CANCELING to CANCELED. 2020-10-28 00:29:47,567 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - KeyedProcess (3/3) (f13b2ef5feba6b65ad276cf87bdf2218) switched from CANCELING to CANCELED. 2020-10-28 00:29:47,568 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Co-Flat Map (3/3) (41edc64886544d8a542b23074c99f614) switched from CANCELING to CANCELED. 2020-10-28 00:29:47,568 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - KeyedProcess (3/3) (e67f3c240663d5949872fa5988568e40) switched from CANCELING to CANCELED. 2020-10-28 00:29:47,569 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Window(TumblingEventTimeWindows(120000), EventTimeTrigger, ViewAggregateFunction, ViewSumWindowFunction) (3/3) (40b1bb8ce62b6b2062dc68bd63c2f60a) switched from CANCELING to CANCELED. 2020-10-28 00:29:47,570 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Source: Custom Source -> Flat Map -> Timestamps/Watermarks (1/3) (ae5e40830a57bbd118db2f8ee86a00ae) switched from CANCELING to CANCELED. 2020-10-28 00:29:47,594 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - KeyedProcess -> async wait operator -> Map (3/3) (a42963633bf0a142c082ec0e424666b3) switched from CANCELING to CANCELED. 2020-10-28 00:29:50,845 INFO org.apache.flink.yarn.YarnResourceManager [] - Closing TaskExecutor connection container_1591067037248_153639_01_000003 because: Container killed on request. Exit code is 137 Container exited with a non-zero exit code 137 Killed by external signal 2020-10-28 00:29:50,846 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Source: Custom Source -> Flat Map -> Timestamps/Watermarks (2/3) (828066cde4cda22eb4756366eafac229) switched from CANCELING to CANCELED. 2020-10-28 00:29:50,846 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Discarding the results produced by task execution 828066cde4cda22eb4756366eafac229. 2020-10-28 00:29:50,846 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Discarding the results produced by task execution 828066cde4cda22eb4756366eafac229. 2020-10-28 00:29:50,846 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - KeyedProcess -> async wait operator -> Map (2/3) (70eb6b6d5a363910f8fd808024d68b8a) switched from CANCELING to CANCELED. 2020-10-28 00:29:50,847 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Discarding the results produced by task execution 70eb6b6d5a363910f8fd808024d68b8a. 2020-10-28 00:29:50,847 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Discarding the results produced by task execution 70eb6b6d5a363910f8fd808024d68b8a. 2020-10-28 00:29:50,847 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - KeyedProcess (2/3) (11cde122ba8a22ef37269c8cd051e079) switched from CANCELING to CANCELED. 2020-10-28 00:29:50,847 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Discarding the results produced by task execution 11cde122ba8a22ef37269c8cd051e079. 2020-10-28 00:29:50,847 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Discarding the results produced by task execution 11cde122ba8a22ef37269c8cd051e079. 2020-10-28 00:29:50,847 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Window(TumblingEventTimeWindows(120000), EventTimeTrigger, ViewAggregateFunction, ViewSumWindowFunction) (2/3) (88e1242700ba1d5a9cba5c466f51cac2) switched from CANCELING to CANCELED. 2020-10-28 00:29:50,847 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Discarding the results produced by task execution 88e1242700ba1d5a9cba5c466f51cac2. 2020-10-28 00:29:50,847 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Discarding the results produced by task execution 88e1242700ba1d5a9cba5c466f51cac2. 2020-10-28 00:29:50,847 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Co-Flat Map (2/3) (6442e15db194a591c32a821e18198686) switched from CANCELING to CANCELED. 2020-10-28 00:29:50,847 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Discarding the results produced by task execution 6442e15db194a591c32a821e18198686. 2020-10-28 00:29:50,847 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Discarding the results produced by task execution 6442e15db194a591c32a821e18198686. 2020-10-28 00:29:50,847 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - KeyedProcess (2/3) (82734fa6851b2dcd769b34f7d8d1afaa) switched from CANCELING to CANCELED. 2020-10-28 00:29:50,847 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Discarding the results produced by task execution 82734fa6851b2dcd769b34f7d8d1afaa. 2020-10-28 00:29:50,847 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Discarding the results produced by task execution 82734fa6851b2dcd769b34f7d8d1afaa. 2020-10-28 00:29:50,850 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Job static_order_gmv_by_paydate (031e5f122711786fcc11ee6eb47291fa) switched from state RESTARTING to RUNNING. 2020-10-28 00:29:50,851 INFO org.apache.flink.runtime.checkpoint.ZooKeeperCompletedCheckpointStore [] - Recovering checkpoints from ZooKeeper. -- Sent from: http://apache-flink.147419.n8.nabble.com/ |
我都是80G、100G这么分配资源的。。。
guanxianchun <[hidden email]> 于2020年10月28日周三 下午5:02写道: > flink版本: flink-1.11 > taskmanager memory: 8G > jobmanager memory: 2G > akka.ask.timeout:20s > akka.retry-gate-closed-for: 5000 > client.timeout:600s > > 运行一段时间后报the remote task manager was lost ,错误信息如下: > 2020-10-28 00:25:30,608 INFO > org.apache.flink.runtime.checkpoint.CheckpointCoordinator [] - Completed > checkpoint 411 for job 031e5f122711786fcc11ee6eb47291fa (2703770 bytes in > 336 ms). > 2020-10-28 00:27:30,273 INFO > org.apache.flink.runtime.checkpoint.CheckpointCoordinator [] - > Triggering > checkpoint 412 (type=CHECKPOINT) @ 1603816050239 for job > 031e5f122711786fcc11ee6eb47291fa. > 2020-10-28 00:27:30,776 INFO > org.apache.flink.runtime.checkpoint.CheckpointCoordinator [] - Completed > checkpoint 412 for job 031e5f122711786fcc11ee6eb47291fa (3466688 bytes in > 509 ms). > 2020-10-28 00:29:30,246 INFO > org.apache.flink.runtime.checkpoint.CheckpointCoordinator [] - > Triggering > checkpoint 413 (type=CHECKPOINT) @ 1603816170239 for job > 031e5f122711786fcc11ee6eb47291fa. > 2020-10-28 00:29:30,597 INFO > org.apache.flink.runtime.checkpoint.CheckpointCoordinator [] - Completed > checkpoint 413 for job 031e5f122711786fcc11ee6eb47291fa (2752681 bytes in > 334 ms). > 2020-10-28 00:29:47,353 WARN akka.remote.ReliableDeliverySupervisor > > [] - Association with remote system > [akka.tcp://[hidden email]:13912] has failed, address is now > gated for [5000] ms. Reason: [Disassociated] > 2020-10-28 00:29:47,353 WARN akka.remote.ReliableDeliverySupervisor > > [] - Association with remote system > [akka.tcp://[hidden email]:31260] has failed, address > is > now gated for [5000] ms. Reason: [Disassociated] > 2020-10-28 00:29:47,377 INFO > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - > KeyedProcess -> async wait operator -> Map (1/3) > (f84731e57528b326ad15ddc17821d1b8) switched from RUNNING to FAILED on > org.apache.flink.runtime.jobmaster.slotpool.SingleLogicalSlot@538198b8. > org.apache.flink.runtime.io > .network.netty.exception.RemoteTransportException: > Connection unexpectedly closed by remote task manager > 'hadoop01.dev.test.cn/192.168.1.21:7527'. This might indicate that the > remote task manager was lost. > at > org.apache.flink.runtime.io > .network.netty.CreditBasedPartitionRequestClientHandler.channelInactive(CreditBasedPartitionRequestClientHandler.java:144) > ~[flink-dist_2.11-1.11.1.jar:1.11.1] > at > > org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:257) > ~[flink-dist_2.11-1.11.1.jar:1.11.1] > at > > org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:243) > ~[flink-dist_2.11-1.11.1.jar:1.11.1] > at > > org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:236) > ~[flink-dist_2.11-1.11.1.jar:1.11.1] > at > > org.apache.flink.shaded.netty4.io.netty.channel.ChannelInboundHandlerAdapter.channelInactive(ChannelInboundHandlerAdapter.java:81) > ~[flink-dist_2.11-1.11.1.jar:1.11.1] > at > org.apache.flink.runtime.io > .network.netty.NettyMessageClientDecoderDelegate.channelInactive(NettyMessageClientDecoderDelegate.java:97) > ~[flink-dist_2.11-1.11.1.jar:1.11.1] > at > > org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:257) > ~[flink-dist_2.11-1.11.1.jar:1.11.1] > at > > org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:243) > ~[flink-dist_2.11-1.11.1.jar:1.11.1] > at > > org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:236) > ~[flink-dist_2.11-1.11.1.jar:1.11.1] > at > > org.apache.flink.shaded.netty4.io.netty.channel.DefaultChannelPipeline$HeadContext.channelInactive(DefaultChannelPipeline.java:1416) > ~[flink-dist_2.11-1.11.1.jar:1.11.1] > at > > org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:257) > ~[flink-dist_2.11-1.11.1.jar:1.11.1] > at > > org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:243) > ~[flink-dist_2.11-1.11.1.jar:1.11.1] > at > > org.apache.flink.shaded.netty4.io.netty.channel.DefaultChannelPipeline.fireChannelInactive(DefaultChannelPipeline.java:912) > ~[flink-dist_2.11-1.11.1.jar:1.11.1] > at > > org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannel$AbstractUnsafe$8.run(AbstractChannel.java:816) > ~[flink-dist_2.11-1.11.1.jar:1.11.1] > at > > org.apache.flink.shaded.netty4.io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163) > ~[flink-dist_2.11-1.11.1.jar:1.11.1] > at > > org.apache.flink.shaded.netty4.io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:416) > ~[flink-dist_2.11-1.11.1.jar:1.11.1] > at > > org.apache.flink.shaded.netty4.io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:331) > ~[flink-dist_2.11-1.11.1.jar:1.11.1] > at > > org.apache.flink.shaded.netty4.io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:918) > ~[flink-dist_2.11-1.11.1.jar:1.11.1] > at > > org.apache.flink.shaded.netty4.io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) > ~[flink-dist_2.11-1.11.1.jar:1.11.1] > at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_131] > 2020-10-28 00:29:47,442 INFO > > org.apache.flink.runtime.executiongraph.failover.flip1.RestartPipelinedRegionFailoverStrategy > [] - Calculating tasks to restart to recover the failed task > abf129c3bc11e5b145c2f3103110a0b2_0. > 2020-10-28 00:29:47,443 INFO > > org.apache.flink.runtime.executiongraph.failover.flip1.RestartPipelinedRegionFailoverStrategy > [] - 19 tasks should be restarted to recover the failed task > abf129c3bc11e5b145c2f3103110a0b2_0. > 2020-10-28 00:29:47,444 INFO > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Job > static_order_gmv_by_paydate (031e5f122711786fcc11ee6eb47291fa) switched > from > state RUNNING to RESTARTING. > 2020-10-28 00:29:47,445 INFO > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Sink: > Unnamed (1/1) (c9d8de20cf8d58d3cd5e9f2dfadd7b70) switched from RUNNING to > CANCELING. > 2020-10-28 00:29:47,447 INFO > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Source: > Custom Source -> Flat Map -> Timestamps/Watermarks (2/3) > (828066cde4cda22eb4756366eafac229) switched from RUNNING to CANCELING. > 2020-10-28 00:29:47,447 INFO > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Source: > Custom Source -> Flat Map -> Timestamps/Watermarks (1/3) > (ae5e40830a57bbd118db2f8ee86a00ae) switched from RUNNING to CANCELING. > 2020-10-28 00:29:47,447 INFO > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - > KeyedProcess -> async wait operator -> Map (2/3) > (70eb6b6d5a363910f8fd808024d68b8a) switched from RUNNING to CANCELING. > 2020-10-28 00:29:47,447 INFO > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - > KeyedProcess -> async wait operator -> Map (3/3) > (a42963633bf0a142c082ec0e424666b3) switched from RUNNING to CANCELING. > 2020-10-28 00:29:47,448 INFO > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Source: > Custom Source -> Flat Map -> Timestamps/Watermarks (3/3) > (591b6fa2ad487cc2fe91cb9ac5a0d19e) switched from RUNNING to CANCELING. > 2020-10-28 00:29:47,448 INFO > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - > KeyedProcess (1/3) (9a35a07b539502ec2d23ec35d3d507db) switched from RUNNING > to CANCELING. > 2020-10-28 00:29:47,448 INFO > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - > KeyedProcess (2/3) (82734fa6851b2dcd769b34f7d8d1afaa) switched from RUNNING > to CANCELING. > 2020-10-28 00:29:47,448 INFO > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - > KeyedProcess (3/3) (f13b2ef5feba6b65ad276cf87bdf2218) switched from RUNNING > to CANCELING. > 2020-10-28 00:29:47,448 INFO > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Co-Flat > Map (2/3) (6442e15db194a591c32a821e18198686) switched from RUNNING to > CANCELING. > 2020-10-28 00:29:47,448 INFO > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Co-Flat > Map (1/3) (6961b6cff72d1c41d8345944d246b433) switched from RUNNING to > CANCELING. > 2020-10-28 00:29:47,448 INFO > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Co-Flat > Map (3/3) (41edc64886544d8a542b23074c99f614) switched from RUNNING to > CANCELING. > 2020-10-28 00:29:47,448 INFO > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - > Window(TumblingEventTimeWindows(120000), EventTimeTrigger, > ViewAggregateFunction, ViewSumWindowFunction) (1/3) > (e9bd1a3fb4f3d0786831a439189e6240) switched from RUNNING to CANCELING. > 2020-10-28 00:29:47,448 INFO > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - > KeyedProcess (1/3) (057233f7fa678b0a54e5c3d682caab24) switched from RUNNING > to CANCELING. > 2020-10-28 00:29:47,448 INFO > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - > KeyedProcess (2/3) (11cde122ba8a22ef37269c8cd051e079) switched from RUNNING > to CANCELING. > 2020-10-28 00:29:47,448 INFO > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - > Window(TumblingEventTimeWindows(120000), EventTimeTrigger, > ViewAggregateFunction, ViewSumWindowFunction) (3/3) > (40b1bb8ce62b6b2062dc68bd63c2f60a) switched from RUNNING to CANCELING. > 2020-10-28 00:29:47,449 INFO > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - > Window(TumblingEventTimeWindows(120000), EventTimeTrigger, > ViewAggregateFunction, ViewSumWindowFunction) (2/3) > (88e1242700ba1d5a9cba5c466f51cac2) switched from RUNNING to CANCELING. > 2020-10-28 00:29:47,449 INFO > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - > KeyedProcess (3/3) (e67f3c240663d5949872fa5988568e40) switched from RUNNING > to CANCELING. > 2020-10-28 00:29:47,452 INFO > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - > Window(TumblingEventTimeWindows(120000), EventTimeTrigger, > ViewAggregateFunction, ViewSumWindowFunction) (1/3) > (e9bd1a3fb4f3d0786831a439189e6240) switched from CANCELING to CANCELED. > 2020-10-28 00:29:47,452 INFO > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - > Discarding > the results produced by task execution e9bd1a3fb4f3d0786831a439189e6240. > 2020-10-28 00:29:47,457 INFO > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - > Discarding > the results produced by task execution e9bd1a3fb4f3d0786831a439189e6240. > 2020-10-28 00:29:47,459 INFO > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - > KeyedProcess (1/3) (057233f7fa678b0a54e5c3d682caab24) switched from > CANCELING to CANCELED. > 2020-10-28 00:29:47,459 INFO > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - > Discarding > the results produced by task execution 057233f7fa678b0a54e5c3d682caab24. > 2020-10-28 00:29:47,460 INFO > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - > Discarding > the results produced by task execution 057233f7fa678b0a54e5c3d682caab24. > 2020-10-28 00:29:47,460 INFO > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Co-Flat > Map (1/3) (6961b6cff72d1c41d8345944d246b433) switched from CANCELING to > CANCELED. > 2020-10-28 00:29:47,460 INFO > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - > Discarding > the results produced by task execution 6961b6cff72d1c41d8345944d246b433. > 2020-10-28 00:29:47,460 INFO > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - > Discarding > the results produced by task execution 6961b6cff72d1c41d8345944d246b433. > 2020-10-28 00:29:47,461 INFO > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - > KeyedProcess (1/3) (9a35a07b539502ec2d23ec35d3d507db) switched from > CANCELING to CANCELED. > 2020-10-28 00:29:47,461 INFO > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - > Discarding > the results produced by task execution 9a35a07b539502ec2d23ec35d3d507db. > 2020-10-28 00:29:47,461 INFO > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - > Discarding > the results produced by task execution 9a35a07b539502ec2d23ec35d3d507db. > 2020-10-28 00:29:47,517 INFO > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Source: > Custom Source -> Flat Map -> Timestamps/Watermarks (3/3) > (591b6fa2ad487cc2fe91cb9ac5a0d19e) switched from CANCELING to CANCELED. > 2020-10-28 00:29:47,566 INFO > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Sink: > Unnamed (1/1) (c9d8de20cf8d58d3cd5e9f2dfadd7b70) switched from CANCELING to > CANCELED. > 2020-10-28 00:29:47,567 INFO > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - > KeyedProcess (3/3) (f13b2ef5feba6b65ad276cf87bdf2218) switched from > CANCELING to CANCELED. > 2020-10-28 00:29:47,568 INFO > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Co-Flat > Map (3/3) (41edc64886544d8a542b23074c99f614) switched from CANCELING to > CANCELED. > 2020-10-28 00:29:47,568 INFO > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - > KeyedProcess (3/3) (e67f3c240663d5949872fa5988568e40) switched from > CANCELING to CANCELED. > 2020-10-28 00:29:47,569 INFO > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - > Window(TumblingEventTimeWindows(120000), EventTimeTrigger, > ViewAggregateFunction, ViewSumWindowFunction) (3/3) > (40b1bb8ce62b6b2062dc68bd63c2f60a) switched from CANCELING to CANCELED. > 2020-10-28 00:29:47,570 INFO > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Source: > Custom Source -> Flat Map -> Timestamps/Watermarks (1/3) > (ae5e40830a57bbd118db2f8ee86a00ae) switched from CANCELING to CANCELED. > 2020-10-28 00:29:47,594 INFO > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - > KeyedProcess -> async wait operator -> Map (3/3) > (a42963633bf0a142c082ec0e424666b3) switched from CANCELING to CANCELED. > 2020-10-28 00:29:50,845 INFO org.apache.flink.yarn.YarnResourceManager > > [] - Closing TaskExecutor connection > container_1591067037248_153639_01_000003 because: Container killed on > request. Exit code is 137 > Container exited with a non-zero exit code 137 > Killed by external signal > > 2020-10-28 00:29:50,846 INFO > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Source: > Custom Source -> Flat Map -> Timestamps/Watermarks (2/3) > (828066cde4cda22eb4756366eafac229) switched from CANCELING to CANCELED. > 2020-10-28 00:29:50,846 INFO > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - > Discarding > the results produced by task execution 828066cde4cda22eb4756366eafac229. > 2020-10-28 00:29:50,846 INFO > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - > Discarding > the results produced by task execution 828066cde4cda22eb4756366eafac229. > 2020-10-28 00:29:50,846 INFO > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - > KeyedProcess -> async wait operator -> Map (2/3) > (70eb6b6d5a363910f8fd808024d68b8a) switched from CANCELING to CANCELED. > 2020-10-28 00:29:50,847 INFO > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - > Discarding > the results produced by task execution 70eb6b6d5a363910f8fd808024d68b8a. > 2020-10-28 00:29:50,847 INFO > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - > Discarding > the results produced by task execution 70eb6b6d5a363910f8fd808024d68b8a. > 2020-10-28 00:29:50,847 INFO > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - > KeyedProcess (2/3) (11cde122ba8a22ef37269c8cd051e079) switched from > CANCELING to CANCELED. > 2020-10-28 00:29:50,847 INFO > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - > Discarding > the results produced by task execution 11cde122ba8a22ef37269c8cd051e079. > 2020-10-28 00:29:50,847 INFO > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - > Discarding > the results produced by task execution 11cde122ba8a22ef37269c8cd051e079. > 2020-10-28 00:29:50,847 INFO > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - > Window(TumblingEventTimeWindows(120000), EventTimeTrigger, > ViewAggregateFunction, ViewSumWindowFunction) (2/3) > (88e1242700ba1d5a9cba5c466f51cac2) switched from CANCELING to CANCELED. > 2020-10-28 00:29:50,847 INFO > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - > Discarding > the results produced by task execution 88e1242700ba1d5a9cba5c466f51cac2. > 2020-10-28 00:29:50,847 INFO > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - > Discarding > the results produced by task execution 88e1242700ba1d5a9cba5c466f51cac2. > 2020-10-28 00:29:50,847 INFO > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Co-Flat > Map (2/3) (6442e15db194a591c32a821e18198686) switched from CANCELING to > CANCELED. > 2020-10-28 00:29:50,847 INFO > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - > Discarding > the results produced by task execution 6442e15db194a591c32a821e18198686. > 2020-10-28 00:29:50,847 INFO > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - > Discarding > the results produced by task execution 6442e15db194a591c32a821e18198686. > 2020-10-28 00:29:50,847 INFO > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - > KeyedProcess (2/3) (82734fa6851b2dcd769b34f7d8d1afaa) switched from > CANCELING to CANCELED. > 2020-10-28 00:29:50,847 INFO > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - > Discarding > the results produced by task execution 82734fa6851b2dcd769b34f7d8d1afaa. > 2020-10-28 00:29:50,847 INFO > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - > Discarding > the results produced by task execution 82734fa6851b2dcd769b34f7d8d1afaa. > 2020-10-28 00:29:50,850 INFO > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Job > static_order_gmv_by_paydate (031e5f122711786fcc11ee6eb47291fa) switched > from > state RESTARTING to RUNNING. > 2020-10-28 00:29:50,851 INFO > org.apache.flink.runtime.checkpoint.ZooKeeperCompletedCheckpointStore [] - > Recovering checkpoints from ZooKeeper. > > > > > > -- > Sent from: http://apache-flink.147419.n8.nabble.com/ > |
可以看一下 remote task 对应的 tm 日志,看看有没有啥异常
Best, Congxian 赵一旦 <[hidden email]> 于2020年12月2日周三 下午6:17写道: > 我都是80G、100G这么分配资源的。。。 > > guanxianchun <[hidden email]> 于2020年10月28日周三 下午5:02写道: > > > flink版本: flink-1.11 > > taskmanager memory: 8G > > jobmanager memory: 2G > > akka.ask.timeout:20s > > akka.retry-gate-closed-for: 5000 > > client.timeout:600s > > > > 运行一段时间后报the remote task manager was lost ,错误信息如下: > > 2020-10-28 00:25:30,608 INFO > > org.apache.flink.runtime.checkpoint.CheckpointCoordinator [] - > Completed > > checkpoint 411 for job 031e5f122711786fcc11ee6eb47291fa (2703770 bytes in > > 336 ms). > > 2020-10-28 00:27:30,273 INFO > > org.apache.flink.runtime.checkpoint.CheckpointCoordinator [] - > > Triggering > > checkpoint 412 (type=CHECKPOINT) @ 1603816050239 for job > > 031e5f122711786fcc11ee6eb47291fa. > > 2020-10-28 00:27:30,776 INFO > > org.apache.flink.runtime.checkpoint.CheckpointCoordinator [] - > Completed > > checkpoint 412 for job 031e5f122711786fcc11ee6eb47291fa (3466688 bytes in > > 509 ms). > > 2020-10-28 00:29:30,246 INFO > > org.apache.flink.runtime.checkpoint.CheckpointCoordinator [] - > > Triggering > > checkpoint 413 (type=CHECKPOINT) @ 1603816170239 for job > > 031e5f122711786fcc11ee6eb47291fa. > > 2020-10-28 00:29:30,597 INFO > > org.apache.flink.runtime.checkpoint.CheckpointCoordinator [] - > Completed > > checkpoint 413 for job 031e5f122711786fcc11ee6eb47291fa (2752681 bytes in > > 334 ms). > > 2020-10-28 00:29:47,353 WARN akka.remote.ReliableDeliverySupervisor > > > > [] - Association with remote system > > [akka.tcp://[hidden email]:13912] has failed, address is now > > gated for [5000] ms. Reason: [Disassociated] > > 2020-10-28 00:29:47,353 WARN akka.remote.ReliableDeliverySupervisor > > > > [] - Association with remote system > > [akka.tcp://[hidden email]:31260] has failed, > address > > is > > now gated for [5000] ms. Reason: [Disassociated] > > 2020-10-28 00:29:47,377 INFO > > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - > > KeyedProcess -> async wait operator -> Map (1/3) > > (f84731e57528b326ad15ddc17821d1b8) switched from RUNNING to FAILED on > > org.apache.flink.runtime.jobmaster.slotpool.SingleLogicalSlot@538198b8. > > org.apache.flink.runtime.io > > .network.netty.exception.RemoteTransportException: > > Connection unexpectedly closed by remote task manager > > 'hadoop01.dev.test.cn/192.168.1.21:7527'. This might indicate that the > > remote task manager was lost. > > at > > org.apache.flink.runtime.io > > > .network.netty.CreditBasedPartitionRequestClientHandler.channelInactive(CreditBasedPartitionRequestClientHandler.java:144) > > ~[flink-dist_2.11-1.11.1.jar:1.11.1] > > at > > > > > org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:257) > > ~[flink-dist_2.11-1.11.1.jar:1.11.1] > > at > > > > > org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:243) > > ~[flink-dist_2.11-1.11.1.jar:1.11.1] > > at > > > > > org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:236) > > ~[flink-dist_2.11-1.11.1.jar:1.11.1] > > at > > > > > org.apache.flink.shaded.netty4.io.netty.channel.ChannelInboundHandlerAdapter.channelInactive(ChannelInboundHandlerAdapter.java:81) > > ~[flink-dist_2.11-1.11.1.jar:1.11.1] > > at > > org.apache.flink.runtime.io > > > .network.netty.NettyMessageClientDecoderDelegate.channelInactive(NettyMessageClientDecoderDelegate.java:97) > > ~[flink-dist_2.11-1.11.1.jar:1.11.1] > > at > > > > > org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:257) > > ~[flink-dist_2.11-1.11.1.jar:1.11.1] > > at > > > > > org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:243) > > ~[flink-dist_2.11-1.11.1.jar:1.11.1] > > at > > > > > org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:236) > > ~[flink-dist_2.11-1.11.1.jar:1.11.1] > > at > > > > > org.apache.flink.shaded.netty4.io.netty.channel.DefaultChannelPipeline$HeadContext.channelInactive(DefaultChannelPipeline.java:1416) > > ~[flink-dist_2.11-1.11.1.jar:1.11.1] > > at > > > > > org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:257) > > ~[flink-dist_2.11-1.11.1.jar:1.11.1] > > at > > > > > org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:243) > > ~[flink-dist_2.11-1.11.1.jar:1.11.1] > > at > > > > > org.apache.flink.shaded.netty4.io.netty.channel.DefaultChannelPipeline.fireChannelInactive(DefaultChannelPipeline.java:912) > > ~[flink-dist_2.11-1.11.1.jar:1.11.1] > > at > > > > > org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannel$AbstractUnsafe$8.run(AbstractChannel.java:816) > > ~[flink-dist_2.11-1.11.1.jar:1.11.1] > > at > > > > > org.apache.flink.shaded.netty4.io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163) > > ~[flink-dist_2.11-1.11.1.jar:1.11.1] > > at > > > > > org.apache.flink.shaded.netty4.io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:416) > > ~[flink-dist_2.11-1.11.1.jar:1.11.1] > > at > > > > > org.apache.flink.shaded.netty4.io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:331) > > ~[flink-dist_2.11-1.11.1.jar:1.11.1] > > at > > > > > org.apache.flink.shaded.netty4.io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:918) > > ~[flink-dist_2.11-1.11.1.jar:1.11.1] > > at > > > > > org.apache.flink.shaded.netty4.io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) > > ~[flink-dist_2.11-1.11.1.jar:1.11.1] > > at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_131] > > 2020-10-28 00:29:47,442 INFO > > > > > org.apache.flink.runtime.executiongraph.failover.flip1.RestartPipelinedRegionFailoverStrategy > > [] - Calculating tasks to restart to recover the failed task > > abf129c3bc11e5b145c2f3103110a0b2_0. > > 2020-10-28 00:29:47,443 INFO > > > > > org.apache.flink.runtime.executiongraph.failover.flip1.RestartPipelinedRegionFailoverStrategy > > [] - 19 tasks should be restarted to recover the failed task > > abf129c3bc11e5b145c2f3103110a0b2_0. > > 2020-10-28 00:29:47,444 INFO > > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Job > > static_order_gmv_by_paydate (031e5f122711786fcc11ee6eb47291fa) switched > > from > > state RUNNING to RESTARTING. > > 2020-10-28 00:29:47,445 INFO > > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Sink: > > Unnamed (1/1) (c9d8de20cf8d58d3cd5e9f2dfadd7b70) switched from RUNNING to > > CANCELING. > > 2020-10-28 00:29:47,447 INFO > > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Source: > > Custom Source -> Flat Map -> Timestamps/Watermarks (2/3) > > (828066cde4cda22eb4756366eafac229) switched from RUNNING to CANCELING. > > 2020-10-28 00:29:47,447 INFO > > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Source: > > Custom Source -> Flat Map -> Timestamps/Watermarks (1/3) > > (ae5e40830a57bbd118db2f8ee86a00ae) switched from RUNNING to CANCELING. > > 2020-10-28 00:29:47,447 INFO > > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - > > KeyedProcess -> async wait operator -> Map (2/3) > > (70eb6b6d5a363910f8fd808024d68b8a) switched from RUNNING to CANCELING. > > 2020-10-28 00:29:47,447 INFO > > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - > > KeyedProcess -> async wait operator -> Map (3/3) > > (a42963633bf0a142c082ec0e424666b3) switched from RUNNING to CANCELING. > > 2020-10-28 00:29:47,448 INFO > > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Source: > > Custom Source -> Flat Map -> Timestamps/Watermarks (3/3) > > (591b6fa2ad487cc2fe91cb9ac5a0d19e) switched from RUNNING to CANCELING. > > 2020-10-28 00:29:47,448 INFO > > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - > > KeyedProcess (1/3) (9a35a07b539502ec2d23ec35d3d507db) switched from > RUNNING > > to CANCELING. > > 2020-10-28 00:29:47,448 INFO > > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - > > KeyedProcess (2/3) (82734fa6851b2dcd769b34f7d8d1afaa) switched from > RUNNING > > to CANCELING. > > 2020-10-28 00:29:47,448 INFO > > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - > > KeyedProcess (3/3) (f13b2ef5feba6b65ad276cf87bdf2218) switched from > RUNNING > > to CANCELING. > > 2020-10-28 00:29:47,448 INFO > > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Co-Flat > > Map (2/3) (6442e15db194a591c32a821e18198686) switched from RUNNING to > > CANCELING. > > 2020-10-28 00:29:47,448 INFO > > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Co-Flat > > Map (1/3) (6961b6cff72d1c41d8345944d246b433) switched from RUNNING to > > CANCELING. > > 2020-10-28 00:29:47,448 INFO > > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Co-Flat > > Map (3/3) (41edc64886544d8a542b23074c99f614) switched from RUNNING to > > CANCELING. > > 2020-10-28 00:29:47,448 INFO > > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - > > Window(TumblingEventTimeWindows(120000), EventTimeTrigger, > > ViewAggregateFunction, ViewSumWindowFunction) (1/3) > > (e9bd1a3fb4f3d0786831a439189e6240) switched from RUNNING to CANCELING. > > 2020-10-28 00:29:47,448 INFO > > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - > > KeyedProcess (1/3) (057233f7fa678b0a54e5c3d682caab24) switched from > RUNNING > > to CANCELING. > > 2020-10-28 00:29:47,448 INFO > > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - > > KeyedProcess (2/3) (11cde122ba8a22ef37269c8cd051e079) switched from > RUNNING > > to CANCELING. > > 2020-10-28 00:29:47,448 INFO > > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - > > Window(TumblingEventTimeWindows(120000), EventTimeTrigger, > > ViewAggregateFunction, ViewSumWindowFunction) (3/3) > > (40b1bb8ce62b6b2062dc68bd63c2f60a) switched from RUNNING to CANCELING. > > 2020-10-28 00:29:47,449 INFO > > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - > > Window(TumblingEventTimeWindows(120000), EventTimeTrigger, > > ViewAggregateFunction, ViewSumWindowFunction) (2/3) > > (88e1242700ba1d5a9cba5c466f51cac2) switched from RUNNING to CANCELING. > > 2020-10-28 00:29:47,449 INFO > > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - > > KeyedProcess (3/3) (e67f3c240663d5949872fa5988568e40) switched from > RUNNING > > to CANCELING. > > 2020-10-28 00:29:47,452 INFO > > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - > > Window(TumblingEventTimeWindows(120000), EventTimeTrigger, > > ViewAggregateFunction, ViewSumWindowFunction) (1/3) > > (e9bd1a3fb4f3d0786831a439189e6240) switched from CANCELING to CANCELED. > > 2020-10-28 00:29:47,452 INFO > > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - > > Discarding > > the results produced by task execution e9bd1a3fb4f3d0786831a439189e6240. > > 2020-10-28 00:29:47,457 INFO > > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - > > Discarding > > the results produced by task execution e9bd1a3fb4f3d0786831a439189e6240. > > 2020-10-28 00:29:47,459 INFO > > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - > > KeyedProcess (1/3) (057233f7fa678b0a54e5c3d682caab24) switched from > > CANCELING to CANCELED. > > 2020-10-28 00:29:47,459 INFO > > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - > > Discarding > > the results produced by task execution 057233f7fa678b0a54e5c3d682caab24. > > 2020-10-28 00:29:47,460 INFO > > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - > > Discarding > > the results produced by task execution 057233f7fa678b0a54e5c3d682caab24. > > 2020-10-28 00:29:47,460 INFO > > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Co-Flat > > Map (1/3) (6961b6cff72d1c41d8345944d246b433) switched from CANCELING to > > CANCELED. > > 2020-10-28 00:29:47,460 INFO > > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - > > Discarding > > the results produced by task execution 6961b6cff72d1c41d8345944d246b433. > > 2020-10-28 00:29:47,460 INFO > > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - > > Discarding > > the results produced by task execution 6961b6cff72d1c41d8345944d246b433. > > 2020-10-28 00:29:47,461 INFO > > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - > > KeyedProcess (1/3) (9a35a07b539502ec2d23ec35d3d507db) switched from > > CANCELING to CANCELED. > > 2020-10-28 00:29:47,461 INFO > > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - > > Discarding > > the results produced by task execution 9a35a07b539502ec2d23ec35d3d507db. > > 2020-10-28 00:29:47,461 INFO > > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - > > Discarding > > the results produced by task execution 9a35a07b539502ec2d23ec35d3d507db. > > 2020-10-28 00:29:47,517 INFO > > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Source: > > Custom Source -> Flat Map -> Timestamps/Watermarks (3/3) > > (591b6fa2ad487cc2fe91cb9ac5a0d19e) switched from CANCELING to CANCELED. > > 2020-10-28 00:29:47,566 INFO > > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Sink: > > Unnamed (1/1) (c9d8de20cf8d58d3cd5e9f2dfadd7b70) switched from CANCELING > to > > CANCELED. > > 2020-10-28 00:29:47,567 INFO > > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - > > KeyedProcess (3/3) (f13b2ef5feba6b65ad276cf87bdf2218) switched from > > CANCELING to CANCELED. > > 2020-10-28 00:29:47,568 INFO > > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Co-Flat > > Map (3/3) (41edc64886544d8a542b23074c99f614) switched from CANCELING to > > CANCELED. > > 2020-10-28 00:29:47,568 INFO > > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - > > KeyedProcess (3/3) (e67f3c240663d5949872fa5988568e40) switched from > > CANCELING to CANCELED. > > 2020-10-28 00:29:47,569 INFO > > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - > > Window(TumblingEventTimeWindows(120000), EventTimeTrigger, > > ViewAggregateFunction, ViewSumWindowFunction) (3/3) > > (40b1bb8ce62b6b2062dc68bd63c2f60a) switched from CANCELING to CANCELED. > > 2020-10-28 00:29:47,570 INFO > > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Source: > > Custom Source -> Flat Map -> Timestamps/Watermarks (1/3) > > (ae5e40830a57bbd118db2f8ee86a00ae) switched from CANCELING to CANCELED. > > 2020-10-28 00:29:47,594 INFO > > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - > > KeyedProcess -> async wait operator -> Map (3/3) > > (a42963633bf0a142c082ec0e424666b3) switched from CANCELING to CANCELED. > > 2020-10-28 00:29:50,845 INFO org.apache.flink.yarn.YarnResourceManager > > > > [] - Closing TaskExecutor connection > > container_1591067037248_153639_01_000003 because: Container killed on > > request. Exit code is 137 > > Container exited with a non-zero exit code 137 > > Killed by external signal > > > > 2020-10-28 00:29:50,846 INFO > > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Source: > > Custom Source -> Flat Map -> Timestamps/Watermarks (2/3) > > (828066cde4cda22eb4756366eafac229) switched from CANCELING to CANCELED. > > 2020-10-28 00:29:50,846 INFO > > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - > > Discarding > > the results produced by task execution 828066cde4cda22eb4756366eafac229. > > 2020-10-28 00:29:50,846 INFO > > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - > > Discarding > > the results produced by task execution 828066cde4cda22eb4756366eafac229. > > 2020-10-28 00:29:50,846 INFO > > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - > > KeyedProcess -> async wait operator -> Map (2/3) > > (70eb6b6d5a363910f8fd808024d68b8a) switched from CANCELING to CANCELED. > > 2020-10-28 00:29:50,847 INFO > > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - > > Discarding > > the results produced by task execution 70eb6b6d5a363910f8fd808024d68b8a. > > 2020-10-28 00:29:50,847 INFO > > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - > > Discarding > > the results produced by task execution 70eb6b6d5a363910f8fd808024d68b8a. > > 2020-10-28 00:29:50,847 INFO > > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - > > KeyedProcess (2/3) (11cde122ba8a22ef37269c8cd051e079) switched from > > CANCELING to CANCELED. > > 2020-10-28 00:29:50,847 INFO > > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - > > Discarding > > the results produced by task execution 11cde122ba8a22ef37269c8cd051e079. > > 2020-10-28 00:29:50,847 INFO > > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - > > Discarding > > the results produced by task execution 11cde122ba8a22ef37269c8cd051e079. > > 2020-10-28 00:29:50,847 INFO > > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - > > Window(TumblingEventTimeWindows(120000), EventTimeTrigger, > > ViewAggregateFunction, ViewSumWindowFunction) (2/3) > > (88e1242700ba1d5a9cba5c466f51cac2) switched from CANCELING to CANCELED. > > 2020-10-28 00:29:50,847 INFO > > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - > > Discarding > > the results produced by task execution 88e1242700ba1d5a9cba5c466f51cac2. > > 2020-10-28 00:29:50,847 INFO > > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - > > Discarding > > the results produced by task execution 88e1242700ba1d5a9cba5c466f51cac2. > > 2020-10-28 00:29:50,847 INFO > > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Co-Flat > > Map (2/3) (6442e15db194a591c32a821e18198686) switched from CANCELING to > > CANCELED. > > 2020-10-28 00:29:50,847 INFO > > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - > > Discarding > > the results produced by task execution 6442e15db194a591c32a821e18198686. > > 2020-10-28 00:29:50,847 INFO > > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - > > Discarding > > the results produced by task execution 6442e15db194a591c32a821e18198686. > > 2020-10-28 00:29:50,847 INFO > > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - > > KeyedProcess (2/3) (82734fa6851b2dcd769b34f7d8d1afaa) switched from > > CANCELING to CANCELED. > > 2020-10-28 00:29:50,847 INFO > > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - > > Discarding > > the results produced by task execution 82734fa6851b2dcd769b34f7d8d1afaa. > > 2020-10-28 00:29:50,847 INFO > > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - > > Discarding > > the results produced by task execution 82734fa6851b2dcd769b34f7d8d1afaa. > > 2020-10-28 00:29:50,850 INFO > > org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Job > > static_order_gmv_by_paydate (031e5f122711786fcc11ee6eb47291fa) switched > > from > > state RESTARTING to RUNNING. > > 2020-10-28 00:29:50,851 INFO > > org.apache.flink.runtime.checkpoint.ZooKeeperCompletedCheckpointStore [] > - > > Recovering checkpoints from ZooKeeper. > > > > > > > > > > > > -- > > Sent from: http://apache-flink.147419.n8.nabble.com/ > > > |
Free forum by Nabble | Edit this page |