大家好:
我的flink是部署在yarn上左session,今天早上jobmanager自动退出了,然后yarn把他重新拉起了,导致里面跑的job重新启动了,但是我查看日志,看到jobmanager的日志没有任何异常,同时jobmanager也没有长时间的full gc和频繁的gc,以下是jobmanager的日志: 就是在06:44分的是偶,日志上标记了收收到停止请求,然后jobmanager直接停止了...请问是由于什么原因导致的呢? 2019-08-06 06:43:58,891 INFO > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Completed > checkpoint 7843 for job e49624208fe771c4c9527799fd46f2a3 (5645215 bytes in > 801 ms). > 2019-08-06 06:43:59,336 INFO > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Triggering > checkpoint 7852 @ 1565045039321 for job a9a7464ead55474bea6f42ed8e5de60f. > 2019-08-06 06:44:00,971 INFO > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Triggering > checkpoint 7852 @ 1565045040957 for job 79788b218e684cb31c1ca0fcc641e89f. > 2019-08-06 06:44:01,357 INFO > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Completed > checkpoint 7852 for job a9a7464ead55474bea6f42ed8e5de60f (25870658 bytes in > 1806 ms). > 2019-08-06 06:44:02,887 INFO > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Completed > checkpoint 7852 for job 79788b218e684cb31c1ca0fcc641e89f (29798945 bytes in > 1849 ms). > 2019-08-06 06:44:05,101 INFO > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Triggering > checkpoint 7852 @ 1565045045092 for job 03f3a0bd53c21f90f70ea01916dc9f78. > 2019-08-06 06:44:06,547 INFO > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Triggering > checkpoint 7844 @ 1565045046522 for job 486a1949d75863f823013d87b509d228. > 2019-08-06 06:44:07,311 INFO > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Completed > checkpoint 7844 for job 486a1949d75863f823013d87b509d228 (62458942 bytes in > 736 ms). > 2019-08-06 06:44:07,506 INFO > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Completed > checkpoint 7852 for job 03f3a0bd53c21f90f70ea01916dc9f78 (105565032 bytes > in 2366 ms). > 2019-08-06 06:44:08,087 INFO > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Triggering > checkpoint 7853 @ 1565045048055 for job 32783d371464265ef536454055ae6182. > 2019-08-06 06:44:09,626 INFO > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Checkpoint > 7050 of job 4b542195824ff7b7cdf749543fd368cb expired before completing. > 2019-08-06 06:44:09,647 INFO > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Triggering > checkpoint 7051 @ 1565045049626 for job 4b542195824ff7b7cdf749543fd368cb. > 2019-08-06 06:44:12,006 INFO > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Completed > checkpoint 7853 for job 32783d371464265ef536454055ae6182 (299599482 bytes > in 3912 ms). > 2019-08-06 06:44:12,972 INFO > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Triggering > checkpoint 7853 @ 1565045052962 for job 16db5afe9a8cd7c6278030d5dec4c80c. > 2019-08-06 06:44:13,109 INFO > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Triggering > checkpoint 7853 @ 1565045053080 for job 9c1394a2d2ff47c7852eff9f1f932535. > 2019-08-06 06:44:16,779 INFO > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Completed > checkpoint 7853 for job 16db5afe9a8cd7c6278030d5dec4c80c (152643149 bytes > in 3666 ms). > 2019-08-06 06:44:18,598 INFO > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Completed > checkpoint 7828 for job 8df2b47f2a4c1ba0f7019ee5989f6e71 (837558245 bytes > in 23472 ms). > 2019-08-06 06:44:19,193 INFO > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Completed > checkpoint 7853 for job 9c1394a2d2ff47c7852eff9f1f932535 (594628825 bytes > in 6067 ms). > 2019-08-06 06:44:19,238 INFO > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Completed > checkpoint 5855 for job 108ce7f6f5f3e76b12fad9dbdbc8feba (45917615 bytes in > 61819 ms). > 2019-08-06 06:44:19,248 INFO > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Triggering > checkpoint 5856 @ 1565045059238 for job 108ce7f6f5f3e76b12fad9dbdbc8feba. > 2019-08-06 06:44:22,092 INFO > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Triggering > checkpoint 7802 @ 1565045062084 for job 430689e0f202fcb29ce9d6403e6825f9. > 2019-08-06 06:44:22,838 INFO > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Completed > checkpoint 2940 for job fea51fd74006de69e265adc13e802229 (122562953 bytes > in 174336 ms). > 2019-08-06 06:44:22,888 INFO > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Triggering > checkpoint 2941 @ 1565045062838 for job fea51fd74006de69e265adc13e802229. > 2019-08-06 06:44:24,348 INFO > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Triggering > checkpoint 613 @ 1565045064328 for job 5a75d77312f29c714af0a2994f0e8b1a. > 2019-08-06 06:44:25,327 INFO > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Completed > checkpoint 7802 for job 430689e0f202fcb29ce9d6403e6825f9 (358649788 bytes > in 2788 ms). > 2019-08-06 06:44:25,769 INFO > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Completed > checkpoint 613 for job 5a75d77312f29c714af0a2994f0e8b1a (583594 bytes in > 1341 ms). > 2019-08-06 06:44:27,547 INFO > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Triggering > checkpoint 7844 @ 1565045067534 for job fb32bbf35ed002961b9dfb1417799ae6. > 2019-08-06 06:44:28,738 INFO > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Completed > checkpoint 7844 for job fb32bbf35ed002961b9dfb1417799ae6 (11017757 bytes in > 1178 ms). > 2019-08-06 06:44:37,576 INFO > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Triggering > checkpoint 7853 @ 1565045077573 for job d73c940cf0a996e12ecb93a146f93293. > 2019-08-06 06:44:38,167 INFO > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Completed > checkpoint 7853 for job d73c940cf0a996e12ecb93a146f93293 (123726 bytes in > 562 ms). > 2019-08-06 06:44:45,957 INFO > org.apache.flink.runtime.entrypoint.ClusterEntrypoint - RECEIVED > SIGNAL 15: SIGTERM. Shutting down as requested. > 2019-08-06 06:44:45,957 INFO > org.apache.flink.runtime.blob.TransientBlobCache - Shutting > down BLOB cache |
Hi,
可以查看一下jobmanager所在节点的yarn log,搜索一下对应的container为什么被kill; Regards On 2019/8/6, 11:40 AM, "戴嘉诚" <[hidden email]> wrote: 大家好: 我的flink是部署在yarn上左session,今天早上jobmanager自动退出了,然后yarn把他重新拉起了,导致里面跑的job重新启动了,但是我查看日志,看到jobmanager的日志没有任何异常,同时jobmanager也没有长时间的full gc和频繁的gc,以下是jobmanager的日志: 就是在06:44分的是偶,日志上标记了收收到停止请求,然后jobmanager直接停止了...请问是由于什么原因导致的呢? 2019-08-06 06:43:58,891 INFO > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Completed > checkpoint 7843 for job e49624208fe771c4c9527799fd46f2a3 (5645215 bytes in > 801 ms). > 2019-08-06 06:43:59,336 INFO > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Triggering > checkpoint 7852 @ 1565045039321 for job a9a7464ead55474bea6f42ed8e5de60f. > 2019-08-06 06:44:00,971 INFO > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Triggering > checkpoint 7852 @ 1565045040957 for job 79788b218e684cb31c1ca0fcc641e89f. > 2019-08-06 06:44:01,357 INFO > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Completed > checkpoint 7852 for job a9a7464ead55474bea6f42ed8e5de60f (25870658 bytes in > 1806 ms). > 2019-08-06 06:44:02,887 INFO > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Completed > checkpoint 7852 for job 79788b218e684cb31c1ca0fcc641e89f (29798945 bytes in > 1849 ms). > 2019-08-06 06:44:05,101 INFO > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Triggering > checkpoint 7852 @ 1565045045092 for job 03f3a0bd53c21f90f70ea01916dc9f78. > 2019-08-06 06:44:06,547 INFO > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Triggering > checkpoint 7844 @ 1565045046522 for job 486a1949d75863f823013d87b509d228. > 2019-08-06 06:44:07,311 INFO > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Completed > checkpoint 7844 for job 486a1949d75863f823013d87b509d228 (62458942 bytes in > 736 ms). > 2019-08-06 06:44:07,506 INFO > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Completed > checkpoint 7852 for job 03f3a0bd53c21f90f70ea01916dc9f78 (105565032 bytes > in 2366 ms). > 2019-08-06 06:44:08,087 INFO > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Triggering > checkpoint 7853 @ 1565045048055 for job 32783d371464265ef536454055ae6182. > 2019-08-06 06:44:09,626 INFO > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Checkpoint > 7050 of job 4b542195824ff7b7cdf749543fd368cb expired before completing. > 2019-08-06 06:44:09,647 INFO > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Triggering > checkpoint 7051 @ 1565045049626 for job 4b542195824ff7b7cdf749543fd368cb. > 2019-08-06 06:44:12,006 INFO > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Completed > checkpoint 7853 for job 32783d371464265ef536454055ae6182 (299599482 bytes > in 3912 ms). > 2019-08-06 06:44:12,972 INFO > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Triggering > checkpoint 7853 @ 1565045052962 for job 16db5afe9a8cd7c6278030d5dec4c80c. > 2019-08-06 06:44:13,109 INFO > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Triggering > checkpoint 7853 @ 1565045053080 for job 9c1394a2d2ff47c7852eff9f1f932535. > 2019-08-06 06:44:16,779 INFO > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Completed > checkpoint 7853 for job 16db5afe9a8cd7c6278030d5dec4c80c (152643149 bytes > in 3666 ms). > 2019-08-06 06:44:18,598 INFO > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Completed > checkpoint 7828 for job 8df2b47f2a4c1ba0f7019ee5989f6e71 (837558245 bytes > in 23472 ms). > 2019-08-06 06:44:19,193 INFO > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Completed > checkpoint 7853 for job 9c1394a2d2ff47c7852eff9f1f932535 (594628825 bytes > in 6067 ms). > 2019-08-06 06:44:19,238 INFO > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Completed > checkpoint 5855 for job 108ce7f6f5f3e76b12fad9dbdbc8feba (45917615 bytes in > 61819 ms). > 2019-08-06 06:44:19,248 INFO > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Triggering > checkpoint 5856 @ 1565045059238 for job 108ce7f6f5f3e76b12fad9dbdbc8feba. > 2019-08-06 06:44:22,092 INFO > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Triggering > checkpoint 7802 @ 1565045062084 for job 430689e0f202fcb29ce9d6403e6825f9. > 2019-08-06 06:44:22,838 INFO > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Completed > checkpoint 2940 for job fea51fd74006de69e265adc13e802229 (122562953 bytes > in 174336 ms). > 2019-08-06 06:44:22,888 INFO > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Triggering > checkpoint 2941 @ 1565045062838 for job fea51fd74006de69e265adc13e802229. > 2019-08-06 06:44:24,348 INFO > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Triggering > checkpoint 613 @ 1565045064328 for job 5a75d77312f29c714af0a2994f0e8b1a. > 2019-08-06 06:44:25,327 INFO > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Completed > checkpoint 7802 for job 430689e0f202fcb29ce9d6403e6825f9 (358649788 bytes > in 2788 ms). > 2019-08-06 06:44:25,769 INFO > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Completed > checkpoint 613 for job 5a75d77312f29c714af0a2994f0e8b1a (583594 bytes in > 1341 ms). > 2019-08-06 06:44:27,547 INFO > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Triggering > checkpoint 7844 @ 1565045067534 for job fb32bbf35ed002961b9dfb1417799ae6. > 2019-08-06 06:44:28,738 INFO > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Completed > checkpoint 7844 for job fb32bbf35ed002961b9dfb1417799ae6 (11017757 bytes in > 1178 ms). > 2019-08-06 06:44:37,576 INFO > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Triggering > checkpoint 7853 @ 1565045077573 for job d73c940cf0a996e12ecb93a146f93293. > 2019-08-06 06:44:38,167 INFO > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Completed > checkpoint 7853 for job d73c940cf0a996e12ecb93a146f93293 (123726 bytes in > 562 ms). > 2019-08-06 06:44:45,957 INFO > org.apache.flink.runtime.entrypoint.ClusterEntrypoint - RECEIVED > SIGNAL 15: SIGTERM. Shutting down as requested. > 2019-08-06 06:44:45,957 INFO > org.apache.flink.runtime.blob.TransientBlobCache - Shutting > down BLOB cache |
你好,
> org.apache.flink.runtime.entrypoint.ClusterEntrypoint - RECEIVED > SIGNAL 15: SIGTERM. Shutting down as requested. 这是收到了 signal 15 了 [1],Wong 说得对,搜一下 yarn node manager 或者 yarn resource manager 的 log 1. https://access.redhat.com/solutions/737033 Thanks, Biao /'bɪ.aʊ/ On Tue, Aug 6, 2019 at 12:30 PM Wong Victor <[hidden email]> wrote: > Hi, > 可以查看一下jobmanager所在节点的yarn log,搜索一下对应的container为什么被kill; > > Regards > > On 2019/8/6, 11:40 AM, "戴嘉诚" <[hidden email]> wrote: > > 大家好: > > > > 我的flink是部署在yarn上左session,今天早上jobmanager自动退出了,然后yarn把他重新拉起了,导致里面跑的job重新启动了,但是我查看日志,看到jobmanager的日志没有任何异常,同时jobmanager也没有长时间的full > gc和频繁的gc,以下是jobmanager的日志: > 就是在06:44分的是偶,日志上标记了收收到停止请求,然后jobmanager直接停止了...请问是由于什么原因导致的呢? > > 2019-08-06 06:43:58,891 INFO > > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - > Completed > > checkpoint 7843 for job e49624208fe771c4c9527799fd46f2a3 (5645215 > bytes in > > 801 ms). > > 2019-08-06 06:43:59,336 INFO > > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - > Triggering > > checkpoint 7852 @ 1565045039321 for job > a9a7464ead55474bea6f42ed8e5de60f. > > 2019-08-06 06:44:00,971 INFO > > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - > Triggering > > checkpoint 7852 @ 1565045040957 for job > 79788b218e684cb31c1ca0fcc641e89f. > > 2019-08-06 06:44:01,357 INFO > > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - > Completed > > checkpoint 7852 for job a9a7464ead55474bea6f42ed8e5de60f (25870658 > bytes in > > 1806 ms). > > 2019-08-06 06:44:02,887 INFO > > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - > Completed > > checkpoint 7852 for job 79788b218e684cb31c1ca0fcc641e89f (29798945 > bytes in > > 1849 ms). > > 2019-08-06 06:44:05,101 INFO > > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - > Triggering > > checkpoint 7852 @ 1565045045092 for job > 03f3a0bd53c21f90f70ea01916dc9f78. > > 2019-08-06 06:44:06,547 INFO > > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - > Triggering > > checkpoint 7844 @ 1565045046522 for job > 486a1949d75863f823013d87b509d228. > > 2019-08-06 06:44:07,311 INFO > > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - > Completed > > checkpoint 7844 for job 486a1949d75863f823013d87b509d228 (62458942 > bytes in > > 736 ms). > > 2019-08-06 06:44:07,506 INFO > > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - > Completed > > checkpoint 7852 for job 03f3a0bd53c21f90f70ea01916dc9f78 (105565032 > bytes > > in 2366 ms). > > 2019-08-06 06:44:08,087 INFO > > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - > Triggering > > checkpoint 7853 @ 1565045048055 for job > 32783d371464265ef536454055ae6182. > > 2019-08-06 06:44:09,626 INFO > > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - > Checkpoint > > 7050 of job 4b542195824ff7b7cdf749543fd368cb expired before > completing. > > 2019-08-06 06:44:09,647 INFO > > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - > Triggering > > checkpoint 7051 @ 1565045049626 for job > 4b542195824ff7b7cdf749543fd368cb. > > 2019-08-06 06:44:12,006 INFO > > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - > Completed > > checkpoint 7853 for job 32783d371464265ef536454055ae6182 (299599482 > bytes > > in 3912 ms). > > 2019-08-06 06:44:12,972 INFO > > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - > Triggering > > checkpoint 7853 @ 1565045052962 for job > 16db5afe9a8cd7c6278030d5dec4c80c. > > 2019-08-06 06:44:13,109 INFO > > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - > Triggering > > checkpoint 7853 @ 1565045053080 for job > 9c1394a2d2ff47c7852eff9f1f932535. > > 2019-08-06 06:44:16,779 INFO > > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - > Completed > > checkpoint 7853 for job 16db5afe9a8cd7c6278030d5dec4c80c (152643149 > bytes > > in 3666 ms). > > 2019-08-06 06:44:18,598 INFO > > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - > Completed > > checkpoint 7828 for job 8df2b47f2a4c1ba0f7019ee5989f6e71 (837558245 > bytes > > in 23472 ms). > > 2019-08-06 06:44:19,193 INFO > > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - > Completed > > checkpoint 7853 for job 9c1394a2d2ff47c7852eff9f1f932535 (594628825 > bytes > > in 6067 ms). > > 2019-08-06 06:44:19,238 INFO > > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - > Completed > > checkpoint 5855 for job 108ce7f6f5f3e76b12fad9dbdbc8feba (45917615 > bytes in > > 61819 ms). > > 2019-08-06 06:44:19,248 INFO > > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - > Triggering > > checkpoint 5856 @ 1565045059238 for job > 108ce7f6f5f3e76b12fad9dbdbc8feba. > > 2019-08-06 06:44:22,092 INFO > > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - > Triggering > > checkpoint 7802 @ 1565045062084 for job > 430689e0f202fcb29ce9d6403e6825f9. > > 2019-08-06 06:44:22,838 INFO > > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - > Completed > > checkpoint 2940 for job fea51fd74006de69e265adc13e802229 (122562953 > bytes > > in 174336 ms). > > 2019-08-06 06:44:22,888 INFO > > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - > Triggering > > checkpoint 2941 @ 1565045062838 for job > fea51fd74006de69e265adc13e802229. > > 2019-08-06 06:44:24,348 INFO > > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - > Triggering > > checkpoint 613 @ 1565045064328 for job > 5a75d77312f29c714af0a2994f0e8b1a. > > 2019-08-06 06:44:25,327 INFO > > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - > Completed > > checkpoint 7802 for job 430689e0f202fcb29ce9d6403e6825f9 (358649788 > bytes > > in 2788 ms). > > 2019-08-06 06:44:25,769 INFO > > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - > Completed > > checkpoint 613 for job 5a75d77312f29c714af0a2994f0e8b1a (583594 > bytes in > > 1341 ms). > > 2019-08-06 06:44:27,547 INFO > > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - > Triggering > > checkpoint 7844 @ 1565045067534 for job > fb32bbf35ed002961b9dfb1417799ae6. > > 2019-08-06 06:44:28,738 INFO > > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - > Completed > > checkpoint 7844 for job fb32bbf35ed002961b9dfb1417799ae6 (11017757 > bytes in > > 1178 ms). > > 2019-08-06 06:44:37,576 INFO > > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - > Triggering > > checkpoint 7853 @ 1565045077573 for job > d73c940cf0a996e12ecb93a146f93293. > > 2019-08-06 06:44:38,167 INFO > > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - > Completed > > checkpoint 7853 for job d73c940cf0a996e12ecb93a146f93293 (123726 > bytes in > > 562 ms). > > 2019-08-06 06:44:45,957 INFO > > org.apache.flink.runtime.entrypoint.ClusterEntrypoint - > RECEIVED > > SIGNAL 15: SIGTERM. Shutting down as requested. > > 2019-08-06 06:44:45,957 INFO > > org.apache.flink.runtime.blob.TransientBlobCache - > Shutting > > down BLOB cache > > > |
你好,
谢谢!已经找到原因了 发件人: Biao Liu 发送时间: 2019年8月6日 13:55 收件人: user-zh 主题: Re: jobmanager 日志异常 你好, > org.apache.flink.runtime.entrypoint.ClusterEntrypoint - RECEIVED > SIGNAL 15: SIGTERM. Shutting down as requested. 这是收到了 signal 15 了 [1],Wong 说得对,搜一下 yarn node manager 或者 yarn resource manager 的 log 1. https://access.redhat.com/solutions/737033 Thanks, Biao /'bɪ.aʊ/ On Tue, Aug 6, 2019 at 12:30 PM Wong Victor <[hidden email]> wrote: > Hi, > 可以查看一下jobmanager所在节点的yarn log,搜索一下对应的container为什么被kill; > > Regards > > On 2019/8/6, 11:40 AM, "戴嘉诚" <[hidden email]> wrote: > > 大家好: > > > > 我的flink是部署在yarn上左session,今天早上jobmanager自动退出了,然后yarn把他重新拉起了,导致里面跑的job重新启动了,但是我查看日志,看到jobmanager的日志没有任何异常,同时jobmanager也没有长时间的full > gc和频繁的gc,以下是jobmanager的日志: > 就是在06:44分的是偶,日志上标记了收收到停止请求,然后jobmanager直接停止了...请问是由于什么原因导致的呢? > > 2019-08-06 06:43:58,891 INFO > > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - > Completed > > checkpoint 7843 for job e49624208fe771c4c9527799fd46f2a3 (5645215 > bytes in > > 801 ms). > > 2019-08-06 06:43:59,336 INFO > > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - > Triggering > > checkpoint 7852 @ 1565045039321 for job > a9a7464ead55474bea6f42ed8e5de60f. > > 2019-08-06 06:44:00,971 INFO > > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - > Triggering > > checkpoint 7852 @ 1565045040957 for job > 79788b218e684cb31c1ca0fcc641e89f. > > 2019-08-06 06:44:01,357 INFO > > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - > Completed > > checkpoint 7852 for job a9a7464ead55474bea6f42ed8e5de60f (25870658 > bytes in > > 1806 ms). > > 2019-08-06 06:44:02,887 INFO > > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - > Completed > > checkpoint 7852 for job 79788b218e684cb31c1ca0fcc641e89f (29798945 > bytes in > > 1849 ms). > > 2019-08-06 06:44:05,101 INFO > > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - > Triggering > > checkpoint 7852 @ 1565045045092 for job > 03f3a0bd53c21f90f70ea01916dc9f78. > > 2019-08-06 06:44:06,547 INFO > > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - > Triggering > > checkpoint 7844 @ 1565045046522 for job > 486a1949d75863f823013d87b509d228. > > 2019-08-06 06:44:07,311 INFO > > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - > Completed > > checkpoint 7844 for job 486a1949d75863f823013d87b509d228 (62458942 > bytes in > > 736 ms). > > 2019-08-06 06:44:07,506 INFO > > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - > Completed > > checkpoint 7852 for job 03f3a0bd53c21f90f70ea01916dc9f78 (105565032 > bytes > > in 2366 ms). > > 2019-08-06 06:44:08,087 INFO > > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - > Triggering > > checkpoint 7853 @ 1565045048055 for job > 32783d371464265ef536454055ae6182. > > 2019-08-06 06:44:09,626 INFO > > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - > Checkpoint > > 7050 of job 4b542195824ff7b7cdf749543fd368cb expired before > completing. > > 2019-08-06 06:44:09,647 INFO > > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - > Triggering > > checkpoint 7051 @ 1565045049626 for job > 4b542195824ff7b7cdf749543fd368cb. > > 2019-08-06 06:44:12,006 INFO > > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - > Completed > > checkpoint 7853 for job 32783d371464265ef536454055ae6182 (299599482 > bytes > > in 3912 ms). > > 2019-08-06 06:44:12,972 INFO > > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - > Triggering > > checkpoint 7853 @ 1565045052962 for job > 16db5afe9a8cd7c6278030d5dec4c80c. > > 2019-08-06 06:44:13,109 INFO > > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - > Triggering > > checkpoint 7853 @ 1565045053080 for job > 9c1394a2d2ff47c7852eff9f1f932535. > > 2019-08-06 06:44:16,779 INFO > > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - > Completed > > checkpoint 7853 for job 16db5afe9a8cd7c6278030d5dec4c80c (152643149 > bytes > > in 3666 ms). > > 2019-08-06 06:44:18,598 INFO > > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - > Completed > > checkpoint 7828 for job 8df2b47f2a4c1ba0f7019ee5989f6e71 (837558245 > bytes > > in 23472 ms). > > 2019-08-06 06:44:19,193 INFO > > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - > Completed > > checkpoint 7853 for job 9c1394a2d2ff47c7852eff9f1f932535 (594628825 > bytes > > in 6067 ms). > > 2019-08-06 06:44:19,238 INFO > > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - > Completed > > checkpoint 5855 for job 108ce7f6f5f3e76b12fad9dbdbc8feba (45917615 > bytes in > > 61819 ms). > > 2019-08-06 06:44:19,248 INFO > > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - > Triggering > > checkpoint 5856 @ 1565045059238 for job > 108ce7f6f5f3e76b12fad9dbdbc8feba. > > 2019-08-06 06:44:22,092 INFO > > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - > Triggering > > checkpoint 7802 @ 1565045062084 for job > 430689e0f202fcb29ce9d6403e6825f9. > > 2019-08-06 06:44:22,838 INFO > > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - > Completed > > checkpoint 2940 for job fea51fd74006de69e265adc13e802229 (122562953 > bytes > > in 174336 ms). > > 2019-08-06 06:44:22,888 INFO > > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - > Triggering > > checkpoint 2941 @ 1565045062838 for job > fea51fd74006de69e265adc13e802229. > > 2019-08-06 06:44:24,348 INFO > > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - > Triggering > > checkpoint 613 @ 1565045064328 for job > 5a75d77312f29c714af0a2994f0e8b1a. > > 2019-08-06 06:44:25,327 INFO > > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - > Completed > > checkpoint 7802 for job 430689e0f202fcb29ce9d6403e6825f9 (358649788 > bytes > > in 2788 ms). > > 2019-08-06 06:44:25,769 INFO > > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - > Completed > > checkpoint 613 for job 5a75d77312f29c714af0a2994f0e8b1a (583594 > bytes in > > 1341 ms). > > 2019-08-06 06:44:27,547 INFO > > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - > Triggering > > checkpoint 7844 @ 1565045067534 for job > fb32bbf35ed002961b9dfb1417799ae6. > > 2019-08-06 06:44:28,738 INFO > > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - > Completed > > checkpoint 7844 for job fb32bbf35ed002961b9dfb1417799ae6 (11017757 > bytes in > > 1178 ms). > > 2019-08-06 06:44:37,576 INFO > > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - > Triggering > > checkpoint 7853 @ 1565045077573 for job > d73c940cf0a996e12ecb93a146f93293. > > 2019-08-06 06:44:38,167 INFO > > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - > Completed > > checkpoint 7853 for job d73c940cf0a996e12ecb93a146f93293 (123726 > bytes in > > 562 ms). > > 2019-08-06 06:44:45,957 INFO > > org.apache.flink.runtime.entrypoint.ClusterEntrypoint - > RECEIVED > > SIGNAL 15: SIGTERM. Shutting down as requested. > > 2019-08-06 06:44:45,957 INFO > > org.apache.flink.runtime.blob.TransientBlobCache - > Shutting > > down BLOB cache > > > |
Free forum by Nabble | Edit this page |