你好,我有2个问题
1:每次重启服务,checkpoint的目录中chk- 总是从chk-1开始,chk-2 ........,没有从上次的编号开始 2:重启服务后,没有从checkpoint中恢复state的数据 下面是我的配置,我是在本地调试的,单机 final StreamExecutionEnvironment streamExecutionEnvironment = StreamExecutionEnvironment.createLocalEnvironmentWithWebUI(conf); // StateBackend stateBackend = new RocksDBStateBackend("hdfs://10.100.51.101:9000/flink/checkpoints",true); StateBackend stateBackend = new FsStateBackend("file:///flink/checkpoints"); // StateBackend stateBackend = new MemoryStateBackend(); streamExecutionEnvironment.setStateBackend(stateBackend); streamExecutionEnvironment.enableCheckpointing(1000); streamExecutionEnvironment.getCheckpointConfig().setCheckpointingMode(CheckpointingMode.EXACTLY_ONCE); streamExecutionEnvironment.getCheckpointConfig().setMinPauseBetweenCheckpoints(500); streamExecutionEnvironment.getCheckpointConfig().setCheckpointTimeout(60000); streamExecutionEnvironment.getCheckpointConfig().setMaxConcurrentCheckpoints(1); streamExecutionEnvironment.getCheckpointConfig() .enableExternalizedCheckpoints(CheckpointConfig.ExternalizedCheckpointCleanup.RETAIN_ON_CANCELLATION); |
再启动服务的时候 需要指定checkpoint回复地址,你这里只是指定了做checkpint地址 在 2020-09-03 16:03:41,"sun" <[hidden email]> 写道: >你好,我有2个问题 > >1:每次重启服务,checkpoint的目录中chk- 总是从chk-1开始,chk-2 ........,没有从上次的编号开始 > >2:重启服务后,没有从checkpoint中恢复state的数据 > >下面是我的配置,我是在本地调试的,单机 > > > >final StreamExecutionEnvironment streamExecutionEnvironment = StreamExecutionEnvironment.createLocalEnvironmentWithWebUI(conf); > >// StateBackend stateBackend = new RocksDBStateBackend("hdfs://10.100.51.101:9000/flink/checkpoints",true); > StateBackend stateBackend = new FsStateBackend("file:///flink/checkpoints"); >// StateBackend stateBackend = new MemoryStateBackend(); > streamExecutionEnvironment.setStateBackend(stateBackend); > > streamExecutionEnvironment.enableCheckpointing(1000); > streamExecutionEnvironment.getCheckpointConfig().setCheckpointingMode(CheckpointingMode.EXACTLY_ONCE); > streamExecutionEnvironment.getCheckpointConfig().setMinPauseBetweenCheckpoints(500); > streamExecutionEnvironment.getCheckpointConfig().setCheckpointTimeout(60000); > streamExecutionEnvironment.getCheckpointConfig().setMaxConcurrentCheckpoints(1); > streamExecutionEnvironment.getCheckpointConfig() > .enableExternalizedCheckpoints(CheckpointConfig.ExternalizedCheckpointCleanup.RETAIN_ON_CANCELLATION); |
啥?
------------------ 原始邮件 ------------------ 发件人: "user-zh" <[hidden email]>; 发送时间: 2020年9月3日(星期四) 下午4:10 收件人: "user-zh"<[hidden email]>; 主题: Re:无法从checkpoint中恢复state 再启动服务的时候 需要指定checkpoint回复地址,你这里只是指定了做checkpint地址 在 2020-09-03 16:03:41,"sun" <[hidden email]> 写道: >你好,我有2个问题 > >1:每次重启服务,checkpoint的目录中chk-&nbsp; 总是从chk-1开始,chk-2 ........,没有从上次的编号开始 > >2:重启服务后,没有从checkpoint中恢复state的数据 > >下面是我的配置,我是在本地调试的,单机 > > > >final StreamExecutionEnvironment streamExecutionEnvironment = StreamExecutionEnvironment.createLocalEnvironmentWithWebUI(conf); > >// StateBackend stateBackend = new RocksDBStateBackend("hdfs://10.100.51.101:9000/flink/checkpoints",true); > StateBackend stateBackend = new FsStateBackend("file:///flink/checkpoints"); >// StateBackend stateBackend = new MemoryStateBackend(); > streamExecutionEnvironment.setStateBackend(stateBackend); > > streamExecutionEnvironment.enableCheckpointing(1000); > streamExecutionEnvironment.getCheckpointConfig().setCheckpointingMode(CheckpointingMode.EXACTLY_ONCE); > streamExecutionEnvironment.getCheckpointConfig().setMinPauseBetweenCheckpoints(500); > streamExecutionEnvironment.getCheckpointConfig().setCheckpointTimeout(60000); > streamExecutionEnvironment.getCheckpointConfig().setMaxConcurrentCheckpoints(1); > streamExecutionEnvironment.getCheckpointConfig() > .enableExternalizedCheckpoints(CheckpointConfig.ExternalizedCheckpointCleanup.RETAIN_ON_CANCELLATION); |
In reply to this post by 程龙
/opt/flink/bin/flink run -d -s /opt/flink/savepoints -c
com.xxx.flink.ohlc.kafka.OrderTickCandleView /home/service-ohlc-*-SNAPSHOT.jar 在启动job时,已经指定这个目录,但会报以下错, Caused by: org.apache.flink.runtime.client.JobExecutionException: Could not instantiate JobManager. at org.apache.flink.runtime.dispatcher.Dispatcher.lambda$createJobManagerRunner$6(Dispatcher.java:398) at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1604) ... 6 more Caused by: java.io.FileNotFoundException: Cannot find meta data file '_metadata' in directory '/opt/flink/savepoints'. Please try to load the checkpoint/savepoint directly from the metadata file instead of the directory. -- Sent from: http://apache-flink.147419.n8.nabble.com/ |
Hi
从报错看,你知道的是一个目录,这个目录下面没有 _metadata 文件,这不是一个完整的 checkpoint/savepoint 因此不能用于恢复 Best, Congxian [hidden email] <[hidden email]> 于2020年10月27日周二 下午4:06写道: > /opt/flink/bin/flink run -d -s /opt/flink/savepoints -c > com.xxx.flink.ohlc.kafka.OrderTickCandleView > /home/service-ohlc-*-SNAPSHOT.jar > > 在启动job时,已经指定这个目录,但会报以下错, > Caused by: org.apache.flink.runtime.client.JobExecutionException: Could not > instantiate JobManager. > at > > org.apache.flink.runtime.dispatcher.Dispatcher.lambda$createJobManagerRunner$6(Dispatcher.java:398) > at > > java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1604) > ... 6 more > Caused by: java.io.FileNotFoundException: Cannot find meta data file > '_metadata' in directory '/opt/flink/savepoints'. Please try to load the > checkpoint/savepoint directly from the metadata file instead of the > directory. > > > > > -- > Sent from: http://apache-flink.147419.n8.nabble.com/ > |
Free forum by Nabble | Edit this page |