个性化文献订阅>期刊> IEEE Transactions on Computers
 

FREM: A Fast Restart Mechanism for General Checkpoint/Restart

  作者 Li, YW; Lan, ZL  
  选自 期刊  IEEE Transactions on Computers;  卷期  2011年60-5;  页码  639-652  
  关联知识点  
 

[摘要]As failure rate keeps on increasing in large systems, applications running atop restart more frequently than ever. Existing research on checkpoint/restart mainly focuses on optimizing checkpoint operation, without paying much attention to the restart operation. As a result, application restart latency maybe substantial, which greatly threatens system dependability and performance. To attack the restart latency problem, in this paper, we present FREM, a fast restart mechanism for general checkpoint/restart protocols. By dynamically tracking the process data accesses after each checkpoint, FREM masks restart latency by overlapping application recovery with the retrieval of its checkpoint image. We have implemented FREM as a prototype system and tested it under Linux environments. Extensive experiments with real applications demonstrate that it can effectively reduce restart latency by over 50 percent on average, as compared to the conventional restart mechanisms.

 
      被申请数(0)  
 

[全文传递流程]

一般上传文献全文的时限在1个工作日内