User Tools

Site Tools


tutorials:checkpointing_overview

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
tutorials:checkpointing_overview [2025/11/11 18:51] – created, initial formatting ibchadmintutorials:checkpointing_overview [2025/11/14 21:27] (current) – [Task-based Checkpointing] ibchadmin
Line 7: Line 7:
 The major software challenges of designing an intermittent device include: The major software challenges of designing an intermittent device include:
  
-  * Ensuring proper (re)initialization of system state and peripherals after power loss: even a device using a matched operation approach may encounter extended periods of power loss (e.g. a device harvesting solar energy at night), so an intermittent device should be able to properly resume operation depending on its previous state and the length of power loss. +  * **Ensuring proper (re)initialization of system state and peripherals after power loss**: even a device using a matched operation approach may encounter extended periods of power loss (e.g. a device harvesting solar energy at night), so an intermittent device should be able to properly resume operation depending on its previous state and the length of power loss. 
-  * Ensuring forward application progress: some loss of progress is generally unavoidable on a device with unpredictable power.  However, a device should be able to ensure that some progress is made on each power cycle, or at least have the ability to recognize when it is stuck and react accordingly. +  * **Ensuring forward application progress**: some loss of progress is generally unavoidable on a device with unpredictable power.  However, a device should be able to ensure that some progress is made on each power cycle, or at least have the ability to recognize when it is stuck and react accordingly. 
-  * Data consistency: most intermittent devices maintain volatile and non-violatile memory stores, and most checkpointing strategies rely on storing at least some information in non-violatile memory.  Intermittent devices need to ensure that writes to non-violatile memory both complete successfully before power loss, and account for possible inconsistencies between non-violatile memory and program state (such as write after read errors). +  * **Data consistency**: most intermittent devices maintain volatile and non-violatile memory stores, and most checkpointing strategies rely on storing at least some information in non-violatile memory.  Intermittent devices need to ensure that writes to non-violatile memory both complete successfully before power loss, and account for possible inconsistencies between non-violatile memory and program state (such as [[https://en.wikipedia.org/wiki/Data_dependency#Write_after_read_(WAR)|write after read]] errors). 
-  * Limited power budget: capacitors can hold only a fraction of energy than a similar-sized battery.  Capacitor sizing and the method of energy harvesting can further reduce the amount of energy a device has available, with some small intermittent devices only having access to a few microwatts of power at any given point.  As a result, the method(s) chosen should have as small a footprint as feasible in order to ensure that most energy is used for useful work. +  * **Limited power budget**: capacitors can hold only a fraction of energy than a similar-sized battery.  Capacitor sizing and the method of energy harvesting can further reduce the amount of energy a device has available, with some small intermittent devices only having access to a few microwatts of power at any given point.  As a result, the method(s) chosen should have as small a footprint as feasible in order to ensure that most energy is used for useful work. 
-  * Adaptability: an intermittent device should be able to adjust its operation to account for varying environments and energy availability, especially if it will be deployed across a variety of environments.  Even for more narrow deployments, an inflexible device may fail if the testing environment fails to accurately model its real world application.+  * **Adaptability**: an intermittent device should be able to adjust its operation to account for varying environments and energy availability, especially if it will be deployed across a variety of environments.  Even for more narrow deployments, an inflexible device may fail if the testing environment fails to accurately model its real world application.
  
-A general survey of the two main approaches tackling these challenges are explored below.  This list is not intended to be exhaustive, as checkpointing/state management in intermittent computing is a significant and evolving area of research: rather, the objective is to provide a general overview of the most common methods currently available, along with their benefits and tradeoffs.  These approaches also assume a standard (von Neumann) device architecture: while other hardware configurations are being explored in intermittent computing, low-power devices with traditional architectures such as the MSP430 are cheaper and more accessible and so currently make up the vast majority of intermittent devices being designed and tested.+A general survey of the two main approaches tackling these challenges are explored below.  This list is not intended to be exhaustive, as checkpointing/state management in intermittent computing is a significant and evolving area of research: rather, the objective is to provide a general overview of the most common methods currently available, along with their benefits and tradeoffs.  These approaches also assume a standard (von Neumann) device architecture: while other hardware configurations are being explored in intermittent computing, low-power devices with traditional architectures such as the [[microcontrollers:msp430|MSP430]] are cheaper and more accessible and so currently make up the vast majority of intermittent devices being designed and tested.
  
 ===== Matched/Energy-Neutral Operation ===== ===== Matched/Energy-Neutral Operation =====
Line 24: Line 24:
  
 **Examples** **Examples**
-  * AsTAR +  * [[https://anrg.usc.edu/www/papers/AsTAR-yang.pdf|AsTAR]] 
-  * Flute+  * [[https://lirias.kuleuven.be/retrieve/734534|Flute]]
  
 ===== Checkpointing ===== ===== Checkpointing =====
Line 50: Line 50:
  
 **Examples** **Examples**
-  * Mementos +  * [[https://dl.acm.org/doi/10.1145/1961295.1950386|MementOS]] 
-  * Hibernus +  * [[https://ieeexplore.ieee.org/document/6960060|Hibernus]] 
-  * HarvOS +  * [[https://dl.acm.org/doi/10.1145/3055031.3055082|HarvOS]]
-  * Broken Time Machine (https://dl.acm.org/doi/10.1145/2618128.2618136)+
  
 ==== Task-based Checkpointing ==== ==== Task-based Checkpointing ====
Line 62: Line 61:
  
 **Examples** **Examples**
-  * Alpaca? +  * [[https://dl.acm.org/doi/10.1145/3133920|Alpaca]] 
-  * Mayfly+  * [[https://dl.acm.org/doi/10.1145/3131672.3131673|Mayfly]]
   * Artemis   * Artemis
   * Chain   * Chain
-  * Ink+  * [[https://dl.acm.org/doi/10.1145/3274783.3274837|InK]]
  
-==== Loop Continuation ==== 
- 
-TBD(?) 
  
 ===== Checkpointing and Hardware Considerations ===== ===== Checkpointing and Hardware Considerations =====
Line 88: Line 84:
 The issue arises when the requirements of the device and its peripherals each prioritize differing capacitances.  Take a simple device with a low power environmental sensor and a transmitter.  The sensor favors smaller capacitance, in order to charge more frequently and retrieve more samples (while being less likely to miss interesting events).  Transmitting packets, however, is energy intensive (with even low-power methods requiring energy equal to tens of thousands of operations on a low power microcontroller).  The transmitter may not be able to even successfully broadcast below a certain capacitance, but the larger capacitance is in direct conflict with the preference of the sensor to have shorter (but more frequent) bursts of energy for detection purposes. The issue arises when the requirements of the device and its peripherals each prioritize differing capacitances.  Take a simple device with a low power environmental sensor and a transmitter.  The sensor favors smaller capacitance, in order to charge more frequently and retrieve more samples (while being less likely to miss interesting events).  Transmitting packets, however, is energy intensive (with even low-power methods requiring energy equal to tens of thousands of operations on a low power microcontroller).  The transmitter may not be able to even successfully broadcast below a certain capacitance, but the larger capacitance is in direct conflict with the preference of the sensor to have shorter (but more frequent) bursts of energy for detection purposes.
  
-To this end a variety of capacitor configurations have been explored.  These configurations are the subject of their own article (TBD), but can range from dynamically adjustable capacitor banks to individual capacitors for each microcontroller and/or peripheral.  Knowing the exact capacitor configuration can impact checkpointing: a strategy that assumes a fixed capacitance will obviously struggle with a dynamic bank, and multiple capacitors can complicate energy availability predictions depending on arrangement.+To this end a variety of capacitor configurations have been explored.  These configurations are the subject of [[tutorials:capacitor_sizing|their own article]], but can range from dynamically adjustable capacitor banks to individual capacitors for each microcontroller and/or peripheral.  Knowing the exact capacitor configuration can impact checkpointing: a strategy that assumes a fixed capacitance will obviously struggle with a dynamic bank, and multiple capacitors can complicate energy availability predictions depending on arrangement.
  
 ===== Designing Adaptable Strategies ===== ===== Designing Adaptable Strategies =====
Line 96: Line 92:
 ===== References ===== ===== References =====
  
-  * https://cmuabstract.github.io/intermittence_tutorial/tutorial/+  * [[https://cmuabstract.github.io/intermittence_tutorial/tutorial/|Getting Started With Intermittent Computing]] 
 +  * [[https://dl.acm.org/doi/10.1145/2618128.2618136|Nonvolatile Memory is a Broken Time Machine]]
  
  
tutorials/checkpointing_overview.1762887086.txt.gz · Last modified: 2025/11/11 18:51 by ibchadmin

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki