scispace - formally typeset
Search or ask a question

Showing papers in "ACM Transactions in Embedded Computing Systems in 2012"


Journal ArticleDOI
TL;DR: This work presents a method for analyzing greedy shapers, and embeds this analysis method into a well-established modular performance analysis framework for real-time embedded systems.
Abstract: Traffic shaping is a well-known technique in the area of networking and is proven to reduce global buffer requirements and end-to-end delays in networked systems. Due to these properties, shapers also play an increasingly important role in the design of multiprocessor embedded systems that exhibit a considerable amount of on-chip traffic. Despite the growing importance of traffic shapping in this area, no methods exist for analyzing shapers in distributed embedded systems and for incorporating them into a system-level performance analysis. Until now it was not possible to determine the effect of shapers on end-to-end delay guarantees or buffer requirements in such systems. In this work, we present a method for analyzing greedy shapers, and we embed this analysis method into a well-established modular performance analysis framework for real-time embedded systems. The presented approach enables system-level performance analysis of complete systems with greedy shapers, and we prove its applicability by analyzing three case study systems.

94 citations


Journal ArticleDOI
TL;DR: A novel parallel partition approach is proposed to map motion estimation DAG onto the multiple DRC units in a PRC system, capable of design optimization of parallel processing reconfigurable systems for a given number of processing elements in different search ranges.
Abstract: Computational load of motion estimation in advanced video coding (AVC) standard is significantly high and even worse for HDTV and super-resolution sequences. In this article, a video processing algorithm is dynamically mapped onto a new parallel reconfigurable computing (PRC) architecture which consists of multiple dynamic reconfigurable computing (DRC) units. First, we construct a directed acyclic graph (DAG) to represent video coding algorithms in which motion estimation is the focus. A novel parallel partition approach is then proposed to map motion estimation DAG onto the multiple DRC units in a PRC system. This partitioning algorithm is capable of design optimization of parallel processing reconfigurable systems for a given number of processing elements in different search ranges. This speeds up the video processing with minimum sacrifice.

78 citations


Journal ArticleDOI
TL;DR: The design and implementation of BeepBeep is presented, a high-accuracy acoustic-based system for ranging and localization that works without any infrastructure and is applicable to sensor platforms and commercial-off-the-shelf mobile devices.
Abstract: We present the design and implementation of BeepBeep, a high-accuracy acoustic-based system for ranging and localization. It is a pure software-based solution and uses the most basic set of commodity hardware -- a speaker, a microphone, and some form of interdevice communication. The ranging scheme works without any infrastructure and is applicable to sensor platforms and commercial-off-the-shelf mobile devices. It achieves high accuracy through three techniques: two-way sensing, self-recording, and sample counting. We further devise a scalable and fast localization scheme. Our experiments show that up to one-centimeter ranging accuracy and three-centimeter localization accuracy can be achieved.

74 citations


Journal ArticleDOI
TL;DR: An abstract cyber-physical model of BANs, called BAN-CPS, is proposed that captures the undesirable side-effects of the medical devices (cyber) on the human body (physical) and a design and analysis tool, named BAND-AiDe, is developed that enables safety and sustainability analysis ofBANs.
Abstract: Body area networks (BANs) are networks of medical devices implanted within or worn on the human body. Analysis and verification of BAN designs require (i) early feedback on the BAN design and (ii) high-confidence evaluation of BANs without requiring any hazardous, intrusive, and costly deployment. Any design of BAN further has to ensure (i) the safety of the human body, that is, limiting any undesirable side-effects (e.g., heat dissipation) of BAN operations (involving sensing, computation, and communication among the devices) on the human body, and (ii) the sustainability of the BAN operations, that is, the continuation of the operations under constrained resources (e.g., limited battery power in the devices) without requiring any redeployments. This article uses the Model Based Engineering (MBE) approach to perform design and analysis of BANs. In this regard, first, an abstract cyber-physical model of BANs, called BAN-CPS, is proposed that captures the undesirable side-effects of the medical devices (cyber) on the human body (physical); second, a design and analysis tool, named BAND-AiDe, is developed that allows specification of BAN-CPS using industry standard Abstract Architecture Description Language (AADL) and enables safety and sustainability analysis of BANs; and third, the applicability of BAND-AiDe is shown through a case study using both single and a network of medical devices for health monitoring applications.

40 citations


Journal ArticleDOI
TL;DR: This article presents a novel approach for implementing cache reconfiguration in soft real-time systems by efficiently leveraging static analysis during runtime to minimize energy while maintaining the same service level.
Abstract: In recent years, efficient dynamic reconfiguration techniques have been widely employed for system optimization Dynamic cache reconfiguration is a promising approach for reducing energy consumption as well as for improving overall system performance It is a major challenge to introduce cache reconfiguration into real-time multitasking systems, since dynamic analysis may adversely affect tasks with timing constraints This article presents a novel approach for implementing cache reconfiguration in soft real-time systems by efficiently leveraging static analysis during runtime to minimize energy while maintaining the same service level To the best of our knowledge, this is the first attempt to integrate dynamic cache reconfiguration in real-time scheduling techniques Our experimental results using a wide variety of applications have demonstrated that our approach can significantly reduce the cache energy consumption in soft real-time systems (up to 74p)

39 citations


Journal ArticleDOI
TL;DR: This work presents an gesture recognition system minimizing power while maintaining a run-time application defined performance target through dynamic sensor selection that can extend network lifetime by 4 times with accuracy >90% and by 9 times by accuracy >70%.
Abstract: Wearable gesture recognition enables context aware applications and unobtrusive HCI. It is realized by applying machine learning techniques to data from on-body sensor nodes. We present an gesture recognition system minimizing power while maintaining a run-time application defined performance target through dynamic sensor selection.Compared to the non managed approach optimized for recognition accuracy (95p accuracy), our technique can extend network lifetime by 4 times with accuracy >90p and by 9 times with accuracy >70p. We characterize the approach and outline its applicability to other scenarios.

35 citations


Journal ArticleDOI
TL;DR: A new game-theoretic approach to analyzing quantitative properties that is based on performing systematic measurements to automatically learn a model of the environment that can accurately predict properties such as worst-case execution time or estimate the distribution of execution times is presented.
Abstract: The analysis of quantitative properties, such as timing and power, is central to the design of reliable embedded software and systems. However, the verification of such properties on a program is made difficult by their heavy dependence on the program’s environment, such as the processor it runs on. Modeling the environment by hand can be tedious, error prone, and time consuming. In this article, we present a new game-theoretic approach to analyzing quantitative properties that is based on performing systematic measurements to automatically learn a model of the environment. We model the problem as a game between our algorithm (player) and the environment of the program (adversary) in which the player seeks to accurately predict the property of interest, while the adversary sets environment states and parameters. To solve this problem, we employ a randomized strategy that repeatedly tests the program along a linear-sized set of program paths called basis paths, using the resulting measurements to infer a weighted-graph model of the environment from which quantitative properties can be predicted. Test cases are automatically generated using satisfiability modulo theories (SMT) solving. We prove that our algorithm can, under certain assumptions and with arbitrarily high probability, accurately predict properties such as worst-case execution time or estimate the distribution of execution times. Experimental results for execution time analysis demonstrate that our approach is efficient, accurate, and highly portable.

34 citations


Journal ArticleDOI
TL;DR: A novel top-down methodology for automatically generating register transfer-level (RTL) tests from SystemC TLM specifications is presented and a test refinement specification for automatically converting TLM tests to RTL tests is developed in order to reduce overall validation effort.
Abstract: SystemC transaction-level modeling (TLM) is widely used to enable early exploration for both hardware and software designs. It can reduce the overall design and validation effort of complex system-on-chip (SOC) architectures. However, due to lack of automated techniques coupled with limited reuse of validation efforts between abstraction levels, SOC validation is becoming a major bottleneck. This article presents a novel top-down methodology for automatically generating register transfer-level (RTL) tests from SystemC TLM specifications. It makes two important contributions: i) it proposes a method that can automatically generate TLM tests using various coverage metrics, and (ii) it develops a test refinement specification for automatically converting TLM tests to RTL tests in order to reduce overall validation effort. We have developed a tool which incorporates these activities to enable automated RTL test generation from SystemC TLM specifications. Case studies using a router example and a 64-bit Alpha AXP pipelined processor demonstrate that our approach can achieve intended functional coverage of the RTL designs, as well as capture various functional errors and inconsistencies between specifications and implementations.

33 citations


Journal ArticleDOI
Kai Huang1, Wolfgang Haid1, Iuliana Bacivarov1, Matthias Keller1, Lothar Thiele1 
TL;DR: An MPSoC software design flow that allows for automatically generating the system implementation, together with an analysis model for system verification, is presented and modular performance analysis (MPA) is integrated into the distributed operation layer (DOL) MP soC programming environment.
Abstract: Modern real-time streaming applications are increasingly implemented on multiprocessor systems-on-chip (MPSoC). The implementation, as well as the verification of real-time applications executing on MPSoCs, are difficult tasks, however. A major challenge is the performance analysis of MPSoCs, which is required for early design space exploration and final system verification. Simulation-based methods are not well-suited for this purpose, due to long runtimes and non-exhaustive corner-case coverage. To overcome these limitations, formal performance analysis methods that provide guarantees for meeting real-time constraints have been developed. Embedding formal performance analysis into the MPSoC design cycle requires the generation of a faithful analysis model and its calibration with the system-specific parameters. In this article, a design flow that automates these steps is presented. In particular, we integrate modular performance analysis (MPA) into the distributed operation layer (DOL) MPSoC programming environment. The result is an MPSoC software design flow that allows for automatically generating the system implementation, together with an analysis model for system verification.

27 citations


Journal ArticleDOI
TL;DR: The proposed flash translation layer is implemented as a Linux device driver and evaluated with respect to ext2 and ext3 file systems, showing significant performance improvement over ext2, ext3, and NTFS file systems with limited system overheads.
Abstract: As flash memory becomes popular over various platforms, there is a strong demand regarding the performance degradation problem, due to the special characteristics of flash memory. This research proposes the design of a file-system-oriented flash translation layer, in which a filter mechanism is designed to separate the access requests of file-system metadata and file contents for better performance. A recovery scheme is then proposed for maintaining the integrity of a file system. The proposed flash translation layer is implemented as a Linux device driver and evaluated with respect to ext2 and ext3 file systems. Experiments were also done over NTFS by a series of realistic traces. The experimental results show significant performance improvement over ext2, ext3, and NTFS file systems with limited system overheads.

25 citations


Journal ArticleDOI
Suk-Hyun Seo1, Jin-Ho Kim1, Sung-Ho Hwang1, Key Ho Kwon1, Jae Wook Jeon1 
TL;DR: This article proposes a reliable gateway based on the OSEK/VDX components for in-vehicle networks, and examines the gateway system developed and the performance of the proposed system is evaluated.
Abstract: This article describes a reliable gateway for in-vehicle networks. Such networks include local interconnect networks, controller area networks, and FlexRay. There is some latency when transferring a message from one node (source) to another node (destination). A high probability of error exists due to different protocol specifications such as baud-rate, and message frame format. Therefore, deploying a reliable gateway is a challenge to the automotive industry. We propose a reliable gateway based on the OSEK/VDX components for in-vehicle networks. We also examine the gateway system developed, and then we evaluate the performance of our proposed system.

Journal ArticleDOI
TL;DR: In this paper, the authors propose a fine-grained transparent recovery, where the property of transparency can be selectively applied to processes and messages to hide the recovery actions in a selected part of the application so that they do not affect the schedule of other processes and message.
Abstract: In this article, we propose a strategy for the synthesis of fault-tolerant schedules and for the mapping of fault-tolerant applications. Our techniques handle transparency/performance trade-offs and use the fault-occurrence information to reduce the overhead due to fault tolerance. Processes and messages are statically scheduled, and we use process reexecution for recovering from multiple transient faults. We propose a fine-grained transparent recovery, where the property of transparency can be selectively applied to processes and messages. Transparency hides the recovery actions in a selected part of the application so that they do not affect the schedule of other processes and messages. While leading to longer schedules, transparent recovery has the advantage of both improved debuggability and less memory needed to store the fault-tolerant schedules.

Journal ArticleDOI
TL;DR: Based on periodicity and subtangential conditions, a new sufficient condition for verifying invariant properties of PCHAs is presented and is used to manually verify safety and progress properties of a fairly complex planner-controller subsystem of an autonomous ground vehicle.
Abstract: This article introduces Periodically Controlled Hybrid Automata (PCHA) for modular specification of embedded control systems. In a PCHA, control actions that change the control input to the plant occur roughly periodically, while other actions that update the state of the controller may occur in the interim. Such actions could model, for example, sensor updates and information received from higher-level planning modules that change the set point of the controller. Based on periodicity and subtangential conditions, a new sufficient condition for verifying invariant properties of PCHAs is presented. For PCHAs with polynomial continuous vector fields, it is possible to check these conditions automatically using, for example, quantifier elimination or sum of squares decomposition. We examine the feasibility of this automatic approach on a small example. The proposed technique is also used to manually verify safety and progress properties of a fairly complex planner-controller subsystem of an autonomous ground vehicle. Geometric properties of planner-generated paths are derived which guarantee that such paths can be safely followed by the controller.

Journal ArticleDOI
TL;DR: A model and an associated methodology are presented that can be used to schedule tasks in DCPSs to ensure that the thermal effects of the task execution are within acceptable levels, and verify that a given schedule meets the constraints.
Abstract: A distributed cyber-physical system (DCPS) may receive and induce energy-based interference to and from its environment. This article presents a model and an associated methodology that can be used to (i) schedule tasks in DCPSs to ensure that the thermal effects of the task execution are within acceptable levels, and (ii) verify that a given schedule meets the constraints. The model uses coarse discretization of space and linearity of interference. The methodology involves characterizing the interference of the task execution and fitting it into the model, then using the fitted model to verify a solution or explore the solution space.

Journal ArticleDOI
TL;DR: The KNOWME platform is presented, a complete, end-to-end, body area sensing system that integrates off-the-shelf biometric sensors with a Nokia N95 mobile phone to continuously monitor the metabolic signals of a subject and develop a low-complexity sensor sampling algorithm.
Abstract: The use of biometric sensors for monitoring an individual’s health and related behaviors, continuously and in real time, promises to revolutionize healthcare in the near future. In an effort to better understand the complex interplay between one’s medical condition and social, environmental, and metabolic parameters, this article presents the KNOWME platform, a complete, end-to-end, body area sensing system that integrates off-the-shelf biometric sensors with a Nokia N95 mobile phone to continuously monitor the metabolic signals of a subject. With a current focus on pediatric obesity, KNOWME employs metabolic signals to monitor and evaluate physical activity. KNOWME development and in-lab deployment studies have revealed three major challenges: (1) the need for robustness to highly varying operating environments due to subject-induced variability, such as mobility or sensor placement; (2) balancing the tension between achieving high fidelity data collection and minimizing network energy consumption; and (3) accurate physical activity detection using a modest number of sensors. The KNOWME platform described herein directly addresses these three challenges. Design robustness is achieved by creating a three-tiered sensor data collection architecture. The system architecture is designed to provide robust, continuous, multichannel data collection and scales without compromising normal mobile device operation. Novel physical activity detection methods which exploit new representations of sensor signals provide accurate and efficient physical activity detection. The physical activity detection method employs personalized training phases and accounts for intersession variability. Finally, exploiting the features of the hardware implementation, a low-complexity sensor sampling algorithm is developed, resulting in significant energy savings without loss of performance.

Journal ArticleDOI
TL;DR: The measurements-based channel model captures large and small time-scale signal correlations, giving an accurate picture of the signal variation, specifically, the deep fades which are the features that mostly affect the behavior of the MAC.
Abstract: We investigate the impact of wireless channel temporal variations on the design of medium access control (MAC) protocols for body area networks (BANs). Our measurements-based channel model captures large and small time-scale signal correlations, giving an accurate picture of the signal variation, specifically, the deep fades which are the features that mostly affect the behavior of the MAC. We test the effect of the channel model on the performance of the 802.15.4 MAC both in contention access mode and TDMA access mode. We show that there are considerable differences in the performance of the MAC compared to simulations that do not model channel temporal variation. Furthermore, explaining the behavior of the MAC under a temporal varying channel, we can suggest specific design choices for the emerging BAN MAC standard.

Journal ArticleDOI
TL;DR: This work has formulated optimal methods for code generation with integer linear programming for acyclic code and then extends this method to modulo scheduling of loops and shows that, for an architecture with two clusters, the integrated method finds a better solution than the nonintegrated method for 27% of the instances.
Abstract: Code generation in a compiler is commonly divided into several phases: instruction selection, scheduling, register allocation, spill code generation, and, in the case of clustered architectures, cluster assignment. These phases are interdependent; for instance, a decision in the instruction selection phase affects how an operation can be scheduled We examine the effect of this separation of phases on the quality of the generated code. To study this we have formulated optimal methods for code generation with integer linear programming; first for acyclic code and then we extend this method to modulo scheduling of loops. In our experiments we compare optimal modulo scheduling, where all phases are integrated, to modulo scheduling, where instruction selection and cluster assignment are done in a separate phase. The results show that, for an architecture with two clusters, the integrated method finds a better solution than the nonintegrated method for 27p of the instances.

Journal ArticleDOI
TL;DR: This work is able to create certificates that come with an algorithmic description of the proof of the desired property as justification and is applied to the certification of the verdicts of a deadlock-detection tool for an asynchronous component-based language.
Abstract: Automatic verification tools, such as model checkers and tools based on static analysis or on abstract interpretation, have become popular in software and hardware development. They increase confidence and potentially provide rich feedback. However, with increasing complexity, verification tools themselves are more likely to contain errors.In contrast to automatic verification tools, higher-order theorem provers use mathematically founded proof strategies checked by a small proof checker to guarantee selected properties. Thus, they enjoy a high level of trustability. Properties of software and hardware systems and their justifications can be encapsulated into a certificate, thereby guaranteeing correctness of the systems, with respect to the properties. These results offer a much higher degree of confidence than results achieved by verification tools. However, higher-order theorem provers are usually slow, due to their general and minimalistic nature. Even for small systems, a lot of human interaction is required for establishing a certificate.In this work, we combine the advantages of automatic verification tools (i.e., speed and automation) with those of higher-order theorem provers (i.e., high level of trustability). The verification tool generates a certificate for each invocation. This is checked by the higher-order theorem prover, thereby guaranteeing the desired property. The generation of certificates is much easier than producing the analysis results of the verification tool in the first place. In our work, we are able to create certificates that come with an algorithmic description of the proof of the desired property as justification. We concentrate on verification tools that generate invariants of systems and certify automatically that these do indeed hold. Our approach is applied to the certification of the verdicts of a deadlock-detection tool for an asynchronous component-based language.

Journal ArticleDOI
TL;DR: A novel flash file system that has a lightweight index structure that introduces the hybrid indexing scheme and intra-inode index logging, and an efficient GC scheme that adopts a dirty list with an on-demand GC approach as well as fine-grained data separation and erase-unit data allocation is presented.
Abstract: A very promising approach for using NAND flash memory as a storage medium is a flash file system. In order to design a higher-performance flash file system, two issues should be considered carefully. One issue is the design of an efficient index structure that contains the locations of both files and data in the flash memory. For large-capacity storage, the index structure must be stored in the flash memory to realize low memory consumption; however, this may degrade the system performance. The other issue is the design of a novel garbage collection (GC) scheme that reclaims obsolete pages. This scheme can induce considerable additional read and write operations while identifying and migrating valid pages. In this article, we present a novel flash file system that has the following features: (i) a lightweight index structure that introduces the hybrid indexing scheme and intra-inode index logging, and (ii) an efficient GC scheme that adopts a dirty list with an on-demand GC approach as well as fine-grained data separation and erase-unit data allocation. We implemented FlashLight in a Linux OS with kernel version 2.6.21 on an embedded device. The experimental results obtained using several benchmark programs confirm that FlashLight improves the performance by up to 27.4p over UBIFS by alleviating index management and GC overheads by up to 33.8p.

Journal ArticleDOI
TL;DR: This article uses the RMA theory to calculate the cost of the model and analyze the circumstances under which it can provide the most value, and presents its experimental evaluation to show evidence of its temporal determinism and overhead.
Abstract: Real-time scheduling algorithms like RMA or EDF and their corresponding schedulability test have proven to be powerful tools for developing predictable real-time systems. However, the traditional interrupt management model presents multiple inconsistencies that break the assumptions of many of the real-time scheduling tests, diminishing its utility. In this article, we analyze these inconsistencies and present a model that resolves them by integrating interrupts and tasks in a single scheduling model. We then use the RMA theory to calculate the cost of the model and analyze the circumstances under which it can provide the most value. This model was implemented in a kernel module. The portability of the design of our module is discussed in terms of its independence from both the hardware and the kernel. We also discuss the implementation issues of the model over conventional PC hardware, along with its cost and novel optimizations for reducing the overhead. Finally, we present our experimental evaluation to show evidence of its temporal determinism and overhead.

Journal ArticleDOI
TL;DR: A small number of additional sensor nodes are deployed to help key establishment between sensor nodes, to achieve both high resilience to node compromises and high efficiency in key establishment.
Abstract: Many techniques have been developed recently for establishing pairwise keys in sensor networks. However, some of them are vulnerable to a few compromised sensor nodes, while others could involve expensive protocols for establishing keys. This article introduces a much better alternative that can achieve both high resilience to node compromises and high efficiency in key establishment. The main idea is to deploy a small number of additional sensor nodes, called assisting nodes, to help key establishment between sensor nodes. The proposed approach has many advantages over existing approaches. In particular, a sensor node only needs to make a few local communications and perform a few efficient hash operations to setup a key with any other sensor node in the network at a very high probability. The majority of sensor nodes only need to store a single key. Besides, it also provides high resilience to node compromises. The theoretical analysis, simulation studies, and experiments on TelosB sensor motes also demonstrate the advantages of this key establishment protocol in sensor networks.

Journal ArticleDOI
TL;DR: A filter switching architecture that allows for dynamic switching between 5/3 and 9/7 wavelet filters and leads to a more power efficient design and a multiplier-free design with a low adder requirement demonstrates the potential of Poly-DWT for embedded systems.
Abstract: Many modern computing applications have been enabled through the use of real-time multimedia processing. While several hardware architectures have been proposed in the research literature to support such primitives, these fail to address applications whose performance and resource requirements have a dynamic aspect. Embedded multimedia systems typically need a power and computation efficient design in addition to good compression performance. In this article, we introduce a Polymorphic Wavelet Architecture (Poly-DWT) as a crucial building block towards the development of embedded systems to address such challenges. We illustrate how our Poly-DWT architecture can potentially make dynamic resource allocation decisions, such as the internal bit representation and the processing kernel, according to the application requirements. We introduce a filter switching architecture that allows for dynamic switching between 5/3 and 9/7 wavelet filters and leads to a more power efficient design. Further, a multiplier-free design with a low adder requirement demonstrates the potential of Poly-DWT for embedded systems. Through an FPGA prototype, we perform a quantitative analysis of our Poly-DWT architecture, and compare our filter to existing approaches to illustrate the area and performance benefits inherent in our approach.

Journal ArticleDOI
TL;DR: A novel approach to cryptosystem design based on dynamic voltage and frequency scaling (DVFS), which hides processor state to make it harder for an attacker to gain access to a secure system is proposed.
Abstract: This article proposes a novel approach to cryptosystem design to prevent power analysis attacks. Such attacks infer program behavior by continuously monitoring the power supply current going into the processor core. They form an important class of security attacks. Our approach is based on dynamic voltage and frequency scaling (DVFS), which hides processor state to make it harder for an attacker to gain access to a secure system. Three designs are studied to test the efficacy of the DVFS method against power analysis attacks. The advanced realization of our cryptosystem is presented which achieves enough high power and time trace entropies to block various kinds of power analysis attacks in the DES algorithm. We observed 27p energy reduction and 16p time overhead in these algorithms. Finally, DVFS hardness analysis is presented.

Journal ArticleDOI
TL;DR: A randomized instruction injection technique (RIJID) that overcomes the pitfalls of previous countermeasures and scrambles the power profile of a cryptographic application by injecting random instructions at random points of execution and therefore protects the system against power analysis attacks.
Abstract: Side-channel attacks in general and power analysis attacks in particular are becoming a major security concern in embedded systems. Countermeasures proposed against power analysis attacks are data and table masking, current flattening, dummy instruction insertion and bit-flips balancing. All these techniques are either susceptible to multi-order power analysis attack, not sufficiently generic to cover all encryption algorithms, or burden the system with high area, run-time or energy cost. In this article, we propose a randomized instruction injection technique (RIJID) that overcomes the pitfalls of previous countermeasures. RIJID scrambles the power profile of a cryptographic application by injecting random instructions at random points of execution and therefore protects the system against power analysis attacks. Two different ways of triggering the instruction injection are also presented: (1) softRIJID, a hardware/software approach, where special instructions are used in the code for triggering the injection at runtime; and (2) autoRIJID, a hardware approach, where the code injection is triggered by the processor itself via detecting signatures of encryption routines at runtime. A novel signature detection technique is also introduced for identifying encryption routines within application programs at runtime. Further, a simple obfuscation metric (RIJIDindex) based on cross-correlation that measures the scrambling provided by any code injection technique is introduced, which coarsely indicates the level of scrambling achieved. Our processor models cost 1.9p additional area in the hardware/software approach and 1.2p in the hardware approach for a RISC based processor, and costs on average 29.8p in runtime and 27.1p in energy for the former and 25.0p in runtime and 28.5p in energy for the later, for industry standard cryptographic applications.

Journal ArticleDOI
TL;DR: This work provides a general architecture of micro-solar power systems---comprising key components and interconnections among the components---and formalize each component in an analytical or empirical model of its behavior, to provide more accurate estimations of solar radiation and close the loop for micro-Solar power system modeling.
Abstract: Micro-solar power system design is challenging because it must address long-term system behavior under highly variable solar energy conditions and consider a large space of design options Several micro-solar power systems and models have been made, validating particular points in the whole design space We provide a general architecture of micro-solar power systems---comprising key components and interconnections among the components---and formalize each component in an analytical or empirical model of its behavior To model the variability of solar energy, we provide three solar radiation models, depending on the degree of information available: an astronomical model for ideal conditions, an obstructed astronomical model for estimating solar radiation under the presence of shadows and obstructions, and a weather-effect model for estimating solar radiation under weather variation Our solar radiation models are validated with a concrete design, the HydroWatch node, thus achieving small deviation from the long-term measurement They can be used in combination with other micro-solar system models to improve the utility of the load and estimate the behavior of micro-solar power systems more accurately Thus, our solar radiation models provide more accurate estimations of solar radiation and close the loop for micro-solar power system modeling

Journal ArticleDOI
TL;DR: The LEAP design approach is described, in which the system is able to adaptively select the most energy-efficient hardware components matching an application’s needs, and it is demonstrated that by exploiting high energy-efficiency components and enabling proper on-demand scheduling, the LEAP architecture may meet both sensing performance and energy dissipation objectives for a broad class of applications.
Abstract: A broad range of embedded networked sensing (ENS) applications have appeared for large-scale systems, introducing new requirements leading to new embedded architectures, associated algorithms, and supporting software systems. These new requirements include the need for diverse and complex sensor systems that present demands for energy and computational resources, as well as for broadband communication. To satisfy application demands while maintaining critical support for low-energy operation, a new multiprocessor node hardware and software architecture, Low Power Energy Aware Processing (LEAP), has been developed. In this article, we described the LEAP design approach, in which the system is able to adaptively select the most energy-efficient hardware components matching an application’s needs. The LEAP platform supports highly dynamic requirements in sensing fidelity, computational load, storage media, and network bandwidth. It focuses on episodic operation of each component and considers the energy dissipation for each platform task by integrating fine-grained energy-dissipation monitoring and sophisticated power-control scheduling for all subsystems, including sensors. In addition to the LEAP platform’s unique hardware capabilities, its software architecture has been designed to provide an easy way to use power management interface and a robust, fault-tolerant operating environment and to enable remote upgrade of all software components. LEAP platform capabilities are demonstrated by example implementations, such as a network protocol design and a light source event detection algorithm. Through the use of a distributed node testbed, we demonstrate that by exploiting high energy-efficiency components and enabling proper on-demand scheduling, the LEAP architecture may meet both sensing performance and energy dissipation objectives for a broad class of applications.

Journal ArticleDOI
TL;DR: Results show that with only randomly generated events, the design can effectively localize nodes with great flexibility while adding little extra cost at the resource constrained sensor node side.
Abstract: Event-driven localization has been proposed as a low-cost solution for node positioning in wireless sensor networks. In order to eliminate the costly requirement for accurate event control in existing methods, we present a practical design using uncontrolled events. The main idea is to estimate both event generation parameters and the location of sensor nodes simultaneously, by processing node sequences that can be easily obtained from event detections. Besides the basic design, we proposed two enhancements to further extract information embedded in node orderings for two scenarios: (i) node density is high; and (ii) abundant events are available. To demonstrate the generality of our design, both straight-line scan and circular wave propagation events are addressed in the article, and we evaluated the design with extensive simulation as well as a testbed implementation with 41 MICAz motes. Results show that with only randomly generated events, our design can effectively localize nodes with great flexibility while adding little extra cost at the resource constrained sensor node side. In addition, localization via uncontrolled events provides a potential option of achieving node positioning through long-term ambient events.

Journal ArticleDOI
TL;DR: This work proposes an algorithm which combines exploration of the system trajectories and state space reduction using merging based on a bisimulation metric and establishes a procedure that allows to prove unbounded safety from the result of the bounded safety algorithm via a refinement step.
Abstract: We consider verification problems for transition systems enriched with a metric structure. We believe that these metric transition systems are particularly suitable for the analysis of cyber-physical systems in which metrics can be naturally defined on the numerical variables of the embedded software and on the continuous states of the physical environment. We consider verification of bounded and unbounded safety properties, as well as bounded liveness properties. The transition systems we consider are nondeterministic, finitely branching, and with a finite set of initial states. Therefore, bounded safety/liveness properties can always be verified by exhaustive exploration of the system trajectories. However, this approach may be intractable in practice, as the number of trajectories usually grows exponentially with respect to the considered bound. Furthermore, since the system we consider can have an infinite set of states, exhaustive exploration cannot be used for unbounded safety verification. For bounded safety properties, we propose an algorithm which combines exploration of the system trajectories and state space reduction using merging based on a bisimulation metric. The main novelty compared to an algorithm presented recently by Lerda et al. [2008] consists in introducing a tuning parameter that improves the performance drastically. We also establish a procedure that allows us to prove unbounded safety from the result of the bounded safety algorithm via a refinement step. We then adapt the algorithm to handle bounded liveness verification. Finally, the effectiveness of the approach is demonstrated by applying it to the analysis of implementations of an embedded control loop.

Journal ArticleDOI
TL;DR: The main result is to prove that basic blocks, and thus program points, can be totally ordered so that live-ranges of variables correspond to intervals on a line, a result that holds for both variants of SSI form.
Abstract: The static single information (SSI) form is an extension of the static single assignment (SSA) form, a well-established compiler intermediate representation that has been successfully used for numerous compiler analysis and optimizations. Several interesting results have also been shown for SSI form concerning liveness analysis and the representation of live-ranges of variables, which could make SSI form appealing for just-in-time compilation. Unfortunately, we have uncovered several mistakes in the previous literature on SSI form, which, admittedly, is already quite sparse. This article corrects the mistakes that are most germane to SSI form. We first explain why the two definitions of SSI form proposed in past literature, first by C. S. Ananian, then by J. Singer, are not equivalent. Our main result is then to prove that basic blocks, and thus program points, can be totally ordered so that live-ranges of variables correspond to intervals on a line, a result that holds for both variants of SSI form. In other words, in SSI form, the intersection graph defined by live-ranges is an interval graph, a stronger structural property than for SSA form for which the intersection graph of live-ranges is chordal. Finally, we show how this structure of live-ranges can be used to simplify liveness analysis.

Journal ArticleDOI
TL;DR: This article presents three improvements to save energy while performing the computation on the mobile system: selective loading, adaptive loading, and caching features in memory, and investigates if energy can be saved by migrating parts of the computation to a server, called computation offloading.
Abstract: Mobile systems such as PDAs and cell phones play an increasing role in handling visual contents such as images. Thousands of images can be stored in a mobile system with the advances in storage technology: this creates the need for better organization and retrieval of these images. Content Based Image Retrieval (CBIR) is a method to retrieve images based on their visual contents. In CBIR, images are compared by matching their numerical representations called features; CBIR is computation and memory intensive and consumes significant amounts of energy. This article examines energy conservation for CBIR on mobile systems. We present three improvements to save energy while performing the computation on the mobile system: selective loading, adaptive loading, and caching features in memory. Using these improvements adaptively reduces the features to be loaded into memory for each search. The reduction is achieved by estimating the difficulty of the search. If the images in the collection are dissimilar, fewer features are sufficient; less computation is performed and energy can be saved. We also consider the effect of consecutive user queries and show how features can be cached in memory to save energy. We implement a CBIR algorithm on an HP iPAQ hw6945 and show that these improvements can save energy and allow CBIR to scale up to 50,000 images on a mobile system. We further investigate if energy can be saved by migrating parts of the computation to a server, called computation offloading. We analyze the impact of the wireless bandwidth, server speed, number of indexed images, and the number of image queries on the energy consumption. Using our scheme, CBIR can be made energy efficient under all conditions.