scispace - formally typeset
Search or ask a question

Showing papers in "Eurasip Journal on Embedded Systems in 2017"


Journal ArticleDOI
TL;DR: This article presents a data collection, analyzing, and monitoring software for a reference smart grid and discusses two possible architectures for collecting data from energy analyzers and their performance with respect to real-time monitoring, load peak analysis, and automated regulation of the power grid.
Abstract: Smart grid, smart metering, electromobility, and the regulation of the power network are keywords of the transition in energy politics. In the future, the power grid will be smart. Based on different works, this article presents a data collection, analyzing, and monitoring software for a reference smart grid. We discuss two possible architectures for collecting data from energy analyzers and analyze their performance with respect to real-time monitoring, load peak analysis, and automated regulation of the power grid. In the first architecture, we analyze the latency, needed bandwidth, and scalability for collecting data over the Modbus TCP/IP protocol and in the second one over a RESTful web service. The analysis results show that the solution with Modbus is more scalable as the one with RESTful web service. However, the performance and scalability of both architectures are sufficient for our reference smart grid and use cases.

26 citations


Journal ArticleDOI
TL;DR: This research work introduces efficient reconfigurable hardware architecture for principal component analysis (PCA), a widely used dimensionality reduction technique in data mining, and introduces generic, parameterized, and scalable hardware designs.
Abstract: With the advancement of mobile and embedded devices, many applications such as data mining have found their way into these devices. These devices consist of various design constraints including stringent area and power limitations, high speed-performance, reduced cost, and time-to-market requirements. Also, applications running on mobile devices are becoming more complex requiring significant processing power. Our previous analysis illustrated that FPGA-based dynamic reconfigurable systems are currently the best avenue to overcome these challenges. In this research work, we introduce efficient reconfigurable hardware architecture for principal component analysis (PCA), a widely used dimensionality reduction technique in data mining. For mobile applications such as signature verification and handwritten analysis, PCA is applied initially to reduce the dimensionality of the data, followed by similarity measure. Experiments are performed, using a handwritten analysis application together with a benchmark dataset, to evaluate and illustrate the feasibility, efficiency, and flexibility of reconfigurable hardware for data mining applications. Our hardware designs are generic, parameterized, and scalable. Furthermore, our partial and dynamic reconfigurable hardware design achieved 79 times speedup compared to its software counterpart, and 71% space saving compared to its static reconfigurable hardware design.

26 citations


Journal ArticleDOI
TL;DR: This paper presents an approach to use an artificial DNA for self-organization in embedded systems, and presents a prototypic implementation and conducts a real-time evaluation using a flexible robot vehicle.
Abstract: Embedded systems are growing more and more complex because of the increasing chip integration density, larger number of chips in distributed applications, and demanding application fields (e.g., in cars and in households). Bio-inspired techniques like self-organization are a key feature to handle this complexity. However, self-organization needs a guideline for setting up and managing the system. In biology the structure and organization of a system is coded in its DNA. In this paper we present an approach to use an artificial DNA for that purpose. Since many embedded systems can be composed from a limited number of basic elements, the structure and parameters of such systems can be stored in a compact way representing an artificial DNA deposited in each processor core. This leads to a self-describing system. Based on the DNA, the self-organization mechanisms can build the system autonomously providing a self-building system. System repair and optimization at runtime are also possible, leading to higher robustness, dependability, and flexibility. We present a prototypic implementation and conduct a real-time evaluation using a flexible robot vehicle. Depending on the DNA, this vehicle acts as a self-balancing vehicle, an autonomous guided vehicle, a follower, or a combination of these.

16 citations


Journal ArticleDOI
TL;DR: A sleep stage detection algorithm is proposed that uses only the heart rate signal, derived from electrocardiogram (ECG), as a discriminator, which would make it possible for sleep analysis to be performed at home, saving a lot of effort and money.
Abstract: To evaluate the quality of sleep, it is important to determine how much time was spent in each sleep stage during the night. The gold standard in this domain is an overnight polysomnography (PSG). But the recording of the necessary electrophysiological signals is extensive and complex and the environment of the sleep laboratory, which is unfamiliar to the patient, might lead to distorted results. In this paper, a sleep stage detection algorithm is proposed that uses only the heart rate signal, derived from electrocardiogram (ECG), as a discriminator. This would make it possible for sleep analysis to be performed at home, saving a lot of effort and money. From the heart rate, using the fast Fourier transformation (FFT), three parameters were calculated in order to distinguish between the different sleep stages. ECG data along with a hypnogram scored by professionals was used from Physionet database, making it easy to compare the results. With an agreement rate of 41.3%, this approach is a good foundation for future research.

16 citations


Journal ArticleDOI
TL;DR: An intelligent wireless communication system aiming at implementing an adaptive OFDM-based transmitter and performing a vertical handover in heterogeneous networks is presented and an unified physical layer for WIFI-WIMAX networks is proposed.
Abstract: Today, wireless devices generally feature multiple radio access technologies (LTE, WIFI, WIMAX,...) to handle a rich variety of standards or technologies.These devices should be intelligent and autonomous enough in order to either reach a given level of performance or automatically select the best available wireless technology according to standards availability. On the hardware side, system on chip (SoC) devices integrate processors and field-programmable gate array (FPGA) logic fabrics on the same chip with fast inter-connection. This allows designing software/hardware systems and implementing new techniques and methodologies that greatly improve the performance of communication systems. In these devices, Dynamic partial reconfiguration (DPR) constitutes a well-known technique for reconfiguring only a specific area within the FPGA while other parts continue to operate independently. To evaluate when it is advantageous to perform DPR, adaptive techniques have been proposed. They consist in reconfiguring parts of the system automatically according to specific parameters. In this paper, an intelligent wireless communication system aiming at implementing an adaptive OFDM-based transmitter and performing a vertical handover in heterogeneous networks is presented. An unified physical layer for WIFI-WIMAX networks is also proposed. The system was implemented and tested on a ZedBoard which features a Xilinx Zynq-7000-SoC. The performance of the system is described, and simulation results are presented in order to validate the proposed architecture.

12 citations


Journal ArticleDOI
TL;DR: This is the first attempt where NN is used as a vehicle to model smartphones’ power, and the results obtained demonstrate that NNs models can provide reasonably accurate estimates, and therefore, further investigation of their use in this modeling problem is justified.
Abstract: In the work presented in this paper, we use data collected from mobile users over several weeks to develop a neural network-based prediction model for the power consumed by a smartphone. Battery life is critical to the designers of smartphones, and being able to assess scenarios of power consumption, and hence energy usage is of great value. The models developed attempt to correlate power consumption to users’ behavior by using power-related data collected from smartphones with the help of specially designed logging tool or application. Experiences gained while developing the model regarding the selection of input parameters to the model, the identification of the most suitable NN (neural network) structure, and the training methodology applied are all described in this paper. To the best of our knowledge, this is the first attempt where NN is used as a vehicle to model smartphones’ power, and the results obtained demonstrate that NNs models can provide reasonably accurate estimates, and therefore, further investigation of their use in this modeling problem is justified.

12 citations


Journal ArticleDOI
TL;DR: This paper introduces a novel embedded hardware-software solution aimed at emulating a wide spectrum of energy sources usually exploited to power sensor networks motes consisting of a modular architecture featuring small factor form, low power requirements, and limited cost.
Abstract: The capability to either minimize energy consumption in battery-operated devices, or to adequately exploit energy harvesting from various ambient sources, is central to the development and engineering of energy-neutral wireless sensor networks. However, the design of effective networked embedded systems targeting unlimited lifetime poses several challenges at different architectural levels. In particular, the heterogeneity, the variability, and the unpredictability of many energy sources, combined to changes in energy required by powered devices, make it difficult to obtain reproducible testing conditions, thus prompting the need of novel solutions addressing these issues. This paper introduces a novel embedded hardware-software solution aimed at emulating a wide spectrum of energy sources usually exploited to power sensor networks motes. The proposed system consists of a modular architecture featuring small factor form, low power requirements, and limited cost. An extensive experimental characterization confirms the validity of the embedded emulator in terms of flexibility, accuracy, and latency while a case study about the emulation of a lithium battery shows that the hardware-software platform does not introduce any measurable reduction of the accuracy of the model. The presented solution represents therefore a convenient solution for testing large-scale testbeds under realistic energy supply scenarios for wireless sensor networks.

11 citations


Journal ArticleDOI
TL;DR: A novel resource allocation approach dedicated to hard real-time systems with distinctive operational modes is proposed to reduce the energy dissipation of the computing cores by either powering them off or switching them into energy-saving states while still guaranteeing to meet all timing constraints.
Abstract: In this paper, a novel resource allocation approach dedicated to hard real-time systems with distinctive operational modes is proposed. The aim of this approach is to reduce the energy dissipation of the computing cores by either powering them off or switching them into energy-saving states while still guaranteeing to meet all timing constraints. The approach is illustrated with two industrial applications, an engine control management and an engine control unit. Moreover, the amount of data to be migrated during the mode change is minimised. Since the number of processing cores and their energy dissipation are often negatively correlated with the amount of data to be migrated during the mode change, there is some trade-off between these values, which is also analysed in this paper.

10 citations


Journal ArticleDOI
TL;DR: This work proposes a methodology for the development and validation of an embedded multiprocessor system that assumes the use of a portable, open source API to support the parallelization and the possibility of prototyping the system on a field-programmable gate array.
Abstract: In recent years, the use of multiprocessor systems has become increasingly common. Even in the embedded domain, the development of platforms based on multiprocessor systems or the porting of legacy single-core applications are frequent needs. However, such designs are often complicated, as embedded systems are characterized by numerous non-functional requirements and a tight hardware/software integration. This work proposes a methodology for the development and validation of an embedded multiprocessor system. Specifically, the proposed method assumes the use of a portable, open source API to support the parallelization and the possibility of prototyping the system on a field-programmable gate array. On this basis, the proposed flow allows an early exploration of the hardware configuration space, a preliminary estimate of performance, and the rapid development of a system able to satisfy the design specifications. An accurate assessment of the actual performance of the system is then enforced by the use of an hardware-based profiling subsystem. The proposed design flow is described, and a version specifically designed for LEON3 processor is presented and validated. The application of the proposed methodology in a real case of industrial study is then presented and analyzed.

10 citations


Journal ArticleDOI
TL;DR: Simulation results show that the proposed precoding design for a massive MIMO system with limited feedback via minimizing mean square error (MSE) is robust to the channel uncertainties caused by quantization errors.
Abstract: Compared with the traditional multiple-input multiple-output (MIMO) systems, the large number of the transmit antennas of massive MIMO makes it more dependent on the limited feedback in practical systems. In this paper, we study the problem of precoding design for a massive MIMO system with limited feedback via minimizing mean square error (MSE). The feedback from mobile users to the base station (BS) is firstly considered; the BS can obtain the quantized information regarding the direction of the channels. Then, the precoding is designed by considering the effect of both noise term and quantization error under transmit power constraint. Simulation results show that the proposed scheme is robust to the channel uncertainties caused by quantization errors.

9 citations


Journal ArticleDOI
Ismat Chaib Draa, Smail Niar, Jamel Tayeb1, Emmanuelle Grislin, Mikael Desertot 
TL;DR: A tool to analyze user/application interaction to understand how the different hardware components are used at run-time and optimize them, using machine learning methods to identify and classify user behaviors and habit information is proposed.
Abstract: Optimizing energy consumption in modern mobile handheld devices plays a very important role as lowering energy consumption impacts battery life and system reliability. With next-generation smartphones and tablets, the number of sensors and communication tools will increase and more and more communication interfaces and protocols such as Wi-Fi, Bluetooth, GPRS, UMTS, and LTE will be incorporated. Consequently, the fraction of energy consumed by these components will be larger. Nevertheless, the use of the large amount of data from the different sensors can be beneficial to detect the changing user context, to understand habits, and to detect running application needs. All these information, when used properly, may lead to an efficient energy consumption control. This paper proposes a tool to analyze user/application interaction to understand how the different hardware components are used at run-time and optimize them. The idea here is to use machine learning methods to identify and classify user behaviors and habit information. Using this tool, a software has been developed to control at run-time system component activities that have high impacts on the energy consumption. The tool allows also to predict future applications usages. By this way, screen brightness, CPU frequency, Wi-Fi connectivity, and playback sound level can be optimized while meeting the applications and the user requirements. Our experimental results show that the proposed solution can lower the energy consumption by up to 30 % versus the out-of-the-box power governor, while maintaining a negligible system overhead.

Journal ArticleDOI
TL;DR: The objective of this work is to provide an alternative scheduling technique that takes advantage of the semi-partitioned properties to accommodate fork-join tasks that cannot be scheduled in any pure partitioned environment and to reduce the migration overheads.
Abstract: This paper extends the work presented in Maia et al. (Semi-partitioned scheduling of fork-join tasks using work-stealing, 2015) where we address the semi-partitioned scheduling of real-time fork-join tasks on multicore platforms. The proposed approach consists of two phases: an offline phase where we adopt a multi-frame task model to perform the task-to-core mapping so as to improve the schedulability and the performance of the system and an online phase where we use the work-stealing algorithm to exploit tasks’ parallelism among cores with the aim of improving the system responsiveness. The objective of this work is twofold: (1) to provide an alternative scheduling technique that takes advantage of the semi-partitioned properties to accommodate fork-join tasks that cannot be scheduled in any pure partitioned environment and (2) to reduce the migration overheads which has been shown to be a traditional major source of non-determinism for global scheduling approaches. In this paper, we consider different allocation heuristics and we evaluate the behavior of two of them when they are integrated within our approach. The simulation results show an improvement up to 15% of the proposed heuristic over the state-of-the-art in terms of the average response time per task set.

Journal ArticleDOI
TL;DR: This paper proposes a novel Pareto-based scheduler for identifying near-optimal resource allocations for user workloads with respect to performance and monetary cost, and develops an automatic configuration of basic tasks’ parameters that allows to further minimize the user’s spending budget and the jobs’ execution times.
Abstract: In recent years, we are observing an increased demand for processing large amounts of data. The MapReduce programming model has been utilized by major computing companies and has been integrated by novel cyber physical systems (CPS) in order to perform large-scale data processing. However, the problem of efficiently scheduling MapReduce workloads in cluster environments, like Amazon’s EC2, can be challenging due to the observed trade-off between the need for performance and the corresponding monetary cost. The problem is exacerbated by the fact that cloud providers tend to charge users based on their I/O operations, increasing dramatically the spending budget. In this paper, we describe our approach for scheduling MapReduce workloads in cluster environments taking into consideration the performance/budget trade-off. Our approach makes the following contributions: (i) we propose a novel Pareto-based scheduler for identifying near-optimal resource allocations for user workloads with respect to performance and monetary cost, and (ii) we develop an automatic configuration of basic tasks’ parameters that allows us to further minimize the user’s spending budget and the jobs’ execution times. Our detailed experimental evaluation using both real and synthetic datasets illustrate that our approach improves the performance of the workloads as much as 50%, compared to its competitors.

Journal ArticleDOI
TL;DR: The results show that the proposed method can obtain a shorter transmission delay and ensure a higher transmission success rate and the theoretical analysis herein proves the validity of the method.
Abstract: While traditional vehicle network communication architecture is based on special short-range communication, it is difficult to meet the demand for quality of service of vehicle networking data transmission. The relevant data can be uploaded to the server through the mobile gateway, which can be transmitted to the target vehicle by the decision of the server, which can then extend the data broadcast domain and greatly reduce the time delay of data transmission. Combined with the idea of mobile cloud services, a new network architecture and data transmission method is proposed in this paper. We first describe the specific process of the gateway service to the registration of cloud service information. Secondly, we propose a method to select the cloud service gateway. The method combines historical cloud data and real-time data, and dynamically determines the gateway service provider and the service scope of the service. Gateway access to services consumers in broadcast news, they considered the communication load, stability link channel quality and other performance parameters to select the best gateway service provider, and then transmitted the data to the gateway service provider and uploaded it to their cloud. Finally, the transmission performance of the proposed method is evaluated for different traffic scenarios. The results show that the proposed method can obtain a shorter transmission delay and ensure a higher transmission success rate and the theoretical analysis herein proves the validity of the method.

Journal ArticleDOI
TL;DR: This paper addresses the MBPTA representativeness problems caused by set-associative caches and presents a novelRepresentativeness validation method (ReVS) for cache placement that explores the probability and impact of those cache placements that can occur during operation.
Abstract: Measurement-Based Probabilistic Timing Analysis (MBPTA) has been shown to be an industrially viable method to estimate the Worst-Case Execution Time (WCET) of real-time programs running on processors including several high-performance features. MBPTA requires hardware/software support so that program’s execution time, and so its WCET, has a probabilistic behaviour and can be modelled with probabilistic and statistic methods. MBPTA also requires that those events with high impact on execution time are properly captured in the (R) runs made at analysis time. Thus, a representativeness argument is needed to provide evidence that those events have been captured. This paper addresses the MBPTA representativeness problems caused by set-associative caches and presents a novel representativeness validation method (ReVS) for cache placement. Building on cache simulation, ReVS explores the probability and impact (miss count) of those cache placements that can occur during operation. ReVS determines the number of runs R ′, which can be higher than R, such that those cache placements with the highest impact are effectively observed in the analysis runs, and hence, MBPTA can be reliably applied to estimate the WCET.

Journal ArticleDOI
TL;DR: The article presents AMBER, an innovative embedded platform leveraging on a design based on System-on-Modules (SOM) and Extender modules that allows a smooth industrial-oriented redesign of the embedded solution.
Abstract: The proliferation of low-cost embedded system platforms has allowed the creation of large communities of developers, as well as the development of new advanced applications. Even though some of these applications can be of industrial relevance, their immediate application to real products is not straightforward since most of them require a complete and expensive hardware redesign of the considered embedded solution. To speed up the technological transfer of custom embedded solutions while overtaking the limits imposed by a complete hardware redesign, the article presents AMBER, an innovative embedded platform leveraging on a design based on System-on-Modules (SOM) and Extender modules. AMBER decouples the processing part of the system, which is fully contained on the SOM, from the peripherals, which are contained on the main board and Extender modules. This allows a smooth industrial-oriented redesign of the embedded solution. In the article, AMBER is first presented starting from its philosophy and design choices while highlighting its main features. Then, an application of AMBER as an enhanced gateway to be used in the Industrial Internet of Things (IIoT) scenario is reported by considering a monitoring and actuation use case. The IIoT-oriented AMBER solution is hardware and software configured to support real-time communications with actuators compliant with the Powerlink standard, as well as to interact with sensors compliant with Bluetooth Low Energy. Performance results show the effectiveness of the proposed solution in the selected industrial scenario while promoting a fast and immediate transfer in new embedded products targeted to IIoT applications.

Journal ArticleDOI
TL;DR: This paper proposes a generic solution to trace embedded heterogeneous systems and overcome the challenges brought by their peculiar architectures (little available memory, bare-metal CPUs, or exotic components for instance).
Abstract: Tracing is a common method used to debug, analyze, and monitor various systems. Even though standard tools and tracing methodologies exist for standard and distributed environments, it is not the case for heterogeneous embedded systems. This paper proposes to fill this gap and discusses how efficient tracing can be achieved without having common system tools, such as the Linux Trace Toolkit (LTTng), at hand on every core. We propose a generic solution to trace embedded heterogeneous systems and overcome the challenges brought by their peculiar architectures (little available memory, bare-metal CPUs, or exotic components for instance). The solution described in this paper focuses on a generic way of correlating traces among different kinds of processors through traces synchronization, to analyze the global state of the system as a whole. The proposed solution was first tested on the Adapteva Parallella board. It was then improved and thoroughly validated on TI’s Keystone 2 System-on-Chip (SoC).

Journal ArticleDOI
TL;DR: This work proposes an intelligible SLAM solution for an embedded processing platform to reduce computer processing time using a low-variance resampling technique and is able to recognise artificial landmarks in a real environment.
Abstract: The simultaneous localisation and mapping (SLAM) algorithm has drawn increasing interests in autonomous robotic systems. However, SLAM has not been widely explored in embedded system design spaces yet due to the limitation of processing recourses in embedded systems. Especially when landmarks are not identifiable, the amount of computer processing will dramatically increase due to unknown data association. In this work, we propose an intelligible SLAM solution for an embedded processing platform to reduce computer processing time using a low-variance resampling technique. Our prototype includes a low-cost pixy camera, a Robot kit with L298N motor board and Raspberry Pi V2.0. Our prototype is able to recognise artificial landmarks in a real environment with an average 75% of identified landmarks in corner detection and corridor detection with only average 1.14 W.

Journal ArticleDOI
TL;DR: The purpose of the paper is to present an efficient quantization and fixed-point representation for turbo-detection and turbo-demapping and the impact of floating-to-fixed-point conversion is illustrated upon the error-rate performance of the receiver for different system configurations.
Abstract: In the domain of wireless digital communication, floating-point arithmetic is generally used to conduct performance evaluation studies of algorithms. This is typically limited to theoretical performance evaluation in terms of communication quality and error rates. For a practical implementation perspective, using fixed-point arithmetic instead of floating-point reduces significantly implementation costs in terms of area occupation and energy consumption. However, this implies a complex conversion process, particularly if the considered algorithm includes complex arithmetic operations with high accuracy requirements and if the target system presents many configuration parameters. In this context, the purpose of the paper is to present an efficient quantization and fixed-point representation for turbo-detection and turbo-demapping. The impact of floating-to-fixed-point conversion is illustrated upon the error-rate performance of the receiver for different system configurations. Only a slight degradation in the error-rate performance of the receiver is observed when implementing the detector and demapper modules which utilize the devised quantization and fixed-point arithmetic rather than floating-point arithmetic.

Journal ArticleDOI
TL;DR: A didactic framework in embedded electronics systems that is used to elicit awareness into students and engineers on the design issues arising in the realization of a class of underactuated robots and aerial vehicles that needs be robustly controlled due to their intrinsic unstability.
Abstract: This paper presents a didactic framework in embedded electronics systems that is used to elicit awareness into students and engineers on the design issues arising in the realization of a class of underactuated robots and aerial vehicles that needs be robustly controlled due to their intrinsic unstability. The applications prototyped on the embedded platform presented here are conceived, by design, to be compliant with tiny collaborative robotics applications in order to adhere to the needs of the complex cyber-physical systems problem. The proposed platform is self-contained with on-board sensing and computation. Its engineering uses only off-the-shelf and mass production components. The system is based on a general purpose embedded board equipped with a 32-bit microcontroller which is able to manage all the basic tasks of this robotic platform: sensing, actuation, control and communication. The framework is described, and initial experimental results are introduced. Three applications are presented in this work as a validation of the methodology: a ballbot robot, a legged robot and a quadrotor aerial vehicle. The chosen case studies are robotics applications that are specialized in performing maneuvers when operating in tight spaces as in the human living environments.

Journal ArticleDOI
TL;DR: Flexibility and performance of the model make it a valuable tool for low power system-on-chip design, either for efficient design space exploration or as part of a HW/SW codesign synthesis flow.
Abstract: Large fractions of today’s embedded systems’ power consumption can be attributed to the memory subsystem. In order to reduce this fraction, we propose a mathematical model to optimize on-chip memory configurations for minimal power. We exploit the power reduction effect of splitting memory into subunits with frequently accessed addresses mapped to small memories. The definition of an integer linear programming model enables us to solve the twofold problem of allocating an optimal set of memory instances with varying size on the one hand and finding an optimal mapping of application segments to allocated memories on the other hand. Experimental results yield power reductions of up to 82 % for instruction memory and 73 % for data memory. Area usage, at the same time, deteriorates by only 2.1 %, respectively, 1.2 % on average and even improves in some cases. Flexibility and performance of our model make it a valuable tool for low power system-on-chip design, either for efficient design space exploration or as part of a HW/SW codesign synthesis flow.

Journal ArticleDOI
TL;DR: A crowd cloud routing protocol based on opportunistic computing to improve the data transmission efficiency, reliability, and reduce routing overhead in wireless sensor networks and eliminates the factors that may cause the link instability.
Abstract: We proposed a crowd cloud routing protocol based on opportunistic computing to improve the data transmission efficiency, reliability, and reduce routing overhead in wireless sensor networks. Based on the analysis of the demand of big data processing in wireless sensor network, the data analysis and processing platform for wireless sensor network are designed based on the combination with the cloud computing. The cloud platform includes the main nodes, the nodes, and the core nodes. There are the engine and the drive between the wireless sensor network and the cloud server. Secondly, aiming at the problem of data transmission in the cloud platform, we design an opportunistic computing model which is suitable for wireless sensor networks to minimize the weight of routing management and network overhead. Then, we design an opportunistic calculation model to guarantee the data transmission scheme of the cloud platform. Finally, by eliminating the factors that may cause the link instability, the crowd cloud routing protocol is proposed. The experimental results show that the proposed crowd cloud routing protocol has the functions of real-time and reliability and reduces the cost of routing request.

Journal ArticleDOI
TL;DR: In order to improve the bandwidth utilization of embedded system and the working efficiency of mobile system, a crowd Petri network and bandwidth allocation scheme is proposed and simulation results show the effectiveness and feasibility of the embedded protocol based on bandwidth allocation of crowdPetri network.
Abstract: In order to improve the bandwidth utilization of embedded system and the working efficiency of mobile system, we propose a crowd Petri network and bandwidth allocation scheme. These research results are suitable for mobile embedded system. On the one hand, we have established a mobile crowd network system based on crowd Petri net. The system can give full play to the advantages of the concurrent and distributed data, so as to provide the formal description of the data control behavior of the mobile system and the asynchronous concurrent protection of the mobile service. On the other hand, through the opportunistic bandwidth allocation, the system efficiency and the network resources of the crowd Petri network is the most appropriate configuration. In the process of optimizing the crowd data, an embedded control protocol is studied based on the combination of the user demand and the data element characteristics by the combination of the service quality and the resource consumption. Simulation results show the effectiveness and feasibility of the embedded protocol based on bandwidth allocation of crowd Petri network.

Journal ArticleDOI
TL;DR: Based on the experiment evaluation result, the hydraulic analysis performance and mechanical equipment support ability of the proposed scheme is better than the static node scheme.
Abstract: In order to improve the efficiency of mechanical and hydraulic control of the mechanical equipment, the analysis scheme of mechanical hydraulic characteristics based on lightweight crowd data was proposed in mobile embedded devices. Based on the mobile and embedded machinery equipment, a crowd lightweight data-driven analysis model is proposed to solve the hydraulic mechanical properties of nonlinear filtering with coarse-grained service detection. The engine of the mechanical equipment was connected with the hydraulic control module through the harmonic filter. Based on the output array of hydraulic characteristics and the transmission power of the mobile embedded node, the analysis scheme of mechanical hydraulic characteristics was proposed based on lightweight crowd data in mobile embedded devices. Based on the experiment evaluation result, the hydraulic analysis performance and mechanical equipment support ability of the proposed scheme is better than the static node scheme.

Journal ArticleDOI
TL;DR: A feedback recurrent neural network-based topic model is proposed that consumes not only lower running time and memory but also has better effectiveness during topic analysis.
Abstract: While mining topics in a document collection, in order to capture the relationships between words and further improve the effectiveness of discovered topics, this paper proposed a feedback recurrent neural network-based topic model. We represented each word as a one-hot vector and embedded each document into a low-dimensional vector space. During the process of document embedding, we applied the long short-term memory method to capture the backward relationships between words and proposed a feedback recurrent neural network to capture the forward relationships between words. In the topic model, we used the original and muted document pairs as positive samples and the original and random document pairs as negative samples to train the model. The experiments show that the proposed model consumes not only lower running time and memory but also has better effectiveness during topic analysis.

Journal ArticleDOI
TL;DR: This paper presents an extension to the routing strategy originally implemented in the recently proposed “tree or linear hopping network” (ToLHnet) protocol, aimed at better handling the special but important case of linear routing over a (possibly very long) wired link, such as an RS-485 bus.
Abstract: As the adoption of sensing and control networks rises to encompass the most diverse fields, the need for simple, efficient interconnection between many different devices will become ever more pressing. Though wireless communication is certainly appealing, current technological limits still prevent its usage where high reliability is needed or where the electromagnetical environment is not really apt to let radio waves through. In these cases, a wired link, based on a robust and well-consolidated standard such as an RS-485 bus, might prove to be a good choice. In this paper, we present an extension to the routing strategy originally implemented in the recently proposed “tree or linear hopping network” (ToLHnet) protocol, aimed at better handling the special but important case of linear routing over a (possibly very long) wired link, such as an RS-485 bus. The ToLHnet protocol was especially developed to suit the need of low complexity for deployments on large control networks. Indeed, using it over RS-485 already makes it possible to overcome many of the traditional limitations regarding cable length, without requiring segmenting the bus to install repeaters. With the extension here proposed, it will also be possible to simultaneously reduce latency (i.e., increase throughput, should it matter) for short-distance communications over the same cable, largely increasing the overall network efficiency, with a negligible increase in the complexity of the nodes’ firmware.

Journal ArticleDOI
TL;DR: An electronic design automation (EDA) methodology for the high-level design of hierarchical memory architectures in embedded data-intensive applications, mainly in the area of multidimensional signal processing, using techniques specific to the integral polyhedra based dependence analysis.
Abstract: In real-time data-intensive multimedia processing applications, data transfer and storage significantly influence, if not dominate, all the major cost parameters of the design space—namely energy consumption, performance, and chip area. This paper presents an electronic design automation (EDA) methodology for the high-level design of hierarchical memory architectures in embedded data-intensive applications, mainly in the area of multidimensional signal processing. Different from the previous works, the problems of data assignment to the memory layers, of mapping the signals into the physical memories, and of banking the on-chip memory are addressed in a consistent way, based on the same formal model. This memory management framework employs techniques specific to the integral polyhedra based dependence analysis. The main design target is the reduction of the static and dynamic energy consumption in the hierarchical memory subsystem.

Journal ArticleDOI
TL;DR: Experimental measurements prove that the proposed novel configurable DC/DC converter architecture can operate in harsh automotive environments since it meets stringent requirements in terms of electrostatic discharge (ESD) protection, operating temperature range, out-of-range current, or voltage conditions.
Abstract: A novel configurable DC/DC converter architecture, to be integrated as hard macrocell in automotive embedded systems, is proposed in the paper. It aims at realizing an intelligent voltage regulator. With respect to the state of the art, the challenge is the integration into an automotive-qualified chip of several advanced features like dithering of switching frequency, nested control loops with both current and voltage feedback, asynchronous hysteretic control for low power mode, slope control of the power FET gate driver, and diagnostic block against out-of-range current or voltage or temperature conditions. Moreover, the converter macrocell can be connected to the in-vehicle digital network, exchanging with the main vehicle control unit status/diagnostic flags and commands. The proposed design can be configured to work both in step-up and step-down modes, to face a very wide operating input voltage range from 2.5 to 60 V and absolute range from −0.3 to 70 V. The main target is regulating all voltages required in the emerging hybrid/electric vehicles where, besides the conventional 12 V DC bus, also a 48 V DC bus is present. The proposed design supports also digital configurability of the output regulated voltage, through a programmable divider, and of the coefficients of the proportional-integrative controller inside the nested control loops. Fabricated in 0.35 μm CMOS technology, experimental measurements prove that the IC can operate in harsh automotive environments since it meets stringent requirements in terms of electrostatic discharge (ESD) protection, operating temperature range, out-of-range current, or voltage conditions.

Journal ArticleDOI
TL;DR: Embedded GPU computing platform is considered as a potential real-time implementation platform of minimum-variance beamforming algorithm and performed more than 100 times faster than the ARM implementation on the same heterogeneous embedded platform.
Abstract: Medical ultrasonic imaging has been utilized in a variety of clinical diagnoses for many years. Recently, because of the needs of portable and mobile medical ultrasonic diagnoses, the development of real-time medical ultrasonic imaging algorithms on embedded computing platforms is a rising research direction. Typically, delay-and-sum beamforming algorithm is implemented on embedded medical ultrasonic scanners. Such algorithm is the easiest to implement at real-time frame rate, but the image quality of this algorithm is not high enough for complicated diagnostic cases. As a result, minimum-variance adaptive beamforming algorithm for medical ultrasonic imaging is considered in this paper, which shows much higher image quality than that of delay-and-sum beamforming algorithm. However, minimum-variance adaptive beamforming algorithm is a complicated algorithm with O(n 3) computational complexity. Consequently, it is not easy to implement such algorithm on embedded computing platform at real-time frame rate. On the other hand, GPU is a well-known parallel computing platform for image processing. Therefore, embedded GPU computing platform is considered as a potential real-time implementation platform of minimum-variance beamforming algorithm in this paper. By applying the described effective implementation strategies, the GPU implementation of minimum-variance beamforming algorithm performed more than 100 times faster than the ARM implementation on the same heterogeneous embedded platform. Furthermore, platform power consumptions, computation energy efficiency, and platform cost efficiency of the experimental heterogeneous embedded platforms were also evaluated, which demonstrated that the investigated heterogeneous embedded computing platforms were suitable for real-time portable or mobile high-quality medical ultrasonic imaging device constructions.

Journal ArticleDOI
TL;DR: A hybrid system spanning a fixed-function microarchitecture and a general-purpose microprocessor is presented, designed to amplify the throughput and decrease the power dissipation of collision detection relative to what can be achieved using CPUs or GPUs alone.
Abstract: We present a hybrid system spanning a fixed-function microarchitecture and a general-purpose microprocessor, designed to amplify the throughput and decrease the power dissipation of collision detection relative to what can be achieved using CPUs or GPUs alone. The primary component is one of the two novel microarchitectures designed to perform the principal elements of broad-phase collision detection. Both microarchitectures consist of pipelines comprising a plurality of memories, which rearrange the input into a format that maximises parallelism and bandwidth. The two microarchitectures are combined with the remainder of the system through an original method for sharing data between a ray tracer and the collision-detection microarchitectures to minimise data structure construction costs. We effectively demonstrate our system using several benchmarks of varying object counts. These benchmarks reveal that, for over one million objects, our design achieves an acceleration of 812 × relative to a CPU and an acceleration of 161 × relative to a GPU. We also achieve energy efficiencies that enable the mitigation of silicon power-density challenges, while making the design amenable to both mobile and wearable computing devices.