scispace - formally typeset
Search or ask a question

Showing papers by "Walter Stechele published in 2013"


Proceedings ArticleDOI
09 Dec 2013
TL;DR: An approach to evaluate denoising algorithms with respect to realistic camera noise: a new camera noise model that includes the full processing chain of a single sensor camera is described and it is shown that the noise characteristics have a significant effect on visual quality.
Abstract: The development and tuning of denoising algorithms is usually based on readily processed test images that are artificially degraded with additive white Gaussian noise (AWGN). While AWGN allows us to easily generate test data in a repeatable manner, it does not reflect the noise characteristics in a real digital camera. Realistic camera noise is signal-dependent and spatially correlated due to the demosaicking step required to obtain full-color images. Hence, the noise characteristic is fundamentally different from AWGN. Using such unrealistic data to test, optimize and compare denoising algorithms may lead to incorrect parameter tuning or sub optimal choices in research on denoising algorithms. In this paper, we therefore propose an approach to evaluate denoising algorithms with respect to realistic camera noise: we describe a new camera noise model that includes the full processing chain of a single sensor camera. We determine the visual quality of noisy and denoised test sequences using a subjective test with 18 participants. We show that the noise characteristics have a significant effect on visual quality. Quality metrics, which are required to compare denoising results, are applied, and we evaluate the performance of 10 full-reference metrics and one no-reference metric with our realistic test data. We conclude that a more realistic noise model should be used in future research to improve the quality estimation of digital images and videos and to improve the research on denoising algorithms.

24 citations


Book ChapterDOI
19 Feb 2013
TL;DR: This work proposes a biologically inspired self-adaptation scheme to enable complex algorithms to adapt to different environments and performs preliminary experiments using a graph-based Visual Simultaneous Localization and Mapping algorithm and a publicly available benchmark set.
Abstract: Many mobile robot algorithms require tedious tuning of parameters and are, then, often suitable to only a limited number of situations. Yet, as mobile robots are to be employed in various fields from industrial settings to our private homes, changes in the environment will occur frequently. Organic computing principles such as self-organization, self-adaptation, or self-healing can provide solutions to react to new situations, e.g. provide fault tolerance. We therefore propose a biologically inspired self-adaptation scheme to enable complex algorithms to adapt to different environments. The proposed scheme is implemented using the Organic Robot Control Architecture (ORCA) and Learning Classifier Tables (LCT). Preliminary experiments are performed using a graph-based Visual Simultaneous Localization and Mapping (SLAM) algorithm and a publicly available benchmark set, showing improvements in terms of runtime and accuracy.

8 citations


Proceedings ArticleDOI
14 Nov 2013
TL;DR: This work investigates the potential of network-coded Network-on-Chip (ncNoC) compared to classical 2D-mesh/dimension-routing NoCs and identifies multi-source scenarios with only a limited number of sinks per source to be the most advantageous connection settings for coded NoCs.
Abstract: Network-on-Chip (NoC) have become favorable for on-chip communication, especially with the ever rising number of communication partners in future manycore system-on-chip. NoCs that are based on mesh topologies with dimension-routing are well-established as they scale well with the increasing number of communication partners and allow efficient router design. To be able to serve application demands with efficiency, sophisticated features such as multicasting become an increasingly important factor in future NoC-based systems. The 2D-mesh/dimension-routing combination, however, suffers from performance degradation especially in the case of multicast communication as the network infrastructure is utilized suboptimally. Approaching this problem, we investigate the potential of network-coded Network-on-Chip (ncNoC) compared to classical 2D-mesh/dimension-routing NoCs. We adapt a high level evaluation method to compute the minimum hop count equivalent using network coding which enables us to compare network-coded and dimension-routed hop count cost. Within this environment we can demonstrate the full potential of network coding for both the butterfly and generalized multicast connection settings. We can show that network coding is never outperformed by dimension-routing in terms of required hop counts and, more important, identify multi-source scenarios with only a limited number of sinks per source to be the most advantageous connection settings for coded NoCs.

6 citations


Proceedings ArticleDOI
01 Sep 2013
TL;DR: This paper proposes a novel approach for FPGA-based, probabilistic, circuit fault simulation, which makes the simulation fast, but also keeps the hardware overhead on the FGPA low by exploiting FPG a specific features.
Abstract: The reduction of CMOS structures into the nanometer regime, as well as the high demand for low-power applications, animating to further reduce the supply voltages towards the threshold, results in an increased susceptibility of integrated circuits to soft errors. Hence, circuit reliability has become a major concern in today's VLSI design process. A new approach to further support these trends is to relax the reliability requirements of a circuit, while ensuring that the functionality of the circuit remains unaffected, or effects remain unnoticed by the user. To realize such an approach it is necessary to determine the probability of an error at the output of a circuit, given an error probability distribution at the circuits' elements. Purely software-based simulation approaches are unsuitable due to the large simulation times. Hardware-accelerated approaches exist, but lack the ability to inject errors based on probabilities, are slow or have a large area overhead. In this paper we propose a novel approach for FPGA-based, probabilistic, circuit fault simulation. The proposed system is a mainly hardware-based, which makes the simulation fast, but also keeps the hardware overhead on the FPGA low by exploiting FPGA specific features.

6 citations


Journal ArticleDOI
TL;DR: The proposed algorithm estimates and compensates ego-motion to allow for object detection from a continuously moving robot, using a first-order-flow motion model, and offers a significant improvement in performance over the state-of-the-art, under harsh environment and performs equally well under smooth motion.
Abstract: In a rescue operation walking robots offer a great deal of flexibility in traversing uneven terrain in an uncontrolled environment. For such a rescue robot, each motion is a potential vital sign and the robot should be sensitive enough to detect such motion, at the same time maintaining high accuracy to avoid false alarms. However, the existing techniques for motion detection have severe limitations in dealing with strong levels of ego-motion on walking robots. This paper proposes an optical flow-based method for the detection of moving objects using a single camera mounted on a hexapod robot. The proposed algorithm estimates and compensates ego-motion to allow for object detection from a continuously moving robot, using a first-order-flow motion model. Our algorithm can deal with strong rotation and translation in 3D, with four degrees of freedom. Two alternative object detection methods using a 2D-histogram based vector clustering and motion-compensated frame differencing, respectively, are examined for the detection of slow- and fast-moving objects. The FPGA implementation with optimized resource utilization using SW/HW codesign can process video frames in real-time at 31 fps. The new algorithm offers a significant improvement in performance over the state-of-the-art, under harsh environment and performs equally well under smooth motion.

5 citations


01 Jan 2013
TL;DR: This work will propose a self-reconfigurable software and hardware architecture in order to enable the dynamic optimization of a robot system depending on the current situation, i.e. the current task, robot state, and environment.
Abstract: Advanced robot systems need to carry out increasingly complex task sets. However, they are typically optimized to a very restricted set of tasks and environments to solve demanding problems. This work will therefore propose a self-reconfigurable software and hardware architecture in order to enable the dynamic optimization of a robot system depending on the current situation, i.e. the current task, robot state, and environment. The proposed framework is based on organic computing principles and unsupervised machine learning techniques. It further uses dynamically reconfigurable Field Programmable Gate Arrays (FPGA) as hardware accelerators.

5 citations


Journal ArticleDOI
01 Sep 2013
TL;DR: An implementation is proposed to accelerate the computation of complex motion estimation vectors on programmable tightly-coupled processor arrays, which offer a high flexibility enabled by coarse-grained reconfiguration capabilities and is 18 times faster when compared to an FPGA-based soft processor implementation.
Abstract: Optical flow is widely used in many applications of portable mobile devices and automotive embedded systems for the determination of motion of objects in a visual scene. Also in robotics, it is used for motion detection, object segmentation, time-to-contact information, focus of expansion calculations, robot navigation, and automatic parking for vehicles. Similar to many other image processing algorithms, optical flow processes pixel operations repeatedly over whole image frames. Thus, it provides a high degree of fine-grained parallelism which can be efficiently exploited on massively parallel processor arrays. In this context, we propose to accelerate the computation of complex motion estimation vectors on programmable tightly-coupled processor arrays, which offer a high flexibility enabled by coarse-grained reconfiguration capabilities. Novel is also that the degree of parallelism may be adapted to the number of processors that are available to the application. Finally, we present an implementation that is 18 times faster when compared to (a) an FPGA-based soft processor implementation, and (b) may be adapted regarding different QoS requirements, hence, being more flexible than a dedicated hardware implementation.

4 citations


Proceedings Article
14 Nov 2013
TL;DR: A new resource-aware nearest-neighbor search algorithm for kd-trees on many-core processors that can adapt itself to the dynamically varying load on a many- core processor and can achieve a good response time and avoid frame drops.
Abstract: Kd-tree search is widely used today in computer vision - for example in object recognition to process a large set of features and identify the objects in a scene. However, the search times vary widely based on the size of the data set to be processed, the number of objects present in the frame, the size and shape of the kd-tree, etc. Constraining the search interval is extremely critical for real-time applications in order to avoid frame drops and to achieve a good response time. The inherent parallelism in the algorithm can be exploited by using massively parallel architectures like many-core processors. However, the variation in execution time is more pronounced on such hardware (HW) due to the presence of shared resources and dynamically varying load situations created by applications running concurrently. In this work, we propose a new resource-aware nearest-neighbor search algorithm for kd-trees on many-core processors. The novel algorithm can adapt itself to the dynamically varying load on a many-core processor and can achieve a good response time and avoid frame drops. The results show significant improvements in performance and detection rate compared to the conventional approach while the simplicity of the conventional algorithm is retained in the new model.

4 citations



Proceedings ArticleDOI
27 May 2013
TL;DR: The presented E/E design space exploration framework provides a graphical modeling and simulation environment with an interactive visualization of simulation-based results and contains an advanced business logic which administrates the modeling, storing, retrieving and cloning of evaluation sessions consisting of complex experiments.
Abstract: The E/E (electric/electronic) architecture of a modern vehicle is a complex distributed system, where up to 80 electronic control units (ECUs), interconnected by several communication buses, need to collaborate with each other in order to implement the various comfort and safety features. The presented E/E design space exploration framework supports engineers during the development process of new E/E architectures by providing a graphical modeling and simulation environment with an interactive visualization of simulation-based results. Furthermore, it contains an advanced business logic which administrates the modeling, storing, retrieving and cloning of evaluation sessions consisting of complex experiments. Due to a high-level modeling approach, future architectures can be evaluated in respect to power consumption and performance values already in an early stage of the design process and design alternatives can be easily compared with each other.

3 citations


Proceedings ArticleDOI
20 May 2013
TL;DR: In this article, the authors use a simulation library, ReSim, to perform functional verification of a DRS application before, during, and after reconfigurations, using the Auto Vision driver assistance system as a case study.
Abstract: Dynamically Reconfigurable Systems (DRS) allow hardware logic to be partially reconfigured while the rest of the design continues to operate. For example, the Auto Vision driver assistance system swaps video processing engines when the driving conditions change. However, the architectural flexibility of DRS also introduces challenges for verifying system functionality. Using Auto Vision as a case study, this paper studies the use of a recent RTL simulation library, ReSim, to perform functional verification of DRS designs. Compared with the conventional Virtual Multiplexing approach, ReSim more accurately simulates the Auto Vision system before, during and after reconfigurations. With trivial development and simulation overhead, ReSim assisted in detecting significantly more bugs than found using Virtual Multiplexing. To the best of our knowledge, this paper is the first significant effort towards functionally verifying a cutting-edge, complex, real-world DRS application.

Proceedings Article
01 Jan 2013
TL;DR: This paper studies the use of a recent RTL simulation library, ReSim, to perform functional verification of DRS designs and reports the first significant effort towards functionally verifying a cutting-edge, complex, real-world DRS application.
Abstract: Dynamically Reconfigurable Systems (DRS) allow hardware logic to be partially reconfigured while the rest of the design continues to operate. For example, the Auto Vision driver assistance system swaps video processing engines when the driving conditions change. However, the architectural flexibility of DRS also introduces challenges for verifying system functionality. Using Auto Vision as a case study, this paper studies the use of a recent RTL simulation library, ReSim, to perform functional verification of DRS designs. Compared with the conventional Virtual Multiplexing approach, ReSim more accurately simulates the Auto Vision system before, during and after reconfigurations. With trivial development and simulation overhead, ReSim assisted in detecting significantly more bugs than found using Virtual Multiplexing. To the best of our knowledge, this paper is the first significant effort towards functionally verifying a cutting-edge, complex, real-world DRS application.

Proceedings ArticleDOI
24 Oct 2013
TL;DR: A metric for weighted partitioning of pre-defined processing element sequences is presented and a partitioning algorithm with linear complexity is formulated, which yields a set of reconfigurable partitions which are balanced in terms of resources, while jointly have a minimal data throughput.
Abstract: Temporal runtime-reconfiguration of FPGAs allows for a resource-efficient sequential execution of signal processing modules. Approaches for partitioning processing chains into modules have been derived in various previous works. We will present a metric for weighted partitioning of pre-defined processing element sequences. The proposed method yields a set of reconfigurable partitions, which are balanced in terms of resources, while jointly have a minimal data throughput. Using this metric, we will formulate a partitioning algorithm with linear complexity and will compare our approach to the state of the art.

Proceedings ArticleDOI
04 Sep 2013
TL;DR: This paper presents a new DAQ system with real-time data processing capability for X-ray detector systems and includes new algorithms for dynamic adaptation of important processing parameters during a measurement.
Abstract: This paper presents a new DAQ system with real-time data processing capability for X-ray detector systems. The data processing algorithms are working directly on the serialised input data stream from the detector system and run in real-time on a FPGA processing system. The DAQ system includes new algorithms for dynamic adaptation of important processing parameters during a measurement. These dynamic algorithms enable better measurement precisions, autonomous system operation and avoid detector recalibration cycles. The developed and presented data processing algorithms are optimised for low memory bandwidth and hardware resource utilisation. High data reduction and live monitoring of important system parameters are achieved with the real-time processing. The concept of the new DAQ system was tested and evaluated with the Mercury Imaging X-ray Spectrometer (MIXS). MIXS is an instrument on board of the ESA corner stone mission BepiColombo which will be launched 2015 into an orbit around Mercury. The 64x64 detector matrix used in the MIXS detector system for X-ray detection delivers 6000 frames per second and provides an optimal evaluation platform for the new DAQ system.

Proceedings ArticleDOI
04 Sep 2013
TL;DR: The proposed algorithm enables real-time noise estimation directly on the raw data stream of an X-ray detection system, allowing for better detector monitoring and increases the measurement accuracy, especially in long term measurements and under changing environment conditions.
Abstract: The proposed algorithm enables real-time noise estimation directly on the raw data stream of an X-ray detection system This allows for better detector monitoring and increases the measurement accuracy, especially in long term measurements and under changing environment conditions The noise value is used as an event extraction threshold and is therefore the most critical parameter for the performance in X-ray data processing Inaccurately estimated noise values can result into significantly poorer energy resolutions The new algorithm is designed for low memory bandwidth, precise estimation results and a low processing complexity This allows real-time data processing even for large and fast detector systems with an output rate of hundreds of MPixels/s Compared to the calculation method used up to now the memory bandwidth can be reduced down to a factor of a hundred with the new algorithm The new developed real-time Dynamic Noise Estimation (DNE) algorithm uses an iterative approach for the estimation For the calculation is only an increment/decrement operation and an integer comparison required This simple approach allows a resource efficient hardware implementation on FPGAs Simulation results confirm the excellent estimation accuracy and stability with real-time performance Measurements with Fano-limited energy resolution, the theoretically best achievable energy resolution with semiconductor detectors, confirm this outstanding performance on real detector systems The reliability and stability of the system was demonstrated during a long term irradiation campaign for Mercury Imaging X-ray Spectrometer (MIXS) [5], [3] MIXS is an instrument onboard ESA's corner stone mission BepiColombo [1] and will be launched 2015 into an orbit around Mercury