scispace - formally typeset
Search or ask a question

Showing papers presented at "International Symposium on Object/Component/Service-Oriented Real-Time Distributed Computing in 2021"


Proceedings ArticleDOI
01 Jun 2021
TL;DR: In this paper, the authors explore virtualization technologies and cloud computing for migrating an existing real-time safety-critical railway use-case from dedicated hardware solutions to a public cloud.
Abstract: This paper explores virtualization technologies and cloud computing for migrating an existing real-time safety-critical railway use-case from dedicated hardware solutions. Cloud computing is rapidly gaining popularity in many domains as they provide benefits such as higher availability, scalability, and efficient hardware resource utilization. We examine existing virtualization technologies for deploying a (private) Real-Time(RT)-Cloud on COTS server hardware to run an existing railway use-case while meeting stringent safety and security requirements. We base our migration review on comparison and relevant benchmarking of KVM and Xen virtualization technologies for the specific railway requirements. Based on the insights gained, we provide suggestions for using existing virtualization technologies with new RT-cloud components to safely and securely run the railway use-case applications.

10 citations


Proceedings ArticleDOI
01 Jun 2021
TL;DR: In this article, the authors investigated the performance of MPLS channels versus normal routing, both using the Open Shortest Path First (OSPF) routing protocol and compared with both single and dual failure scenarios within the two architectures.
Abstract: MPLS has been in the forefront of high-speed Wide Area Networks (WANs), for almost two decades [1], [12]. The performance advantages in implementing Multi-Protocol Label Switching (MPLS) are mainly its superior speed based on fast label switching and its capability to perform Fast Reroute rapidly when failure(s) occur – in theory under 50 ms [16], [17], which makes MPLS also interesting for real-time applications. We investigate the aforementioned advantages of MPLS by creating two real testbeds using actual routers that commercial Internet Service Providers (ISPs) use, one with a ring and one with a partial mesh architecture. In those two testbeds we compare the performance of MPLS channels versus normal routing, both using the Open Shortest Path First (OSPF) routing protocol. The speed of the Fast Reroute mechanism for MPLS when failures are occurring is investigated. Firstly, baseline experiments are performed consisting of MPLS versus normal routing. Results are evaluated and compared using both single and dual failure scenarios within the two architectures. Our results confirm recovery times within 50 ms.

5 citations


Proceedings ArticleDOI
01 Jun 2021
TL;DR: In this paper, the authors highlight experiences made while adjusting the publisher of the open62541 OPC UA stack to enable WCET analysis, following a simple process combined with the open-source platform T-CREST.
Abstract: Worst-case execution time (WCET) analysis is a prevalent way to ensure the timely execution of programs in time-critical systems. With the advent of new technologies such as fog computing and time-sensitive networking (TSN), the interest in timing analysis has increased in industrial communication. This paper highlights experiences made while adjusting the publisher of the open62541 OPC UA stack to enable WCET analysis, following a simple process combined with the open-source platform T-CREST. The main challenges are the required knowledge about the code and the specific communication software characteristics like variable message sizes. Other findings indicate the need for other types of annotation for indirect recursion or callback functions. The paper provides the foundation for further research on adjusting the implementation of existing industrial communication protocols for WCET analysis.

5 citations


Proceedings ArticleDOI
01 Jun 2021
TL;DR: In this article, an adaptive system optimization and reconfiguration approach that dynamically adapts the scheduling parameters and processor speeds to satisfy dynamic deadlines while consuming as little energy as possible is presented.
Abstract: The increasing computing demands of autonomous driving applications make energy optimizations critical for reducing battery capacity and vehicle weight. Current energy optimization methods typically target traditional real-time systems with static deadlines, resulting in conservative energy savings that are unable to exploit additional energy optimizations due to dynamic deadlines arising from the vehicle's change in velocity and driving context. We present an adaptive system optimization and reconfiguration approach that dynamically adapts the scheduling parameters and processor speeds to satisfy dynamic deadlines while consuming as little energy as possible. Our experimental results with an autonomous driving task set from Bosch and realworld driving data show energy reductions up to 46.4% on average in typical dynamic driving scenarios compared with traditional static energy optimization methods, demonstrating great potential for dynamic energy optimization gains by exploiting dynamic deadlines.

5 citations


Proceedings ArticleDOI
01 Jun 2021
TL;DR: In this paper, a new scheduler called Reliable Job Allocation scheduler (RJA) is proposed to improve the reliability of real-time wireless systems, which uses an approach called Frame Replication and Elimination for Reliability (FRER) to replicate the communication flows through redundant routes.
Abstract: The job allocation problem is a challenging in wireless systems, because the spatial and temporal distribution of dependent jobs to hosts must satisfy the precedence constraints, prevent communication interference and minimize energy consumption. In this paper, a new scheduler called Reliable Job Allocation scheduler (RJA) is proposed to improve the reliability of realtime wireless systems. The proposed scheduler uses an approach called Frame Replication and Elimination for Reliability (FRER) to replicate the communication flows through redundant routes. RJA considers the periodicity of Time-Triggered (TT) flows and the impact of the induced interference, by applying a physical interference model which is used to ensure that all flows are transmitted successfully in the assigned time-slots. The scheduling efficiency and the system reliability are improved through allocating jobs to hosts with high performance in terms of flow arrival time, energy consumption and failure rates. A reliability model is also introduced to determine the reliability of the system. The reliability model computes the reliability of each job depending on the reliability of all its incoming flows. The reliability of the leaf job, which has no forwarding flows, presents the global reliability of the overall system. RJA is compared with state-of-the-art TT schedulers that use either the shortest or load-aware routes to send flows without addressing reliability. The experimental results show that the reliability of the system computed by RJA is improved compared to the other schedulers while also ensuring scalability in the network design (i.e. increase of the number of jobs and hosts) and timeliness. We also study the impact of the injected link failures on the flow delivery ratio.

4 citations


Proceedings ArticleDOI
01 Jun 2021
TL;DR: In this article, the synchronization of the task execution schedule with the underlying communication schedule is investigated, and an open-source software framework for time-triggered end-systems is proposed.
Abstract: In order to guarantee end-to-end latency and minimal jitter in distributed real-time systems, it is necessary to provide tight synchronization between computation and communication. This requires time-predictable execution of tasks across all processing nodes, and the use of a network protocol that can provide a global time base and bounded communication latency. TTEthernet is one such industrial communication protocol. This paper investigates the synchronization of the task execution schedule with the underlying communication schedule, and we propose an open-source software framework for time-triggered end-systems. We present the implementation of a static cyclic task schedule, on a time-predictable platform that is integrated within a TTEthernet network and synchronized with the communication schedule. We evaluate the presented framework by developing a simple one-sensor, one-actuator industrial control example, distributed over three nodes that communicate over a single TTEthernet switch. The presented real-time system can exchange messages with minimal jitter as the distributed tasks are synchronized over the TTEthernet network with about 1.6 us precision. Due to the tight time synchronization, the system can operate stably with zero missed frames, using a single receiver and a single transmitter buffer.

3 citations


Proceedings ArticleDOI
01 Jun 2021
TL;DR: In this paper, the authors present an approach to eliminate the interference of low-level non-deterministic I/O interfaces for real-time tasks with high predictability demands while preserving flexibility for tasks with lower requirements.
Abstract: Predictable and analyzable I/O is one of the considerable challenges in the design of multi-core real-time systems. A common approach to tackle this issue is to partition and schedule I/O transactions such that interference between tasks is minimized. While this works for packet-oriented interfaces with deterministic blocking times, such as ethernet, these techniques are inapplicable to a whole range of I/O devices with nondeterministic behavior that is commonly found in embedded applications. Interfaces, such as SPI, do not allow for fine-grained scheduling and thus exhibit uncontrolled blocking times. Even worse, their configuration and use must be considered as independent transactions requiring costly synchronization between tasks. The resulting detrimental effects are, in particular, pronounced in settings with mixed task requirements on predictability and determinism. All this makes the temporal analysis of such systems cumbersome and overly pessimistic. To solve these issues, we present $LOW_{I/O}$ , an approach to eliminate the interference of low-level non-deterministic I/O interfaces for real-time tasks with high predictability demands (i.e., critical task) while preserving flexibility for tasks with lower requirements (i.e., uncritical tasks). Therefore, we leverage knowledge about the application-specific I/O usage patterns, obtained by static analysis, to derive a tailored hardware architecture. Its key feature is the anticipatory reservation of individual time slots for critical tasks and to mimic preemptivity of I/O units for the remaining system. We have implemented our approach as a toolchain for OSEK-based real-time systems that automatically generates an application-specific SoC design along with a hardware and timing model for subsequent WCET analysis. Our experimental results prove predictable timing for critical tasks with limited impact on uncritical tasks.

3 citations


Proceedings ArticleDOI
01 Jun 2021
TL;DR: In this article, the authors use dynamic analysis paired with machine learning techniques to train a binary classifier on a collection of system-level metrics, database metrics, and MySQL status variables, and the classifier is loaded into the dynamic analysis code where the model can classify unlabeled data in real-time.
Abstract: This paper presents an approach for real-time detection of the More is Less software performance anti-pattern in MySQL databases. This project uses dynamic analysis paired with machine learning techniques to train a binary classifier on a collection of system-level metrics, database metrics, and MySQL status variables. After training, the classifier is loaded, at runtime, into the dynamic analysis code where the model can classify unlabeled data in real-time. The results of our approach show that the binary classifier can predict the More is Less software performance anti-pattern with 99.1% sensitivity.

2 citations


Proceedings ArticleDOI
01 Jun 2021
TL;DR: In this paper, the authors propose a strategy to degrade distributed systems based on the Artificial Hormone System (AHS) in the event of node failures, by automatically degrading the system in such overload situations through de-allocating of low-priority tasks so that only high-priority task are running.
Abstract: This paper presents a novel strategy to degrade distributed systems based on the Artificial Hormone System (AHS) in the event of node failures. The AHS is a middleware based on Organic Computing principles to distribute tasks to computing nodes in a self-organizing way. Node failures are automatically detected, resulting in the affected tasks being relocated to healthy nodes, thus exhibiting self-healing capabilities. If tasks are assigned priorities, it is even possible to heal the system if the remaining nodes' combined resources are no longer sufficient to allocate all tasks. This is done by automatically degrading the system in such overload situations through de-allocation of low-priority tasks so that only high-priority tasks are running. We present a novel strategy that governs the order in which low-priority tasks are stopped and prove hard time bounds for the duration of the resulting self-healing process, outperforming a prior strategy. Initial evaluations conducted in an AHS simulator are in accordance with the theoretical considerations.

2 citations


Proceedings ArticleDOI
01 Jun 2021
TL;DR: In this article, a software library is presented that can detect the improper use of GPUs for safety-critical computer-vision applications, revealing the presence of issues in all ten applications considered and a case study is presented, detailing the response-time improvements to one of the applications when such issues are corrected.
Abstract: Computer-vision applications typically rely on graphics processing units (GPUs) to accelerate computations. However, prior work has shown that care must be taken when using GPUs in real-time systems subject to strict timing constraints; without such care, GPU use can easily lead to unexpected delays not only on the GPU device but also on the host CPU. In this paper, a software library is presented that can detect the improper use of GPUs for safety-critical computer-vision applications. This library was used to analyze several GPU-using sample applications available as part of OpenCV, a popular computer-vision library, revealing the presence of issues in all ten applications considered. Additionally, a case study is presented, detailing the response-time improvements to one of the applications when such issues are corrected.

1 citations


Proceedings ArticleDOI
01 Jun 2021
TL;DR: An IDK classifier is a software component that categorizes each input provided to it into one of a fixed set of “classes,” or outputs an “I don't know” (IDK) to indicate that it is unable to classify this input.
Abstract: An IDK classifier is a software component that categorizes each input provided to it into one of a fixed set of “classes,” or outputs an “I don't know” (IDK) to indicate that it is unable to classify this input. An IDK-cascade is a linear arrangement of different IDK classifiers for the same classification problem, which are executed in sequence on a given input until one outputs an actual class (rather than IDK). Given a multistage computation that must be completed within a specified hard end-to-end deadline and a choice of classifiers, one deterministic and one an IDK-cascade, for each stage, the problem of determining which IDK-cascades to schedule in order to minimize the expected end-to-end response time while guaranteeing to meet the specified deadline is considered. Different variants of this problem are defined, and optimal algorithms for solving them are derived.

Proceedings ArticleDOI
01 Jun 2021
TL;DR: In this paper, the authors investigated the quality impact and security properties of redundant PTP deployment and proposed an observation window-based multi-domain, PTP end-system, design to increase fault-tolerance and security.
Abstract: Distributed real-time systems often rely on time-triggered communication and task execution to guarantee end-to-end latency and time-predictable computation. Such systems require a reliable synchronized network time to be shared among end-systems. The IEEE 1588 Precision Time Protocol (PTP) enables such clock synchronization throughout an Ethernet-based network. While security was not addressed in previous versions of the IEEE 1588 standard, in its most recent iteration (IEEE 1588-2019), several security mechanisms and recommendations were included describing different measures that can be taken to improve system security and safety. One proposal to improve security and reliability is to add redundancy to the network through modifications in the topology. However, this recommendation omits implementation details and leaves the question open of how it affects synchronization quality. This work investigates the quality impact and security properties of redundant PTP deployment and proposes an observation window-based multi-domain, PTP end-system, design to increase fault-tolerance and security. We implement the proposed design inside a discrete-event network simulator and evaluate its clock synchronization quality using two test-case network topologies with simulated faults.

Proceedings ArticleDOI
01 Jun 2021
TL;DR: T-Pack as discussed by the authors analyzes end-to-end transmission times of packets and detects compromised systems or networks based on deviation of observed time from the expected time on end nodes, well in advance of a task's deadline.
Abstract: Network communication between real-time control systems raises system vulnerability to malware attacks over the network. Such attacks not only result in alteration of system behavior but also incur timing dilation due to executing injected code or, in case of network attacks, to dropped, added, rerouted, or modified packets. This work proposes to detect intrusion based on time dilation induced by time delays within the network potentially resulting in system malfunctioning due to missed deadlines. A new method of timed packet protection, T-Pack, analyzes end-to-end transmission times of packets and detects a compromised system or network based on deviation of observed time from the expected time on end nodes, well in advance of a task's deadline. First, the Linux network stack is extended with timing information maintained within the kernel and further embedded within packets for TCP and UDP communication. Second, real-time application scenarios are analyzed in terms of their susceptibility to malware attacks. Results are evaluated on a distributed system of embedded platforms running a Preempt RT Linux kernel to demonstrate its real-time capabilities.

Proceedings ArticleDOI
01 Jun 2021
TL;DR: In this article, an adaptive partitioning approach for scheduling real-time tasks on symmetric multicore systems is proposed based on combining partitioned EDF scheduling with an adaptive migration policy that moves tasks across processors only when strictly needed to respect their temporal constraints.
Abstract: This paper provides an open implementation and an experimental evaluation of an adaptive partitioning approach for scheduling real-time tasks on symmetric multicore systems. The proposed technique is based on combining partitioned EDF scheduling with an adaptive migration policy that moves tasks across processors only when strictly needed to respect their temporal constraints. The implementation of the technique within the Linux kernel, via modifications to the SCHED_DEADLINE code base, is presented. An extensive experimentation-has been conducted by applying the technique on a real multi-core platform with several randomly generated synthetic task sets. The obtained experimental results highlight that the approach exhibits a promising performance to schedule real-time workloads on a real system, with a greatly reduced number of migrations compared to the original global EDF available in SCHED_DEADLINE.

Proceedings ArticleDOI
01 Jun 2021
TL;DR: In this paper, the authors proposed two recovery measures to detect such situations and to quickly remove the spurious tasks and to prevent output conflicts in a self-balancing robot vehicle, and evaluated the effectiveness of these measures at the example of a self balancing robot vehicle.
Abstract: Embedded systems are growing very complex because of the increasing chip integration density and larger number of chips in distributed applications and demanding application fields. Bio-inspired techniques like self-organization are a key feature to handle this increasing complexity. The artificial hormone system (AHS) and the artificial DNA (ADNA) are exploiting such principles to enable self-organizing, self-building and self-healing distributed embedded real-time systems. The ADNA represents the building plan for the system stored in each node of the distributed structure. The AHS manages the decentralized allocation of the components (tasks) of the building plan to the nodes. Overall, a flexible and robust system is created. However, the AHS/ADNA system is vulnerable to communication dropouts between the nodes. If such dropouts last long enough, the system structure is disturbed by spurious multiple assignment of tasks. This causes conflicts in the data output of these tasks. This paper presents an approach to recover from such dropouts. We propose two recovery measures to detect such situations and to quickly remove the spurious tasks and to prevent output conflicts. The evaluation shows the effectiveness of these measures at the example of a self-balancing robot vehicle.

Proceedings ArticleDOI
01 Jun 2021
TL;DR: In this paper, the authors present an instruction filter, a simple architecture extension that adds support for fully predicated execution to existing processor cores that do not natively support it, which makes single-path code execution and hence high quality and easily derivable worst-case execution time (WCET) information available for a wide range of processors.
Abstract: In this paper, we present an instruction filter, a simple architecture extension that adds support for fully predicated execution to existing processor cores that do not natively support it. This makes single-path code execution and hence high quality and easily derivable worst-case execution time (WCET) information available for a wide range of processors. We have implemented the single-path instruction filter for two processors and evaluated it on the TACLe benchmark collection. The results demonstrate that despite the seeming inefficiency of single-path code, our method does not substantially increase the WCET. Therefore, running single-path code on processors with our instruction filter represents a competitive method for time-predictable code execution.