scispace - formally typeset
Search or ask a question

Showing papers on "Application software published in 2009"


Proceedings ArticleDOI
18 May 2009
TL;DR: This work presents Eucalyptus -- an open-source software framework for cloud computing that implements what is commonly referred to as Infrastructure as a Service (IaaS); systems that give users the ability to run and control entire virtual machine instances deployed across a variety physical resources.
Abstract: Cloud computing systems fundamentally provide access to large pools of data and computational resources through a variety of interfaces similar in spirit to existing grid and HPC resource management and programming systems. These types of systems offer a new programming target for scalable application developers and have gained popularity over the past few years. However, most cloud computing systems in operation today are proprietary, rely upon infrastructure that is invisible to the research community, or are not explicitly designed to be instrumented and modified by systems researchers. In this work, we present Eucalyptus -- an open-source software framework for cloud computing that implements what is commonly referred to as Infrastructure as a Service (IaaS); systems that give users the ability to run and control entire virtual machine instances deployed across a variety physical resources. We outline the basic principles of the Eucalyptus design, detail important operational aspects of the system, and discuss architectural trade-offs that we have made in order to allow Eucalyptus to be portable, modular and simple to use on infrastructure commonly found within academic settings. Finally, we provide evidence that Eucalyptus enables users familiar with existing Grid and HPC systems to explore new cloud computing functionality while maintaining access to existing, familiar application development software and Grid middle-ware.

1,962 citations


Proceedings ArticleDOI
17 May 2009
TL;DR: The Native Client project as mentioned in this paper is a sandbox for untrusted x86 native code that uses software fault isolation and a secure runtime to direct system interaction and side effects through interfaces managed by Native Client.
Abstract: This paper describes the design, implementation and evaluation of Native Client, a sandbox for untrusted x86 native code. Native Client aims to give browser-based applications the computational performance of native applications without compromising safety. Native Client uses software fault isolation and a secure runtime to direct system interaction and side effects through interfaces managed by Native Client. Native Client provides operating system portability for binary code while supporting performance-oriented features generally absent from web application programming environments, such as thread support, instruction set extensions such as SSE, and use of compiler intrinsics and hand-coded assembler. We combine these properties in an open architecture that encourages community review and 3rd-party tools.

560 citations


Journal ArticleDOI
TL;DR: This paper presents a state of the art in software architecture reconstruction approaches and presents a plethora of approaches and techniques supporting architecture reconstruction.
Abstract: To maintain and understand large applications, it is important to know their architecture. The first problem is that unlike classes and packages, architecture is not explicitly represented in the code. The second problem is that successful applications evolve over time, so their architecture inevitably drifts. Reconstructing the architecture and checking whether it is still valid is therefore an important aid. While there is a plethora of approaches and techniques supporting architecture reconstruction, there is no comprehensive software architecture reconstruction state of the art and it is often difficult to compare the approaches. This paper presents a state of the art in software architecture reconstruction approaches.

355 citations


Proceedings ArticleDOI
13 Apr 2009
TL;DR: The architecture, design, and preliminary evaluation of ACme, a wireless sensor and actuator network for monitoring AC energy usage and controlling AC devices in a large and diverse building environment, is presented.
Abstract: We present the architecture, design, and preliminary evaluation of ACme, a wireless sensor and actuator network for monitoring AC energy usage and controlling AC devices in a large and diverse building environment The ACme system consists of three tiers: the ACme node which provides a metering and control interface to a single outlet, a network fabric which allows this interface to be exported to arbitrary IP endpoints, and application software that uses this networked interface to provide various power-centric applications The ACme node integrates an Epic core module with a dedicated energy metering IC to provide real, reactive, and apparent power measurements, with optional control of an attached load The network comprises a complete IPv6/6LoWPAN stack on every node and an edge router that connects to other IP networks The application tier receives and stores readings in a database and uses a web server for visualization Nodes automatically join the IPv6 subnet after being plugged in, and begin interactions with the application layer We evaluate our system in a preliminary green building deployment with 49 nodes spread over several floors of a Computer Science Building and present energy consumption data from this preliminary deployment

320 citations


Proceedings ArticleDOI
17 Jun 2009
TL;DR: The traditional integration solutions are discussed, proposed and implement an alternative architecture where sensor nodes are accessible according to the REST principles, and the nodes become part of a “Web of Things” and interacting with them as well as composing their services with existing ones, becomes almost as easy as browsing the web.
Abstract: Wireless Sensor Networks (WSNs) have promising industrial applications, since they reduce the gap between traditional enterprise systems and the real world. However, every particular application requires complex integration work, and therefore technical expertise, effort and time which prevents users from creating small tactical, ad-hoc applications using sensor networks. Following the success of Web 2.0 “mashups”, we propose a similar lightweight approach for combining enterprise services (e.g. ERPs) with WSNs. Specifically, we discuss the traditional integration solutions, propose and implement an alternative architecture where sensor nodes are accessible according to the REST principles. With this approach, the nodes become part of a “Web of Things” and interacting with them as well as composing their services with existing ones, becomes almost as easy as browsing the web.

234 citations


Journal ArticleDOI
TL;DR: At the core of cloud computing is a simple concept: software as a service, or SaaS, but a complex concoction of paradigms, concepts, and technologies envelop cloud computing.
Abstract: At the core of cloud computing is a simple concept: software as a service, or SaaS. Whether the underlying software is an application, application component, platform, framework, environment, or some other soft infrastructure for composing applications to be delivered as a service on the Web, it's all software in the end. But the simplicity ends there. Just a step away from that core, a complex concoction of paradigms, concepts, and technologies envelop cloud computing.

229 citations


Patent
17 Apr 2009
TL;DR: In this article, an annotation module is provided on a client machine as a plugin for a web browser application, which allows the user to interact with the browser application to annotate a document displayed on the screen.
Abstract: A variety of technologies can be used to annotate electronic documents. In one embodiment, an annotation module is provided on a client machine as a plugin for a web browser application. The annotation module provides a user interface which allows the user to interact with the web browser application to annotate a document displayed using the browser application. Other embodiments are described.

207 citations


Proceedings ArticleDOI
18 May 2009
TL;DR: This paper proposes using explicit variability models to systematically derive customization and deployment information for individual SaaS tenants and demonstrates how variability models could be used to systematically consider information about already deployed SAAS applications for efficiently deploying SaaA applications for new tenants.
Abstract: More and more companies are offering their software by following the Software as a Service (SaaS) model. The promise of the SaaS model is to exploit economies of scale on the provider side by hosting multiple customers (or tenants) on the same hardware and software infrastructure. However, to attract a significant number of tenants, SaaS applications have to be customizable to fulfill the varying functional and quality requirements of individual tenants. In this paper, we describe how variability modeling techniques from software product line engineering can support SaaS providers in managing the variability of SaaS applications and their requirements. Specifically, we propose using explicit variability models to systematically derive customization and deployment information for individual SaaS tenants. We also demonstrate how variability models could be used to systematically consider information about already deployed SaaS applications for efficiently deploying SaaS applications for new tenants. We illustrate our approach by a running example for a meeting planning application.

204 citations


Patent
14 Aug 2009
TL;DR: In this paper, a self-evolving cyber robot is provided to make and evolve a personal cyber robot of a user to make various artificial intelligence robots and a robot management server synchronizes information in real time, provides various robots, stores/manages the robot information, and guides the information of a connected user.
Abstract: A self-evolving cyber robot is provided to make and evolve a personal cyber robot of a user to make various artificial intelligence robots. A user terminal(100) performs the creation and growth of a personal cyber robot having the knowledge of a user. An application software server(310) is connected with the user terminal through the cyber space(300) to perform the provision and management of all software. A robot management server(330) synchronizes information in real time, provides various robots, stores/manages the robot information, and guides the information of a connected user.

158 citations


Proceedings ArticleDOI
23 May 2009
TL;DR: An input-adaptive optimization framework, namely G-ADAPT, is developed to address the influence of program inputs by constructing cross-input predictive models for automatically predicting the (near-)optimal configurations for an arbitrary input to a GPU program.
Abstract: Recent years have seen a trend in using graphic processing units (GPU) as accelerators for general-purpose computing. The inexpensive, single-chip, massively parallel architecture of GPU has evidentially brought factors of speedup to many numerical applications. However, the development of a high-quality GPU application is challenging, due to the large optimization space and complex unpredictable effects of optimizations on GPU program performance. Recently, several studies have attempted to use empirical search to help the optimization. Although those studies have shown promising results, one important factor—program inputs—in the optimization has remained unexplored. In this work, we initiate the exploration in this new dimension. By conducting a series of measurement, we find that the ability to adapt to program inputs is important for some applications to achieve their best performance on GPU. In light of the findings, we develop an input-adaptive optimization framework, namely G-ADAPT, to address the influence by constructing cross-input predictive models for automatically predicting the (near-)optimal configurations for an arbitrary input to a GPU program. The results demonstrate the promise of the framework in serving as a tool to alleviate the productivity bottleneck in GPU programming.

150 citations


Journal ArticleDOI
TL;DR: In this paper, a software technique for transient fault tolerance, which leverages multiple cores for low overhead, is presented, which shifts the focus from ensuring correct hardware execution to ensuring correct software execution.
Abstract: Transient faults are emerging as a critical concern in the reliability of general-purpose microprocessors. As architectural trends point toward multicore designs, there is substantial interest in adapting such parallel hardware resources for transient fault tolerance. This paper presents process-level redundancy (PLR), a software technique for transient fault tolerance, which leverages multiple cores for low overhead. PLR creates a set of redundant processes per application process and systematically compares the processes to guarantee correct execution. Redundancy at the process level allows the operating system to freely schedule the processes across all available hardware resources. PLR uses a software-centric approach to transient fault tolerance, which shifts the focus from ensuring correct hardware execution to ensuring correct software execution. As a result, many benign faults that do not propagate to affect program correctness can be safely ignored. A real prototype is presented that is designed to be transparent to the application and can run on general-purpose single-threaded programs without modifications to the program, operating system, or underlying hardware. The system is evaluated for fault coverage and performance on a four-way SMP machine and provides improved performance over existing software transient fault tolerance techniques with a 16.9 percent overhead for fault detection on a set of optimized SPEC2000 binaries.

Patent
30 Dec 2009
TL;DR: In this paper, the authors present a mobile application ecosystem comprising a mobile app development kit and store, both of which are implemented as web-based services such that creation, testing, and distribution of mobile applications, as well as discovery, investigation, and delivery of same, can all be performed using a standard web browser.
Abstract: The present invention provides a mobile application ecosystem comprising a mobile application development kit and store, both of which are implemented as web-based services such that creation, testing, and distribution of mobile applications, as well as discovery, investigation, and delivery of same, can all be performed using a standard web browser. The mobile application development kit offers common capabilities across all target mobile device brand and brand groups, allowing the same application construct to work unmodified on all, while building the application in a manner that is native to each, thereby avoiding any requirement to embed a separate common runtime or virtual machine on every mobile device.

Proceedings ArticleDOI
23 May 2009
TL;DR: This paper considers the benefits of running on multiple (parallel) GPUs to provide further orders of performance speedup and develops a methodology that allows developers to accurately predict execution time for GPU applications while varying the number and configuration of the GPUs, and the size of the input data set.
Abstract: Graphics Processing Units (GPUs) have been growing in popularity due to their impressive processing capabilities, and with general purpose programming languages such as NVIDIA's CUDA interface, are becoming the platform of choice in the scientific computing community. Previous studies that used GPUs focused on obtaining significant performance gains from execution on a single GPU. These studies employed low-level, architecture-specific tuning in order to achieve sizeable benefits over multicore CPU execution. In this paper, we consider the benefits of running on multiple (parallel) GPUs to provide further orders of performance speedup. Our methodology allows developers to accurately predict execution time for GPU applications while varying the number and configuration of the GPUs, and the size of the input data set. This is a natural next step in GPU computing because it allows researchers to determine the most appropriate GPU configuration for an application without having to purchase hardware, or write the code for a multiple-GPU implementation. When used to predict performance on six scientific applications, our framework produces accurate performance estimates (11% difference on average and 40% maximum difference in a single case) for a range of short and long running scientific programs.

Journal ArticleDOI
01 Nov 2009
TL;DR: A comprehensive empirical study examining two different methodologies, data sampling and boosting, for improving the performance of decision-tree models designed to identify fp software modules shows that while data-sampling techniques are very effective in improving theperformance of such models, boosting almost always outperforms even the best data- sampling techniques.
Abstract: Software-quality data sets tend to fall victim to the class-imbalance problem that plagues so many other application domains. The majority of faults in a software system, particularly high-assurance systems, usually lie in a very small percentage of the software modules. This imbalance between the number of fault-prone (fp) and non-fp (nfp) modules can have a severely negative impact on a data-mining technique's ability to differentiate between the two. This paper addresses the class-imbalance problem as it pertains to the domain of software-quality prediction. We present a comprehensive empirical study examining two different methodologies, data sampling and boosting, for improving the performance of decision-tree models designed to identify fp software modules. This paper applies five data-sampling techniques and boosting to 15 software-quality data sets of different sizes and levels of imbalance. Nearly 50 000 models were built for the experiments contained in this paper. Our results show that while data-sampling techniques are very effective in improving the performance of such models, boosting almost always outperforms even the best data-sampling techniques. This significant result, which, to our knowledge, has not been previously reported, has important consequences for practitioners developing software-quality classification models.

Proceedings ArticleDOI
17 May 2009
TL;DR: This work proposes CLAMP, an architecture for preventing data leaks even in the presence of web server compromises or SQL injection attacks, and arrives at an architecture that allows developers to use familiar operating systems, servers, and scripting languages, while making relatively few changes to application code.
Abstract: Providing online access to sensitive data makes web servers lucrative targets for attackers. A compromise of any of the web server's scripts, applications, or operating system can leak the sensitive data of millions of customers. Unfortunately, many systems for stopping data leaks require considerable effort from application developers, hindering their adoption.In this work, we investigate how such leaks can be prevented with minimal developer effort. We propose CLAMP, an architecture for preventing data leaks even in the presence of web server compromises or SQL injection attacks. CLAMP protects sensitive data by enforcing strong access control on user data and by isolating code running on behalf of different users. By focusing on minimizing developer effort, we arrive at an architecture that allows developers to use familiar operating systems, servers, and scripting languages, while making relatively few changes to application code -- less than 50 lines in our applications.

Proceedings ArticleDOI
11 May 2009
TL;DR: The focus of this paper is on the integration of radio frequency identification (RFID) and wireless sensor network (WSN) in smart homes and applications of this system such as identifying a caregiver who enters the home.
Abstract: With the aging population and increased need to care for the elderly there are fewer of the younger generation to administer the necessary care and supervision. This condition is one of the reasons many researchers devote their time in evolving smart homes. These homes offer the occupant(s) a level of convenience not seen in traditional homes by using technology to create an environment that is aware of the activities taking place within it. The focus of this paper is on the integration of radio frequency identification (RFID) and wireless sensor network (WSN) in smart homes and applications of this system such as identifying a caregiver who enters the home. In the following work we present an architecture consisting of RFID, a WSN to identify motion within an environment and who is moving as well as several useful applications which take advantage of this information.

Journal ArticleDOI
TL;DR: A new method of analysis based on a list of criteria that indicate a disruptive innovation and trajectory maps of the technologies' performance attributes indicates a small likelihood for web applications to pose a disruptive threat to Microsoft and by extension, to incumbents in the software industry.

Patent
24 Apr 2009
TL;DR: In this paper, a two-way isolation of the distributed resources of an application (e.g., the executing application, the application user interface on the user's computer, and server-and client-side stored resources) from other applications may be desirable.
Abstract: An application host (such as a web application server) may execute a set of applications on behalf of a set of users. Such applications may not be fully trusted, and a two-way isolation of the distributed resources of an application (e.g., the executing application, the application user interface on the user's computer, and server- and client-side stored resources) from other applications may be desirable. This isolation may be promoted utilizing the cross-domain restriction policies of each user's computer by allocating a distinct subdomain of the application host for each application. The routing of network requests to a large number of distinct subdomains may be economized by mapping all distinct subdomains to the address of the domain of the application host. Moreover, the application user interfaces may be embedded in an isolation construct (e.g., an IFRAME HTML element) to promote two-way isolation among application user interfaces and client-side application resources.

Journal ArticleDOI
TL;DR: A prototype application called Open Smart Classroom is built, built on the software infrastructure based on the multiagent system architecture using Web Service technology in Smart Space, which demonstrates the influence of these new features on the educational effect.
Abstract: Real-time interactive virtual classroom with teleeducation experience is an important approach in distance learning. However, most current systems fail to meet new challenges in extensibility and scalability, which mainly lie with three issues. First, an open system architecture is required to better support the integration of increasing human-computer interfaces and personal mobile devices in the classroom. Second, the learning system should facilitate opening its interfaces, which will help easy deployment that copes with different circumstances and allows other learning systems to talk to each other. Third, problems emerge on binding existing systems of classrooms together in different places or even different countries such as tackling systems intercommunication and distant intercultural learning in different languages. To address these issues, we build a prototype application called Open Smart Classroom built on our software infrastructure based on the multiagent system architecture using Web Service technology in Smart Space. Besides the evaluation of the extensibility and scalability of the system, an experiment connecting two Open Smart Classrooms deployed in different countries is also undertaken, which demonstrates the influence of these new features on the educational effect. Interesting and optimistic results obtained show a significant research prospect for developing future distant learning systems.

Journal ArticleDOI
TL;DR: In this paper, the authors propose to achieve fault tolerance by employing redundancy at the core level instead of at the micro-architecture level, which not only maximizes the performance of the on-chip communication scheme, but also provides a unified topology to operating system and application software running on the processor.
Abstract: Homogeneous manycore systems are emerging for tera-scale computation and typically utilize Network-on-Chip (NoC) as the communication scheme between embedded cores. Effective defect tolerance techniques are essential to improve the yield of such complex integrated circuits. We propose to achieve fault tolerance by employing redundancy at the core-level instead of at the microarchitecture level. When faulty cores exist on-chip in this architecture, however, the physical topologies of various manufactured chips can be significantly different. How to reconfigure the system with the most effective NoC topology is a relevant research problem. In this paper, we first show that this problem is an instance of a well known NP-complete problem. We then present novel solutions for the above problem, which not only maximize the performance of the on-chip communication scheme, but also provide a unified topology to Operating System and application software running on the processor. Experimental results show the effectiveness of the proposed techniques.

Proceedings ArticleDOI
19 Jul 2009
TL;DR: The paper addresses how TSP can be used to safely integrate applications of different criticality and security classifications, and how incremental validation is supported to control the impact of software modifications to the system.
Abstract: This paper will describe the benefits of incorporating software Time and Space Partitioning (TSP), based upon the aeronautic IMA concept, into the spacecraft avionics architecture to manage the growth of mission functions implemented in the on-board software. The paper addresses how TSP can be used to safely integrate applications of different criticality and security classifications, and how incremental validation is supported to control the impact of software modifications to the system.

Proceedings ArticleDOI
01 Dec 2009
TL;DR: This paper forms a computing cloud as a kind of graph, a computing resource such as services or intellectual property access rights as an attribute of a graph node, and the use of a resource as a predicate on an edge of the graph.
Abstract: What is a cloud application precisely? In this paper, we formulate a computing cloud as a kind of graph, a computing resource such as services or intellectual property access rights as an attribute of a graph node, and the use of a resource as a predicate on an edge of the graph. We also propose to model cloud computation semantically as a set of paths in a subgraph of the cloud such that every edge contains a predicate that is evaluated to be true. Finally, we present algorithms to compose cloud computations and a family of model-based testing criteria to support the testing of cloud applications.

Proceedings ArticleDOI
09 Mar 2009
TL;DR: In this article, a high-performance remote computing platform is introduced to enable thin clients to access the applications in remote servers, where all the data redundancies are exploited in the compression viewport.
Abstract: The pervasive computing environment and the wide network bandwidth provide users more opportunities to utilize remote computing resources. In this paper, we introduce a high-performance remote computing platform to enable thin clients to access the applications in remote servers. Our system employs a compression-friendly model, where all the data redundancies are exploited in the compression viewport. Within this model, we also propose a unified coding scheme to efficiently compress the vast-spectral screens. Several mechanisms are further proposed to improve interactivity and reduce the bandwidth consumption in the system, including adaptive transmission for heterogeneous display devices. Simulations with real applications demonstrate that our system significantly outperforms previous thin-client systems. Especially for video playback under moderate- and low-bandwidth network conditions, our system can achieve the frame update rate 5 times as high as the previous counterparts.

Proceedings ArticleDOI
24 Aug 2009
TL;DR: This work develops compositional techniques for automated scheduling of partitions in a distributed real-time avionics system and proposes a principled approach for scheduling ARINC-653 partitions that should facilitate system integration.
Abstract: ARINC specification 653-2 describes the interface between application software and underlying middleware in a distributed real-time avionics system. The real-time workload in this system comprises of partitions, where each partition consists of one or more processes. Processes incur blocking and preemption overheads and can communicate with other processes in the system. In this work we develop compositional techniques for automated scheduling of such partitions and processes. At present, system designers manually schedule partitions based on interactions they have with the partition vendors. This approach is not only time consuming, but can also result in under utilization of resources. In contrast, the technique proposed in this paper is a principled approach for scheduling ARINC-653 partitions and therefore should facilitate system integration.

Journal ArticleDOI
TL;DR: This work proposes an abstraction of ldquovirtual collocationrdquo and its realization by the software infrastructure of middleware, and describes the implementation as well as some experimental results over a traffic control testbed.
Abstract: We focus on the mechanism half of the policy-mechanism divide for networked control systems, and address the issue of what are the appropriate abstractions and architecture to facilitate their development and deployment. We propose an abstraction of ldquovirtual collocationrdquo and its realization by the software infrastructure of middleware. Control applications are to be developed as a collection of software components that communicate with each other through the middleware, called Etherware. The middleware handles the complexities of network operation, such as addressing, start-up, configuration and interfaces, by encapsulating application components in ldquoShellsrdquo which mediate component interactions with the rest of the system. The middleware also provides mechanisms to alleviate the effects of uncertain delays and packet losses over wireless channels, component failures, and distributed clocks. This is done through externalization of component state, with primitives to capture and reuse it for component restarts, upgrades, and migration, and through services such as clock synchronization. We further propose an accompanying use of local temporal autonomy for reliability, and describe the implementation as well as some experimental results over a traffic control testbed.

Journal ArticleDOI
TL;DR: In this paper, the authors propose a generic architecture to implement intrusion-tolerant Web servers based on redundancy and diversification principles in order to increase the system resilience to attacks.
Abstract: Nowadays, more and more information systems are connected to the Internet and offer Web interfaces to the general public or to a restricted set of users. Such openness makes them likely targets for intruders, and conventional protection techniques have been shown insufficient to prevent all intrusions in such open systems. This paper proposes a generic architecture to implement intrusion-tolerant Web servers. This architecture is based on redundancy and diversification principles in order to increase the system resilience to attacks: usually, an attack targets a particular software, running on a particular platform, and fails on others. The architecture is composed of redundant proxies that mediate client requests to a redundant bank of diversified application servers. The redundancy is deployed here to increase system availability and integrity. To improve performance, adaptive redundancy is applied: the redundancy level is selected according to the current alert level. The architecture can be used for static servers, that is, for Web distribution of stable information (updated offline) and for fully dynamic systems where information updates are executed immediately on an online database. The feasibility of this architecture has been demonstrated by implementing an example of a travel agency Web server, and the first performance tests are satisfactory, both for request execution times and recovery after incidents.

Journal ArticleDOI
TL;DR: A resource-efficient agent platform was developed, which relies on established concepts of agent platforms, but modifies and supplements them accordingly, and is implemented in Java and in several C++ variants.
Abstract: Recently, distributed agents are increasingly adopted in automation control systems, where they are used for monitoring, data collection, fault diagnosis and control. However, existing agent platforms do not always fulfill the requirements of practical automation applications in respect of real-time properties and resource usage. Often, they offer a lot of functionality that is not necessary in automation and leads to significant overhead in respect of design effort and runtime resources. To meet the specific requirements of the automation domain, a resource-efficient agent platform was developed, which relies on established concepts of agent platforms, but modifies and supplements them accordingly. This platform is implemented in Java and in several C++ variants. This paper describes the architecture of the platform and discusses several performance issues. Results of various performance tests are presented in comparison to the established agent platform JADE. Finally, a practical use case is presented, where the platform is utilized to drive a hardware-in-the-loop emulation and testing environment.

Proceedings ArticleDOI
23 May 2009
TL;DR: The initial experiences adapting OpenMP to enable it to serve as a programming model for high performance embedded systems and the needs of embedded application developers are discussed.
Abstract: In this paper we discuss our initial experiences adapting OpenMP to enable it to serve as a programming model for high performance embedded systems. A high-level programming model such as OpenMP has the potential to increase programmer productivity, reducing the design/development costs and time to market for such systems. However, OpenMP needs to be extended if it is to meet the needs of embedded application developers, who require the ability to express multiple levels of parallelism, real-time and resource constraints, and to provide additional information in support of optimization. It must also be capable of supporting the mapping of different software tasks, or components, to the devices configured in a given architecture.

Proceedings ArticleDOI
26 Apr 2009
TL;DR: This paper uses machine learning techniques first to generate performance models for all tasks and then applying those models to perform automatic performance prediction across program executions, and extends an existing scheduling algorithm to use generated task cost estimates for online task partitioning and scheduling.
Abstract: With the emerging many-core paradigm, parallel programming must extend beyond its traditional realm of scientific applications. Converting existing sequential applications as well as developing next-generation software requires assistance from hardware, compilers and runtime systems to exploit parallelism transparently within applications. These systems must decompose applications into tasks that can be executed in parallel and then schedule those tasks to minimize load imbalance. However, many systems lack a priori knowledge about the execution time of all tasks to perform effective load balancing with low scheduling overhead. In this paper, we approach this fundamental problem using machine learning techniques first to generate performance models for all tasks and then applying those models to perform automatic performance prediction across program executions. We also extend an existing scheduling algorithm to use generated task cost estimates for online task partitioning and scheduling. We implement the above techniques in the pR framework, which transparently parallelizes scripts in the popular R language, and evaluate their performance and overhead with both a real-world application and a large number of synthetic representative test scripts. Our experimental results show that our proposed approach significantly improves task partitioning and scheduling, with maximum improvements of 21.8%, 40.3% and 22.1% and average improvements of 15.9%, 16.9% and 4.2% for LMM (a real R application) and synthetic test cases with independent and dependent tasks, respectively.

Proceedings ArticleDOI
24 Mar 2009
TL;DR: In this paper, an architecture-centric approach for the second step of automatic failure diagnosis is presented, where component anomaly scores are correlated based on architectural dependency graphs of the software system and a rule set to address error propagation.
Abstract: Manual failure diagnosis in large-scale software systems is time-consuming and error-prone. Automatic failure diagnosis support mechanisms can potentially narrow down, or even localize faults within a very short time which both helps to preserve system availability. A large class of automatic failure diagnosis approaches consists of two steps: 1) computation of component anomaly scores; 2) global correlation of the anomaly scores for fault localization. In this paper, we present an architecture-centric approach for the second step. In our approach, component anomaly scores are correlated based on architectural dependency graphs of the software system and a rule set to address error propagation. Moreover, the results are graphically visualized in order to support fault localization and to enhance maintainability. The visualization combines architectural diagrams automatically derived from monitoring data with failure diagnosis results. In a case study, the approach is applied to a distributed sample Web application which is subject to fault injection.