scispace - formally typeset
Search or ask a question

Showing papers presented at "Parallel and Distributed Computing: Applications and Technologies in 2011"


Proceedings ArticleDOI
20 Oct 2011
TL;DR: Simulation results show that WiMAX outscores the UMTS with a sufficient margin, and is the better technology to support VoIP applications compared with UMTS.
Abstract: Next Generation Wireless Networks (NGWNs) focus on convergence of different Radio Access Technologies (RATs) providing good Quality of Service (QoS) for applications such as Voice over IP traffic (VoIP) and video streaming. The voice applications over IP networks are growing rapidly due to their increasing popularity and cost. To meet the demand of providing high-quality of VoIP at anytime and from anywhere, it is imperative to design suitable QoS model. In this paper we conduct simulation study to evaluate the QoS performance of WiMAX and UMTS for supporting VoIP. We designed simulation modules in OPNET for WiMAX and UMTS, and carried out extensive simulations to evaluate and analyze several important performance metrics such as Mean Opinion Score (MOS), end-to-end delay, jitter and packet delay variation. Simulation results show that WiMAX outscores the UMTS with a sufficient margin, and is the better technology to support VoIP applications compared with UMTS.

62 citations


Proceedings ArticleDOI
20 Oct 2011
TL;DR: This paper designs an algorithm and protocol for cost-based cloud resource scheduling and implements and evaluates this scheduling paradigm in Java Cloud ware, the pure Java based private cloud platform.
Abstract: Infrastructure as a Service (IaaS) is the most common and fundamental type of cloud computing. It provides on-demand rent and access to virtual machine and storage. In this type of cloud computing, infrastructure service suppliers and users form a resource market. In this paper, we present a cost-based resource scheduling paradigm in cloud computing by leveraging market theory to schedule compute resources to meet user's requirement. The set of computing resources with the lowest price are assigned to the user according to current suppliers' resource availability and price. We design an algorithm and protocol for cost-based cloud resource scheduling. This scheduling paradigm is implemented and evaluated in Java Cloud ware, the pure Java based private cloud platform.

37 citations


Proceedings ArticleDOI
20 Oct 2011
TL;DR: This paper presents an experimental comparative study of seven largely adopted memory allocators using real-world multithreaded applications and considers the applications' response time, memory consumption, and memory fragmentation, in order to compare the performance of the investigated memory allocator running on a multicore machine.
Abstract: Memory allocations are one of the most omnipresent operations in computer programs. The performance of memory allocation operations is a very important aspect to be considered in software design, however it is frequently neglected. This paper presents an experimental comparative study of seven largely adopted memory allocators. Unlike other related works, we assess the selected memory allocators using real-world multithreaded applications. We consider the applications' response time, memory consumption, and memory fragmentation, in order to compare the performance of the investigated memory allocators running on a multicore machine. All test results are evaluated with respect to their statistical significance throughout the ANOVA method.

32 citations


Proceedings ArticleDOI
20 Oct 2011
TL;DR: This paper proposes Biprominer, a tool that can automatically extract binary protocol message formats of an application from its real-world network trace and presents a transition probability model for a better description of the protocol.
Abstract: Application-level protocol specifications are helpful for network security management, including intrusion detection and intrusion prevention which rely on monitoring technologies such as deep packet inspection. Moreover, detailed knowledge of protocol specifications is also an effective way of detecting malicious code. However, current methods for obtaining unknown and proprietary protocol message formats (i.e., no publicly available protocol specification), especially binary protocols, highly rely on manual operations, such as reverse engineering which is time-consuming and laborious. In this paper, we propose Biprominer, a tool that can automatically extract binary protocol message formats of an application from its real-world network trace. In addition, we present a transition probability model for a better description of the protocol. The chief feature of Biprominer is that it does not need to have any priori knowledge of protocol formats, because Biprominer is based on the statistical nature of the protocol format. We evaluate the efficacy of Biprominer over three binary protocols, with an average precision more than 99% and a recall better than 96.7%.

31 citations


Proceedings ArticleDOI
20 Oct 2011
TL;DR: This paper gives an overview of forward/inverse Prestack Kirchhoff Time Migration algorithm, as one of the well-known seismic imaging algorithms, and proposes an approach to fit this algorithm for running on Google's MapReduce framework.
Abstract: The oil and gas industries have been great consumers of parallel and distributed computing systems, by frequently running technical applications with intensive processing of terabytes of data. By the emergence of cloud computing which gives the opportunity to hire high-throughput computing resources with lower operational costs, such industries have started to adopt their technical applications to be executed on such high-performance commodity systems. In this paper, we first give an overview of forward/inverse Prestack Kirchhoff Time Migration (PKTM) algorithm, as one of the well-known seismic imaging algorithms. Then we will explain our proposed approach to fit this algorithm for running on Google's MapReduce framework. Toward the end, we will analyse the relation between MapReduce-based PKTM completion time and the number of mappers/reducers on pseudo-distributed MapReduce mode.

22 citations


Proceedings ArticleDOI
20 Oct 2011
TL;DR: An efficient and robust implementation of the estimation of GMM statistics used in the EM algorithm on GPU using NVIDIA's Compute Unified Device Architecture (CUDA) and an augmentation of the standard CPU version is proposed utilizing SSE instructions.
Abstract: Gaussian Mixture Models (GMMs) are widely used among scientists e.g. in statistics toolkits and data mining procedures. In order to estimate parameters of a GMM the Maximum Likelihood (ML) training is often utilized, more precisely the Expectation-Maximization (EM) algorithm. Nowadays, a lot of tasks works with huge datasets, what makes the estimation process time consuming (mainly for complex mixture models containing hundreds of components). The paper presents an efficient and robust implementation of the estimation of GMM statistics used in the EM algorithm on GPU using NVIDIA's Compute Unified Device Architecture (CUDA). Also an augmentation of the standard CPU version is proposed utilizing SSE instructions. Time consumptions of presented methods are tested on a large dataset of real speech data from the NIST Speaker Recognition Evaluation (SRE) 2008. Estimation on GPU proves to be more than 400 times faster than the standard CPU version and 130 times faster than the SSE version, thus a huge speed up was achieved without any approximations made in the estimation formulas. Proposed implementation was also compared to other implementations developed by other departments over the world and proved to be the fastest (at least 5 times faster than the best implementation published recently).

20 citations


Proceedings ArticleDOI
20 Oct 2011
TL;DR: A new method called AESC(Automatic Energy Status Controlling) which can control the energy status of CPU automatically is presented which can automatically gear the frequency of CPU and get the best result that 20% less energy while increasing execution time by only 1%.
Abstract: Recently, more and more people are focused on the energy efficiency of HPC(High Performance Computing) centers. The power consumption of supercomputer listed in the first 10 of top 500 is usually more than 1Mkw. Power-aware techniques have been widely used in data centre or supercomputer centre for reducing the power consumption. It is useful for us to reduce the power consumption with frequency scalable CPU being adopted in the HPC centers. Because load unbalance is a common problem in high performance computing, we can gear the frequency of CPU down when it is idle so that the power consumption can be reduced with little performance lose. In this paper, we present a new method called AESC(Automatic Energy Status Controlling) which can control the energy status of CPU automatically. Firstly, we identify the application as different kinds of power consumption by the load balancing and communication time. Secondly, we will give a control policy for controlling the energy status based on former results. By using this policy, we can automatically gear the frequency of CPU and get the best result that 20% less energy while increasing execution time by only 1%.

18 citations


Proceedings ArticleDOI
20 Oct 2011
TL;DR: A new representative measure is proposed to compress the original data sets and maintain a set of representative points by continuously updating Eigen-system with the incidence vector to generate instant cluster labels as new data points arrive.
Abstract: Spectral clustering is an emerging research topic that has numerous applications, such as data dimension reduction and image segmentation. In spectral clustering, as new data points are added continuously, dynamic data sets are processed in an on-line way to avoid costly re-computation. In this paper, we propose a new representative measure to compress the original data sets and maintain a set of representative points by continuously updating Eigen-system with the incidence vector. According to these extracted points we generate instant cluster labels as new data points arrive. Our method is effective and able to process large data sets due to its low time complexity. Experimental results over various real evolutional data sets show that our method provides fast and relatively accurate results.

18 citations


Proceedings ArticleDOI
20 Oct 2011
TL;DR: This paper analyses the performance of the two standardised handover schemes, namely the Mobile IP and the ASN-based Network Mobility (ABNM), in mobile WiMAX using simulation and results clearly indicate that ABNM is more efficient for handover in terms of handover delay and throughput.
Abstract: Worldwide Interoperability for Microwave Access (WiMAX) deployment is growing at a rapid pace. Since Mobile WiMAX has the key advantage of serving large coverage areas per base station, it has become a popular emerging technology for handling mobile clients. However, serving a large number of Mobile Stations (MS) in practice requires an efficient handover scheme. Currently, mobile WiMAX has a long handover delay that contributes to the overall end-to-end communication delay. Recent research is focusing on increasing the efficiency of hand over schemes. In this paper, we analyse the performance of the two standardised handover schemes, namely the Mobile IP and the ASN-based Network Mobility (ABNM), in mobile WiMAX using simulation. Our results clearly indicate that ABNM is more efficient for handover in terms of handover delay and throughput.

14 citations


Proceedings ArticleDOI
20 Oct 2011
TL;DR: This work presents a novel strategy called Security Proxy, which can be easily implemented and deployed on existing DNS server without modification of DNS server itself, and which has obvious advantage over the original transaction ID, the source port randomizing and 0x20 techniques.
Abstract: DNS has been suffering from cache poisoning attack for a long time. The attacker sends camouflaged DNS response to trick the domain name server, and inserts malicious resource record into the cached database. Because the original DNS protocol only depends on 16-bit transaction ID to verify the response packet, it is prone to be guessed by the attacker. Although many strategies such as transaction randomizing, source port randomizing and the 0x20 technique have been applied to improve the resistance of DNS, the attacker still has chance to poison DNS server in an acceptable time. Other more complicated strategy such as DNSSEC which provides stricter prevention mechanism is not easy to deploy and is not widely adopted yet. To address the problem, we present a novel strategy called Security Proxy. The architecture can be easily implemented and deployed on existing DNS server without modification of DNS server itself. The embedded two schemes Selective Re-Query and Security Label Communication can cooperate and effectively prevent the cache poisoning attack. We analyze our strategy from both the capability and efficiency. Then we find that our Security Proxy has obvious advantage over the original transaction ID, the source port randomizing and 0x20 techniques.

14 citations


Proceedings ArticleDOI
20 Oct 2011
TL;DR: A framework for automatic reference metadata extraction from scientific papers that can extract title, author, journal, volume, year, and page from science papers in PDF is described.
Abstract: Bibliographical information of scientific papers is of great value since the Science Citation Index is introduced to measure research impact. Most scientific documents available on the web are unstructured or semi-structured, and the automatic reference metadata extraction process becomes an important task. This paper describes a framework for automatic reference metadata extraction from scientific papers. Our system can extract title, author, journal, volume, year, and page from scientific papers in PDF. We utilize a document metadata knowledge base to guide the reference metadata extraction process. The experiment results show that our system achieves a high accuracy.

Proceedings ArticleDOI
20 Oct 2011
TL;DR: A translation of synchronous systems to data-flow process networks is presented, thereby bridging the gap between synchronous and asynchronous MoCs.
Abstract: The synchronous model of computation (MoC) has been successfully used for the design of embedded systems having a local control like hardware circuits and single-threaded software, while its application to distributed parallel embedded systems is still a challenge. In contrast, other MoCs such as data-flow process networks (DPNs) directly match with these architectures. In this paper, we therefore present a translation of synchronous systems to data-flow process networks, thereby bridging the gap between synchronous and asynchronous MoCs. We use the resulting DPNs to generate CAL code for the Open DF package, which offers important features for embedded system design.

Proceedings ArticleDOI
20 Oct 2011
TL;DR: This work analytically establishes that the proposed algorithm provides assurance on the bounded probability of missing (m,k)-firm constraints and confirms the effectiveness and superiority of the proposed scheme in achieving the scheduling objective.
Abstract: We present a guaranteed real-time scheduling algorithm for multiple real-time tasks subject to (m,k)-firm deadlines on homogeneous multiprocessors. The scheduling objective of the proposed algorithm is to provide guaranteed performance by bounding the probability of missing (m,k)-firm deadline constraints while improving the probability of deadline satisfactions as much as possible. This goal is established to satisfy the minimum requirements expressed by (m,k)-firm deadlines and simultaneously provide the best possible quality of service. We analytically establish that the proposed algorithm provides assurance on the bounded probability of missing (m,k)-firm constraints. Experimental studies validate our analytical results and confirm the effectiveness and superiority of the proposed scheme in achieving our scheduling objective.

Proceedings ArticleDOI
20 Oct 2011
TL;DR: An energy-efficient scheduling policy called green scheduling is proposed which relaxes fairness slightly to create as many opportunities as possible for overlapping resource complementary tasks and shows that green scheduling can save between 7% and 9% energy consumption of fair scheduler.
Abstract: Energy efficiency of data centers has draw a great attention due to the cost of power consumption increases dramatically as the size of data center grows. Nowadays, Map Reduce is a framework widely used for processing large data sets in data center, its energy efficiency directly affects the energy efficiency of data center. MapReduce's energy efficiency is closely tied to its scheduler, we find that fair scheduler outperforms FIFO scheduler in energy efficiency when CPU-intensive job and IO-intensive job running simultaneously on the cluster, because fair scheduler achieves better resource utilization by overlapping resource complementary tasks on slaves. However this behavior is occasional, because fair scheduler has no information about task's resource requirement. This occasional behavior lets us identify the area that energy efficiency of fair scheduler can be improved. We propose an energy-efficient scheduling policy called green scheduling which relaxes fairness slightly to create as many opportunities as possible for overlapping resource complementary tasks. The results show that green scheduling can save between 7% and 9% energy consumption of fair scheduler.

Proceedings ArticleDOI
20 Oct 2011
TL;DR: A threat assessment approach to estimate the impact of attacks on network using the Common Vulnerability Scoring System to quantitatively assess network threats and further correlates alerts with contextual information to improve the accuracy of assessment.
Abstract: In face of overwhelming alerts produced by firewalls or intrusion detection devices, it is difficult to assess network threats that we face. In this paper, we propose a threat assessment approach to estimate the impact of attacks on network. The approach employs the Common Vulnerability Scoring System to quantitatively assess network threats and further correlates alerts with contextual information to improve the accuracy of assessment. In the case studies, we demonstrate how the approach is applied in real networks. The experimental results show that the approach can make an accurate assessment of network threats.

Proceedings ArticleDOI
20 Oct 2011
TL;DR: It is shown that diffusion wavelet can do an efficient multi-resolution analysis on TM and the original TM can be reconstructed by choosing the diffused traffic in a particular level.
Abstract: Traffic matrix describes the traffic volumes traversing the network from the input nodes to the exit nodes over a measured period. Such a matrix is very hard, if not intractable, to be obtained for a large network. We apply a new technique to analyze the traffic matrix by use of diffusion wavelet in this paper. It is shown that diffusion wavelet can do an efficient multi-resolution analysis on TM. The original TM can be reconstructed by choosing the diffused traffic in a particular level. This paper also shows there are a lot of potential applications by use of diffusion wavelet-based analysis on traffic matrix.

Proceedings ArticleDOI
20 Oct 2011
TL;DR: In this article, the authors consider all possible tasks of a sensor in its embedded network and propose an energy management model, which can be used to guide the design of energy efficient wireless sensor networks through network parameterization and optimization.
Abstract: Sensors have limited resources so it is important to manage the resources efficiently to maximize their use. A sensor's battery is a crucial resource as it singly determines the lifetime of sensor network applications. Since these devices are useful only when they are able to communicate with the world, radio transceiver of a sensor as an I/O and a costly unit plays a key role in its lifetime. This resource often consumes a big portion of the sensor's energy as it must be active most of the time to announce the existence of the sensor in the network. As such the radio component has to deal with its embedded sensor network whose parameters and operations have significant effects on the sensor's lifetime. In existing energy models, hardware is considered, but the environment and the network's parameters did not receive adequate attention. Energy consumption components of traditional network architecture are often considered individually and separately, and their influences on each other have not been considered in these approaches. In this paper we consider all possible tasks of a sensor in its embedded network and propose an energy management model. We categorize these tasks in five energy consuming constituents. The sensor's Energy Consumption (EC) is modeled on its energy consuming constituents and their input parameters and tasks. The sensor's EC can thus be reduced by managing and executing efficiently the tasks of its constituents. The proposed approach can be effective for power management, and it also can be used to guide the design of energy efficient wireless sensor networks through network parameterization and optimization.

Proceedings ArticleDOI
20 Oct 2011
TL;DR: An XML placement strategy, which is Query Workload Estimation based XML Placement strategy (QWEXP) for efficient distributed XML storage and parallel query and Experimental results have shown that QWEXP promotes the speedup and scale up properties of parallel XML system greatly.
Abstract: Since there has been significant amount of XML documents generated in various application domains, efficient XML management has become an important problem. Distributed XML storage and parallel query based on Map Reduce can be an effective solution to this problem. As XML data placement strategy is a key factor of parallel system performance, in this paper we present an XML placement strategy, which is Query Workload Estimation based XML Placement strategy (QWEXP) for efficient distributed XML storage and parallel query. To achieve query workload balance, it partitions XML based on query workload estimation which is calculated by XML structure without knowing of user queries, considering that in common application scenarios user queries are unknown in advance. The partitioned XML segments are around an XML storage unit W0, to support scalability of parallel XML database. Finally segments are distributed to each processing node evenly to ensure workload balance on parallel query execution. Experimental results have shown that QWEXP promotes the speedup and scale up properties of parallel XML system greatly.

Proceedings ArticleDOI
20 Oct 2011
TL;DR: An optimization is presented to reduce the number of synchronization messages from 3N to 2¡ÌN and it is shown that the performance of All to all communication is improved by almost constant ratio, which is mainly determined by message size and independent of system scale.
Abstract: MPI All to all communication is widely used in many high performance computing (HPC) applications. In All to all communication, each process sends a distinct message to all other participating processes. In multicore clusters, processes within a node simultaneously contend for the same network resource of the node in All to all communication. However, many small synchronization messages are required in All to all communication of large messages. With the contention, their latency is orders of magnitude larger than that without contention. As a result, the synchronization overhead is significantly increased and accounts for a large proportion to the whole latency of All to all communication. In this paper, we analyse the considerable overhead of synchronization messages. Base on the analysis, an optimization is presented to reduce the number of synchronization messages from 3N to 2iIN. Evaluations on a 240-core cluster show that the performance is improved by almost constant ratio, which is mainly determined by message size and independent of system scale. The performance of All to all communication is improved by 25% for 32K and 64K bytes messages. For FFT application, performance is improved by 20%.

Proceedings ArticleDOI
20 Oct 2011
TL;DR: A parallel immune algorithm for detection of lung cancer in chest X-ray images based on object shared space based on template matching method and Java Spaces is used asobject shared space is shown to be efficient.
Abstract: This paper discusses a parallel immune algorithm (IA) for detection of lung cancer in chest X-ray images based on object shared space. The template matching method is combined to the algorithm and Java Spaces is used as object shared space. The experiment results show that the algorithm is efficient for detecting suspected lung cancer in chest X-ray images.

Proceedings ArticleDOI
20 Oct 2011
TL;DR: This paper provides a heuristic-based solution for the problem of gathering the data in wireless sensor network using a single Mobile Element and evaluates the performance of the algorithm by comparing it to the optimal solution as well as the best well-known algorithms from the literature.
Abstract: In this paper we investigate the problem of gathering the data in wireless sensor network using a single Mobile Element. In particular we consider the case where the data are produced by measurements and they need to be delivered to a predefined sink within a given time interval from the time the measurement takes place. A mobile element travels the network in predefined paths, collect the data from the nodes, and deliver them to the sink by a single long-distance transmission. In this problem, the length of the mobile element path is bounded by pre-determined length. This path will visit a subset of the nodes. These selected nodes will work as caching points and will aggregate the other nodes' data. The caching point nodes are selected with the aim of reducing the energy expenditures due to multi-hop forwarding. We provide a heuristic-based solution for this problem. We evaluate the performance of our algorithm by comparing it to the optimal solution as well as the best well-known algorithms from the literature.

Proceedings ArticleDOI
20 Oct 2011
TL;DR: This paper proposes an algorithmic solution that outperforms that best known comparable scheme for this problem by an average of 40% and plans the tours of the mobile elements that minimize the total travelling time.
Abstract: In this paper we consider the problem of data gathering in wireless sensor network using Mobile Elements. In particular, we consider the situations where the data produced by the nodes must be delivered to the sink within a pre-defined time interval. Mobile elements travel the network and collect the data of nodes, and deliver them to the sink. Each node must be visited by a mobile element, which then must deliver this node data to the sink with the given time interval. The goal is to plan the tours of the mobile elements that minimize the total travelling time. Several variations of this problem have been investigated in the literature. We propose an algorithmic solution that outperforms that best known comparable scheme for this problem by an average of 40%.

Proceedings ArticleDOI
20 Oct 2011
TL;DR: An integer linear programming (ILP) based energy-optimal solution designed for heterogeneous system is proposed and the experimental results demonstrate that the proposed method could exploit the heterogeneity in different processors and achieve improved energy efficiency.
Abstract: Heterogeneous parallel systems have become popular in general purpose computing and even high performance computing fields. There are many studies focused on harnessing heterogeneous parallel processing for better performance. However the energy optimization for heterogeneous system has not been well studied. Owing to the differences in performance and energy consumption, the energy optimization technique for heterogeneous system is different from the existing methods designed for homogeneous system. Besides typical voltage scaling method, reasonable task partitioning is also an essential method for optimizing energy consumption on heterogeneous systems. Through partitioning a data parallel task and mapping sub-tasks onto several processors, one could achieve better performance and reduced energy consumption. As the computation cost reduces with specific accelerators, the communication overhead becomes more prominent. Therefore, the task partition optimization should holistically consider the computation improvement and communication overhead to achieve higher energy efficiency. Typically, task partition and voltage scaling are not orthogonal and influence the effect of each other in the energy optimization problem. In order to harness both two knobs efficiently, this paper proposes an integer linear programming (ILP) based energy-optimal solution designed for heterogeneous system. We present a case study of optimizing MGRID benchmark on a typical CPU-GPU heterogeneous system. The experimental results demonstrate that the proposed method could exploit the heterogeneity in different processors and achieve improved energy efficiency.

Proceedings ArticleDOI
20 Oct 2011
TL;DR: JMigBSP was designed to work on grid computing environments and offers an interface that follows the BSP (Bulk Synchronous Parallel) style and is shown as a competitive library when comparing performance with a C-based library called BSPlib.
Abstract: This paper describes the rationale for developing jMigBSP - a Java programming library that offers object rescheduling. It was designed to work on grid computing environments and offers an interface that follows the BSP (Bulk Synchronous Parallel) style. jMigBSP's main contribution focuses on the rescheduling facility in two different ways: (i) by using migration directives on the application code directly and (ii) through automatic load balancing at middleware level. Especially, this second idea is feasible thanks to the Java's inheritance feature, in which transforms a simple jMigBSP application in a migratable one only by changing a single line of code. In addition, the presented library makes the object interaction easier by providing one-sided message passing directives and hides network latency through asynchronous communications. Finally, a BSP-based FFT application was developed and its execution shows jMigBSP as a competitive library when comparing performance with a C-based library called BSPlib. Besides its user-friendly Java interface, the strengths of jMigBSP also considers the migration tests where it outperforms the time spent with BSPlib.

Proceedings ArticleDOI
20 Oct 2011
TL;DR: This paper focuses on SE-botnet's infection and defense, and presents a propagation model for it, taking full account of social networks' characteristics and human dynamics, and abstract the general process of social engineering attacks used by SE- botnet.
Abstract: With the rapid development of social networking services and the diversification of social engineering attacks, new high-infection botnet (called SE-botnet by us), which exploits social engineering attacks to spread bots in social networks, has become an underlying threat. Predicting the threat of SE-botnet can help defenders mitigate it effectively. In this paper, we focus on SE-botnet's infection and defense, presenting a propagation model for it. We take full account of social networks' characteristics and human dynamics, and abstract the general process of social engineering attacks used by SE-botnet. Our preliminary simulation results demonstrate that the SE-botnet can capture tens of thousands of bots in one day with a great infection capacity. our propagation model can accurately predict this process with less than 5% deviation.

Proceedings ArticleDOI
20 Oct 2011
TL;DR: This work proposes a model of email worm propagation based on a novel modeling method, Stochastic Game Nets (SGN), and uses this model to analyze the rampant propagation issues of email worms.
Abstract: In this paper, we propose a model of email worm propagation based on a novel modeling method, Stochastic Game Nets (SGN), and use this model to analyze the rampant propagation issues of email worm. Combination the conceptual framework of SGN with the practical problems, we get some remarks based on the definitions of SGN in order to precisely describe the details of email worm propagation. An algorithm for solving the equilibrium strategy is presented to calculate the model of SGN. Finally we analyze our research result with several figures, such as infection rate and average propagation time. The results of our work can also offer some references for email users to defend email worms.

Proceedings ArticleDOI
20 Oct 2011
TL;DR: A segmented approach is introduced which schedules tasks based on nature set of tasks in terms of computation cost and there precedence constraints and the experimental results show the better performance of proposed heuristic.
Abstract: Task scheduling optimization is crucial in order to achieve maximum advantage out of available resources having diverse characteristics. In heterogeneous environment scheduling set of dependent tasks involve two dimensional considerations. Tasks are supposed to be assigned to best suited machines while avoiding the extra overhead of communication cost which should ultimately enhance the performance mostly in terms of minimizing the completion time of a job. Extensive research work has been done addressing the same problem domain and number of well-known heuristics has been proposed. In this paper a new heuristic is proposed which assign priorities to the set of dependent tasks based on three different parameters which are average computation cost, average communication cost and mean of both. A segmented approach is introduced which schedules tasks based on nature set of tasks in terms of computation cost and there precedence constraints. The experimental results show the better performance of proposed heuristic.

Proceedings ArticleDOI
20 Oct 2011
TL;DR: This work presents as original contribution an optimization of algorithm for the detection of duplicate tuples in databases through phonetic based on multithreading without the need for trained data, as well as an independent environment of language to be supported for this.
Abstract: Aiming to ensure greater reliability and consistency of data stored in the database, the data cleaning stage is set early in the process of Knowledge Discovery in Databases (KDD) and is responsible for eliminating problems and adjust the data for the later stages, especially for the stage of data mining. Such problems occur in the instance level and schema, namely, missing values, null values, duplicate tuples, values outside the domain, among others. Several algorithms were developed to perform the cleaning step in databases, some of them were developed specifically to work with the phonetics of words, since a word can be written in different ways. Within this perspective, this work presents as original contribution an optimization of algorithm for the detection of duplicate tuples in databases through phonetic based on multithreading without the need for trained data, as well as an independent environment of language to be supported for this.

Proceedings ArticleDOI
20 Oct 2011
TL;DR: It is shown that the success of malicious peers in the Bubble Trust depends mainly on the amount of benefit provided to the network and that the analyzed malicious strategies cannot menace the contribution provided by the P2P network.
Abstract: Malicious collectives represent one of the biggest threats for secured P2P applications. In our previous work, we proposed a trust management system called Bubble Trust targeting this problem. In this paper, we investigate the most common malicious strategies as well as the resistance the Bubble Trust uses against them. We created the simulation framework suitable for testing other TMSs in the same scenarios. Our analysis shows that the success of malicious peers in the Bubble Trust depends mainly on the amount of benefit provided to the network and that the analyzed malicious strategies cannot menace the contribution provided by the P2P network.

Proceedings ArticleDOI
20 Oct 2011
TL;DR: The approach presented in this work extends an application level check pointing framework to proactively migrate MPI processes from processors when impending failures are notified, without having to restart the entire application.
Abstract: The running times of large-scale computational science and engineering parallel applications are usually longer than the mean-time-between-failures (MTBF). Hardware failures must be tolerated by the parallel applications to ensure that not all computation done is lost on machine failures. Check pointing and rollback recovery is a very useful technique to implement fault-tolerant applications. However, when a failure occurs, most check pointing mechanisms require a complete restart of the parallel application from the last checkpoint. This affects the efficiency of the solution, leading to an unnecessary overhead that can be avoided through a single process migration in case of failure. Although research has been carried out in this field, the solutions proposed in the literature are commonly tied to specific implementations of the parallel communication APIs or to specific runtime environments. The approach presented in this work extends an application level check pointing framework to proactively migrate MPI processes from processors when impending failures are notified, without having to restart the entire application. The main features of the proposed solution are: transparency for the user, achieved through the use of a compiler tool and a runtime library, and portability since it is not locked into a particular MPI implementation.