scispace - formally typeset
Search or ask a question

Showing papers on "Server published in 2010"


Proceedings ArticleDOI
03 May 2010
TL;DR: The architecture of HDFS is described and experience using HDFS to manage 25 petabytes of enterprise data at Yahoo! is reported on.
Abstract: The Hadoop Distributed File System (HDFS) is designed to store very large data sets reliably, and to stream those data sets at high bandwidth to user applications. In a large cluster, thousands of servers both host directly attached storage and execute user application tasks. By distributing storage and computation across many servers, the resource can grow with demand while remaining economical at every size. We describe the architecture of HDFS and report on experience using HDFS to manage 25 petabytes of enterprise data at Yahoo!.

5,005 citations


Journal ArticleDOI
TL;DR: Cassandra is a distributed storage system for managing very large amounts of structured data spread out across many commodity servers, while providing highly available service with no single point of failure.
Abstract: Cassandra is a distributed storage system for managing very large amounts of structured data spread out across many commodity servers, while providing highly available service with no single point of failure. Cassandra aims to run on top of an infrastructure of hundreds of nodes (possibly spread across different data centers). At this scale, small and large components fail continuously. The way Cassandra manages the persistent state in the face of these failures drives the reliability and scalability of the software systems relying on this service. While in many ways Cassandra resembles a database and shares many design and implementation strategies therewith, Cassandra does not support a full relational data model; instead, it provides clients with a simple data model that supports dynamic control over data layout and format. Cassandra system was designed to run on cheap commodity hardware and handle high write throughput while not sacrificing read efficiency.

2,870 citations


Proceedings Article
23 Jun 2010
TL;DR: ZooKeeper provides a per client guarantee of FIFO execution of requests and linearizability for all requests that change the ZooKeeper state to enable the implementation of a high performance processing pipeline with read requests being satisfied by local servers.
Abstract: In this paper, we describe ZooKeeper, a service for coordinating processes of distributed applications Since ZooKeeper is part of critical infrastructure, ZooKeeper aims to provide a simple and high performance kernel for building more complex coordination primitives at the client It incorporates elements from group messaging, shared registers, and distributed lock services in a replicated, centralized service The interface exposed by Zoo-Keeper has the wait-free aspects of shared registers with an event-driven mechanism similar to cache invalidations of distributed file systems to provide a simple, yet powerful coordination service The ZooKeeper interface enables a high-performance service implementation In addition to the wait-free property, ZooKeeper provides a per client guarantee of FIFO execution of requests and linearizability for all requests that change the ZooKeeper state These design decisions enable the implementation of a high performance processing pipeline with read requests being satisfied by local servers We show for the target workloads, 2:1 to 100:1 read to write ratio, that ZooKeeper can handle tens to hundreds of thousands of transactions per second This performance allows ZooKeeper to be used extensively by client applications

1,637 citations


Proceedings ArticleDOI
14 Mar 2010
TL;DR: This paper utilize and uniquely combine the public key based homomorphic authenticator with random masking to achieve the privacy-preserving public cloud data auditing system, which meets all above requirements.
Abstract: Cloud Computing is the long dreamed vision of computing as a utility, where users can remotely store their data into the cloud so as to enjoy the on-demand high quality applications and services from a shared pool of configurable computing resources. By data outsourcing, users can be relieved from the burden of local data storage and maintenance. However, the fact that users no longer have physical possession of the possibly large size of outsourced data makes the data integrity protection in Cloud Computing a very challenging and potentially formidable task, especially for users with constrained computing resources and capabilities. Thus, enabling public auditability for cloud data storage security is of critical importance so that users can resort to an external audit party to check the integrity of outsourced data when needed. To securely introduce an effective third party auditor (TPA), the following two fundamental requirements have to be met: 1) TPA should be able to efficiently audit the cloud data storage without demanding the local copy of data, and introduce no additional on-line burden to the cloud user; 2) The third party auditing process should bring in no new vulnerabilities towards user data privacy. In this paper, we utilize and uniquely combine the public key based homomorphic authenticator with random masking to achieve the privacy-preserving public cloud data auditing system, which meets all above requirements. To support efficient handling of multiple auditing tasks, we further explore the technique of bilinear aggregate signature to extend our main result into a multi-user setting, where TPA can perform multiple auditing tasks simultaneously. Extensive security and performance analysis shows the proposed schemes are provably secure and highly efficient.

1,408 citations


Patent
18 Jun 2010
TL;DR: An interactive television program guide system is provided in this article, which provides users with an opportunity to select programs for recording on a remote media server and to designate gift recipients for whom programs may be recorded.
Abstract: An interactive television program guide system is provided. An interactive television program guide provides users with an opportunity to select programs for recording on a remote media server. Programs may also be recorded on a local media server. The program guide provides users with VCR-like control over programs that are played back from the media servers and over real-time cached copies of the programs. The program guide also provides users with an opportunity to designate gift recipients for whom programs may be recorded.

1,316 citations


Proceedings ArticleDOI
30 Aug 2010
TL;DR: This work presents Helios, a hybrid electrical/optical switch architecture that can deliver significant reductions in the number of switching elements, cabling, cost, and power consumption relative to recently proposed data center network architectures.
Abstract: The basic building block of ever larger data centers has shifted from a rack to a modular container with hundreds or even thousands of servers. Delivering scalable bandwidth among such containers is a challenge. A number of recent efforts promise full bisection bandwidth between all servers, though with significant cost, complexity, and power consumption. We present Helios, a hybrid electrical/optical switch architecture that can deliver significant reductions in the number of switching elements, cabling, cost, and power consumption relative to recently proposed data center network architectures. We explore architectural trade offs and challenges associated with realizing these benefits through the evaluation of a fully functional Helios prototype.

1,045 citations


Patent
19 Nov 2010
TL;DR: In this paper, a block-based interface to a dispersed data storage network is disclosed, which accepts read and write commands from a file system resident on a user's computer and generates network commands that are forwarded to slice servers.
Abstract: A block-based interface to a dispersed data storage network is disclosed. The disclosed interface accepts read and write commands from a file system resident on a user's computer and generates network commands that are forwarded to slice servers that form the storage component of the dispersed data storage network. The slice servers then fulfill the read and write commands.

929 citations


Patent
05 Apr 2010
TL;DR: In this paper, the authors propose an architecture that allows a Content Provider to replicate and serve its most popular content at an unlimited number of points throughout the world, by serving the base HTML document from the Content Provider's site, maintaining control over the content.
Abstract: Network architecture supports hosting and content distribution on a global scale. The architecture allows a Content Provider to replicate and serve its most popular content at an unlimited number of points throughout the world. The inventive framework comprises a set of servers operating in a distributed manner. The actual content to be served is preferably supported on a set of hosting servers (sometimes referred to as ghost servers). This content comprises HTML page objects that, conventionally, are served from a Content Provider site. A base HTML document portion of a Web page is served from the Content Provider's site while one or more embedded objects for the page are served from the hosting servers, preferably, those hosting servers near the client machine. By serving the base HTML document from the Content Provider's site, the Content Provider maintains control over the content.

808 citations


Journal ArticleDOI
TL;DR: An overview of the components and capabilities of the Akamai platform is given, and some insight into its architecture, design principles, operation, and management is offered.
Abstract: Comprising more than 61,000 servers located across nearly 1,000 networks in 70 countries worldwide, the Akamai platform delivers hundreds of billions of Internet interactions daily, helping thousands of enterprises boost the performance and reliability of their Internet applications. In this paper, we give an overview of the components and capabilities of this large-scale distributed computing platform, and offer some insight into its architecture, design principles, operation, and management.

769 citations


Proceedings ArticleDOI
04 Oct 2010
TL;DR: This work characterize the availability properties of cloud storage systems based on an extensive one year study of Google's main storage infrastructure and presents statistical models that enable further insight into the impact of multiple design choices, such as data placement and replication strategies.
Abstract: Highly available cloud storage is often implemented with complex, multi-tiered distributed systems built on top of clusters of commodity servers and disk drives. Sophisticated management, load balancing and recovery techniques are needed to achieve high performance and availability amidst an abundance of failure sources that include software, hardware, network connectivity, and power issues. While there is a relative wealth of failure studies of individual components of storage systems, such as disk drives, relatively little has been reported so far on the overall availability behavior of large cloudbased storage services.We characterize the availability properties of cloud storage systems based on an extensive one year study of Google's main storage infrastructure and present statistical models that enable further insight into the impact of multiple design choices, such as data placement and replication strategies. With these models we compare data availability under a variety of system parameters given the real patterns of failures observed in our fleet.

672 citations


Proceedings ArticleDOI
30 Nov 2010
TL;DR: This paper proposes virtual data center (VDC) as the unit of resource allocation for multiple tenants in the cloud and introduces a centralized VDC allocation algorithm for bandwidth guaranteed virtual to physical mapping.
Abstract: In this paper, we propose virtual data center (VDC) as the unit of resource allocation for multiple tenants in the cloud. VDCs are more desirable than physical data centers because the resources allocated to VDCs can be rapidly adjusted as tenants' needs change. To enable the VDC abstraction, we design a data center network virtualization architecture called SecondNet. SecondNet achieves scalability by distributing all the virtual-to-physical mapping, routing, and bandwidth reservation state in server hypervisors. Its port-switching based source routing (PSSR) further makes SecondNet applicable to arbitrary network topologies using commodity servers and switches. SecondNet introduces a centralized VDC allocation algorithm for bandwidth guaranteed virtual to physical mapping. Simulations demonstrate that our VDC allocation achieves high network utilization and low time complexity. Our implementation and experiments show that we can build SecondNet on top of various network topologies, and SecondNet provides bandwidth guarantee and elasticity, as designed.

Proceedings ArticleDOI
13 Apr 2010
TL;DR: Q-Clouds, a QoS-aware control framework that tunes resource allocations to mitigate performance interference effects, is developed, which uses online feedback to build a multi-input multi-output (MIMO) model that captures performance interference interactions, and uses it to perform closed loop resource management.
Abstract: Cloud computing offers users the ability to access large pools of computational and storage resources on demand. Multiple commercial clouds already allow businesses to replace, or supplement, privately owned IT assets, alleviating them from the burden of managing and maintaining these facilities. However, there are issues that must be addressed before this vision of utility computing can be fully realized. In existing systems, customers are charged based upon the amount of resources used or reserved, but no guarantees are made regarding the application level performance or quality-of-service (QoS) that the given resources will provide. As cloud providers continue to utilize virtualization technologies in their systems, this can become problematic. In particular, the consolidation of multiple customer applications onto multicore servers introduces performance interference between collocated workloads, significantly impacting application QoS. To address this challenge, we advocate that the cloud should transparently provision additional resources as necessary to achieve the performance that customers would have realized if they were running in isolation. Accordingly, we have developed Q-Clouds, a QoS-aware control framework that tunes resource allocations to mitigate performance interference effects. Q-Clouds uses online feedback to build a multi-input multi-output (MIMO) model that captures performance interference interactions, and uses it to perform closed loop resource management. In addition, we utilize this functionality to allow applications to specify multiple levels of QoS as application Q-states. For such applications, Q-Clouds dynamically provisions underutilized resources to enable elevated QoS levels, thereby improving system efficiency. Experimental evaluations of our solution using benchmark applications illustrate the benefits: performance interference is mitigated completely when feasible, and system utilization is improved by up to 35% using Q-states.

Proceedings ArticleDOI
20 Apr 2010
TL;DR: CloudAnalyst is developed to simulate large-scale Cloud applications with the purpose of studying the behavior of such applications under various deployment configurations and helps developers with insights in how to distribute applications among Cloud infrastructures and value added services such as optimization of applications performance and providers incoming with the use of Service Brokers.
Abstract: Advances in Cloud computing opens up many new possibilities for Internet applications developers. Previously, a main concern of Internet applications developers was deployment and hosting of applications, because it required acquisition of a server with a fixed capacity able to handle the expected application peak demand and the installation and maintenance of the whole software infrastructure of the platform supporting the application. Furthermore, server was underutilized because peak traffic happens only at specific times. With the advent of the Cloud, deployment and hosting became cheaper and easier with the use of pay-peruse flexible elastic infrastructure services offered by Cloud providers. Because several Cloud providers are available, each one offering different pricing models and located in different geographic regions, a new concern of application developers is selecting providers and data center locations for applications. However, there is a lack of tools that enable developers to evaluate requirements of large-scale Cloud applications in terms of geographic distribution of both computing servers and user workloads. To fill this gap in tools for evaluation and modeling of Cloud environments and applications, we propose CloudAnalyst. It was developed to simulate large-scale Cloud applications with the purpose of studying the behavior of such applications under various deployment configurations. CloudAnalyst helps developers with insights in how to distribute applications among Cloud infrastructures and value added services such as optimization of applications performance and providers incoming with the use of Service Brokers.

Proceedings ArticleDOI
10 Jun 2010
TL;DR: Joulemeter builds power models to infer power consumption from resource usage at runtime and identifies the challenges that arise when applying such models for VM power metering, and shows how existing instrumentation in server hardware and hypervisors can be used to build the required power models on real platforms with low error.
Abstract: Virtualization is often used in cloud computing platforms for its several advantages in efficiently managing resources. However, virtualization raises certain additional challenges, and one of them is lack of power metering for virtual machines (VMs). Power management requirements in modern data centers have led to most new servers providing power usage measurement in hardware and alternate solutions exist for older servers using circuit and outlet level measurements. However, VM power cannot be measured purely in hardware. We present a solution for VM power metering, named Joulemeter. We build power models to infer power consumption from resource usage at runtime and identify the challenges that arise when applying such models for VM power metering. We show how existing instrumentation in server hardware and hypervisors can be used to build the required power models on real platforms with low error. Our approach is designed to operate with extremely low runtime overhead while providing practically useful accuracy. We illustrate the use of the proposed metering capability for VM power capping, a technique to reduce power provisioning costs in data centers. Experiments are performed on server traces from several thousand production servers, hosting Microsoft's real-world applications such as Windows Live Messenger. The results show that not only does VM power metering allows virtualized data centers to achieve the same savings that non-virtualized data centers achieved through physical server power capping, but also that it enables further savings in provisioning costs with virtualization.

Proceedings ArticleDOI
13 Dec 2010
TL;DR: This paper presents Open Data Kit, an extensible, open-source suite of tools designed to build information services for developing regions and describes four deployments that demonstrate how the decisions made in the system architecture of ODK enable services that can both push and pull information in developing regions.
Abstract: This paper presents Open Data Kit (ODK), an extensible, open-source suite of tools designed to build information services for developing regions. ODK currently provides four tools to this end: Collect, Aggregate, Voice, and Build. Collect is a mobile platform that renders application logic and supports the manipulation of data. Aggregate provides a "click-to-deploy" server that supports data storage and transfer in the "cloud" or on local servers. Voice renders application logic using phone prompts that users respond to with keypad presses. Finally, Build is a application designer that generates the logic used by the tools. Designed to be used together or independently, ODK core tools build on existing open standards and are supported by an open-source community that has contributed additional tools. We describe four deployments that demonstrate how the decisions made in the system architecture of ODK enable services that can both push and pull information in developing regions.

Journal ArticleDOI
TL;DR: This paper argues for a new approach to datacenter storage called RAMCloud, where information is kept entirely in DRAM and large-scale systems are created by aggregating the main memories of thousands of commodity servers.
Abstract: Disk-oriented approaches to online storage are becoming increasingly problematic: they do not scale gracefully to meet the needs of large-scale Web applications, and improvements in disk capacity have far outstripped improvements in access latency and bandwidth. This paper argues for a new approach to datacenter storage called RAMCloud, where information is kept entirely in DRAM and large-scale systems are created by aggregating the main memories of thousands of commodity servers. We believe that RAMClouds can provide durable and available storage with 100-1000x the throughput of disk-based systems and 100-1000x lower access latency. The combination of low latency and large scale will enable a new breed of dataintensive applications.

Proceedings ArticleDOI
Howard S. David1, Eugene Gorbatov1, Ulf R. Hanebutte1, Rahul Khanna1, Christian Le1 
18 Aug 2010
TL;DR: This paper proposes a new approach for measuring memory power and demonstrating its applicability to a novel power limiting algorithm and shows that it achieves up to 40% lower performance impact when compared to the state-of-art baseline across the power limiting range.
Abstract: The drive for higher performance and energy efficiency in data-centers has influenced trends toward increased power and cooling requirements in the facilities. Since enterprise servers rarely operate at their peak capacity, efficient power capping is deemed as a critical component of modern enterprise computing environments. In this paper we propose a new power measurement and power limiting architecture for main memory. Specifically, we describe a new approach for measuring memory power and demonstrate its applicability to a novel power limiting algorithm. We implement and evaluate our approach in the modern servers and show that we achieve up to 40% lower performance impact when compared to the state-of-art baseline across the power limiting range.

Proceedings ArticleDOI
18 Dec 2010
TL;DR: A two-level control system to manage the mappings of workloads to VMs and VMs to physical resources and an improved genetic algorithm with fuzzy multi-objective evaluation is proposed for efficiently searching the large solution space and conveniently combining possibly conflicting objectives.
Abstract: Server consolidation using virtualization technology has become increasingly important for improving data center efficiency It enables one physical server to host multiple independent virtual machines (VMs), and the transparent movement of workloads from one server to another Fine-grained virtual machine resource allocation and reallocation are possible in order to meet the performance targets of applications running on virtual machines On the other hand, these capabilities create demands on system management, especially for large-scale data centers In this paper, a two-level control system is proposed to manage the mappings of workloads to VMs and VMs to physical resources The focus is on the VM placement problem which is posed as a multi-objective optimization problem of simultaneously minimizing total resource wastage, power consumption and thermal dissipation costs An improved genetic algorithm with fuzzy multi-objective evaluation is proposed for efficiently searching the large solution space and conveniently combining possibly conflicting objectives The simulation-based evaluation using power-consumption and thermal-dissipation models based on profiling of a Blade Center, demonstrates the good performance, scalability and robustness of our proposed approach Compared with four well-known bin-packing algorithms and two single-objective approaches, the solutions obtained from our approach seek good balance among the conflicting objectives while others cannot

Proceedings ArticleDOI
10 Jun 2010
TL;DR: This paper is the first attempt to study server failures and hardware repairs for large datacenters and presents a detailed analysis of failure characteristics as well as a preliminary analysis on failure predictors.
Abstract: Modern day datacenters host hundreds of thousands of servers that coordinate tasks in order to deliver highly available cloud computing services. These servers consist of multiple hard disks, memory modules, network cards, processors etc., each of which while carefully engineered are capable of failing. While the probability of seeing any such failure in the lifetime (typically 3-5 years in industry) of a server can be somewhat small, these numbers get magnified across all devices hosted in a datacenter. At such a large scale, hardware component failure is the norm rather than an exception.Hardware failure can lead to a degradation in performance to end-users and can result in losses to the business. A sound understanding of the numbers as well as the causes behind these failures helps improve operational experience by not only allowing us to be better equipped to tolerate failures but also to bring down the hardware cost through engineering, directly leading to a saving for the company. To the best of our knowledge, this paper is the first attempt to study server failures and hardware repairs for large datacenters. We present a detailed analysis of failure characteristics as well as a preliminary analysis on failure predictors. We hope that the results presented in this paper will serve as motivation to foster further research in this area.

Proceedings ArticleDOI
20 Apr 2010
TL;DR: This paper investigates three possible distributed solutions proposed for load balancing; approaches inspired by Honeybee Foraging Behaviour, Biased Random Sampling and Active Clustering.
Abstract: The anticipated uptake of Cloud computing, built on well-established research in Web Services, networks, utility computing, distributed computing and virtualisation, will bring many advantages in cost, flexibility and availability for service users. These benefits are expected to further drive the demand for Cloud services, increasing both the Cloud's customer base and the scale of Cloud installations. This has implications for many technical issues in Service Oriented Architectures and Internet of Services (IoS)-type applications; including fault tolerance, high availability and scalability. Central to these issues is the establishment of effective load balancing techniques. It is clear the scale and complexity of these systems makes centralized assignment of jobs to specific servers infeasible; requiring an effective distributed solution. This paper investigates three possible distributed solutions proposed for load balancing; approaches inspired by Honeybee Foraging Behaviour, Biased Random Sampling and Active Clustering.

Proceedings ArticleDOI
01 Apr 2010
TL;DR: This paper introduces the Graphite open-source distributed parallel multicore simulator infrastructure and demonstrates that Graphite can simulate target architectures containing over 1000 cores on ten 8-core servers with near linear speedup.
Abstract: This paper introduces the Graphite open-source distributed parallel multicore simulator infrastructure. Graphite is designed from the ground up for exploration of future multi-core processors containing dozens, hundreds, or even thousands of cores. It provides high performance for fast design space exploration and software development. Several techniques are used to achieve this including: direct execution, seamless multicore and multi-machine distribution, and lax synchronization. Graphite is capable of accelerating simulations by distributing them across multiple commodity Linux machines. When using multiple machines, it provides the illusion of a single process with a single, shared address space, allowing it to run off-the-shelf pthread applications with no source code modification. Our results demonstrate that Graphite can simulate target architectures containing over 1000 cores on ten 8-core servers. Performance scales well as more machines are added with near linear speedup in many cases. Simulation slowdown is as low as 41× versus native execution.

Proceedings ArticleDOI
Dennis Abts1, Michael R. Marty1, Philip M. Wells1, Peter Michael Klausler1, Hong Liu1 
19 Jun 2010
TL;DR: It is demonstrated that energy proportional datacenter communication is indeed possible and that there is a significant power advantage to having independent control of each unidirectional channel comprising a network link.
Abstract: Numerous studies have shown that datacenter computers rarely operate at full utilization, leading to a number of proposals for creating servers that are energy proportional with respect to the computation that they are performing. In this paper, we show that as servers themselves become more energy proportional, the datacenter network can become a significant fraction (up to 50%) of cluster power. In this paper we propose several ways to design a high-performance datacenter network whose power consumption is more proportional to the amount of traffic it is moving -- that is, we propose energy proportional datacenter networks. We first show that a flattened butterfly topology itself is inherently more power efficient than the other commonly proposed topology for high-performance datacenter networks. We then exploit the characteristics of modern plesiochronous links to adjust their power and performance envelopes dynamically. Using a network simulator, driven by both synthetic workloads and production datacenter traces, we characterize and understand design tradeoffs, and demonstrate an 85% reduction in power --- which approaches the ideal energy-proportionality of the network. Our results also demonstrate two challenges for the designers of future network switches: 1) We show that there is a significant power advantage to having independent control of each unidirectional channel comprising a network link, since many traffic patterns show very asymmetric use, and 2) system designers should work to optimize the high-speed channel designs to be more energy efficient by choosing optimal data rate and equalization technology. Given these assumptions, we demonstrate that energy proportional datacenter communication is indeed possible.

Proceedings ArticleDOI
01 Dec 2010
TL;DR: The simulation results obtained for two-tier, three- tier, and three-tier high-speed data center architectures demonstrate the effectiveness of the simulator in utilizing power management schema, such as voltage scaling, frequency scaling, and dynamic shutdown that are applied to the computing and networking components.
Abstract: Cloud computing data centers are becoming increasingly popular for the provisioning of computing resources. The cost and operating expenses of data centers have skyrocketed with the increase in computing capacity. Several governmental, industrial, and academic surveys indicate that the energy utilized by computing and communication units within a data center contributes to a considerable slice of the data center operational costs. In this paper, we present a simulation environment for energy-aware cloud computing data centers. Along with the workload distribution, the simulator is designed to capture details of the energy consumed by data center components (servers, switches, and links) as well as packet-level communication patterns in realistic setups. The simulation results obtained for two-tier, three- tier, and three-tier high-speed data center architectures demonstrate the effectiveness of the simulator in utilizing power management schema, such as voltage scaling, frequency scaling, and dynamic shutdown that are applied to the computing and networking components.

Proceedings ArticleDOI
29 Nov 2010
TL;DR: This work proposes a novel technique for dynamic consolidation of VMs based on adaptive utilization thresholds, which ensures a high level of meeting the Service Level Agreements (SLA) and validates the high efficiency of the proposed technique across different kinds of workloads.
Abstract: The rapid growth in demand for computational power driven by modern service applications combined with the shift to the Cloud computing model have led to the establishment of large-scale virtualized data centers. Such data centers consume enormous amounts of electrical energy resulting in high operating costs and carbon dioxide emissions. Dynamic consolidation of virtual machines (VMs) and switching idle nodes off allow Cloud providers to optimize resource usage and reduce energy consumption. However, the obligation of providing high quality of service to customers leads to the necessity in dealing with the energy-performance trade-off. We propose a novel technique for dynamic consolidation of VMs based on adaptive utilization thresholds, which ensures a high level of meeting the Service Level Agreements (SLA). We validate the high efficiency of the proposed technique across different kinds of workloads using workload traces from more than a thousand PlanetLab servers.

Proceedings ArticleDOI
06 Dec 2010
TL;DR: This work implemented a prototype of this security model for Android phones, and shows that it is both practical and scalable: it is able to support more than a hundred replicas running on a single server.
Abstract: Smartphone usage has been continuously increasing in recent years. Moreover, smartphones are often used for privacy-sensitive tasks, becoming highly valuable targets for attackers. They are also quite different from PCs, so that PC-oriented solutions are not always applicable, or do not offer comprehensive security. We propose an alternative solution, where security checks are applied on remote security servers that host exact replicas of the phones in virtual environments. The servers are not subject to the same constraints, allowing us to apply multiple detection techniques simultaneously. We implemented a prototype of this security model for Android phones, and show that it is both practical and scalable: we generate no more than 2KiB/s and 64B/s of trace data for high-loads and idle operation respectively, and are able to support more than a hundred replicas running on a single server.

Proceedings ArticleDOI
01 Nov 2010
TL;DR: A detailed study of 130,000 measurement sessions that the service has recorded since it was made publicly available in June 2009 is presented, along with describing Netalyzr 's architecture and system implementation.
Abstract: In this paper we present Netalyzr, a network measurement and debugging service that evaluates the functionality provided by people's Internet connectivity. The design aims to prove both comprehensive in terms of the properties we measure and easy to employ and understand for users with little technical background. We structure Netalyzr as a signed Java applet (which users access via their Web browser) that communicates with a suite of measurement-specific servers. Traffic between the two then probes for a diverse set of network properties, including outbound port filtering, hidden in-network HTTP caches, DNS manipulations, NAT behavior, path MTU issues, IPv6 support, and access-modem buffer capacity. In addition to reporting results to the user, Netalyzr also forms the foundation for an extensive measurement of edge-network properties. To this end, along with describing Netalyzr 's architecture and system implementation, we present a detailed study of 130,000 measurement sessions that the service has recorded since we made it publicly available in June 2009.

Journal ArticleDOI
TL;DR: The UCL Bioinformatics Group web portal offers a fully automated 3D modelling pipeline: BioSerf, which performed well in CASP8 and uses a fragment-assembly approach which placed it in the top five servers in the de novo modelling category.
Abstract: The UCL Bioinformatics Group web portal offers several high quality protein structure prediction and function annotation algorithms including PSIPRED, pGenTHREADER, pDomTHREADER, MEMSAT, MetSite, DISOPRED2, DomPred and FFPred for the prediction of secondary structure, protein fold, protein structural domain, transmembrane helix topology, metal binding sites, regions of protein disorder, protein domain boundaries and protein function, respectively. We also now offer a fully automated 3D modelling pipeline: BioSerf, which performed well in CASP8 and uses a fragment-assembly approach which placed it in the top five servers in the de novo modelling category. The servers are available via the group web site at http://bioinf.cs.ucl.ac.uk/.

Journal ArticleDOI
TL;DR: This paper presents decision models to optimally allocate source servers to physical target servers while considering real-world constraints and presents a heuristic to address large-scale server consolidation projects.
Abstract: Today's data centers offer IT services mostly hosted on dedicated physical servers. Server virtualization provides a technical means for server consolidation. Thus, multiple virtual servers can be hosted on a single server. Server consolidation describes the process of combining the workloads of several different servers on a set of target servers. We focus on server consolidation with dozens or hundreds of servers, which can be regularly found in enterprise data centers. Cost saving is among the key drivers for such projects. This paper presents decision models to optimally allocate source servers to physical target servers while considering real-world constraints. Our central model is proven to be an NP-hard problem. Therefore, besides an exact solution method, a heuristic is presented to address large-scale server consolidation projects. In addition, a preprocessing method for server load data is introduced allowing for the consideration of quality-of-service levels. Extensive experiments were conducted based on a large set of server load data from a data center provider focusing on managerial concerns over what types of problems can be solved. Results show that, on average, server savings of 31 percent can be achieved only by taking cycles in the server workload into account.

Journal ArticleDOI
TL;DR: How the biomedical informatics (BMI) community, especially consortia that share data and applications, can take advantage of a new resource called "cloud computing" is examined.

Book ChapterDOI
07 Sep 2010
TL;DR: In this article, the authors proposed a fine-grained access control for online personal health record (PHR) data in a multi-user setting, where each owner would encrypt her PHR files using a different set of cryptographic keys.
Abstract: Online personal health record (PHR) enables patients to manage their own medical records in a centralized way, which greatly facilitates the storage, access and sharing of personal health data. With the emergence of cloud computing, it is attractive for the PHR service providers to shift their PHR applications and storage into the cloud, in order to enjoy the elastic resources and reduce the operational cost. However, by storing PHRs in the cloud, the patients lose physical control to their personal health data, which makes it necessary for each patient to encrypt her PHR data before uploading to the cloud servers. Under encryption, it is challenging to achieve fine-grained access control to PHR data in a scalable and efficient way. For each patient, the PHR data should be encrypted so that it is scalable with the number of users having access. Also, since there are multiple owners (patients) in a PHR system and every owner would encrypt her PHR files using a different set of cryptographic keys, it is important to reduce the key distribution complexity in such multi-owner settings. Existing cryptographic enforced access control schemes are mostly designed for the single-owner scenarios.