scispace - formally typeset
Search or ask a question

Showing papers on "Web service published in 2018"


Journal ArticleDOI
TL;DR: The Content-based Journals & Conferences Recommender System on computer science, as well as its web service, is presented, which recommends suitable journals or conferences with a priority order based on the abstract of a manuscript.
Abstract: As computer science and information technology are making broad and deep impacts on our daily lives, more and more papers are being submitted to computer science journals and conferences. To help authors decide where they should submit their manuscripts, we present the Content-based Journals & Conferences Recommender System on computer science, as well as its web service at http://www.keaml.cn/prs/ . This system recommends suitable journals or conferences with a priority order based on the abstract of a manuscript. To follow the fast development of computer science and technology, a web crawler is employed to continuously update the training set and the learning model. To achieve interactive online response, we propose an efficient hybrid model based on chi-square feature selection and softmax regression. Our test results show that, the system can achieve an accuracy of 61.37% and suggest the best journals or conferences in about 5 s on average.

287 citations


Journal ArticleDOI
TL;DR: A peer to peer (P2P) approach for service-oriented FBDE, which revolutionizes the traditional centralized and neutral-file based approach for CBDM is proposed.
Abstract: With the rapid development of service-oriented computing (SOC)/service-oriented architecture (SOA), cloud computing and web services, cloud-based design and manufacture (CBDM) is emerging as state-of-the-art technologies and methodologies to enable collaborative product development (CPD). CBDM-enabled CPD can provide cost-effective, flexible and scalable solutions to collaborative partners by sharing the resources in the applications of design and manufacturing. Feature-based data exchange (FBDE) has been one of the key issues in history of CPD and should be adapted in lasted CBDM-enabled CPD. Firstly this paper presents a service-oriented architecture for data exchange in CBDM. Within this architecture, FBDE was registered as service and FBDE users in the CBDM environment can acquire a set of FBDE services to replace the traditional FBDE functions among heterogeneous CAD systems. Secondly, in order to put the philosophy of FBDE-as-a-Service into practice for CBDM, this paper proposes a peer to peer (P2P) approach for service-oriented FBDE, which revolutionizes the traditional centralized and neutral-file based approach. Thirdly, technique issues of FBDE-as-a-Service in P2P architecture are discussed in details, including constituting of the P2P FBDE service, procedure of service-oriented P2P FBDE, pre-P2P FBDE service, topological entity matching between pre/post-P2P service and post-P2P FBDE service. Finally, a case study of data exchange is tested to demonstrate the proposed idea of service-oriented FBDE for CBDM.

170 citations


Journal ArticleDOI
TL;DR: This Anniversary update gives a 20-year perspective on the RSAT (Regulatory Sequence Analysis Tools) software suite, including updated programs to analyse regulatory variants and tools to extract sequences from a list of coordinates.
Abstract: RSAT (Regulatory Sequence Analysis Tools) is a suite of modular tools for the detection and the analysis of cis-regulatory elements in genome sequences. Its main applications are (i) motif discovery, including from genome-wide datasets like ChIP-seq/ATAC-seq, (ii) motif scanning, (iii) motif analysis (quality assessment, comparisons and clustering), (iv) analysis of regulatory variations, (v) comparative genomics. Six public servers jointly support 10 000 genomes from all kingdoms. Six novel or refactored programs have been added since the 2015 NAR Web Software Issue, including updated programs to analyse regulatory variants (retrieve-variation-seq, variation-scan, convert-variations), along with tools to extract sequences from a list of coordinates (retrieve-seq-bed), to select motifs from motif collections (retrieve-matrix), and to extract orthologs based on Ensembl Compara (get-orthologs-compara). Three use cases illustrate the integration of new and refactored tools to the suite. This Anniversary update gives a 20-year perspective on the software suite. RSAT is well-documented and available through Web sites, SOAP/WSDL (Simple Object Access Protocol/Web Services Description Language) web services, virtual machines and stand-alone programs at http://www.rsat.eu/.

163 citations


Journal ArticleDOI
TL;DR: A review of the state-of-the-art search methods for the IoT, classifying them according to their design principle and search approaches as: IoT data and IoT object-based techniques is presented.
Abstract: Internet of Things (IoT) paradigm links physical objects in the real world to cyber world and enables the creation of smart environments and applications. A physical object is the fundamental building block of the IoT, known as a Smart Device , that can monitor the environment. These devices can communicate with each other and have data processing abilities. When deployed, smart devices collect real-time data and publish the gathered data on the Web. The functionality of smart devices can be abstracted as a service and an IoT application can be built by combining the smart devices with these services that help to address challenges of day-to-day activities. The IoT comprises billions of these intelligent communicating devices that generate enormous amount of data, and hence performing analysis on this data is a significant task. Using search techniques, the size and extent of data can be reduced and limited, so that an application can choose just the most important and valuable data items as per its necessities. It is, however, a tedious task to effectively seek and select a proper device and/or its data among a large number of available devices for a specific application. Search techniques are fundamental to IoT and poses various challenges like a large number of devices, dynamic availability, restrictions on resource utilization, real time data in various types and formats, past and historical monitoring. In the recent past, various methods and techniques have been developed by the research community to address these issues. In this paper, we present a review of the state-of-the-art search methods for the IoT, classifying them according to their design principle and search approaches as: IoT data and IoT object-based techniques. Under each classification, we describe the method adopted, their advantages and disadvantages. Finally, we identify and discuss key challenges and future research directions that will allow the next generation search techniques to recognize and respond to user queries and satisfy the information needs of users.

145 citations


Journal ArticleDOI
TL;DR: The purpose of this paper is to develop a cloud broker architecture for cloud service selection by finding a pattern of the changing priorities of User Preferences (UPs), and it is shown that the method outperforms the Analytic Hierarchy Process (AHP).
Abstract: Due to the increasing number of cloud services, service selection has become a challenging decision for many organisations. It is even more complicated when cloud users change their preferences based on the requirements and the level of satisfaction of the experienced service. The purpose of this paper is to overcome this drawback and develop a cloud broker architecture for cloud service selection by finding a pattern of the changing priorities of User Preferences (UPs). To do that, a Markov chain is employed to find the pattern. The pattern is then connected to the Quality of Service (QoS) for the available services. A recently proposed Multi Criteria Decision Making (MCDM) method, Best Worst Method (BWM), is used to rank the services. We show that the method outperforms the Analytic Hierarchy Process (AHP). The proposed methodology provides a prioritized list of the services based on the pattern of changing UPs. The methodology is validated through a case study using real QoS performance data of Amazon Elastic Compute (Amazon EC2) cloud services.

120 citations


Journal ArticleDOI
TL;DR: The Ocean Gene Atlas allows users to query protein or nucleotide sequences against global ocean reference gene catalogs and provides quantitative and contextualized information on genes of interest in the global ocean ecosystem.
Abstract: The Ocean Gene Atlas is a web service to explore the biogeography of genes from marine planktonic organisms. It allows users to query protein or nucleotide sequences against global ocean reference gene catalogs. With just one click, the abundance and location of target sequences are visualized on world maps as well as their taxonomic distribution. Interactive results panels allow for adjusting cutoffs for alignment quality and displaying the abundances of genes in the context of environmental features (temperature, nutrients, etc.) measured at the time of sampling. The ease of use enables non-bioinformaticians to explore quantitative and con-textualized information on genes of interest in the global ocean ecosystem. Currently the Ocean Gene Atlas is deployed with (i) the Ocean Microbial Reference Gene Catalog (OM-RGC) comprising 40 million non-redundant mostly prokaryotic gene sequences associated with both Tara Oceans and Global Ocean Sampling (GOS) gene abundances and (ii) the Marine Atlas of Tara Ocean Unigenes (MATOU) composed of >116 million eukaryote unigenes. Additional datasets will be added upon availability of further marine environmental datasets that provide the required complement of sequence assemblies, raw reads and contextual environmental parameters. Ocean Gene Atlas is a freely-available web service at: http://tara-oceans.mio.osupytheas.fr/ocean-gene-atlas/.

114 citations


Journal ArticleDOI
TL;DR: The BIG Data Center is continually expanding, updating and enriching its core database resources through big-data integration and value-added curation, including BioCode (a repository archiving bioinformatics tool codes), Bio project library, Bio sample library, and Science Wikis.
Abstract: The BIG Data Center at Beijing Institute of Genomics (BIG) of the Chinese Academy of Sciences provides freely open access to a suite of database resources in support of worldwide research activities in both academia and industry. With the vast amounts of omics data generated at ever-greater scales and rates, the BIG Data Center is continually expanding, updating and enriching its core database resources through big-data integration and value-added curation, including BioCode (a repository archiving bioinformatics tool codes), BioProject (a biological project library), BioSample (a biological sample library), Genome Sequence Archive (GSA, a data repository for archiving raw sequence reads), Genome Warehouse (GWH, a centralized resource housing genome-scale data), Genome Variation Map (GVM, a public repository of genome variations), Gene Expression Nebulas (GEN, a database of gene expression profiles based on RNA-Seq data), Methylation Bank (MethBank, an integrated databank of DNA methylomes), and Science Wikis (a series of biological knowledge wikis for community annotations). In addition, three featured web services are provided, viz., BIG Search (search as a service; a scalable inter-domain text search engine), BIG SSO (single sign-on as a service; a user access control system to gain access to multiple independent systems with a single ID and password) and Gsub (submission as a service; a unified submission service for all relevant resources). All of these resources are publicly accessible through the home page of the BIG Data Center at http://bigd.big.ac.cn.

114 citations


Journal ArticleDOI
TL;DR: This study analytically and statistically categorize and analyze the current research techniques on the service composition in the IoT (published between 2012 and 2017) and presents a technical taxonomy according to content of the existing studies selected with SLR method.

112 citations


Journal ArticleDOI
TL;DR: A novel deep learning based hybrid approach for Web service recommendation by combining collaborative filtering and textual content is proposed, which can achieve better recommendation performance than several state-of-the-art methods.
Abstract: With the rapid development of service-oriented computing and cloud computing, an increasing number of Web services have been published on the Internet, which makes it difficult to select relevant Web services manually to satisfy complex user requirements. Many machine learning methods, especially matrix factorization based collaborative filtering models, have been widely employed in Web service recommendation. However, as a linear model of latent factors, matrix factorization is challenging to capture complex interactions between Web applications (or mashups) and their component services within an extremely sparse interaction matrix, which will result in poor service recommendation performance. Towards this problem, in this paper, we propose a novel deep learning based hybrid approach for Web service recommendation by combining collaborative filtering and textual content. The invocation interactions between mashups and services as well as their functionalities are seamlessly integrated into a deep neural network, which can be used to characterize the complex relations between mashups and services. Experiments conducted on a real-world Web service dataset demonstrate that our approach can achieve better recommendation performance than several state-of-the-art methods, which indicates the effectiveness of our proposed approach in service recommendation.

108 citations


Journal ArticleDOI
TL;DR: A lightweight authorization stack for smart-home IoT applications, where a Cloud-connected device relays input commands to a user’s smart-phone for authorization is proposed, which is user-device centric and addresses security issues in the context of an untrusted Cloud platform.

106 citations


Journal ArticleDOI
TL;DR: A novel service workflow reconfiguration architecture is designed to provide guidance, which ranges from monitoring to recommendations for project implementation, and experiments are conducted to demonstrate the effectiveness and efficiency of the proposed method.

Journal ArticleDOI
TL;DR: A heterogeneous SMR network for movie recommendation that exploits the textual description and movie-poster image of each movie as well as user ratings and social relationships is developed and is evaluated on a large-scale dataset from a real world SMR Web site.
Abstract: With the rapid development of Internet movie industry social-aware movie recommendation systems (SMRs) have become a popular online web service that provide relevant movie recommendations to users. In this effort many existing movie recommendation approaches learn a user ranking model from user feedback with respect to the movie's content. Unfortunately this approach suffers from the sparsity problem inherent in SMR data. In the present work we address the sparsity problem by learning a multimodal network representation for ranking movie recommendations. We develop a heterogeneous SMR network for movie recommendation that exploits the textual description and movie-poster image of each movie as well as user ratings and social relationships. With this multimodal data we then present a heterogeneous information network learning framework called SMR-multimodal network representation learning (MNRL) for movie recommendation. To learn a ranking metric from the heterogeneous information network we also developed a multimodal neural network model. We evaluated this model on a large-scale dataset from a real world SMR Web site and we find that SMR-MNRL achieves better performance than other state-of-the-art solutions to the problem.

Journal ArticleDOI
TL;DR: A detection model based on Deep Belief Networks (DBN) is presented and it is shown that the detecting model can achieve an approximately 90% true positive rate and 0.6% false positive rate.
Abstract: Web service is one of the key communications software services for the Internet. Web phishing is one of many security threats to web services on the Internet. Web phishing aims to steal private information, such as usernames, passwords, and credit card details, by way of impersonating a legitimate entity. It will lead to information disclosure and property damage. This paper mainly focuses on applying a deep learning framework to detect phishing websites. This paper first designs two types of features for web phishing: original features and interaction features. A detection model based on Deep Belief Networks (DBN) is then presented. The test using real IP flows from ISP (Internet Service Provider) shows that the detecting model based on DBN can achieve an approximately 90% true positive rate and 0.6% false positive rate.

Journal ArticleDOI
TL;DR: A three-fold solution that includes: trust establishment framework that is resilient to collusion attacks that occur to mislead trust results; bootstrapping mechanism that capitalizes on the endorsement concept in online social networks to assign initial trust values; and trust-based hedonic coalitional game that enables services to distributively form trustworthy multi-cloud communities is proposed.
Abstract: The prominence of cloud computing led to unprecedented proliferation in the number of web services deployed in cloud data centers. In parallel, service communities have gained recently increasing interest due to their ability to facilitate discovery, composition, and resource scaling in large-scale services’ markets. The problem is that traditional community formation models may work well when all services reside in a single cloud but cannot support a multi-cloud environment. Particularly, these models overlook having malicious services that misbehave to illegally maximize their benefits and that arises from grouping together services owned by different providers. Besides, they rely on a centralized architecture whereby a central entity regulates the community formation; which contradicts with the distributed nature of cloud-based services. In this paper, we propose a three-fold solution that includes: trust establishment framework that is resilient to collusion attacks that occur to mislead trust results; bootstrapping mechanism that capitalizes on the endorsement concept in online social networks to assign initial trust values; and trust-based hedonic coalitional game that enables services to distributively form trustworthy multi-cloud communities. Experiments conducted on a real-life dataset demonstrate that our model minimizes the number of malicious services compared to three state-of-the-art cloud federations and service communities models.

Book
29 Oct 2018
TL;DR: In this paper, the authors describe warehouse-scale computers (WSCs), the computing platforms that power cloud computing and all the great web services we use every day, and discuss how these new systems treat the datacenter itself as one massive computer designed at warehouse scale.
Abstract: This book describes warehouse-scale computers (WSCs), the computing platforms that power cloud computing and all the great web services we use every day. It discusses how these new systems treat the datacenter itself as one massive computer designed at warehouse scale, with hardware and software working in concert to deliver good levels of internet service performance. The book details the architecture of WSCs and covers the main factors influencing their design, operation, and cost structure, and the characteristics of their software base. Each chapter contains multiple real-world examples, including detailed case studies and previously unpublished details of the infrastructure used to power Google's online services. Targeted at the architects and programmers of today's WSCs, this book provides a great foundation for those looking to innovate in this fascinating and important area, but the material will also be broadly interesting to those who just want to understand the infrastructure powering the internet. The third edition reflects four years of advancements since the previous edition and nearly doubles the number of pictures and figures. New topics range from additional workloads like video streaming, machine learning, and public cloud to specialized silicon accelerators, storage and network building blocks, and a revised discussion of data center power and cooling, and uptime. Further discussions of emerging trends and opportunities ensure that this revised edition will remain an essential resource for educators and professionals working on the next generation of WSCs.

Journal ArticleDOI
01 Dec 2018
TL;DR: The experimental results indicate that the CSA-WSC compared to genetic search skyline network (GS-S-Net) and genetic particle swarm optimization algorithm (GAPSO-W SC) reduces the costs by 7% and responding time by 6%, as two major reasons for the reduction of improvement of the quality of service.
Abstract: In recent years, service-based applications are deemed to be one of the new solutions to build an enterprise application system. In order to answer the most demanding needs or adaptations to the needs of changed services quickly, service composition is currently used to exploit the multi-service capabilities in the Information Technology organizations. While web services, which have been independently developed, may not always be compatible with each other, the selection of optimal services and composition of these services are seen as a challenging issue. In this paper, we present cuckoo search algorithm for web service composition problem which is called ‘CSA-WSC’ that provides web service composition to improve the quality of service (QoS) in the distributed cloud environment. The experimental results indicate that the CSA-WSC compared to genetic search skyline network (GS-S-Net) and genetic particle swarm optimization algorithm (GAPSO-WSC) reduces the costs by 7% and responding time by 6%, as two major reasons for the reduction of improvement of the quality of service. It also increases provider availability up to 7.25% and the reliability to 5.5%, as the two important QoS criteria for improving the quality of service.

Book
29 Oct 2018
TL;DR: This book describes warehouse-scale computers (WSCs), the computing platforms that power cloud computing and all the great web services the authors use every day, and details the main factors influencing their design, operation, and cost structure, and the characteristics of their software base.
Abstract: This book describes warehouse-scale computers (WSCs), the computing platforms that power cloud computing and all the great web services we use every day. It discusses how these new systems treat the datacenter itself as one massive computer designed at warehouse scale, with hardware and software working in concert to deliver good levels of internet service performance. The book details the architecture of WSCs and covers the main factors influencing their design, operation, and cost structure, and the characteristics of their software base. Each chapter contains multiple real-world examples, including detailed case studies and previously unpublished details of the infrastructure used to power Google's online services. Targeted at the architects and programmers of today's WSCs, this book provides a great foundation for those looking to innovate in this fascinating and important area, but the material will also be broadly interesting to those who just want to understand the infrastructure powering the internet. The third edition reflects four years of advancements since the previous edition and nearly doubles the number of pictures and figures. New topics range from additional workloads like video streaming, machine learning, and public cloud to specialized silicon accelerators, storage and network building blocks, and a revised discussion of data center power and cooling, and uptime. Further discussions of emerging trends and opportunities ensure that this revised edition will remain an essential resource for educators and professionals working on the next generation of WSCs.

Journal ArticleDOI
TL;DR: An original method for usability analysis based on users' preferences is presented and compares with other methods of usability of websites and conducts verification of this method on the basis of exemplary websites.
Abstract: This article attempts to closely examine the users' preferences in the author's method of assessing the usability of websites. In particular, the issues bring a closer evaluation of websites by users. It sets rules for the accuracy of users' preferences on the basis of the scoring method. In the considered problem of assessing the usability of websites the methods of decision support, logs, and user preferences on the basis of the scoring method were used. It should be noted that websites and user preferences change over time and usually vary during the design from the pages already available on the network. Website aging forces companies to conduct a new study on the usability of websites. This article presents an original method for usability analysis based on users' preferences. The proposed method compares with other methods of usability of websites and conducts verification of this method on the basis of exemplary websites.

Journal ArticleDOI
TL;DR: A support vector machine (SVM) based collaborative filtering (CF) service recommendation approach, namely SVMCF4SR is proposed, which has comparatively higher recommendation efficiency and quality.

Journal ArticleDOI
TL;DR: Ms2lda.org is a web application that allows users to upload their data, run MS2LDA analyses and explore the results through interactive visualizations, and the user can also decompose a data set onto predefined Mass2Motifs.
Abstract: Motivation We recently published MS2LDA, a method for the decomposition of sets of molecular fragment data derived from large metabolomics experiments. To make the method more widely available to the community, here we present ms2lda.org, a web application that allows users to upload their data, run MS2LDA analyses and explore the results through interactive visualizations. Results Ms2lda.org takes tandem mass spectrometry data in many standard formats and allows the user to infer the sets of fragment and neutral loss features that co-occur together (Mass2Motifs). As an alternative workflow, the user can also decompose a data set onto predefined Mass2Motifs. This is accomplished through the web interface or programmatically from our web service. Availability and implementation The website can be found at http://ms2lda.org, while the source code is available at https://github.com/sdrogers/ms2ldaviz under the MIT license. Supplementary information Supplementary data are available at Bioinformatics online.

Journal ArticleDOI
TL;DR: A hybrid service discovery approach is developed by integrating goal-based matching with two practical approaches: keyword-based and topic model-based, which shows the effectiveness of this approach on a real-world dataset.

Journal ArticleDOI
TL;DR: A low-cost unmanned surveillance system consisting of remote measuring stations and a monitoring center consisting of video cameras, water level analyzers, and wireless communication routers necessary to display real-time water level measurements of rivers and reservoirs on a Web platform is developed.
Abstract: Traditional surveillance systems for observing water levels are often complex, costly, and time-consuming. In this paper, we developed a low-cost unmanned surveillance system consisting of remote measuring stations and a monitoring center. The system uses a map-based Web service, as well as video cameras, water level analyzers, and wireless communication routers necessary to display real-time water level measurements of rivers and reservoirs on a Web platform. With the aid of a wireless communication router, the water level information is transmitted to a server connected to the Internet via a cellular network. By combining complex water level information of different river basins, the proposed system can be used to forecast and prevent flood disasters. In order to evaluate the proposed system, we conduct experiments using three feasible methods, including the difference method, dictionary learning, and deep learning. The experimental results show that the deep learning-based method performs best in terms of accuracy and stability.

Journal ArticleDOI
TL;DR: The present paper provides an update on PUG-REST, a Representational State Transfer-like web service interface to PubChem, which includes access to new kinds of data, full implementation of synchronous fast structure search, and implementation of dynamic traffic control through throttling.
Abstract: PubChem (https://pubchem.ncbi.nlm.nih.gov) is one of the largest open chemical information resources available. It currently receives millions of unique users per month on average, serving as a key resource for many research fields such as cheminformatics, chemical biology, medicinal chemistry, and drug discovery. PubChem provides multiple programmatic access routes to its data and services. One of them is PUG-REST, a Representational State Transfer (REST)-like web service interface to PubChem. On average, PUG-REST receives more than a million requests per day from tens of thousands of unique users. The present paper provides an update on PUG-REST since our previous paper published in 2015. This includes access to new kinds of data (e.g. concise bioactivity data, table of contents headings, etc.), full implementation of synchronous fast structure search, support for assay data retrieval using accession identifiers in response to the deprecation of NCBI's GI numbers, data exchange between PUG-REST and NCBI's E-Utilities through the List Gateway, implementation of dynamic traffic control through throttling, and enhanced usage policies. In addition, example Perl scripts are provided, which the user can easily modify, run, or translate into another scripting language.

Journal ArticleDOI
TL;DR: A novel MapReduce-based Evolutionary Algorithm with Guided Mutation that leads to an efficient composition of Big services with better performance and execution time is proposed and it is observed that the proposed method outperforms other methods.

Proceedings ArticleDOI
27 May 2018
TL;DR: A metamorphic testing approach for the automated detection of faults in RESTful Web APIs (henceforth also referred to as simply Web APIs) is presented and the concept of meetamorphic relation output patterns is introduced.
Abstract: Web Application Programming Interfaces (APIs) specify how to access services and data over the network, typically using Web services. Web APIs are rapidly proliferating as a key element to foster reusability, integration, and innovation, enabling new consumption models such as mobile or smart TV apps. Companies such as Facebook, Twitter, Google, eBay or Netflix receive billions of API calls every day from thousands of different third-party applications and devices, which constitutes more than half of their total traffic. As Web APIs are progressively becoming the cornerstone of software integration, their validation is getting more critical. In this context, the fast detection of bugs is of utmost importance to increase the quality of internal products and third-party applications. However, testing Web APIs is challenging mainly due to the difficulty to assess whether the output of an API call is correct, i.e., the oracle problem. For instance, consider the Web API of the popular music streaming service Spotify. Suppose a search for albums with the query "redhouse" returning 21 total matches: Is this output correct? Do all the albums in the result set contain the keyword? Are there any albums containing the keyword not included in the result set? Answering these questions is difficult, even with small result sets, and often infeasible when the results are counted by thousands or millions. Metamorphic testing alleviates the oracle problem by providing an alternative when the expected output of a test execution is complex or unknown. Rather than checking the output of an individual program execution, metamorphic testing checks whether multiple executions of the program under test fulfil certain necessary properties called metamorphic relations. For instance, consider the following metamorphic relation in Spotify: two searches for albums with the same query should return the same number of total results regardless of the size of pagination. Suppose that a new Spotify search is performed using the exact same query as before and increasing the maximum number of results per page from 20 (default value) to 50: This search returns 27 total albums (6 more matches than in the previous search), which reveals a bug. This is an example of a real and reproducible fault detected using the approach presented in this paper and reported to Spotify. According to Spotify developers, it was a regression fault caused by a fix with undesired side effects. In this paper [1], we present a metamorphic testing approach for the automated detection of faults in RESTful Web APIs (henceforth also referred to as simply Web APIs). We introduce the concept of metamorphic relation output patterns. A Metamorphic Relation Output Pattern (MROP) defines an abstract output relation typically identified in Web APIs, regardless of their application domain. Each MROP is defined in terms of set operations among test outputs such as equality, union, subset, or intersection. MROPs provide a helpful guide for the identification of metamorphic relations, broadening the scope of our work beyond a particular Web API. Based on the notion of MROP, a methodology is proposed for the application of the approach to any Web API following the REST architectural pattern. The approach was evaluated in several steps. First, we used the proposed methodology to identify 33 metamorphic relations in four Web APIs developed by undergraduate students. All the relations are instances of the proposed MROPs. Then, we assessed the effectiveness of the identified relations at revealing 317 automatically seeded faults (i.e., mutants) in the APIs under test. As a result, 302 seeded faults were detected, achieving a mutation score of 95.3%. Second, we evaluated the approach using real Web APIs and faults. In particular, we identified 20 metamorphic relations in the Web API of Spotify and 40 metamorphic relations in the Web API of YouTube. Each metamorphic relation was implemented and automatically executed using both random and manual test data. In total, 469K metamorphic tests were generated. As a result, 21 metamorphic relations were violated, and 11 issues revealed and reported (3 issues in Spotify and 8 issues in YouTube). To date, 10 of the reported issues have been either confirmed by the API developers or reproduced by other users supporting the effectiveness of our approach.

Proceedings ArticleDOI
31 Oct 2018
TL;DR: This paper presents a large scale, longitudinal study of data center network reliability based on operational data collected from the production network infrastructure at Facebook, one of the largest web service providers in the world.
Abstract: The ability to tolerate, remediate, and recover from network incidents (caused by device failures and fiber cuts, for example) is critical for building and operating highly-available web services. Achieving fault tolerance and failure preparedness requires system architects, software developers, and site operators to have a deep understanding of network reliability at scale, along with its implications on the software systems that run in data centers. Unfortunately, little has been reported on the reliability characteristics of large scale data center network infrastructure, let alone its impact on the availability of services powered by software running on that network infrastructure. This paper fills the gap by presenting a large scale, longitudinal study of data center network reliability based on operational data collected from the production network infrastructure at Facebook, one of the largest web service providers in the world. Our study covers reliability characteristics of both intra and inter data center networks. For intra data center networks, we study seven years of operation data comprising thousands of network incidents across two different data center network designs, a cluster network design and a state-of-the-art fabric network design. For inter data center networks, we study eighteen months of recent repair tickets from the field to understand reliability of Wide Area Network (WAN) backbones. In contrast to prior work, we study the effects of network reliability on software systems, and how these reliability characteristics evolve over time. We discuss the implications of network reliability on the design, implementation, and operation of large scale data center systems and how it affects highly-available web services. We hope our study forms a foundation for understanding the reliability of large scale network infrastructure, and inspires new reliability solutions to network incidents.

Journal ArticleDOI
TL;DR: A Web service decentralized discovery approach which is based on two complementary mechanisms which based on the trust and the domain-specific expertise gives a RMSE value lower than other trust-aware recommender systems like TidalTrust, MoleTrust and TrustWalker.

Journal ArticleDOI
TL;DR: A solid starting ground and comprehensive overview in this area is presented to help readers quickly understand state-of-the-art technologies and research progress and analyze desirable properties of fault tolerance and scalability to illuminate the design principles of distributed systems.
Abstract: Data centers are widely used for big data analytics, which often involve data-parallel jobs, including query and web service. Meanwhile, cluster frameworks are rapidly developed for data-intensive applications in data center networks (DCNs). To promote the performance of these frameworks, many efforts have been paid to improve scheduling strategies and resource allocation algorithms. With the deployment of geo-distributed data centers and data-intensive applications, the optimization in DCNs regains pervasive attention in both industry and academia. Many solutions, such as the coflow-aware scheduling and speculative execution, have been proposed to meet various requirements. Therefore, we present a solid starting ground and comprehensive overview in this area to help readers quickly understand state-of-the-art technologies and research progress. We observe that algorithms in cluster frameworks are implemented with different guidelines and can be classified according to scheduling granularity, controller management, and prior-knowledge requirement. In addition, mechanisms for conquering crucial challenges in DCNs are discussed, including providing low latency and minimizing job completion time. Moreover, we analyze desirable properties of fault tolerance and scalability to illuminate the design principles of distributed systems. We hope that this paper will shed light on this promising land and serve as a guide for further researches.

Journal ArticleDOI
TL;DR: This paper proposes a novel way of dynamically reconstructing objective service profiles based on mashup descriptions, which carry historical information of how services are used in mashups, and proposes the rules of dominant words discovery and employ it to further refine the algorithm.
Abstract: Web services are self-contained software components that support business process automation over the Internet, and mashup is a popular technique that creates value-added service compositions to fulfill complicated business requirements. For mashup developers, looking for desired component services from a sea of service candidates is often challenging. Therefore, web service recommendation has become a highly demanding technique. Traditional approaches, however, mostly rely on static and potentially subjectively described texts offered by service providers. In this paper, we propose a novel way of dynamically reconstructing objective service profiles based on mashup descriptions, which carry historical information of how services are used in mashups. Our key idea is to leverage mashup descriptions and structures to discover important word features of services and bridge the vocabulary gap between mashup developers and service providers. Specifically, we jointly model mashup descriptions and component service using author topic model in order to reconstruct service profiles. Exploiting word features derived from the reconstructed service profiles, a new service recommendation algorithm is developed. Experiments over a real-world data set from ProgrammableWeb.com demonstrate that our proposed service recommendation algorithm is effective and outperforms the state-of-the-art methods. Note to Practitioners —Service recommendation accuracy for mashup creation is often limited due to poor quality of service descriptions. Mashup descriptions contain valuable information about functions and features of its component services, which can be leveraged to enhance descriptive quality of original service profiles. Based on the assumption, this paper proposes a novel two-phase service recommendation framework to facilitate mashup creation. Specifically, our approach reconstructs service profiles by extracting appropriate words from historical mashup descriptions. Then, a novel service recommendation algorithm is developed by exploiting popularity and relevance measures hidden in the reconstructed profiles. Moreover, we propose the rules of dominant words discovery and employ it to further refine our algorithm.

Journal ArticleDOI
TL;DR: An independent RESTful web service in a layered approach to detect NoSQL injection attacks in web applications named DNIARS, which depends on comparing the generated patterns from NoSQL statement structure in static code state and dynamic state to respond to the web application with the possibility of NoSQL injections.
Abstract: Despite the extensive research of using web services for security purposes, there is a big challenge towards finding a no radical solution for NoSQL injection attack. This paper presents an independent RESTful web service in a layered approach to detect NoSQL injection attacks in web applications. The proposed method is named DNIARS. DNIARS depends on comparing the generated patterns from NoSQL statement structure in static code state and dynamic state. Accordingly, the DNIARS can respond to the web application with the possibility of NoSQL injection attack. The proposed DNIARS was implemented in PHP plain code and can be considered as an independent framework that has the ability for responding to different requests formats like JSON, XML. To evaluate its performance, DNIARS was tested using the most common testing tools for RESTful web service. According to the results, DNIARS can work in real environments where the error rate did not exceed 1%.