scispace - formally typeset
Search or ask a question

Showing papers on "Web service published in 2017"


Proceedings ArticleDOI
14 Jun 2017
TL;DR: The motivation, design and formal semantics of WebAssembly are described, some preliminary experience with implementations are provided, and it is described how WebAssembly is an abstraction over modern hardware, making it language-, hardware-, and platform-independent, with use cases beyond just the Web.
Abstract: The maturation of the Web platform has given rise to sophisticated and demanding Web applications such as interactive 3D visualization, audio and video software, and games. With that, efficiency and security of code on the Web has become more important than ever. Yet JavaScript as the only built-in language of the Web is not well-equipped to meet these requirements, especially as a compilation target. Engineers from the four major browser vendors have risen to the challenge and collaboratively designed a portable low-level bytecode called WebAssembly. It offers compact representation, efficient validation and compilation, and safe low to no-overhead execution. Rather than committing to a specific programming model, WebAssembly is an abstraction over modern hardware, making it language-, hardware-, and platform-independent, with use cases beyond just the Web. WebAssembly has been designed with a formal semantics from the start. We describe the motivation, design and formal semantics of WebAssembly and provide some preliminary experience with implementations.

388 citations


Journal ArticleDOI
TL;DR: Five multilingual web services for speech science operational since 2012 are described and the benefits and drawbacks of the new paradigm as well as the experiences with user acceptance and implementation problems are discussed.

272 citations


Journal ArticleDOI
TL;DR: The Pathview Web server is developed, to make pathway visualization and data integration accessible to all scientists, including those without the special computing skills or resources, and presents a comprehensive workflow for both regular and integrated pathway analysis of multiple omics data.
Abstract: Pathway analysis is widely used in omics studies. Pathway-based data integration and visualization is a critical component of the analysis. To address this need, we recently developed a novel R package called Pathview. Pathview maps, integrates and renders a large variety of biological data onto molecular pathway graphs. Here we developed the Pathview Web server, as to make pathway visualization and data integration accessible to all scientists, including those without the special computing skills or resources. Pathview Web features an intuitive graphical web interface and a user centered design. The server not only expands the core functions of Pathview, but also provides many useful features not available in the offline R package. Importantly, the server presents a comprehensive workflow for both regular and integrated pathway analysis of multiple omics data. In addition, the server also provides a RESTful API for programmatic access and conveniently integration in third-party software or workflows. Pathview Web is openly and freely accessible at https://pathview.uncc.edu/.

272 citations


Journal ArticleDOI
TL;DR: An update is presented describing the latest enhancement to the Job Dispatcher APIs as well as the governance under it, which is increasingly important as more high-throughput data is generated.
Abstract: Since 2009 the EMBL-EBI provides free and unrestricted access to several bioinformatics tools via the user's browser as well as programmatically via Web Services APIs. Programmatic access to these tools, which is fundamental to bioinformatics, is increasingly important as more high-throughput data is generated, e.g. from proteomics and metagenomic experiments. Access is available using both the SOAP and RESTful approaches and their usage is reviewed regularly in order to ensure that the best, supported tools are available to all users. We present here an update describing the latest enhancement to the Job Dispatcher APIs as well as the governance under it.

265 citations


Journal ArticleDOI
TL;DR: The development of ePlant is described and several examples illustrating its integrative features for hypothesis generation are presented, including the process of deploying ePl plant as an “app” on Araport.
Abstract: A big challenge in current systems biology research arises when different types of data must be accessed from separate sources and visualized using separate tools. The high cognitive load required to navigate such a workflow is detrimental to hypothesis generation. Accordingly, there is a need for a robust research platform that incorporates all data and provides integrated search, analysis, and visualization features through a single portal. Here, we present ePlant (http://bar.utoronto.ca/eplant), a visual analytic tool for exploring multiple levels of Arabidopsis thaliana data through a zoomable user interface. ePlant connects to several publicly available web services to download genome, proteome, interactome, transcriptome, and 3D molecular structure data for one or more genes or gene products of interest. Data are displayed with a set of visualization tools that are presented using a conceptual hierarchy from big to small, and many of the tools combine information from more than one data type. We describe the development of ePlant in this article and present several examples illustrating its integrative features for hypothesis generation. We also describe the process of deploying ePlant as an “app” on Araport. Building on readily available web services, the code for ePlant is freely available for any other biological species research.

247 citations


Journal ArticleDOI
Ruhi Sarikaya1
TL;DR: An overview of personal digital assistants (PDAs) is given; the system architecture, key components, and technology behind them are described; and their future potential to fully redefine human?computer interaction is discussed.
Abstract: We have long envisioned that one day computers will understand natural language and anticipate what we need, when and where we need it, and proactively complete tasks on our behalf. As computers get smaller and more pervasive, how humans interact with them is becoming a crucial issue. Despite numerous attempts over the past 30 years to make language understanding (LU) an effective and robust natural user interface for computer interaction, success has been limited and scoped to applications that were not particularly central to everyday use. However, speech recognition and machine learning have continued to be refined, and structured data served by applications and content providers has emerged. These advances, along with increased computational power, have broadened the application of natural LU to a wide spectrum of everyday tasks that are central to a user's productivity. We believe that as computers become smaller and more ubiquitous [e.g., wearables and Internet of Things (IoT)], and the number of applications increases, both system-initiated and user-initiated task completion across various applications and web services will become indispensable for personal life management and work productivity. In this article, we give an overview of personal digital assistants (PDAs); describe the system architecture, key components, and technology behind them; and discuss their future potential to fully redefine human?computer interaction.

180 citations


Journal ArticleDOI
TL;DR: The results of this study confirms that new meta-heuristic algorithms have not yet been applied for solving QoS-aware web services composition and describes future research directions in this area.
Abstract: Web service composition concerns the building of new value added services by integrating the sets of existing web services. Due to the seamless proliferation of web services, it becomes difficult to find a suitable web service that satisfies the requirements of users during web service composition. This paper systematically reviews existing research on QoS-aware web service composition using computational intelligence techniques (published between 2005 and 2015). This paper develops a classification of research approaches on computational intelligence based QoS-aware web service composition and describes future research directions in this area. In particular, the results of this study confirms that new meta-heuristic algorithms have not yet been applied for solving QoS-aware web services composition.

168 citations


Journal ArticleDOI
TL;DR: This paper develops a novel multi-cloud IoT service composition algorithm called E2C2 that aims at creating an energy-aware composition plan by searching for and integrating the least possible number of IoT services, in order to fulfil user requirements.

162 citations


Journal ArticleDOI
01 Jan 2017-Database
TL;DR: A workflow to automatically generate, test and deploy API clients for rapid response to API changes is developed, and an R client to the Broad Institute’s RESTful Firehose Pipeline is provided as a working example, which is built by the means of the presented workflow.
Abstract: With its Firebrowse service (http://firebrowse.org/) the Broad Institute is making large-scale multi-platform omics data analysis results publicly available through a Representational State Transfer (REST) Application Programmable Interface (API). Querying this database through an API client from an arbitrary programming environment is an essential task, allowing other developers and researchers to focus on their analysis and avoid data wrangling. Hence, as a first result, we developed a workflow to automatically generate, test and deploy such clients for rapid response to API changes. Its underlying infrastructure, a combination of free and publicly available web services, facilitates the development of API clients. It decouples changes in server software from the client software by reacting to changes in the RESTful service and removing direct dependencies on a specific implementation of an API. As a second result, FirebrowseR, an R client to the Broad Institute’s RESTful Firehose Pipeline, is provided as a working example, which is built by the means of the presented workflow. The package’s features are demonstrated by an example analysis of cancer gene expression data. Database URL: https://github.com/mariodeng/

129 citations


Posted Content
TL;DR: Two novel statistical techniques for automatically detecting anomalies in cloud infrastructure data are developed that employ statistical learning to detect anomalies in both application, and system metrics.
Abstract: Performance and high availability have become increasingly important drivers, amongst other drivers, for user retention in the context of web services such as social networks, and web search. Exogenic and/or endogenic factors often give rise to anomalies, making it very challenging to maintain high availability, while also delivering high performance. Given that service-oriented architectures (SOA) typically have a large number of services, with each service having a large set of metrics, automatic detection of anomalies is non-trivial. Although there exists a large body of prior research in anomaly detection, existing techniques are not applicable in the context of social network data, owing to the inherent seasonal and trend components in the time series data. To this end, we developed two novel statistical techniques for automatically detecting anomalies in cloud infrastructure data. Specifically, the techniques employ statistical learning to detect anomalies in both application, and system metrics. Seasonal decomposition is employed to filter the trend and seasonal components of the time series, followed by the use of robust statistical metrics -- median and median absolute deviation (MAD) -- to accurately detect anomalies, even in the presence of seasonal spikes. We demonstrate the efficacy of the proposed techniques from three different perspectives, viz., capacity planning, user behavior, and supervised learning. In particular, we used production data for evaluation, and we report Precision, Recall, and F-measure in each case.

129 citations


Journal ArticleDOI
TL;DR: This paper proposes an application prototype for precision farming using a wireless sensor network with an IOT cloud, which represents platforms that allow to create web services suitable for the objects integrated on the Internet.

Journal ArticleDOI
TL;DR: A mobile service provisioning architecture named a mobile service sharing community is proposed and a service composition approach by utilizing the Krill-Herd algorithm is proposed, which can obtain superior solutions as compared with current standard composition methods in mobile environments.
Abstract: The advances in mobile technologies enable mobile devices to perform tasks that are traditionally run by personal computers as well as provide services to the others. Mobile users can form a service sharing community within an area by using their mobile devices. This paper highlights several challenges involved in building such service compositions in mobile communities when both service requesters and providers are mobile. To deal with them, we first propose a mobile service provisioning architecture named a mobile service sharing community and then propose a service composition approach by utilizing the Krill-Herd algorithm. To evaluate the effectiveness and efficiency of our approach, we build a simulation tool. The experimental results demonstrate that our approach can obtain superior solutions as compared with current standard composition methods in mobile environments. It can yield near-optimal solutions and has a nearly linear complexity with respect to a problem size.

Journal ArticleDOI
TL;DR: A novel privacy-preserving and scalable service recommendation approach based on SimHash, named , is proposed in this paper and validated through a set of experiments deployed on a real distributed service quality dataset WS-DREAM.
Abstract: With the increasing volume of web services in the cloud environment, Collaborative Filtering- (CF-) based service recommendation has become one of the most effective techniques to alleviate the heavy burden on the service selection decisions of a target user. However, the service recommendation bases, that is, historical service usage data, are often distributed in different cloud platforms. Two challenges are present in such a cross-cloud service recommendation scenario. First, a cloud platform is often not willing to share its data to other cloud platforms due to privacy concerns, which decreases the feasibility of cross-cloud service recommendation severely. Second, the historical service usage data recorded in each cloud platform may update over time, which reduces the recommendation scalability significantly. In view of these two challenges, a novel privacy-preserving and scalable service recommendation approach based on SimHash, named , is proposed in this paper. Finally, through a set of experiments deployed on a real distributed service quality dataset WS-DREAM, we validate the feasibility of our proposal in terms of recommendation accuracy and efficiency while guaranteeing privacy-preservation.

Journal ArticleDOI
TL;DR: The idea is that several detection methods are combined and executed in parallel during an optimization process to find a consensus regarding the identification of web service antipatterns using a cooperative parallel evolutionary algorithm (P-EA).
Abstract: Service Oriented Architecture (SOA) is widely used in industry and is regarded as one of the preferred architectural design technologies. As with any other software system, service-based systems (SBSs) may suffer from poor design, i.e., antipatterns, for many reasons such as poorly planned changes, time pressure or bad design choices. Consequently, this may lead to an SBS product that is difficult to evolve and that exhibits poor quality of service (QoS). Detecting web service antipatterns is a manual, time-consuming and error-prone process for software developers. In this paper, we propose an automated approach for detection of web service antipatterns using a cooperative parallel evolutionary algorithm (P-EA). The idea is that several detection methods are combined and executed in parallel during an optimization process to find a consensus regarding the identification of web service antipatterns. We report the results of an empirical study using eight types of common web service antipatterns. We compare the implementation of our cooperative P-EA approach with random search, two single population-based approaches and one state-of-the-art detection technique not based on heuristic search. Statistical analysis of the obtained results demonstrates that our approach is efficient in antipattern detection, with a precision score of 89 percent and a recall score of 93 percent.

Journal ArticleDOI
TL;DR: The updated web user interfaces together with RESTful web services and the backend relational database that support the former are outlined and enhanced the connectivity of the wwPDB/RDF data by incorporating various external data resources.
Abstract: The Protein Data Bank Japan (PDBj, http://pdbjorg), a member of the worldwide Protein Data Bank (wwPDB), accepts and processes the deposited data of experimentally determined macromolecular structures While maintaining the archive in collaboration with other wwPDB partners, PDBj also provides a wide range of services and tools for analyzing structures and functions of proteins We herein outline the updated web user interfaces together with RESTful web services and the backend relational database that support the former To enhance the interoperability of the PDB data, we have previously developed PDB/RDF, PDB data in the Resource Description Framework (RDF) format, which is now a wwPDB standard called wwPDB/RDF We have enhanced the connectivity of the wwPDB/RDF data by incorporating various external data resources Services for searching, comparing and analyzing the ever-increasing large structures determined by hybrid methods are also described

Journal ArticleDOI
06 Mar 2017
TL;DR: This survey reviews the existing literature on the methods used by web services to track the users online as well as their purposes, implications, and possible user’s defenses, and presents five main groups of methods used for user tracking.
Abstract: Privacy seems to be the Achilles’ heel of today’s web. Most web services make continuous efforts to track their users and to obtain as much personal information as they can from the things they search, the sites they visit, the people they contact, and the products they buy. This information is mostly used for commercial purposes, which go far beyond targeted advertising. Although many users are already aware of the privacy risks involved in the use of internet services, the particular methods and technologies used for tracking them are much less known. In this survey, we review the existing literature on the methods used by web services to track the users online as well as their purposes, implications, and possible user’s defenses. We present five main groups of methods used for user tracking, which are based on sessions, client storage, client cache, fingerprinting, and other approaches. A special focus is placed on mechanisms that use web caches, operational caches, and fingerprinting, as they are usually very rich in terms of using various creative methodologies. We also show how the users can be identified on the web and associated with their real names, e-mail addresses, phone numbers, or even street addresses. We show why tracking is being used and its possible implications for the users. For each of the tracking methods, we present possible defenses. Some of them are specific to a particular tracking approach, while others are more universal (block more than one threat). Finally, we present the future trends in user tracking and show that they can potentially pose significant threats to the users’ privacy.

Journal ArticleDOI
TL;DR: The results show clearly that MCDM ranking methods are more likely to diverge in selecting the best elements when the number of web services is increased, which argues for the pertinence of computing the Borda solution in the context of M CDM based web service selection.

Journal ArticleDOI
TL;DR: The Dockstore is a project that brings together Docker images with standardized, machine-readable ways of describing and running the tools contained within the container that greatly improves the sharing and reuse of genomics tools and promotes interoperability with similar projects through emerging web service standards developed by the Global Alliance for Genomics and Health (GA4GH).
Abstract: As genomic datasets continue to grow, the feasibility of downloading data to a local organization and running analysis on a traditional compute environment is becoming increasingly problematic. Current large-scale projects, such as the ICGC PanCancer Analysis of Whole Genomes (PCAWG), the Data Platform for the U.S. Precision Medicine Initiative, and the NIH Big Data to Knowledge Center for Translational Genomics, are using cloud-based infrastructure to both host and perform analysis across large data sets. In PCAWG, over 5,800 whole human genomes were aligned and variant called across 14 cloud and HPC environments; the processed data was then made available on the cloud for further analysis and sharing. If run locally, an operation at this scale would have monopolized a typical academic data centre for many months, and would have presented major challenges for data storage and distribution. However, this scale is increasingly typical for genomics projects and necessitates a rethink of how analytical tools are packaged and moved to the data. For PCAWG, we embraced the use of highly portable Docker images for encapsulating and sharing complex alignment and variant calling workflows across highly variable environments. While successful, this endeavor revealed a limitation in Docker containers, namely the lack of a standardized way to describe and execute the tools encapsulated inside the container. As a result, we created the Dockstore ( https://dockstore.org), a project that brings together Docker images with standardized, machine-readable ways of describing and running the tools contained within. This service greatly improves the sharing and reuse of genomics tools and promotes interoperability with similar projects through emerging web service standards developed by the Global Alliance for Genomics and Health (GA4GH).

Journal ArticleDOI
TL;DR: This work clusters the users and calculates the reputation of users based on the clustering information by a beta reputation system, and identifies a set of similar services by clustering the services and makes prediction for active users by combining the QoS data of the trustworthy similar users and similar services.
Abstract: With the rapid development of service-oriented computing, cloud computing and big data, a large number of functionally equivalent web services are available on the Internet. Quality of Service (QoS) becomes a differentiating point of services to attract customers. Since the QoS of services varies widely among users due to the unpredicted network, physical location and other objective factors, many Collaborative Filtering based approaches are recently proposed to predict the unknown QoS by employing the historical user-contributed QoS data. However, most existing approaches ignore the data credibility problem and are thus vulnerable to the unreliable QoS data contributed by dishonest users. To address this problem, we propose a trust-aware approach TAP for reliable personalized QoS prediction. Firstly, we cluster the users and calculate the reputation of users based on the clustering information by a beta reputation system. Secondly, a set of trustworthy similar users is identified according to the calculated user reputation and similarity. Finally, we identify a set of similar services by clustering the services and make prediction for active users by combining the QoS data of the trustworthy similar users and similar services. Comprehensive real-world experiments are conducted to demonstrate the effectiveness and robustness of our approach compared with other state-of-the-art approaches.

Proceedings ArticleDOI
TL;DR: A full proof of concept implementation of an ML predictive analytics and deployment of resultant web service that accurately predicts and prevents SQLIA with empirical evaluations presented in Confusion Matrix (CM) and Receiver Operating Curve (ROC).

Journal ArticleDOI
TL;DR: AMF is inspired from the widely-used collaborative filtering techniques in recommender systems, but significantly extends the conventional matrix factorization model with new techniques of data transformation, online learning, and adaptive weights to enable optimal runtime service adaptation.
Abstract: Cloud applications built on service-oriented architectures generally integrate a number of component services to fulfill certain application logic. The changing cloud environment highlights the need for these applications to keep resilient against QoS variations of their component services so that end-to-end quality-of-service (QoS) can be guaranteed. Runtime service adaptation is a key technique to achieve this goal. To support timely and accurate adaptation decisions, effective and efficient QoS prediction is needed to obtain real-time QoS information of component services. However, current research has focused mostly on QoS prediction of working services that are being used by a cloud application, but little on predicting QoS values of candidate services that are equally important in determining optimal adaptation actions. In this paper, we propose an adaptive matrix factorization (namely AMF) approach to perform online QoS prediction for candidate services. AMF is inspired from the widely-used collaborative filtering techniques in recommender systems, but significantly extends the conventional matrix factorization model with new techniques of data transformation, online learning, and adaptive weights. Comprehensive experiments, as well as a case study, have been conducted based on a real-world QoS dataset of Web services (with over 40 million QoS records). The evaluation results demonstrate AMF’s superiority in achieving accuracy, efficiency, and robustness, which are essential to enable optimal runtime service adaptation.

Proceedings ArticleDOI
11 Dec 2017
TL;DR: Swayam is a fully distributed autoscaling framework that exploits characteristics of production ML inference workloads to deliver on the dual challenge of resource efficiency and SLA compliance.
Abstract: Developers use Machine Learning (ML) platforms to train ML models and then deploy these ML models as web services for inference (prediction). A key challenge for platform providers is to guarantee response-time Service Level Agreements (SLAs) for inference workloads while maximizing resource efficiency. Swayam is a fully distributed autoscaling framework that exploits characteristics of production ML inference workloads to deliver on the dual challenge of resource efficiency and SLA compliance. Our key contributions are (1) model-based autoscaling that takes into account SLAs and ML inference workload characteristics, (2) a distributed protocol that uses partial load information and prediction at frontends to provision new service instances, and (3) a backend self-decommissioning protocol for service instances. We evaluate Swayam on 15 popular services that were hosted on a production ML-as-a-service platform, for the following service-specific SLAs: for each service, at least 99% of requests must complete within the response-time threshold. Compared to a clairvoyant autoscaler that always satisfies the SLAs (i.e., even if there is a burst in the request rates), Swayam decreases resource utilization by up to 27%, while meeting the service-specific SLAs over 96% of the time during a three hour window. Microsoft Azure's Swayam-based framework was deployed in 2016 and has hosted over 100,000 services.

Journal ArticleDOI
TL;DR: This paper proposes a ratio-based method to calculate the similarity computation, and proposes a new method to predict the unknown value by comparing the values of a similar service and the current service that are invoked by common users.
Abstract: Recently, collaborative filtering-based methods are widely used for service recommendation. QoS attribute value-based collaborative filtering service recommendation mainly includes two important steps. One is the similarity computation, and the other is the prediction for QoS attribute value, which the user has not experienced. In previous studies, the performances of some methods need to be improved. In this paper, we propose a ratio-based method to calculate the similarity. We can get the similarity between users or between items by comparing the attribute values directly. Based on our similarity computation method, we propose a new method to predict the unknown value. By comparing the values of a similar service and the current service that are invoked by common users, we can obtain the final prediction result. The performance of the proposed method is evaluated through a large data set of real web services. Experimental results show that our method obtains better prediction precision, lower mean absolute error ( $MAE$ ) and faster computation time than various reference schemes considered.


Journal ArticleDOI
TL;DR: The analysis of results indicates that ifcJSON4 schema developed in this paper is a valid JSON schema that can guide the creation of valid ifc JSON documents to be used for web-based data transfer and to improve interoperability of Cloud-based BIM applications.

Proceedings ArticleDOI
06 Apr 2017
TL;DR: It is argued that a microservice approach to building IoT systems can combine in a mutually enforcing way with patterns for microservices, API gateways, distribution of services, uniform service discovery, containers, and access control.
Abstract: The Internet of Things (IoT) has connected an incredible diversity of devices in novel ways, which has enabled exciting new services and opportunities. Unfortunately, IoT systems also present several important challenges to developers. This paper proposes a vision for how we may build IoT systems in the future by reconceiving IoT's fundamental unit of construction not as a "thing", but rather as a widely and finely distributed "microservice" already familiar to web service engineering circles. Since IoT systems are quite different from more established uses of microservice architectures, success of the approach depends on adaptations that enable them to met the key challenges that IoT systems present. We argue that a microservice approach to building IoT systems can combine in a mutually enforcing way with patterns for microservices, API gateways, distribution of services, uniform service discovery, containers, and access control. The approach is illustrated using two case studies of IoT systems in personal health management and connected autonomous vehicles. Our hope is that the vision of a microservices approach will help focus research that can fill in current gaps preventing more effective, interoperable, and secure IoT services and solutions in a wide variety of contexts.

Journal ArticleDOI
Shuiguang Deng1, Hongyue Wu1, Wei Tan2, Zhengzhe Xiang1, Zhaohui Wu1 
TL;DR: This paper formally models this problem of mobile service selection for composition in terms of energy consumption and constructs energy consumption computation models that adopts the genetic algorithm to resolve it.
Abstract: Due to the limits of battery capacity of mobile devices, how to select cloud services to invoke in order to reduce energy consumption in mobile environments is becoming a critical issue. This paper addresses the problem of mobile service selection for composition in terms of energy consumption. It formally models this problem and constructs energy consumption computation models. Energy consumption aggregation rules for composite services with different structures are presented. It adopts the genetic algorithm to resolve it. A replanning mechanism is also proposed to deal with the changeable conditions and user behavior. A series of experiments are conducted to evaluate the performance of our method. The results show that our service selection method significantly outperforms traditional methods. Even if the conditions or user behavior is changeable, this method is still effective to recommend services. Moreover, the service selection method performs good scalability as the experimental scale increases.

Journal ArticleDOI
TL;DR: The semantic Web service discovery method based on the proposed similarity measure outperforms existing state-of-the-art discovery methods in terms of precision, recall and F-measure.
Abstract: Proposing a new similarity measure integrating multiple conceptual relationships.Utilizing is-a, has-a and antinomy conceptual relationships.Weighted combination of interface similarity and description similarity.Comparing with state-of-the-art similarity-based Web service discovery methods. The process of Web service discovery identifies the most relevant services to requesters' service queries. We propose a new measure of semantic similarity integrating multiple conceptual relationships (SIMCR) for Web service discovery. The new measure enables more accurate service-request comparison by treating different conceptual relationships in ontologies such as is-a, has-a and antonomy differently. Each service or request is represented by vectors of terms (or words) that characterize both the interface signature and textual description. The overall semantic similarity is computed as a weighted aggregation of interface similarity and description similarity. The experimental results confirm the effectiveness of the proposed semantic similarity measure. As demonstrated in this study, the semantic Web service discovery method based on the proposed similarity measure outperforms existing state-of-the-art discovery methods in terms of precision, recall and F-measure. The proposed semantic similarity measure has wider applications such as to improve document classification or clustering, and to more accurately represent and apply knowledge in expert and intelligent systems.

Proceedings ArticleDOI
01 Dec 2017
TL;DR: In this article, a resource-aware placement scheme is proposed to boost the system performance in a heterogeneous cluster of Docker containers, where the heterogeneity lies in the fact that different nodes in the cluster may have various configurations, concerning resource types and availabilities, etc., and the demands generated by services are varied.
Abstract: Virtualization is a promising technology that has facilitated cloud computing to become the next wave of the Internet revolution. Adopted by data centers, millions of applications that are powered by various virtual machines improve the quality of services. Although virtual machines are well-isolated among each other, they suffer from redundant boot volumes and slow provisioning time. To address limitations, containers were born to deploy and run distributed applications without launching entire virtual machines. As a dominant player, Docker is an open-source implementation of container technology. When managing a cluster of Docker containers, the management tool, Swarmkit, does not take the heterogeneities in both physical nodes and virtualized containers into consideration. The heterogeneity lies in the fact that different nodes in the cluster may have various configurations, concerning resource types and availabilities, etc., and the demands generated by services are varied, such as CPU-intensive (e.g. Clustering services) as well as memory-intensive (e.g. Web services). In this paper, we target on investigating the Docker container cluster and developed, DRAPS, a resource-aware placement scheme to boost the system performance in a heterogeneous cluster.

Proceedings ArticleDOI
01 Jun 2017
TL;DR: An augmented LDA model is proposed which leverages the high-quality word vectors obtained by Word2vec to improve the performance of Web services clustering and has an average improvement of 5.3% of the clustering accuracy with various metrics.
Abstract: Due to the rapid growth in both the number and diversity of Web services on the web, it becomes increasingly difficult for us to find the desired and appropriate Web services nowadays. Clustering Web services according to their functionalities becomes an efficient way to facilitate the Web services discovery as well as the services management. Existing methods for Web services clustering mostly focus on utilizing directly key features from WSDL documents, e.g., input/output parameters and keywords from description text. Probabilistic topic model Latent Dirichlet Allocation (LDA) is also adopted, which extracts latent topic features of WSDL documents to represent Web services, to improve the accuracy of Web services clustering. However, the power of the basic LDA model for clustering is limited to some extent. Some auxiliary features can be exploited to enhance the ability of LDA. Since the word vectors obtained by Word2vec is with higher quality than those obtained by LDA model, we propose, in this paper, an augmented LDA model (named WE-LDA) which leverages the high-quality word vectors to improve the performance of Web services clustering. In WE-LDA, the word vectors obtained by Word2vec are clustered into word clusters by K-means++ algorithm and these word clusters are incorporated to semi-supervise the LDA training process, which can elicit better distributed representations of Web services. A comprehensive experiment is conducted to validate the performance of the proposed method based on a ground truth dataset crawled from ProgrammableWeb. Compared with the state-of-the-art, our approach has an average improvement of 5.3% of the clustering accuracy with various metrics.