scispace - formally typeset
Search or ask a question

Showing papers presented at "Color Imaging Conference in 2015"


Proceedings ArticleDOI
27 Oct 2015
TL;DR: Wang et al. as mentioned in this paper proposed a solution to privacy-preserving outsourced distributed clustering (PPODC) for multiple users based on the k-means clustering algorithm.
Abstract: Many techniques for privacy-preserving data mining (PPDM) have been investigated over the past decade. Such techniques, however, usually incur heavy computational and communication cost on the participating parties and thus entities with limited resources may have to refrain from participating in the PPDM process. To address this issue, one promising solution is to outsource the tasks to the cloud environment. In this paper, we propose a novel and efficient solution to privacy-preserving outsourced distributed clustering (PPODC) for multiple users based on the k-means clustering algorithm. The main novelty of our solution lies in avoiding the secure division operations required in computing cluster centers through efficient transformation techniques. In addition, we discuss two strategies, namely offline computation and pipelined execution that aim to boost the performance of our protocol. We implement our protocol on a cluster of 16 nodes and demonstrate how our two strategies combined with parallelism can significantly improve the performance of our protocol through extensive experiments using a real dataset.

40 citations


Proceedings ArticleDOI
27 Oct 2015
TL;DR: This paper presents a vision of a future IoT system architecture that is driven by service discovery across every layer of IoT, including on demand discovery and integration of devices, cloud storage and computing resources, as well as existing data analysis, visualisation and application integration services that can be dynamically selected and orchestrated as needed to create IoT applications.
Abstract: The Internet of Things (IoT) ecosystem is growing at a staggering pace. Each day, we are witnessing the emergence of new devices, smart phones, cameras and sensors that are connected to the internet. It is envisioned IoT will discover, integrate and exploit such devices and their data in the development of new services and products that can change and positively impact our lives. However, the core IoT functionality (such as discovery and integration) required to develop IoT service and products need to be developed to better support IoT application development. In this paper, we present a vision of a future IoT system architecture that is driven by service discovery across every layer of IoT. This includes on demand discovery and integration of devices, cloud storage and computing resources, as well as existing data analysis, visualisation and application integration services that can be dynamically selected and orchestrated as needed to create IoT applications. We provide descriptions of specific solutions that we are investigating at each of the IoT layer providing core functionalities for service-based discovery and integration.

26 citations


Proceedings ArticleDOI
27 Oct 2015
TL;DR: This work proposes an access control model for customers who use AWS platform as their infrastructure platform to securely share cyber attack information, which enables secure cyber information sharing and collaborations in public cloud environment on a community basis.
Abstract: A public cloud provides enterprises and organizations with a secure and efficient environment to deploy their systems. While organizations and companies benefit from moving to cloud platform, it is likely that similar cyber attacks will happen to organizations which share the same cloud platform. One way to mitigate this risk is to share cyber security information among these organizations. Unfortunately, popular public cloud platform AWS is lacking an accepted access control model for cyber security information sharing. We propose an access control model for customers who use AWS platform as their infrastructure platform to securely share cyber attack information. Our model enables secure cyber information sharing and collaborations in public cloud environment on a community basis.

24 citations


Proceedings ArticleDOI
27 Oct 2015
TL;DR: This paper first formalize Attribute Based Access Control (ABAC) and proposes a new access control model, called Attribute-Rule ABAC (AR-ABAC), for cloud computing to meet critical access control requirements in clouds.
Abstract: One of the most important challenges that have threatened cloud computing and caused its slow adoption is security Since clouds have diverse groups of users with different sets of security requirements, restricting the users' accesses and protecting information from unauthorized accesses have become the most difficult tasks To address these critical challenges, in this paper we first formalize Attribute Based Access Control (ABAC) and propose a new access control model, called Attribute-Rule ABAC (AR-ABAC), for cloud computing to meet critical access control requirements in clouds Our model supports the attribute-rules that deal with the association between users and objects, as well as the capability for accessing objects based on their sensitivity levels The attribute-rules specify an agreement that determines what kind of attributes should be used and the number of attributes considered for making access decisions In addition, our model ensures secure resource sharing among potential untrusted tenants and supports different access permissions to the same user at the same session

21 citations


Proceedings ArticleDOI
27 Oct 2015
TL;DR: A Genetic Algorithm with improved crossover and mutation operator is proposed for QoS-aware service composition which allows users to select the optimized composition solution according to their preference and experiment results show that this algorithm can improve the solution optimality and accelerate convergence speed.
Abstract: Cloud computing as a widely used computing platform can provide a number of services for customers in a pay-as-you-go fashion. Enabling further growing and complex needs of users, service of different independent cloud provider should be composed to deliver uniform Quality of Service (QoS) as a single request. An open and valid question is how to select services as a partner chain and optimize the service compositions in order to satisfy both functional and non-functional requirements across multiple Cloud services. It is a NP-hard problem and faces trade-off among various QoS criteria. In this paper, a service composition model is presented considering the geo-distributed Multi-Cloud environment. Furthermore a Genetic Algorithm (GA) with improved crossover and mutation operator is proposed for QoS-aware service composition which allows users to select the optimized composition solution according to their preference. Experiment results show that this algorithm can improve the solution optimality and accelerate convergence speed.

21 citations


Proceedings ArticleDOI
27 Oct 2015
TL;DR: An extended deep learning approach that incorporates instance selection and bootstrapping techniques for imbalanced data classification is proposed that shows the effectiveness of the framework in classifying 54 TRECVID concepts with different imbalanced levels by comparing with other state-of-the-art methods.
Abstract: In this paper, we propose an extended deep learning approach that incorporates instance selection and bootstrapping techniques for imbalanced data classification. In supervised learning, classification performance often deteriorates when the training set is imbalanced where at least one of the classes has a substantially fewer number of instances than the others. We propose to use adaptive synthetic sampling approach (ADASYN) to generate synthetic instances for the minority class. A data pruning process based on multiple correspondence analysis (MCA) is then performed to identify a sub-set of synthetic instances that are most suitable to supplement the existing minority instances. This results in a relatively more balanced training dataset which is then bootstrapped and fed into the convolutional neural networks (CNNs) for classification. Furthermore, we propose to use low-level features pre-processed by principal component analysis (PCA), instead of the commonly used raw signal data, as the input to CNNs to reduce the computational time. The experimental results show the effectiveness of our framework in classifying 54 TRECVID concepts with different imbalanced levels by comparing with other state-of-the-art methods.

14 citations


Proceedings ArticleDOI
27 Oct 2015
TL;DR: The proposed batch authentication protocol meets most requirements, such as tag information privacy, forward/backward security, resistance against reply, tracking, and Dos attacks, and the run time of the whole authentication process in the scheme is decreased 20% at least compared with the other existing schemes.
Abstract: Tag Authentication is an essential issue in RFID system which is wildly applied in many areas. Compared to the per-tag-based authentication, the batch-mode authentication has better performance for complex applications. However, many existing batch authentication protocols suffer from security and privacy threats, low efficiency problem or high communication and computation cost. In order to solve these problems better, we propose a new RFID batch authentication protocol. In this protocol, tags are grouped and each key in one group shares the same group key. The connection between the group key and the tag's own key is fully utilized to construct our batch authentication protocol. Meanwhile, based on the proposed batch authentication protocol, we propose a group tags ownership transfer protocol which also supports the tag authorisation recovery. Compared with previous schemes, our scheme achieves stronger security and higher efficiency. In the side of security and privacy, our scheme meets most requirements, such as tag information privacy, forward/backward security, resistance against reply, tracking, and Dos attacks. Then we carry on the theoretical analysis and implement the simulation experiment. Both of them indicate that the performance of our scheme is much efficient than other authentication schemes. Particularly, the simulation results show that the run time of the whole authentication process in our scheme is decreased 20% at least compared with the other existing schemes.

12 citations


Proceedings ArticleDOI
27 Oct 2015
TL;DR: This paper adopts an evolutionary computation technique called Particle Swarm Optimization (PSO) to derive the approximate answers to the problem of finding the scheduling solutions to minimize the workflow make span under the constraint of the user's budget.
Abstract: Nowadays, many scientific workflows are deployed in the cloud, and how to schedule the tasks according to the users' QoS (Quality of Service) requirements, such as the make span and the monetary cost, has been proposed as the main challenge. In this paper, we aim to solve the problem of finding the scheduling solutions to minimize the workflow make span under the constraint of the user's budget. Considering it is very time consuming to find the optimal solution, instead, we adopt an evolutionary computation technique called Particle Swarm Optimization (PSO) to derive the approximate answers. The proposed method is evaluated with real scientific workflows of different structures and sizes. Comparing with the latest method, the experiment results show that our proposed approach can achieve better performance by increasing the number of particles and iterations.

11 citations


Proceedings ArticleDOI
27 Oct 2015
TL;DR: The The authors Feel system is able to capture and process up to 45,000 tweets per minute, and show their emotional content live in an interactive visualisation, which can be used to explore how people respond to certain events and support mental health research.
Abstract: We have seen an explosion of communication on social media in the past few years. In particular, people use Twitter to share information and experiences, express their opinions, and say how they feel. This wealth of data of how people feel and experience various events can provide valuable information to support mental health research. In this paper, we show how we can explore the emotional state of a population by mining the vast amount of available public social media data in real time. The We Feel system is able to capture and process up to 45,000 tweets per minute, and show their emotional content live in an interactive visualisation. The data can be used to explore how people respond to certain events and support mental health research.

11 citations


Proceedings ArticleDOI
27 Oct 2015
TL;DR: This work suggests a system for collaborative a-posteriori access control to data dissemination in decentralized online social networks based on reporting and auditing and demonstrates the usability of the suggested model using a real OSN graph.
Abstract: Accountability and transparency have been commonly accepted to deter bad acts and to encourage compliance to rules. For this, auditing has been largely, and since ancient times, adopted to ensure the well running of systems and businesses within which duties are governed by set rules. Recently, an a-posteriori approach to data access control has been investigated for information systems as well across number of critical domains (e.g., Healthcare systems). Besides, privacy advocates started calling for the necessity of accountability and transparency in managing users' privacy in nowadays connected and proliferated web data. Following this line of thought, we suggest a system for collaborative a-posteriori access control to data dissemination in decentralized online social networks based on reporting and auditing. We demonstrate the usability of our suggested model using a real OSN graph.

10 citations


Proceedings ArticleDOI
27 Oct 2015
TL;DR: A novel recommendation algorithm called STPMF based on neighborhood model and matrix factorization model, where complementary roles of similarity relationships and trust relationships to the user model by means of a weight w are considered simultaneously to alleviate the data sparsity and cold-start problems.
Abstract: Traditional collaborative filtering approaches are often confronted with two major problems: data sparsity and cold-start. Fortunately, along with the rise of social media, social network is producing a large and rich set of social data (such as labels, trust, etc.), which provides a new way to solve the problems of collaborative filtering, namely, we can make use of social data to enhance the recommendation accuracy. However, traditional recommendation algorithms may only consider either the influence of similarity relationships or trust relationships to the user model, but fail to take full advantage of the implications of social data. In this paper, we propose a novel recommendation algorithm called STPMF based on neighborhood model and matrix factorization model, where complementary roles of similarity relationships and trust relationships to the user model by means of a weight w are considered simultaneously. Furthermore, we propagate similarity relationships and trust relationships one step or two steps to alleviate the data sparsity and cold-start problems. We have conducted experiments on two real world data sets from Last. Fm and Delicious. Compared with existing recommendation algorithms, our method can effectively alleviate the problems of collaborative filtering, and enhance the recommendation accuracy.

Proceedings ArticleDOI
27 Oct 2015
TL;DR: This article examines VANET dynamics from a spatial viewpoint in order to identify where/why in the underlying road network, particular network phenomena occur and discusses whether the emerging network phenomena can indeed support (and to what extend) these paradigms.
Abstract: Vehicular Ad hoc Networks (VANETs) have emerged as a platform to support Intelligent Transportation Applications. A key to the development of protocols and services for IVC lies in the knowledge of the topological characteristics of the VANET communication graph. This article aims provide a "higher order" knowledge in the time-evolving dynamics of vehicular networks in large-scale urban environments through complex network analysis. In addition, we examine VANET dynamics from a spatial viewpoint in order to identify where/why in the underlying road network, particular network phenomena occur. We make reference and correlate our findings with the main communication paradigms employed in VANETs, and discuss whether the emerging network phenomena can indeed support (and to what extend) these paradigms.

Proceedings ArticleDOI
27 Oct 2015
TL;DR: A collaborative approach by multiple operators to share spectrum can be of mutual benefit and the possibility of virtualization to enable every operator to obtain a higher throughput in an LTE network compared to operators working separately is shown.
Abstract: Wireless virtualization has significant potential for improving the efficiency of resource utilization through spectrum sharing However, several challenges exist in enabling/implementing virtualization over the current static cellular infrastructure In the meantime, virtualization necessitates spectrum sharing and a collaborative network management entity that needs to provide efficient isolation strategies that protects multiple operators from suffering interference in shared frequency bands Toward enabling virtualization, we develop a framework based on 3GPP LTE, in which spectrum is reused temporally and spatially by more than one operator The usage of spectrum is adjusted by network parameters and network parameters are configured based on power and radio propagation characteristics In the framework, we minimally modify the radio resource manager (RRM) of LTE into a virtual resource manager Further, the popular LTE enhanced inter-cell interference cancellation (eICIC) - time domain muting is adopted/modified as an isolation technique in a virtual setting Extensive simulations are conducted that show the possibility of virtualization to enable every operator to obtain a higher throughput in an LTE network compared to operators working separately However, the network configuration driven by tradeoffs will have to be carefully considered to ensure isolation In summary, a collaborative approach by multiple operators to share spectrum can be of mutual benefit

Journal ArticleDOI
01 May 2015
TL;DR: A sketch based bidirectional reflectance distribution function design interface allows simple concept art to be used to style new metallic car colors and is assessed by using it in industrial and educational design studios.
Abstract: A computer aided design system for determining the color appearance of metallic automotive coatings has been developed. A sketch based bidirectional reflectance distribution function design interface allows simple concept art to be used to style new metallic car colors. The final design is specified using industrial measurement standards for metallic color appearance, and paint formulations are determined by employing an automotive refinish system. A virtual collection of existing automotive paints, specified using the measurement standard, is provided, and tools for searching this database, for both design and manufacturing purposes, are described. The system is assessed by using it in industrial and educational design studios. c 2015 Society for Imaging Science and Technology. (DOI: 10.2352/J.ImagingSci.Technol.2015.59.3.030403)

Proceedings ArticleDOI
27 Oct 2015
TL;DR: This paper presents the performance of the CoAP protocol using BLE 4.1 on Android Lollipop and focuses on the use of the IoT protocol CoAP as an application layer protocol.
Abstract: As the number of mobile devices per user increases, the need to connect/combine them grows. Current approaches focus on the use of cloud-hosted backend services which allow file and app-state synchronization but fail in providing true resource sharing among mobile devices. To enable true resource/service sharing, the mobile devices of a single user should be combined into a cloud of cooperating mobile devices. Instead of accessing the resources/services of an individual device, a user should be able to seamlessly access the combined resources/services of his/her device cloud. Enabling seamless access to the resources/services hosted on different mobile devices is therefore a key challenge. Exposing the resources/services of each mobile devices within the user's device cloud via Restful micro-services, is one possible approach. This paper focusses on the use of the IoT protocol CoAP as an application layer protocol. To minimize the energy costs of communication, it was necessary to replace CoAP's standard transport protocol (UDP) with BLE 4.1. This paper presents the performance of the CoAP protocol using BLE 4.1 on Android Lollipop.

Proceedings ArticleDOI
27 Oct 2015
TL;DR: A preliminary evaluation of a cloud partitioning approach to distribute task execution requests in volunteer cloud that has been validated through a simulation-based statistical analysis using the Google workload data trace is presented.
Abstract: The growing demand of computational resources has shifted users towards the adoption of cloud computing technologies. Cloud allows users to transparently access to remote computing capabilities as an utility. The volunteer computing paradigm, another ICT trend of the last years, can be considered a companion force to enhance the cloud in fulfilling specific domain requirements, such as computational intensive requests. Combining the spared resources provided by volunteer nodes with few data centers is possible to obtain a robust and scalable cloud platform. The price for such benefits relies in increased challenges to design and manage a dynamic complex system composed by heterogeneous nodes. Task execution requests submitted in the volunteer cloud are usually associated with Quality of Service requirements e.g., Specified through an execution deadline. In this paper, we present a preliminary evaluation of a cloud partitioning approach to distribute task execution requests in volunteer cloud, that has been validated through a simulation-based statistical analysis using the Google workload data trace.

Proceedings ArticleDOI
27 Oct 2015
TL;DR: This paper presents collaborative computing for optimization of the integrated problem consisting of vehicle routing and docking scheduling, and proposes collaborative service rules for handling the uncertainty.
Abstract: Inspired by successful cross-docking applications in the prevailing supply chain management (SCM) business such as Wal-Mart, Fed Ex, and Home Depot, we propose a new collaborative vehicle routing and scheduling model with cross-dock under uncertainty. Conventional cross-docking models treat the vehicle routing and docking scheduling components as independent sub problems, and intend to solve them separately. The obtained result may overlook the potential benefits from solving them as a whole. Moreover, little literature has addressed the uncertainty scenarios such as vehicle failure, traffic condition, and changing demand. This paper presents collaborative computing for optimization of the integrated problem consisting of vehicle routing and docking scheduling. We further propose collaborative service rules for handling the uncertainty. An illustrative solution with a 50-vertex problem is used to manifest the superiority of our model over existing ones. Worst-case statistical analysis is conducted to show the worst performance that could be obtained by our method after a specific number of repetitive runs.

Proceedings ArticleDOI
27 Oct 2015
TL;DR: This paper first describes the overall platform's architecture and functionality and then presents concrete programming model elements -- Collective-based Tasks (CBTs) and Collectives, describe their properties and show how they meet the hybridity and collectiveness requirements.
Abstract: Hybrid Diversity-aware Collective Adaptive Systems (HDA-CAS) is a new generation of socio-technical systems where both humans and machine peers complement each other and operate collectively to achieve their goals. These systems are characterized by the fundamental properties of hybridity and collectiveness, hiding from users the complexities associated with managing the collaboration and coordination of hybrid human/machine teams. In this paper we present the key programming elements of the Smart Society HDA-CAS platform. We first describe the overall platform's architecture and functionality and then present concrete programming model elements -- Collective-based Tasks (CBTs) and Collectives, describe their properties and show how they meet the hybridity and collectiveness requirements. We also describe the associated Java language constructs, and show how concrete use-cases can be encoded with the introduced constructs.

Proceedings ArticleDOI
27 Oct 2015
TL;DR: A K-medoids based clustering scheme, Clustering based on Similar Volume and Shape (CSVS), is proposed on a newly designed Scaling Aware Shifting Invariant (SASI) distance measure to uncover the different types of sharing patterns of news articles in social media.
Abstract: What types of information are popularly shared on social media sites such as Facebook, Twitter, Linked In and Google+? Does each of the social network cares about different topics? Can we uncover the different patterns behind the sharing behavior of social network users? This paper addresses the above questions by analyzing public information spreading data on different online social networks, identifying the spreading characteristics, and modeling the spreading patterns. In order to conduct statistical studies and build models to achieve our goals, we first extract data from the "share" buttons in news articles published by mainstream news websites. Such buttons are important to initiate the propagation of the news in social media. Through statistical findings, we demonstrate that both the share counts and topics of news vary a lot in different social networks. Additionally, based on the time series data displaying how news articles accumulate their share counts, we propose a K-medoids based clustering scheme, Clustering based on Similar Volume and Shape (CSVS), on a newly designed Scaling Aware Shifting Invariant (SASI) distance measure to uncover the different types of sharing patterns of news articles in social media. Through experiments on the collected dataset, we demonstrate that the proposed CSVS is able to cluster news with similar sharing patterns into the same cluster, by taking into consideration both of their share counts and shapes.

Proceedings ArticleDOI
27 Oct 2015
TL;DR: This work draws inspiration that only a small portion of reviewers can generate useful information, and proposes a sparse overlapping user lasso model to tackle these challenges, and demonstrates that the method consistently outperforms other state-of-the-art models in sentiment classification tasks, meanwhile generating accurate results on keywords discovering and opinion leader identification task.
Abstract: Social review sites offer a wealth of information beyond sentiment polarity. For instance, on IMDb users leave valuable reviews on different aspects of a movie (e.g. Actors, visual effects). This inspires researchers to fully discover information from social review texts for the sake of modelling users behavior. Previous studies have spent a large amount of effort to identify sentiment scores from reviews. Yet questions like "What are the key plot in this movie?" and "Who is the valuable user I should follow?", the answers of which comprehensively support user decision making process, can not be answered in those works. To jointly learn from sentiment, text and user in social reviews, we draw inspiration that only a small portion of reviewers can generate useful information, and propose a sparse overlapping user lasso model to tackle these challenges. In addition, we show how to efficiently solve the resulting optimization challenges using the alternating directions method of multipliers (ADMM), a framework which divides our objective into sub-tasks that are easy to fulfill. By experimenting several experiments on 3 real world social review datasets, we demonstrate that our method consistently outperforms other state-of-the-art models in sentiment classification tasks, meanwhile generating accurate results on keywords discovering and opinion leader identification task.

Proceedings ArticleDOI
27 Oct 2015
TL;DR: This paper presents a model-defined approach to the development of a cloud-of-clouds management (abbreviated as CCMan) system and shows that the efforts needed to both develop the CCMan system and operate its services are significantly reduced with negligible performance loss.
Abstract: With the growth in the number of Cloud Service Providers, many enterprises and organizations are now able to use multiple cloud platforms in order to achieve improved overall Quality-of-Service (QoS), reliability and cost efficiency. However, due to the diversity in architecture and functionalities among different cloud platforms, it is difficult to build a system that simultaneously manages multiple clouds, i.e., A cloud-of-clouds. This paper presents a model-defined approach to the development of a cloud-of-clouds management (abbreviated as CCMan) system. The runtime model of a CCMan system that meets custom management requirements is constructed through model construction, model merging and model transformation. Each step of the approach is presented in detail in terms of an example. Evaluation of the approach from several perspectives shows that the efforts needed to both develop the CCMan system and operate its services are significantly reduced with negligible performance loss.

Proceedings ArticleDOI
27 Oct 2015
TL;DR: An improved broadcast authentication scheme named bidirectional broadcasting authentication scheme based on the Merkle Hash tree and μTESLA protocol is proposed, which can reduce the transmission overhead and realize the two-way secure communication between central node and leaf node.
Abstract: Broadcast authentication is an important prerequisite for secure data diffusion in the wireless sensor networks (WSN). It allows a base station to send commands and requests to multiple sensor nodes in a dynamic and authentic manner. During the past few years, several public key based multi-user broadcast authentication schemes have been proposed, such as µTESLA scheme and its variants. These schemes are associated with large buffer due to the delayed authentication of the message, which can incurred a series of problems such as high energy consumption and long verification delay. In this contribution, we propose an improved broadcast authentication scheme named bidirectional broadcasting authentication scheme based on the Merkle Hash tree and µTESLA protocol. The main idea of our scheme is adding a verify node in the Merkle Hash tree broadcast authentication, which is responsible for storing the entire hash tree. When the node receives a packet from other nodes, it sends an authentication request to the verify node and the verify node will send the authentication data of the node to the requesting node, and then requesting node receives the authentication packet and authenticate it. As a result, our scheme can reduce the transmission overhead and realize the two-way secure communication between central node and leaf node.

Proceedings ArticleDOI
Yang Cuixia1, Zuo Chaoshun1, Guo Shanqing1, Hu Chengyu1, Cui Lizhen1 
27 Oct 2015
TL;DR: A UI modeling method based on attribute graph by the technology of reverse engineering and program analysis for applications on Android that helps develop more comprehensive application to detect repackaged applications include malicious ones.
Abstract: Android applications are user interaction intensive programs, that makes UI become an indispensable part of mobile applications. UI also reflects information, such as the functions of applications, that makes the study of Android UI very significant. We design a UI modeling method based on attribute graph by the technology of reverse engineering and program analysis for applications on Android. The proposed method is also applied to malware repackage detection and evaluation of the application family resemblance. This method depicts Widgets consisted in UI and relationships between UIs, based of the assumption that 1) repackaged applications are similar on UI and 2) functions and appearances are highly similar between the members of families. Our method achieves 94.74% success rate at UI model and detects 2231 (26.13%) repackaged applications discovering around 50.0% applications have the exact same UI. The result shows that the UI modeling method helps develop more comprehensive application to detect repackaged applications include malicious ones.


Proceedings ArticleDOI
27 Oct 2015
TL;DR: The research seeks to explore the various opportunities, challenges and approaches in using social media for environmental monitoring using generic terms to track the environmental events.
Abstract: In recent years, social media has revolutionized citizen science activities. Given its popularity among people and communities, these social media services could be used effectively for environmental surveillance. However in social media, people use different terms to refer to same event for example, Blue Green Algae, Cyan bacteria, Algae Bloom and Red Tide refer to same event but one is very technical and other is more generic term. The technical terms are normally known to field experts or the domain scientists which inherently would mean more reliable information on social media but the more generic term is used by people of various backgrounds putting a question on the trustworthiness of the post. Moreover, the user base and the number of posts for more technical terms are relatively less compared to the generic terms. But the dichotomy is that the more common the term, the more noisy the data. One can say using generic terms to track the environmental events would be more effective. But the social media data has lot of flux thus using train once and classify ever model of machine learning will miss to classify many of the relevant events as shown in the paper. Our research seeks to explore the various opportunities, challenges and approaches in using social media for environmental monitoring.

Proceedings ArticleDOI
27 Oct 2015
TL;DR: This paper presents models and a framework for analyzing the reliability of hybrid compute units (HCU), which represent on-demand collectives of humans collaboration supported by machines (hardware and software units) for performing tasks.
Abstract: Modern development of computing systems caters the collaboration of human-based resources together with machine-based resources as active compute units. Those units can be dynamically provisioned on-demand for solving complex tasks, such as observed in collaborative applications, crowd sourced applications, and human task workflows. Such collaborations involve very diverse compute units, which have different capabilities and reliability. While the reliability analysis for machine-based compute units has been widely developed, the reliability analysis for the hybrid human-machine collaborations has not been extensively studied. In this paper we present models and a framework for analyzing the reliability of hybrid compute units (HCU), which represent on-demand collectives of humans collaboration supported by machines (hardware and software units) for performing tasks. We present the implementation of our models and study the reliability of HCUs in a simulated system for infrastructure maintenance scenarios. Our evaluation shows that the proposed framework is effective for measuring the reliability of the collaboration collectives, and beneficial to obtain insights for improvements.

Proceedings ArticleDOI
27 Oct 2015
TL;DR: The results show that DPRTree indexing coupled with precomputed distance based query processing make the Hike system can significantly reduce the overall cost of kNN search, and is much faster than the existing representative methods.
Abstract: Internet continues to serve as the dominating platform for sharing and delivering multimedia contents kNN queries are an enabling functionality for many multimedia applications However, scaling kNN queries over large highdimensional multimedia datasets remains challenging In this paper we present Hike, a high performance multimedia kNN query processing system, it integrate the proposed novel Distance-Precomputation based R-tree (DPR-Tree) index structure, coupled with the fast distance based pruning methods This unique coupling improves the computational performance of kNN search and consequently reducing I/O cost Since Hike's index structure DPR-Tree by design generates two types of query independent precomputed distances during the index construction, and it can intelligently utilizes the precomputed distances to design a suite of computationally inexpensive pruning techniques, which make the Hike system can filter out irrelevant index nodes and data objects, and minimize the amount of duplicate computations among different kNN queriesWe conduct an extensive experimental evaluation using real-life web multimedia datasets Our results show that DPRTree indexing coupled with precomputed distance based query processing make the Hike system can significantly reduce the overall cost of kNN search, and is much faster than the existing representative methods


Proceedings ArticleDOI
27 Oct 2015
TL;DR: A QoS-based service ranking and selection approach is proposed to help developers select the service that best satisfies developers' QoS requirements from a set of services having already satisfied developers' functionality requirements in mobile cloud computing.
Abstract: With the prevalence of mobile computing and its convergence with cloud computing, there is an increasing trend of composing existing cloud services for rapid development of cloud-based mobile applications. It is vital for developers to find services not only satisfying their functionality requirements, but also meeting the requirements on non-functional quality of services (QoS). These QoS requirements, such as throughput, delay, reliability and security, are critical for the success of cloud-based mobile applications. In this paper, a QoS-based service ranking and selection approach is proposed to help developers select the service that best satisfies developers' QoS requirements from a set of services having already satisfied developers' functionality requirements in mobile cloud computing. Compared with state-of-the-art service ranking and selection techniques, our approach has the following advantages: 1) it uses intervals instead of fixed values to represent QoS of services, which are more flexible and practical in mobile cloud computing, 2) it enables developers to specify their QoS requirements in a more simple way, and 3) it employs the hybrid weights that incorporate the Entropy-based weighting technique to overcome the weakness caused by subjective weights, which ignore the knowledge of different services' performance in different QoS aspects. Experiments validate the effectiveness of the proposed method.

Proceedings ArticleDOI
27 Oct 2015
TL;DR: This paper explores a Workload Aware Column Order solution, WACO, to boost scan operator in wide table and proposes a linear programming based solution to efficiently obtain an effective column order in internal physical placement.
Abstract: Entering the big data era, wide table, which contains thousands of columns, is being widely adopted in cloud systems for its facilitation to critical fields such as warehouse/log analysis system, scientific applications and RDF storage. Although taking the advantage of avoiding expensive join in distributed environment, wide table puts an urgent demand for fast scan during data processing. However, represented by horizontal row-store, vertical column store and hybrid columnar store, existing promising data placement structures are witnessing a massive waste of computing resources on disk seeks, due to their ignorance of column order in physical data placement. To make the matter worse, some superior placement methods require additional adjustment to query processing engine or violate the data replication strategy and thus changed error-prone and incur additional disk seeks. In this paper, we explore a Workload Aware Column Order solution, WACO, to boost scan operator in wide table. The new WACO solution maximizes the sequential disk access and turns out to be transparent to underlying cloud systems, which therefore does not exhibit any above-mentioned shortcomings. In particular, acquire query workload, we exploit the recent access patterns on wide table and their frequencies via query logs. Given access patterns and their workload, we investigate the column placement strategy and proof that it belongs to NP-hard to figure out the optimal column layout. Furthermore, we propose a linear programming based solution to efficiently obtain an effective column order in internal physical placement. To make our solution robust and practical, we implement such scan-optimized data placement strategy as a library and thus it is a seamless integration with the underlying system and does not require any adjustment to existing system. We conduct extensive experiments on real-world TPC-H benchmark and SDSS dataset for simulate wide table to demonstrate the superiority of our solution. The experiment results show that our approach is 2x faster than the state-of-the-art.