scispace - formally typeset
Search or ask a question

Showing papers in "Information Processing and Management in 2021"


Journal ArticleDOI
TL;DR: It is highlighted that blockchain’s structure and modern cloud- and edge-computing paradigms are crucial in enabling a widespread adaption and development of blockchain technologies for new players in today unprecedented vibrant global market.
Abstract: Blockchain technologies have grown in prominence in recent years, with many experts citing the potential applications of the technology in regard to different aspects of any industry, market, agency, or governmental organizations. In the brief history of blockchain, an incredible number of achievements have been made regarding how blockchain can be utilized and the impacts it might have on several industries. The sheer number and complexity of these aspects can make it difficult to address blockchain potentials and complexities, especially when trying to address its purpose and fitness for a specific task. In this survey, we provide a comprehensive review of applying blockchain as a service for applications within today’s information systems. The survey gives the reader a deeper perspective on how blockchain helps to secure and manage today information systems. The survey contains a comprehensive reporting on different instances of blockchain studies and applications proposed by the research community and their respective impacts on blockchain and its use across other applications or scenarios. Some of the most important findings this survey highlights include the fact that blockchain’s structure and modern cloud- and edge-computing paradigms are crucial in enabling a widespread adaption and development of blockchain technologies for new players in today unprecedented vibrant global market. Ensuring that blockchain is widely available through public and open-source code libraries and tools will help to ensure that the full potential of the technology is reached and that further developments can be made concerning the long-term goals of blockchain enthusiasts.

291 citations


Journal ArticleDOI
TL;DR: In this article, the authors proposed a solution for distributed management of identity and authorization policies by leveraging on the blockchain technology to hold a global view of the security policies within the system, and integrating it in the FIWARE platform.
Abstract: The platforms supporting the smart city applications are rarely implemented from scratch by a municipality and/or totally owned by a single company, but are more typically realized by integrating some existing ICT infrastructures thanks to a supporting platform, such as the well known FIWARE platform. Such a multi-tenant deployment model is required to lower the initial investment costs to implement large scale solutions for smart cities, but also imposes some key security obstacles. In fact, smart cities support critical applications demanding to protect the data and functionalities from malicious and unauthorized uses. Equipping the supporting platforms with proper means for access control is demanding, but these means are typically implemented according to a centralized approach, where a single server stores and makes available a set of identity attributes and authorization policies. Having a single root of trust is not suitable in a distributed and cooperating scenario of large scale smart cities due to their multi-tenant deployment. In fact, each of the integrated system has its own set of security policies, and the other systems need to be aware of these policy, in order to allow a seamless use of the same credentials across the overall infrastructure (realizing what is known as the single-sign-on). This imposes the problem of consistent and secure data replicas within a distributed system, which can be properly approached by using the blockchain technology. Therefore, this work proposes a novel solution for distributed management of identity and authorization policies by leveraging on the blockchain technology to hold a global view of the security policies within the system, and integrating it in the FIWARE platform. A detailed assessment is provided to evaluate the goodness of the proposed approach and to compare it with the existing solutions.

228 citations


Journal ArticleDOI
TL;DR: A model to understand the effect of information seeking, information sources, and information overload (Stimuli) on information anxiety (psychological organism), and consequent behavioral response, information avoidance during the global health crisis (COVID-19) is proposed.
Abstract: Individuals seek information for informed decision-making, and they consult a variety of information sources nowadays. However, studies show that information from multiple sources can lead to information overload, which then creates negative psychological and behavioral responses. Drawing on the Stimulus-Organism-Response (S-O-R) framework, we propose a model to understand the effect of information seeking, information sources, and information overload (Stimuli) on information anxiety (psychological organism), and consequent behavioral response, information avoidance during the global health crisis (COVID-19). The proposed model was tested using partial least square structural equation modeling (PLS-SEM) for which data were collected from 321 Finnish adults using an online survey. People found to seek information from traditional sources such as mass media, print media, and online sources such as official websites and websites of newspapers and forums. Social media and personal networks were not the preferred sources. On the other hand, among different information sources, social media exposure has a significant relationship with information overload as well as information anxiety. Besides, information overload also predicted information anxiety, which further resulted in information avoidance.

204 citations


Journal ArticleDOI
TL;DR: The BDR-CNN-GCN showed improved performance compared to five proposed neural network models and 15 state-of-the-art breast cancer detection approaches, proving to be an effective method for data augmentation and improved detection of malignant breast masses.
Abstract: Aim In a pilot study to improve detection of malignant lesions in breast mammograms, we aimed to develop a new method called BDR-CNN-GCN, combining two advanced neural networks: (i) graph convolutional network (GCN); and (ii) convolutional neural network (CNN). Method We utilised a standard 8-layer CNN, then integrated two improvement techniques: (i) batch normalization (BN) and (ii) dropout (DO). Finally, we utilized rank-based stochastic pooling (RSP) to substitute the traditional max pooling. This resulted in BDR-CNN, which is a combination of CNN, BN, DO, and RSP. This BDR-CNN was hybridized with a two-layer GCN, and yielded our BDR-CNN-GCN model which was then utilized for analysis of breast mammograms as a 14-way data augmentation method. Results As proof of concept, we ran our BDR-CNN-GCN algorithm 10 times on the breast mini-MIAS dataset (containing 322 mammographic images), achieving a sensitivity of 96.20±2.90%, a specificity of 96.00±2.31% and an accuracy of 96.10±1.60%. Conclusion Our BDR-CNN-GCN showed improved performance compared to five proposed neural network models and 15 state-of-the-art breast cancer detection approaches, proving to be an effective method for data augmentation and improved detection of malignant breast masses.

189 citations


Journal ArticleDOI
TL;DR: An exhaustive review of stance detection techniques on social media, including the task definition, different types of targets in stance detection, features set used, and various machine learning approaches applied is presented.
Abstract: Stance detection on social media is an emerging opinion mining paradigm for various social and political applications in which sentiment analysis may be sub-optimal. There has been a growing research interest for developing effective methods for stance detection methods varying among multiple communities including natural language processing, web science, and social computing, where each modeled stance detection in different ways. In this paper, we survey the work on stance detection across those communities and present an exhaustive review of stance detection techniques on social media, including the task definition, different types of targets in stance detection, features set used, and various machine learning approaches applied. Our survey reports state-of-the-art results on the existing benchmark datasets on stance detection, and discusses the most effective approaches. In addition, we explore the emerging trends and different applications of stance detection on social media, including opinion mining and prediction and recently using it for fake news detection. The study concludes by discussing the gaps in the current existing research and highlights the possible future directions for stance detection on social media.

121 citations


Journal ArticleDOI
TL;DR: A hybrid approach of two deep learning architectures namely Convolutional Neural Network (CNN) and Long Short Term Memory (LSTM) (RNN with memory) is suggested for sentiment classification of reviews posted at diverse domains for sentiment analysis of consumer reviews posted on social media.
Abstract: Analysis of consumer reviews posted on social media is found to be essential for several business applications. Consumer reviews posted in social media are increasing at an exponential rate both in terms of number and relevance, which leads to big data. In this paper, a hybrid approach of two deep learning architectures namely Convolutional Neural Network (CNN) and Long Short Term Memory (LSTM) (RNN with memory) is suggested for sentiment classification of reviews posted at diverse domains. Deep convolutional networks have been highly effective in local feature selection, while recurrent networks (LSTM) often yield good results in the sequential analysis of a long text. The proposed Co-LSTM model is mainly aimed at two objectives in sentiment analysis. First, it is highly adaptable in examining big social data, keeping scalability in mind, and secondly, unlike the conventional machine learning approaches, it is free from any particular domain. The experiment has been carried out on four review datasets from diverse domains to train the model which can handle all kinds of dependencies that usually arises in a post. The experimental results show that the proposed ensemble model outperforms other machine learning approaches in terms of accuracy and other parameters.

120 citations


Journal ArticleDOI
TL;DR: A novel theoretical model is provided to calculate the transaction latency under various network configurations such as block size, block interval, etc to help assess the effectiveness of the Fabric blockchain.
Abstract: Blockchain has been one of the most attractive technologies for many modern and even future applications. Fabric, an open-source framework to implement the permissioned enterprise-grade blockchain, is getting increasing attention from innovators. The latency performance is crucial to the Fabric blockchain in assessing its effectiveness. Many empirical studies were conducted to analyze this performance based on different hardware platforms. These experimental results are not comparable as they are highly dependent on the underlying networks. Moreover, theoretical analysis on the latency of Fabric blockchain still receives much less attention. This paper provides a novel theoretical model to calculate the transaction latency under various network configurations such as block size, block interval, etc. Subsequently, we validate the proposed latency model with experiments, and the results show that the difference between analytical and experimental results is as low as 6.1 % . We also identify some performance bottlenecks and give insights from the developer’s perspective.

119 citations


Journal ArticleDOI
TL;DR: In this paper, a model for exploring the effects of external stimuli (perceived threat and perceived information overload) related to COVID-19 on consumers' internal states (sadness, anxiety, and cognitive dissonance) and their subsequent behavioral intentions to avoid health information and engage in preventive behaviors was proposed.
Abstract: This study investigated consumers’ information-avoidance behavior in the context of a public health emergency—the COVID-19 pandemic in China. Guided by the stimulus-organism-response paradigm, it proposes a model for exploring the effects of external stimuli (perceived threat and perceived information overload) related to COVID-19 on consumers’ internal states (sadness, anxiety, and cognitive dissonance) and their subsequent behavioral intentions to avoid health information and engage in preventive behaviors. With a survey sample (N = 721), we empirically examined the proposed model and tested the hypotheses. The results indicate that sadness, anxiety, and cognitive dissonance, which were a result of perceived threat and perceived information overload, had heterogeneous effects on information avoidance. Anxiety and cognitive dissonance increased information avoidance intention, while sadness decreased information avoidance intention. Moreover, information avoidance predicted a reluctance on the part of consumers to engage in preventive behaviors during the COVID-19 pandemic. These findings not only contribute to the information behavior literature and extend the concept of information avoidance to a public health emergency context, but also yield practical insights for global pandemic control.

113 citations


Journal ArticleDOI
TL;DR: This work proposes an owner-centric decentralized sharing model for Digital Twin data, and shows how to overcome the numerous implementation challenges associated with fully decentralized data sharing, enabling management of Digital Twin components and their associated information.
Abstract: Digital Twins are complex digital representations of assets that are used by a variety of organizations across the Industry 4.0 value chain. As the digitization of industrial processes advances, Digital Twins will become widespread. As a result, there is a need to develop new secure data sharing models for a complex ecosystem of interacting Digital Twins and lifecycle parties. Decentralized Applications are uniquely suited to address these sharing challenges while ensuring availability, integrity and confidentiality. They rely on distributed ledgers and decentralized databases for data storage and processing, avoiding single points of trust. To tackle the need for decentralized sharing of Digital Twin data, this work proposes an owner-centric decentralized sharing model. A formal access control model addresses integrity and confidentiality aspects based on Digital Twin components and lifecycle requirements. With our prototypical implementation EtherTwin we show how to overcome the numerous implementation challenges associated with fully decentralized data sharing, enabling management of Digital Twin components and their associated information. For validation, the prototype is evaluated based on an industry use case and semi-structured expert interviews.

97 citations


Journal ArticleDOI
TL;DR: This paper collected over 10,000 smart contracts from Ethereum and focused on the data behavior generated by smart contracts and users, and proposed a transaction-based classification and detection approach for Ethereum smart contract to address issues.
Abstract: Blockchain technology brings innovation to various industries. Ethereum is currently the second blockchain platform by market capitalization, it’s also the largest smart contract blockchain platform. Smart contracts can simplify and accelerate the development of various applications, but they also bring some problems. For example, smart contracts are used to commit fraud, vulnerability contracts are deliberately developed to undermine fairness, and there are numerous duplicative contracts that waste performance with no actual purpose. In this paper, we propose a transaction-based classification and detection approach for Ethereum smart contract to address these issues. We collected over 10,000 smart contracts from Ethereum and focused on the data behavior generated by smart contracts and users. We identified four behavior patterns from the transactions by manual analysis, which can be used to distinguish the difference between different types of contracts. Then 14 basic features of a smart contract are constructed from these. To construct the experimental dataset, we propose a data slicing algorithm for slicing the collected smart contracts. After that, we use an LSTM network to train and test our datasets. The extensive experimental results show that our approach can distinguish different types of contracts and can be applied to anomaly detection and malicious contract identification with satisfactory precision, recall, and f1-score.

95 citations


Journal ArticleDOI
TL;DR: In this article, the authors proposed a blockchain based framework for secure vehicular networks (B-FERL), which uses permissioned blockchain technology to tailor information access to restricted entities in the connected vehicle ecosystem, and uses a challenge-response data exchange between the vehicles and roadside units to monitor the internal state of the vehicle to identify cases of in-vehicle network compromise.
Abstract: The ubiquity of connecting technologies in smart vehicles and the incremental automation of its functionalities promise significant benefits, including a significant decline in congestion and road fatalities. However, increasing automation and connectedness broadens the attack surface and heightens the likelihood of a malicious entity successfully executing an attack. In this paper, we propose a Blockchain based Framework for sEcuring smaRt vehicLes (B-FERL). B-FERL uses permissioned blockchain technology to tailor information access to restricted entities in the connected vehicle ecosystem. It also uses a challenge–response data exchange between the vehicles and roadside units to monitor the internal state of the vehicle to identify cases of in-vehicle network compromise. In order to enable authentic and valid communication in the vehicular network, only vehicles with a verifiable record in the blockchain can exchange messages. Through qualitative arguments, we show that B-FERL is resilient to identified attacks. Also, quantitative evaluations in an emulated scenario show that B-FERL ensures a suitable response time and required storage size compatible with realistic scenarios. Finally, we demonstrate how B-FERL achieves various important functions relevant to the automotive ecosystem such as trust management, vehicular forensics and secure vehicular networks.

Journal ArticleDOI
TL;DR: This paper presents a new authentication and encryption protocol based on quantum-inspired quantum walks (QIQW) that can defend against message attack and impersonation attacks, thus ensuring secure transmission of data among IoT devices.
Abstract: Blockchain plays a vital task in cybersecurity. With the exerted efforts for realising large-scale quantum computers, most current cryptographic mechanisms may be hacked. Accordingly, we need a quantum tool utilised for designing blockchain frameworks to have the ability to be executed in the level of digital computers and resist the probable attacks from both digital and quantum computers. Quantum walks may be utilised as a quantum-inspired model for designing new cryptographic algorithms. In this paper, we present a new authentication and encryption protocol based on quantum-inspired quantum walks (QIQW). The proposed protocol is utilized to build a blockchain framework for secure data transmission among IoT devices. Instead of using classical cryptographic hash functions, quantum hash functions based on QIQW are employed for linking blocks of the chain. The main advantages of the presented framework are helping IoT nodes to effectively share their data with other nodes and full control of their records. Security analysis demonstrates that our proposed protocol can defend against message attack and impersonation attacks, thus ensuring secure transmission of data among IoT devices.

Journal ArticleDOI
TL;DR: A novel health misinformation detection model was proposed which incorporated the central- level features and the peripheral-level features (including linguistic features, sentiment features, and user behavioral features) and correctly detected about 85% of the health misinformation.
Abstract: Curbing the diffusion of health misinformation on social media has long been a public concern since the spread of such misinformation can have adverse effects on public health. Previous studies mainly relied on linguistic features and textual features to detect online health-related misinformation. Based on the Elaboration Likelihood Model (ELM), this study proposed that the features of online health misinformation can be classified into two levels: central-level and peripheral-level. In this study, a novel health misinformation detection model was proposed which incorporated the central-level features (including topic features) and the peripheral-level features (including linguistic features, sentiment features, and user behavioral features). In addition, the following behavioral features were introduced to reflect the interaction characteristics of users: Discussion initiation, Interaction engagement, Influential scope, Relational mediation, and Informational independence. Due to the lack of a labeled dataset, we collected the dataset from a real online health community in order to provide a real scenario for data analysis. Four types of misinformation were identified through the coding analysis. The proposed model and its individual features were validated on the real-world dataset. The model correctly detected about 85% of the health misinformation. The results also suggested that behavioral features were more informative than linguistic features in detecting misinformation. The findings not only demonstrated the efficacy of behavioral features in health misinformation detection but also offered both methodological and theoretical contributions to misinformation detection from the perspective of integrating the features of messages as well as the features of message creators.

Journal ArticleDOI
TL;DR: A multimodal fake news detection framework based on Crossmodal Attention Residual and Multichannel convolutional neural Networks (CARMN) is proposed and it is demonstrated that the proposed model outperforms the state-of-the-art methods and learns more discriminable feature representations.
Abstract: In recent years, social media has increasingly become one of the popular ways for people to consume news. As proliferation of fake news on social media has the negative impacts on individuals and society, automatic fake news detection has been explored by different research communities for combating fake news. With the development of multimedia technology, there is a phenomenon that cannot be ignored is that more and more social media news contains information with different modalities, e.g., texts, pictures and videos. The multiple information modalities show more evidence of the happening of news events and present new opportunities to detect features in fake news. First, for multimodal fake news detection task, it is a challenge of keeping the unique properties for each modality while fusing the relevant information between different modalities. Second, for some news, the information fusion between different modalities may produce the noise information which affects model’s performance. Unfortunately, existing methods fail to handle these challenges. To address these problems, we propose a multimodal fake news detection framework based on Crossmodal Attention Residual and Multichannel convolutional neural Networks (CARMN). The Crossmodal Attention Residual Network (CARN) can selectively extract the relevant information related to a target modality from another source modality while maintaining the unique information of the target modality. The Multichannel Convolutional neural Network (MCN) can mitigate the influence of noise information which may be generated by crossmodal fusion component by extracting textual feature representation from original and fused textual information simultaneously. We conduct extensive experiments on four real-world datasets and demonstrate that the proposed model outperforms the state-of-the-art methods and learns more discriminable feature representations.

Journal ArticleDOI
TL;DR: An explainable natural language processing model based on DistilBERT and SHAP (Shapley Additive exPlanations) to combat misinformation about COVID-19 due to their efficiency and effectiveness and to boost public trust in model prediction is proposed.
Abstract: Misinformation of COVID-19 is prevalent on social media as the pandemic unfolds, and the associated risks are extremely high. Thus, it is critical to detect and combat such misinformation. Recently, deep learning models using natural language processing techniques, such as BERT (Bidirectional Encoder Representations from Transformers), have achieved great successes in detecting misinformation. In this paper, we proposed an explainable natural language processing model based on DistilBERT and SHAP (Shapley Additive exPlanations) to combat misinformation about COVID-19 due to their efficiency and effectiveness. First, we collected a dataset of 984 claims about COVID-19 with fact-checking. By augmenting the data using back-translation, we doubled the sample size of the dataset and the DistilBERT model was able to obtain good performance (accuracy: 0.972; areas under the curve: 0.993) in detecting misinformation about COVID-19. Our model was also tested on a larger dataset for AAAI2021 - COVID-19 Fake News Detection Shared Task and obtained good performance (accuracy: 0.938; areas under the curve: 0.985). The performance on both datasets was better than traditional machine learning models. Second, in order to boost public trust in model prediction, we employed SHAP to improve model explainability, which was further evaluated using a between-subjects experiment with three conditions, i.e., text (T), text+SHAP explanation (TSE), and text+SHAP explanation+source and evidence (TSESE). The participants were significantly more likely to trust and share information related to COVID-19 in the TSE and TSESE conditions than in the T condition. Our results provided good implications for detecting misinformation about COVID-19 and improving public trust.

Journal ArticleDOI
TL;DR: This paper formalizes two novel metrics that quantify how much a recommender system equally treats items along the popularity tail, and proposes an in-processing approach aimed at minimizing the biased correlation between user-item relevance and item popularity.
Abstract: Recommender systems learn from historical users’ feedback that is often non-uniformly distributed across items. As a consequence, these systems may end up suggesting popular items more than niche items progressively, even when the latter would be of interest for users. This can hamper several core qualities of the recommended lists (e.g., novelty, coverage, diversity), impacting on the future success of the underlying platform itself. In this paper, we formalize two novel metrics that quantify how much a recommender system equally treats items along the popularity tail. The first one encourages equal probability of being recommended across items, while the second one encourages true positive rates for items to be equal. We characterize the recommendations of representative algorithms by means of the proposed metrics, and we show that the item probability of being recommended and the item true positive rate are biased against the item popularity. To promote a more equal treatment of items along the popularity tail, we propose an in-processing approach aimed at minimizing the biased correlation between user-item relevance and item popularity. Extensive experiments show that, with small losses in accuracy, our popularity-mitigation approach leads to important gains in beyond-accuracy recommendation quality.

Journal ArticleDOI
TL;DR: A deep learning framework for a binary classification task that classifies chest X-ray images into normal and pneumonia based on the proposed CGNet, which achieved the best accuracy, sensitivity, and specificity at 0.9795 on a public pneumonia dataset.
Abstract: Pneumonia is a global disease that causes high children mortality. The situation has even been worsening by the outbreak of the new coronavirus named COVID-19, which has killed more than 983,907 so far. People infected by the virus would show symptoms like fever and coughing as well as pneumonia as the infection progresses. Timely detection is a public consensus achieved that would benefit possible treatments and therefore contain the spread of COVID-19. X-ray, an expedient imaging technique, has been widely used for the detection of pneumonia caused by COVID-19 and some other virus. To facilitate the process of diagnosis of pneumonia, we developed a deep learning framework for a binary classification task that classifies chest X-ray images into normal and pneumonia based on our proposed CGNet. In our CGNet, there are three components including feature extraction, graph-based feature reconstruction and classification. We first use the transfer learning technique to train the state-of-the-art convolutional neural networks (CNNs) for binary classification while the trained CNNs are used to produce features for the following two components. Then, by deploying graph-based feature reconstruction, we, therefore, combine features through the graph to reconstruct features. Finally, a shallow neural network named GNet, a one layer graph neural network, which takes the combined features as the input, classifies chest X-ray images into normal and pneumonia. Our model achieved the best accuracy at 0.9872, sensitivity at 1 and specificity at 0.9795 on a public pneumonia dataset that includes 5,856 chest X-ray images. To evaluate the performance of our proposed method on detection of pneumonia caused by COVID-19, we also tested the proposed method on a public COVID-19 CT dataset, where we achieved the highest performance at the accuracy of 0.99, specificity at 1 and sensitivity at 0.98, respectively.

Journal ArticleDOI
TL;DR: The empirical results show that health caution and advice, help seeking misinformation, and emotional support significantly increase the dissemination of misinformation, indicating both dark and bright misinformation ambiguity and richness.
Abstract: The dissemination of misinformation in health emergencies poses serious threats to public health and increases health anxiety. To understand the underlying mechanism of the dissemination of misinformation regarding health emergencies, this study creatively draws on social support theory and text mining. It also explores the roles of different types of misinformation, including health advice and caution misinformation and health help-seeking misinformation, and emotional support in affecting individuals’ misinformation dissemination behavior on social media and whether such relationships are contingent on misinformation ambiguity and richness. The theoretical model is tested using 12,101 textual data about COVID-19 collected from Sina Weibo, a leading social media platform in China. The empirical results show that health caution and advice, help seeking misinformation, and emotional support significantly increase the dissemination of misinformation. Furthermore, when the level of ambiguity and richness regarding misinformation is high, the effect of health caution and advice misinformation is strengthened, whereas the effect of health help-seeking misinformation and emotional support is weakened, indicating both dark and bright misinformation ambiguity and richness. This study contributes to the literature on misinformation dissemination behavior on social media during health emergencies and social support theory and provides implications for practice.

Journal ArticleDOI
TL;DR: An Ant Colony Optimization (ACO) algorithm in a Fog-enabled Blockchain-assisted scheduling model, namely PF-BTS is proposed, which allows the fog to process, manage, and perform the tasks to enhance latency measures and shows high privacy awareness and noticeable enhancement in execution time and network load.
Abstract: In recent years, the deployment of Cloud Computing (CC) has become more popular both in research and industry applications, arising form various fields including e-health, manufacturing, logistics and social networking. This is due to the easiness of service deployment and data management, and the unlimited provision of virtual resources (VR). In simple scenarios, users/applications send computational or storage tasks to be executed in the cloud, by manually assigning those tasks to the available computational resources. In complex scenarios, such as a smart city applications, where there is a large number of tasks, VRs, or both, task scheduling is exposed as an NP-Hard problem. Consequently, it is preferred and more efficient in terms of time and effort, to use a task scheduling automation technique. As there are many automated scheduling solutions proposed, new possibilities arise with the advent of Fog Computing (FC) and Blockchain (BC) technologies. Accordingly, such automation techniques may help the quick, secure and efficient assignment of tasks to the available VRs. In this paper, we propose an Ant Colony Optimization (ACO) algorithm in a Fog-enabled Blockchain-assisted scheduling model, namely PF-BTS. The protocol and algorithms of PF-BTS exploit BC miners for generating efficient assignment of tasks to be performed in the cloud’s VRs using ACO, and award miner nodes for their contribution in generating the best schedule. In our proposal, PF-BTS further allows the fog to process, manage, and perform the tasks to enhance latency measures. While this processing and managing is taking place, the fog is enforced to respect the privacy of system components, and assure that data, location, identity, and usage information are not exposed. We evaluate and compare PF-BTS performance, with a recently proposed Blockchain-based task scheduling protocol, in a simulated environment. Our evaluation and experiments show high privacy awareness of PF-BTS, along with noticeable enhancement in execution time and network load.

Journal ArticleDOI
TL;DR: Wang et al. as mentioned in this paper used a super-efficiency SBM model to construct the relative effective frontier, and then machine learning algorithms were used to construct a regression model and establish the absolute effective frontier.
Abstract: The traditional data envelopment analysis (DEA) method used for performance evaluation has inherent problems such as being easily affected by statistical noise in data. Furthermore, when new evaluation units are added, the performance of all the original units must be re-measured, which restricts the evaluation efficiency. In this study, machine learning algorithms were applied to make up for the shortcomings of the data envelopment analysis method. First, a super-efficiency SBM model was used to construct the relative effective frontier, and then machine learning algorithms were used to construct a regression model and establish the absolute effective frontier. After 15 machine learning algorithms were compared, BPNN demonstrated the best performance, and a SuperSBM-DEA-BPNN model was eventually established. The new model has the following advantages: First, compared with the traditional data envelopment analysis method, the absolute effective frontier displays better evaluation; second, compared with the data envelopment analysis and neural network fusion outlined in the previous literature, the new model can better overcome the problems associated with data envelopment analysis, thereby improving the fusion efficiency. Taking the innovation efficiency evaluation of China's regional rural commercial banks for instance, the new model is proven to be more applicable and offers more effective management tools to improve efficiency. On the whole, the new model not only provides a stable performance evaluation tool but also facilitates comparison, which has good application significance for organizations.

Journal ArticleDOI
TL;DR: A notion of fairness based on the performance gap of a RS between the users with different demographics is defined, and a variety of collaborative filtering algorithms are evaluated in terms of accuracy and beyond-accuracy metrics to explore the fairness in the RS results toward a specific gender group.
Abstract: Although recommender systems (RSs) play a crucial role in our society, previous studies have revealed that the performance of RSs may considerably differ between groups of individuals with different characteristics or from different demographics. In this case, a RS is considered to be unfair when it does not perform equally well for different groups of users. Considering the importance of RSs in the distribution and consumption of musical content worldwide, a careful evaluation of fairness in the context of music RSs is crucial. To this end, we first introduce LFM-2b, a novel large-scale real-world dataset of music listening records, comprising a subset to investigate bias of RSs regarding users’ demographics. We then define a notion of fairness based on the performance gap of a RS between the users with different demographics, and evaluate a variety of collaborative filtering algorithms in terms of accuracy and beyond-accuracy metrics to explore the fairness in the RS results toward a specific gender group. We observe the existence of significant discrepancies (unfairness) between the performance of algorithms across male and female user groups. Based on these discrepancies, we explore to what extent recommender algorithms lead to intensifying the underlying population bias in the final results. We also study the effect of a resampling strategy, commonly used as debiasing method , which yields slight improvements in the fairness measures of various algorithms while maintaining their accuracy and beyond-accuracy performance.

Journal ArticleDOI
TL;DR: A comprehensive comparative study on the most effective approaches used for Arabic sentiment analysis, which re-implement most of the existing approaches and test their effectiveness on three of the most popular benchmark datasets for Arabic SA.
Abstract: Sentiment analysis (SA) is a natural language processing (NLP) application that aims to analyse and identify sentiment within a piece of text. Arabic SA started to receive more attention in the last decade with many approaches showing some effectiveness for detecting sentiment on multiple datasets. While there have been some surveys summarising some of the approaches for Arabic SA in literature, most of these approaches are reported on different datasets, which makes it difficult to identify the most effective approaches among those. In addition, those approaches do not cover the recent advances in NLP that use transformers. This paper presents a comprehensive comparative study on the most effective approaches used for Arabic sentiment analysis. We re-implement most of the existing approaches for Arabic SA and test their effectiveness on three of the most popular benchmark datasets for Arabic SA. Further, we examine the use of transformer-based language models for Arabic SA and show their superior performance compared to the existing approaches, where the best model achieves F-score scores of 0.69, 0.76, and 0.92 on the SemEval, ASTD, and ArSAS benchmark datasets. We also apply an extensive analysis of the possible reasons for failures, which show the limitations of the existing annotated Arabic SA datasets, and the challenge of sarcasm that is prominent in Arabic dialects. Finally, we highlight the main gaps in Arabic sentiment analysis research and suggest the most in-need future research directions in this area.

Journal ArticleDOI
TL;DR: In this article, the authors proposed a new optimized Machine Learning (ML) algorithm called the Local Search Improvised Bat Algorithm based Elman Neural Network (LSIBA-ENN) for the sentiment analysis of online product reviews.
Abstract: Recently, online shopping has turned into a mainstream means for users to purchase as well as consume with the upsurge development of Internet technology. User satisfaction can be improved effectively by doing Sentiment Analysis (SA) of a large quantity of user reviews on e-commerce platforms. It is still challenging to envisage the accurate sentiment polarities of the user reviews because of the changes in sequence length, textual order, along with complicated logic. This paper proposes a new optimized Machine Learning (ML) algorithm called the Local Search Improvised Bat Algorithm based Elman Neural Network (LSIBA-ENN) for the SA of online product reviews. The proposed work of SA encompasses ‘4’ major steps: i) Data Collection (DC), ii) preprocessing, iii) Features Extraction (FE) or Term Weighting (TW), Feature Selection (FS), and polarity or Sentiment Classifications (SC). Initially, the Web Scrapping Tool (WST) is utilized to extract the customer reviews of the products for which the data is gathered as of the E-commerce websites. Next, preprocessing is carried out on the web scrap extracted data. Those preprocessed data go through TW and FS for additional processing by means of Log Term Frequency-based Modified Inverse Class Frequency (LTF-MICF) and Hybrid Mutation based Earth Warm Algorithm (HMEWA). Lastly, the HM-EWA data is rendered to the LSIBA-ENN, which classifies the customer reviews’ sentiment as positive, negative, and neutral. For the performance analysis of the proposed and prevailing classifiers, ‘2’ yardstick datasets are taken. The outcomes exhibit that the LSIBA-ENN attains the best performance in SC when weighted against the existing top-notch algorithms. The observations of the reviewer are exact. The prevailing ENN proffers recall of 87.79 when utilizing the proposed LTF-MICF scheme, whereas ENN only achieve 83.55, 84.03, 85.48, and 86.04 of recall whilst utilizing W2V, TF, TF-IDF, and TF-DFS schemes respectively.

Journal ArticleDOI
TL;DR: In this article, a modular hybrid privacy-preserving framework leveraging off-chain and on-chain blockchain system design applied to three different reference models that illustrate how blockchain can enhance healthcare information management.
Abstract: In the context of blockchain technology, “off-chain” refers to computation or data that is structurally external to the blockchain network. Off-Chain Blockchain Systems (OCBS) enable this information processing and management through distributed software architecture where the blockchain network interacts with off-chain resources. Hence, OCBS are a critical data governance component in the design of enterprise blockchain solutions, resulting in extensive research and development exploring the interplay between on-chain and off-chain storage and computation and efforts to evaluate their performance relative to other information management systems. Key features of OCBS’ are their ability to improve scalability, reduce data storage requirements, and enhance data privacy, all extremely critical issues to enable broader blockchain adoption. These OCBS features map well to the needs of the healthcare industry, particularly due to the need to manage various types of medical, consumer, and other health-related data. However, different types of health data are also subject to stringent regulatory, security and legal requirements, a key factor limiting blockchain adoption in the sector. In response, there is a critical need to better align OCBS design features to different types of healthcare data management and their respective governance and privacy regimes. This article first reviews the characteristics of different constructs of OCBS. It then proposes a modular hybrid privacy-preserving framework leveraging off-chain and on-chain blockchain system design applied to three different reference models that illustrate how blockchain can enhance healthcare information management. Through this privacy-preserving framework we hope to liberate healthcare data by enabling sharing, sovereignty and enhanced trust.

Journal ArticleDOI
TL;DR: Propagation2Vec is proposed, a novel fake news early detection technique, which assigns varying levels of importance for the nodes and cascades in propagation networks, and reconstructs the knowledge of complete propagation networks based on their partial propagation networks at an early detection deadline.
Abstract: Many recent studies have demonstrated that the propagation patterns of news on social media can facilitate the detection of fake news. Most of these studies rely on the complete propagation networks to build their model, which is not fully available in the early stages and may take a long time to complete. Hence, relying on the complete propagation network is not ideal for fake news early detection. However, detecting fake news as early as possible is important due to their fast-spreading nature and the significant harm they can cause. In addition, most existing propagation network-based fake news detection techniques are not explicitly designed to jointly emphasise informative cascades and nodes in the propagation networks to detect fake news. To bridge these research gaps, this work proposes Propagation2Vec, a novel fake news early detection technique, which assigns varying levels of importance for the nodes and cascades in propagation networks, and reconstructs the knowledge of complete propagation networks based on their partial propagation networks at an early detection deadline. Our experiments show that our model can achieve state-of-the-art performance while only having access to the early stage propagation networks. Furthermore, we devise general explanations for the underlying logic of Propagation2Vec based on its attention weights assigned to different nodes and cascades, which improves the applicability of our approach and facilitates future research on propagation network-based fake news detection.

Journal ArticleDOI
TL;DR: The LGBMRegressor has the best goodness-of-fit among the compared regression models and decision making suggestions for rumor refutation platforms on how to organize rumors refutation microblogs under different situations such as rumor category, author’s influence and heat of topics are proposed.
Abstract: Motivated by the practical needs of enhancing social media rumor refutation effectiveness, this paper is dedicated to develop a proper rumor refutation effectiveness index ( R E I ), identify key factors influencing R E I and propose decision making suggestions for rumor refutation platforms. 298,118 pieces of comments and 185,209 pieces of the reposters’ verification status of 248 rumor refutation microblogs on Sina Weibo (the Chinese equivalent of Twitter) are collected during a 1-year period using a web crawler. To extract the text characteristics and analyze the sentiment of the rumor refutation microblogs, Natural Language Processing (NLP) approaches are applied. To explore the relationship between R E I and the content and contextual factors of the rumor refutation microblogs, four regression models based on the collected data are established, namely linear regression model, Support Vector regression model (SVR), Extreme Gradient Boosting regression model (XGBoostRegressor) and Light Gradient Boosting Machine regression model (LGBMRegressor). The LGBMRegressor has the best goodness-of-fit among the compared regression models. Then, SHapley Additive exPlanations (SHAP) is employed to visualize and explain the LGBMRegressor results. Decision making suggestions for rumor refutation platforms on how to organize rumor refutation microblogs under different situations such as rumor category, author’s influence and heat of topics are proposed.

Journal ArticleDOI
TL;DR: In this article, a systematic literature search following the PRISMA guideline covering 12 scholarly databases was conducted to retrieve various types of peer-reviewed articles that reported causes, impacts, or countermeasures of the COVID-19 infodemic.
Abstract: An unprecedented infodemic has been witnessed to create massive damage to human society. However, it was not thoroughly investigated. This systematic review aims to (1) synthesize the existing literature on the causes and impacts of COVID-19 infodemic; (2) summarize the proposed strategies to fight with COVID-19 infodemic; and (3) identify the directions for future research. A systematic literature search following the PRISMA guideline covering 12 scholarly databases was conducted to retrieve various types of peer-reviewed articles that reported causes, impacts, or countermeasures of the infodemic. Empirical studies were assessed for risk of bias using the Mixed-Methods Appraisal Tool. A coding theme was iteratively developed to categorize the causes, impacts, and countermeasures found from the included studies. Social media usage, low level of health/eHealth literacy, and fast publication process and preprint service are identified as the major causes of the infodemic. Besides, the vicious circle of human rumor-spreading behavior and the psychological issues from the public (e.g., anxiety, distress, fear) emerges as the characteristic of the infodemic. Comprehensive lists of countermeasures are summarized from different perspectives, among which risk communication and consumer health information need/seeking are of particular importance. Theoretical and practical implications are discussed and future research directions are suggested.

Journal ArticleDOI
TL;DR: Convolutional Neural Networks (CNN) with margin loss and different embedding models proposed for detecting fake news are presented and their proposed architectures are evaluated on two recent well-known datasets in the field, namely ISOT and LIAR.
Abstract: The advent of online news platforms such as social media, news blogs, and online newspapers in recent years and their facilitated features such as swift information flow, easy access, and low costs encourage people to seek and raise their information by consuming their provided news. Furthermore, these platforms increase the opportunities for deceiver parties to influence public opinion and awareness by producing fake news, i.e., the news which consists of false and deceptive information and is published for achieving specific political and economic gains. Since the discerning of fake news through their contents by individuals is very difficult, the existence of an automatic fake news detection approach for preventing the spread of such false information is mandatory. In this paper, Convolutional Neural Networks (CNN) with margin loss and different embedding models proposed for detecting fake news. We compare static word embeddings with the non-static embeddings that provide the possibility of incrementally up-training and updating word embedding in the training phase. Our proposed architectures are evaluated on two recent well-known datasets in the field, namely ISOT and LIAR. Our results on the best architecture show encouraging performance, outperforming the state-of-the-art methods by 7.9% on ISOT and 2.1% on the test set of the LIAR dataset.

Journal ArticleDOI
TL;DR: A model that learns from a nearest neighbor and a furthest neighbor graph via a joint convolutional model to establish a novel accuracy-diversity trade-off for recommender systems is developed, showing diversity gains up to seven times by trading as little as 1\% in accuracy.
Abstract: Graph convolutions, in both their linear and neural network forms, have reached state-of-the-art accuracy on recommender system (RecSys) benchmarks. However, recommendation accuracy is tied with diversity in a delicate trade-off and the potential of graph convolutions to improve the latter is unexplored. Here, we develop a model that learns joint convolutional representations from a nearest neighbor and a furthest neighbor graph to establish a novel accuracy-diversity trade-off for recommender systems. The nearest neighbor graph connects entities (users or items) based on their similarities and is responsible for improving accuracy, while the furthest neighbor graph connects entities based on their dissimilarities and is responsible for diversifying recommendations. The information between the two convolutional modules is balanced already in the training phase through a regularizer inspired by multi-kernel learning. We evaluate the joint convolutional model on three benchmark datasets with different degrees of sparsity. The proposed method can either trade accuracy to improve substantially the catalog coverage or the diversity within the list; or improve both by a lesser amount. Compared with accuracy-oriented graph convolutional approaches, the proposed model shows diversity gains up to seven times by trading as little as 1% in accuracy. Compared with alternative accuracy-diversity trade-off solutions, the joint graph convolutional model retains the highest accuracy while offering a handle to increase diversity. To our knowledge, this is the first work proposing an accuracy-diversity trade-off with graph convolutions and opens the doors to learning over graphs approaches for improving such trade-off.

Journal ArticleDOI
TL;DR: A novel ensemble embedding method is developed to generate semantic and contextual representations of the words in review sentences that contain innovation ideas from online reviews that are then used in a long short-term memory (LSTM) model for innovation-sentence identification.
Abstract: The importance of online customer reviews to product innovation has been well-recognized in prior literature. Mining online reviews has received extensive attention and efforts. Most existing research on mining online reviews focus on issues such as the impact of reviews on sales, helpfulness of reviews, and customers’ participation in reviews. Few research studies, however, seek to identify and extract innovation ideas for products from online reviews. This type of information is particularly important for product functionality improvement and new feature development from a manufacturer's perspective. Mining product innovation ideas allows a manufacturer to proactively review customer opinion and unlock insights about new functionality and features that the market expects, in order to gain a competitive advantage. In this paper, we propose a deep learning-based approach to identify sentences that contain innovation ideas from online reviews. Specifically, we develop a novel ensemble embedding method to generate semantic and contextual representations of the words in review sentences. The resultant representations in each sentence are then used in a long short-term memory (LSTM) model for innovation-sentence identification. Moreover, we adopt a focal loss function in our model to address the class imbalance problem. We validate our approach with a dataset of 10,000 customer reviews from Amazon. Our model achieves an AUC score of 0.91 and an F1 score of 0.89, outperforming a set of state-of-the-art baseline models in the comparison. Our approach can be extended and applied to many other information extraction tasks.