scispace - formally typeset
Search or ask a question

Showing papers in "Communications in computer and information science in 2023"


Book ChapterDOI
TL;DR: In this article , an efficient and scalable vulnerability detection method based on a deep neural network model, Long Short-Term Memory (LSTM), and quantum machine learning model (QLSTM) is presented.
Abstract: One of the most important challenges in the field of software code audit is the presence of vulnerabilities in software source code. Every year, more and more software flaws are found, either internally in proprietary code or revealed publicly. These flaws are highly likely exploited and lead to system compromise, data leakage, or denial of service. C and C++ open-source codes are now available in order to create a large-scale, classical machine-learning and quantum machine-learning system for function-level vulnerability identification. We assembled a sizable dataset of millions of open-source functions that point to potential exploits. We created an efficient and scalable vulnerability detection method based on a deep neural network model– Long Short-Term Memory (LSTM), and quantum machine learning model– Long Short-Term Memory (QLSTM), that can learn features extracted from the source codes. The source code is first converted into a minimal intermediate representation to remove the pointless components and shorten the dependency. Previous studies lack analyzing features of the source code that causes models to recognize flaws in real-life examples. Therefore, We keep the semantic and syntactic information using state-of-the-art word embedding algorithms such as Glove and fastText. The embedded vectors are subsequently fed into the classical and quantum convolutional neural networks to classify the possible vulnerabilities. To measure the performance, we used evaluation metrics such as F1 score, precision, recall, accuracy, and total execution time. We made a comparison between the results derived from the classical LSTM and quantum LSTM using basic feature representation as well as semantic and syntactic representation. We found that the QLSTM with semantic and syntactic features detects significantly accurate vulnerability and runs faster than its classical counterpart.

3 citations


Book ChapterDOI
TL;DR: The authors explored the possibility of large language models, specifically GPT-3, to write explanations for middle-school mathematics problems, with the goal of eventually using this process to rapidly generate explanations for the mathematics problems of new curricula as they emerge.
Abstract: Large language models have recently been able to perform well in a wide variety of circumstances. In this work, we explore the possibility of large language models, specifically GPT-3, to write explanations for middle-school mathematics problems, with the goal of eventually using this process to rapidly generate explanations for the mathematics problems of new curricula as they emerge, shortening the time to integrate new curricula into online learning platforms. To generate explanations, two approaches were taken. The first approach attempted to summarize the salient advice in tutoring chat logs between students and live tutors. The second approach attempted to generate explanations using few-shot learning from explanations written by teachers for similar mathematics problems. After explanations were generated, a survey was used to compare their quality to that of explanations written by teachers. We test our methodology using the GPT-3 language model. Ultimately, the synthetic explanations were unable to outperform teacher written explanations. In the future more powerful large language models may be employed, and GPT-3 may still be effective as a tool to augment teachers’ process for writing explanations, rather than as a tool to replace them. The explanations, survey results, analysis code, and a dataset of tutoring chat logs are all available at https://osf.io/wh5n9/ .

2 citations


Book ChapterDOI
TL;DR: In this paper , an AI-based, machine learning method of autograding online tutor lessons is presented, using a subset of learnersourced, human-graded tutor responses from the lessons and a surrogate model using the recently released AI-chatbot, ChatGPT.
Abstract: Machine learning and artificial intelligence (AI) are ubiquitous, although accessibility and application are often misunderstood and obscure. Automatic short answer grading (ASAG), leveraging natural language processing (NLP) and machine learning, has received notable attention as a method of providing instantaneous, corrective feedback to learners without the time and energy demands of human graders. However, ASAG systems are only as valid as the reference answers, or training sets, they are compared against. We introduce an AI-based, machine learning method of autograding online tutor lessons that is easily accessible and user friendly. We present two methods of training set creation using: a subset of learnersourced, human-graded tutor responses from the lessons; and a surrogate model using the recently released AI-chatbot, ChatGPT. Findings indicate human-created training sets perform considerably better than AI-generated training sets (F1 = 0.84 and 0.67, respectively). Our straightforward approach, although not accurate enough for wide use, demonstrates application of directly available machine learning based NLP methods and highlights a constructive use of ChatGPT for pedagogical purposes that is not without limitations.

2 citations


Journal ArticleDOI
TL;DR: In this article , a hate speech detection from multimodal Bengali memes and texts is presented. But, the authors only used text modality is most useful for hate speech detecting, while memes are moderately useful.
Abstract: Numerous machine learning (ML) and deep learning (DL)-based approaches have been proposed to utilize textual data from social media for anti-social behavior analysis like cyberbullying, fake news detection, and identification of hate speech mainly for highly-resourced languages such as English. However, despite of having a lot of diversity and millions of native speakers, some languages like Bengali are under-resourced, which is due to lack of computational resources for natural language processing (NLP). Similar to other languages, Bengali social media contents also include images along with texts (e.g., multimodal memes are posted by embedding short texts into images on Facebook). Therefore, only the textual data is not enough to judge them since images might give extra context to make a proper judgement. This paper is about hate speech detection from multimodal Bengali memes and texts. We prepared the only multimodal hate speech dataset for-a-kind of problem for Bengali, which we use to train state-of-the-art neural architectures (e.g., Bi-LSTM/Conv-LSTM with word embeddings, ConvNets + pre-trained language models, e.g., monolingual Bangla BERT, multilingual BERT-cased/uncased, and XLM-RoBERTa) to jointly analyze textual and visual information for hate speech detection. Conv-LSTM and XLM-RoBERTa models performed best for texts, yielding F1 scores of 0.78 and 0.82, respectively. As of memes, ResNet-152 and DenseNet-161 models yield F1 scores of 0.78 and 0.79, respectively. As of multimodal fusion, XLM-RoBERTa + DenseNet-161 performed the best, yielding an F1 score of 0.83. Our study suggest that text modality is most useful for hate speech detection, while memes are moderately useful.

2 citations


Journal ArticleDOI
TL;DR: In this paper , the authors make use of the existing theory of partially observable Markov decision processes (POMDPs) to define what it means for a system to have beliefs and goals.
Abstract: Under what circumstances can a system be said to have beliefs and goals, and how do such agency-related features relate to its physical state? Recent work has proposed a notion of interpretation map, a function that maps the state of a system to a probability distribution representing its beliefs about an external world. Such a map is not completely arbitrary, as the beliefs it attributes to the system must evolve over time in a manner that is consistent with Bayes’ theorem, and consequently the dynamics of a system constrain its possible interpretations. Here we build on this approach, proposing a notion of interpretation not just in terms of beliefs but in terms of goals and actions. To do this we make use of the existing theory of partially observable Markov decision processes (POMDPs): we say that a system can be interpreted as a solution to a POMDP if it not only admits an interpretation map describing its beliefs about the hidden state of a POMDP but also takes actions that are optimal according to its belief state. An agent is then a system together with an interpretation of this system as a POMDP solution. Although POMDPs are not the only possible formulation of what it means to have a goal, this nevertheless represents a step towards a more general formal definition of what it means for a system to be an agent.

2 citations


Book ChapterDOI
TL;DR: In this paper , the authors conducted a questionnaire survey of employees at major Japanese IT companies based on Herzberg's theory of motivation, and analyzed the correlation between the work environment and productivity awareness before and after the COVID-19 pandemic.
Abstract: The purpose of this paper is to focus on major Japanese IT companies whose usage rate of telework has increased rapidly due to the COVID-19 disaster, and to consider issues and measures for improving productivity and performance by using telework. In this survey, we conducted a questionnaire survey of employees at major IT companies based on Herzberg’s theory of motivation, and we analyzed the correlation between the work environment and productivity awareness before and after the COVID-19 pandemic. As a result, we confirmed that major Japanese IT companies have not been able to actively utilize telework and adapt to the working environment, work environment, and corporate culture [18] that contribute to productivity improvement. In addition, because of the correlation analysis, “The working environment for telework in companies has not changed explicitly before and after the corona crisis. There are large individual differences.” The situation was confirmed. However, the absolute number of samples in this questionnaire survey is small, and it is necessary to increase the number of samples and conduct a detailed examination in the future. Under these circumstances, major Japanese IT companies introduced telework relatively quickly compared to many other companies in other countries and Japan, demonstrating the resilience of their operations. However, to continuously improve the productivity and performance of employees using telework in the future, it will be necessary to reshape the working environment surrounding telework.

1 citations


Journal ArticleDOI
TL;DR: In this article , a lightweight BGP anomaly detection using a binary neural network (BNN) was proposed, achieving similar results to full-precision neural networks with reduced computation and storage.
Abstract: Border Gateway Protocol (BGP) is responsible for managing connectivity and reachability information between autonomous systems, and plays a critical role in the overall efficiency and reliability of the Internet. Due to anomalies caused by misconfiguration or hijacking, etc. there can be a significant impact on the Internet. Neural network-based detection methods provide high accuracy, but their complex structure increases network latency and storage overhead. In addition, small-scale anomalies such as prefix hijacking and path hijacking are difficult to detect due to their small propagation range. We implement a lightweight BGP anomaly detection using a binary neural network (BNN) in this paper, achieving similar results to full-precision neural networks with reduced computation and storage. In addition, we use a mixture of BGP attribute features and graph features to detect small-scale anomalous events, and the results show that the detection accuracy of our proposed method is significantly improved compared to BGP attribute features.

1 citations


Journal ArticleDOI
TL;DR: In this paper , the authors proposed a full reference quality assessment method for screen content images named Genetic Programming based Screen Content Image Quality (GP-SCIQ), which operates via a symbolic regression technique using genetic programming.
Abstract: The study of Screen Content Image Quality Assessment (SCI-QA) is a new and interesting topic due to its excellent potential for instruction and optimization in various processing systems, it has been attractive recently. In this paper, we proposed a full reference quality assessment method for screen content images named Genetic Programming based Screen Content Image Quality (GP-SCIQ). The proposed method operates via a symbolic regression technique using Genetic Programming (GP). Hence, for predicting subject scores of images in datasets we combined the objective scores of a set of Image Quality Metrics (IQM). Two largest publicly available image databases (namely SICAD and SCID) are used for training and testing the predictive models, according the k-fold-cross-validation strategy. The performance of the proposed approach is evaluated, and several experiments are carried using four performance indices (SRCC, PCC, KROCC and RMSE). The results achieve superior performance to state-of-the-art methods in predicting the perceptual quality of SCIs.

1 citations


Book ChapterDOI
TL;DR: In this article , the authors present a full-day workshop that aims to identify and document next steps towards establishing ethical and legal requirements for Artificial Intelligence and Education and involve all educational levels and stakeholders to guarantee trustworthy and ethical AI usage.
Abstract: How can we develop and implement legal and organisational requirements for ethical AI introduction and usage? How can we safeguard and maybe even strengthen human rights, democracy, digital equity and rules of law through AI-supported education and learning opportunities? And how can we involve all educational levels (micro, meso, macro) and stakeholders to guarantee a trustworthy and ethical AI usage? These open and urgent questions will be discussed during a full-day workshop that aims to identify and document next steps towards establishing ethical and legal requirements for Artificial Intelligence and Education.

1 citations


Book ChapterDOI
TL;DR: In this paper , a scaled-dot product attention mechanism was proposed to explore relations between the generated energy and a set of multiple-location weather forecasts/measurements, and the experimental evaluation on a dataset consisting of hourly generated wind energy in Greece along with hourly weather forecasts for 18 different locations, demonstrated that the proposed approach outperforms competitive methods.
Abstract: In recent years, electricity generated from renewable energy sources has become a significant contributor to power supply systems over the world. Wind is one of the most important renewable energy sources, thus accurate wind energy prediction is a vital component of the management and operation of electric grids. This paper proposes a novel method for wind energy forecasting, which relies on a novel variant of the scaled-dot product attention mechanism, for exploring relations between the generated energy and a set of multiple-location weather forecasts/measurements. The conducted experimental evaluation on a dataset consisting of the hourly generated wind energy in Greece along with hourly weather forecasts for 18 different locations, demonstrated that the proposed approach outperforms competitive methods.

1 citations


Journal ArticleDOI
TL;DR: In this paper , a decentralized, blockchain and Internet of Things (IoT) based methodology for voting system is devised and presented in this direction, a decentralized smart contracts and consensus mechanism are applied in the proposed system to enhance the security.
Abstract: These days, most people are not satisfied with the final result of the voting system. This is because the current system for voting is centralized and fully controlled by the election commission. So, there is a chance that the central body can be compromised or hacked and the final result can be tampered. In this direction, a decentralized, Blockchain and Internet of Things (IoT) based methodology for voting system is devised and presented in this paper. The efficient smart contracts and consensus mechanism are applied in the proposed system to enhance the security. Blockchain is totally transparent, secured and immutable technique because it uses concept like encryption, decryption, hash function, consensus and Merkle tree etc. which make Blockchain Technology an appropriate platform for storing and sharing the data in a secured and anonymous manner. IoT makes use of biometric sensors using which people can cast their votes in not only physical mode but also in digital mode. As a response, a message is received to the owner for casting his vote to ensure the authentication. In this way, we can make the present voting system more secure and trust-worthy using the properties of both Blockchain and IoT, and therefore, we can give more value to the election process in the democratic countries. The proposed method ensures security as well as reduces the computational time as compared to the existing approaches.

Journal ArticleDOI
TL;DR: In this article , the authors introduce the notions of learning context, performance map, and high performance function and apply them to a variety of learning contexts to show how the methodology can be applied.
Abstract: The paper describes a novel methodology to compare learning algorithms by exploiting their performance maps. A performance map enhances the comparison of a learner across learning contexts and it also provides insights in the distribution of a learners’ performances across its parameter space. Also some initial empirical findings are commented. In order to explain the novel comparison methodology, this study introduces the notions of learning context, performance map, and high performance function. These concepts are then applied to a variety of learning contexts to show how the methodology can be applied. Finally, we will use meta-optimization as an instrument to improve the efficiency of the parameter space search with respect to its complete enumeration. But, note that meta-optimization is neither an essential part of our methodology nor the focus of our study.

Journal ArticleDOI
TL;DR: In this paper , an optimized software pipeline interleaving parallel computation of LSTM or GRU recurrent blocks, featuring vectorized 8-bit integer (INT8) and 16-bit floating-point (FP16) compute units, with manually-managed memory transfers of model parameters.
Abstract: This paper presents an optimized methodology to design and deploy Speech Enhancement (SE) algorithms based on Recurrent Neural Networks (RNNs) on a state-of-the-art MicroController Unit (MCU), with 1+8 general-purpose RISC-V cores. To achieve low-latency execution, we propose an optimized software pipeline interleaving parallel computation of LSTM or GRU recurrent blocks, featuring vectorized 8-bit integer (INT8) and 16-bit floating-point (FP16) compute units, with manually-managed memory transfers of model parameters. To ensure minimal accuracy degradation with respect to the full-precision models, we propose a novel FP16-INT8 Mixed-Precision Post-Training Quantization (PTQ) scheme that compresses the recurrent layers to 8-bit while the bit precision of remaining layers is kept to FP16. Experiments are conducted on multiple LSTM and GRU based SE models trained on the Valentini dataset, featuring up to 1.24M parameters. Thanks to the proposed approaches, we speed-up the computation by up to 4 $$\times $$ with respect to the lossless FP16 baselines. Differently from a uniform 8-bit quantization that degrades the PESQ score by 0.3 on average, the Mixed-Precision PTQ scheme leads to a low-degradation of only 0.06, while achieving a 1.4–1.7 $$\times $$ memory saving. Thanks to this compression, we cut the power cost of the external memory by fitting the large models on the limited on-chip non-volatile memory and we gain a MCU power saving of up to 2.5 $$\times $$ by reducing the supply voltage from 0.8 V to 0.65 V while still matching the real-time constraints. Our design results >10 $$\times $$ more energy efficient than state-of-the-art SE solutions deployed on single-core MCUs that make use of smaller models and quantization-aware training.

Journal ArticleDOI
TL;DR: In this article , an object-based active inference (OBAI) model is proposed, which represents distinct objects with separate variational beliefs and uses selective attention to route inputs to their corresponding object slots.
Abstract: The world consists of objects: distinct entities possessing independent properties and dynamics. For agents to interact with the world intelligently, they must translate sensory inputs into the bound-together features that describe each object. These object-based representations form a natural basis for planning behavior. Active inference (AIF) is an influential unifying account of perception and action, but existing AIF models have not leveraged this important inductive bias. To remedy this, we introduce ‘object-based active inference’ (OBAI), marrying AIF with recent deep object-based neural networks. OBAI represents distinct objects with separate variational beliefs, and uses selective attention to route inputs to their corresponding object slots. Object representations are endowed with independent action-based dynamics. The dynamics and generative model are learned from experience with a simple environment (active multi-dSprites). We show that OBAI learns to correctly segment the action-perturbed objects from video input, and to manipulate these objects towards arbitrary goals.

Book ChapterDOI
TL;DR: In this paper , a residual error learning for forecasting (RESELF) method is proposed to improve the performance of a deep learning model towards the electricity demand forecasting task, where the residual errors are computed using the actual load values and computed residual errors.
Abstract: Electricity demand forecasting describes the challenging task of predicting the electricity demand by employing historical load data. In this paper, we propose a novel method, named RESidual Error Learning for Forecasting (RESELF) for improving the performance of a deep learning model towards the electricity demand forecasting task. The proposed method proposes to train a model with the actual load values and compute the residual errors. Subsequently, RESELF proposes to train a second model using as targets the computed residual errors. Finally, the prediction of the proposed methodology is defined as the sum of the first model’s and second model’s predictions. We argue that if the errors are systematic, the proposed method will provide improved results. The experimental evaluation on four datasets validates the effectiveness of the proposed method in improving the forecasting performance.

Journal ArticleDOI
TL;DR: In this article , the authors proposed six constructive heuristics to solve the K-independent total traveling salesperson problem (KITTSP) which is a variant of the famous TSP.
Abstract: This paper is concerned with K-independent total traveling salesperson problem (KITTSP) which is a variant of the famous traveling salesperson problem (TSP). KITTSP seeks K mutually independent Hamiltonian tours such that the total cost of these K tours is minimized. KITTSP is an $$\mathcal{N}\mathcal{P}$$ -hard problem since it is a generalisation of TSP. KITTSP is a recently introduced problem, and, so far no solution approach exists in the literature for this problem. We have proposed six constructive heuristics to solve KITTSP which are the first approaches for this problem. We have evaluated the performance of these heuristics on an extensive range of TSPLIB instances and presented a detailed comparative study of their performance.

Journal ArticleDOI
TL;DR: In this paper , the authors evaluated the effect of technological capabilities in operations and information technology capabilities in business support systems on the company's digital transformation and found that the development of technological skills, both in operation and in support, serve as inducers of digital business models.
Abstract: The evolution of markets has encouraged the use of technologies in different areas; in business and industry, their use has increased since these tools and technologies allow efficient management of resources and generate better products and services; in this sense, the technological capacities allow a correct use of the different elements and technology resources to be able to carry out productive, commercial and management processes, as well as the use of tools such as the support system. This research evaluates the effect of technological capabilities in operations and information technology capabilities in business support systems on the company's digital transformation. Four categorical regression models are estimated, controlling for economic activity, for a sample of companies from different industries. The findings show an influence of capabilities on the propensity for digital transformation and a moderation effect due to economic activity. Information technology capabilities influence service companies, while the digital transformation of manufacturing organizations depends on the technological capabilities in operation with a more significant effect. It is concluded that the development of technological skills, both in operation and in support, serve as inducers of digital business models. However, their adoption will depend on the nature of the firm. Changes in the technological support capabilities further underpin the probability of more extraordinary digital transformation in service, opposite for those who operate the production of tangibles.

Journal ArticleDOI
TL;DR: In this paper , an ML-based model was proposed for the detection of schizophrenia on the structural MRI dataset of 146 subjects and the results showed that the SVM achieved high accuracy when the dataset was validated using a stratified 10-fold cross-validation technique.
Abstract: The reproducibility of Computer Aided Diagnosis (CAD) in detecting schizophrenia using neuroimaging modalities can provide early diagnosis of the disease. Schizophrenia is a psychiatric disorder that can lead to structural abnormalities in the brain, causing delusions and hallucinations. Neuroimaging modality such as a structural Magnetic Resonance Imaging (sMRI) technique can capture these structural abnormalities in the brain. Utilizing Machine Learning (ML) as a potential diagnostic tool in detecting classification biomarkers can aid clinical measures and cater to recognizing the factors underlying schizophrenia. This paper proposes an ML based model for the detection of schizophrenia on the structural MRI dataset of 146 subjects. We sought to classify schizophrenia and healthy control using five ML classifiers: Support Vector Machine, Logistic Regression, Decision Tree, k-Nearest Neighbor, and Random Forest. The raw structural MRI scans have been pre-processed using techniques such as image selection, image conversion, gray scaling of MRI images, and image flattening. Further, we have tested the performance of the model using hold-out cross-validation and stratified 10-fold cross-validation techniques. The results showed that the SVM achieved high accuracy when the dataset was validated using a stratified 10-fold cross-validation technique. On the other hand, k-Nearest Neighbor performed better when the hold-out validation method was used to evaluate the classifier.

Journal ArticleDOI
TL;DR: In this paper , a different way of obtaining images of solar cells using Artificial Intelligence techniques such as Generative Adversarial Neural Networks (GANs) is presented, which will improve the maintenance of photovoltaic systems in different places like Smart Cities.
Abstract: This article presents a different way of obtaining images of solar cells using Artificial Intelligence techniques such as Generative Adversarial Neural Networks (GANs). This will improve the maintenance of Photovoltaic Systems in different places like Smart Cities. The original data has been obtained manually and preprocessed to create better images. The GAN architecture used is known as Deep Convolutional GAN since it performs better than other GANs. The synthetic images were labeled and analyzed to ensure their quality.

Book ChapterDOI
TL;DR: In this article , a prototype of an innovative vehicle interior concept to support social interaction during automated driving as well as to identify design suggestions for its further development was evaluated in a field-experimental research setting, and questionnaires were used to assess user experience, system trust and subjective road safety.
Abstract: The goal of the research project RUMBA, funded by the German Federal Ministry for Economic Affairs and Climate Action, is to redesign the user experience for occupants during a highly automated drive (SAE Level 4 [1]) by developing innovative interior and interaction concepts. As part of the second iteration of the user-centered, iterative development process, a field study is conducted. It aims to evaluate a prototype of an innovative vehicle interior concept to support social interaction during automated driving as well as to identify design suggestions for its further development. The vehicle interior concept to be evaluated is compared to a classic vehicle interior in a field-experimental research setting. The participants experience each vehicle interior for 30 min during a simulated automated drive in real traffic (Wizard of Oz). During the drive, participants play a digital board game together: In one condition, the participants are seated in driving direction and play on their respective tablets. In the other condition, they sit facing each other and play on a shared tablet. Besides others, user behavior is observed, and questionnaires are used to assess user experience, system trust and subjective road safety. The data collection for the study is completed by the end of April 2023.Therefore, this short paper focuses on the method, results are not yet included. The conference poster in July depicts both the methodology and the results of the evaluation study.


Journal ArticleDOI
TL;DR: In this article , two techniques for avatar representation in VR, i.e., no avatar (VR Kit only) and full-body reconstruction (blending of inverse kinematics and animations), are compared in the context of emergency training.
Abstract: Virtual Reality (VR) technology is playing an increasingly important role in the field of training. The emergency domain, in particular, can benefit from various advantages of VR with respect to traditional training approaches. One of the most promising features of VR-based training is the possibility to share the virtual experience with other users. In multi-user training scenarios, the trainees have to be provided with a proper representation of both the other peers and themselves, with the aim of fostering mutual awareness, communication and cooperation. Various techniques for representing avatars in VR have been proposed in the scientific literature and employed in commercial applications. However, the impact of these techniques when deployed to multi-user scenarios for emergency training has not been extensively explored yet. In this work, two techniques for avatar representation in VR, i.e., no avatar (VR Kit only) and Full-Body reconstruction (blending of inverse kinematics and animations), are compared in the context of emergency training. Experiments were carried out in a training scenario simulating a road tunnel fire. The participants were requested to collaborate with a partner (controlled by an experimenter) to cope with the emergency, and aspects concerning perceived embodiment, immersion, and social presence were investigated.

Journal ArticleDOI
TL;DR: In this article , a generative model neural conversation system using a deep LSTM Sequence to Sequence model with an attention mechanism was proposed to build a chatbot in open domain which can have a meaningful conversation with humans.
Abstract: Conversational modeling is an important task in natural language understanding and machine intelligence. It makes sense for natural language to become the primary way in which we interact with devices because that is how humans communicate with each other. Thus, the possibility of having conversations with machines would make our interaction much more smooth and human-like. The natural language techniques need to be evolved to match the level of power and sophistication that users expect from virtual assistants. Although previous approaches exist, they are often restricted to specific domains and require handcrafted rules. The obvious problem lies in their inability to answer questions for which the rules were not written. To overcome this problem, we build a generative model neural conversation system using a deep LSTM Sequence to Sequence model with an attention mechanism. Our main emphasis is to build a generative model chatbot in open domain which can have a meaningful conversation with humans. We consider Reddit conversation datasets to train the model and applied turing test on the proposed model. The proposed chatbot model is compared with Cleverbot and the results are presented.

Book ChapterDOI
TL;DR: SLAC Neural Network Library (SNL) as discussed by the authors leverages Xilinx's HLS framework presenting an API modeled after the open source Keras interface to the TensorFlow library.
Abstract: The LCLS2 Free Electron Laser (FEL) will generate x-ray pulses to beamline experiments at up to 1 Mhz. These experimentals will require new ultra-high rate (UHR) detectors that can operate at rates above 100 kHz and generate data throughputs upwards of 1 TB/s, a data velocity which requires prohibitively large investments in storage infrastructure. Machine Learning has demonstrated the potential to digest large datasets to extract relevant insights, however current implementations show latencies that are too high for real-time data reduction objectives. SLAC has endeavored on the creation of a software framework which translates MLs structures for deployment on Field Programmable Gate Arrays (FPGAs) deployed at the Edge of the data chain, close to the instrumentation. This framework leverages Xilinx’s HLS framework presenting an API modeled after the open source Keras interface to the TensorFlow library. This SLAC Neural Network Library (SNL) framework is designed with a streaming data approach, optimizing the data flow between layers, while minimizing the buffer data buffering requirements. The goal is to ensure the highest possible framerate while keeping the maximum latency constrained to the needs of the experiment. Our framework is designed to ensure the RTL implementation of the network layers supporting full re-deployment of weights and biases without requiring re-synthesis after training. The ability to reduce the precision of the implemented networks through quantization is necessary to optimize the use of both DSP and memory resources in the FPGA. We currently have a preliminary version of the toolset and are experimenting with both general purpose example networks and networks being designed for specific LCLS2 experiments.

Book ChapterDOI
TL;DR: Li et al. as mentioned in this paper proposed a group-personalized federated learning (GP-FL) solution to handle the heterogeneity of data across user devices in the smart applications such as personalized health care.
Abstract: Human Activity Recognition (HAR) plays a significant role in recent years due to its applications in various fields including health care and well-being. Traditional centralized methods reach very high recognition rates, but they incur privacy and scalability issues. Federated learning (FL) is a leading distributed machine learning (ML) paradigm, to train a global model collaboratively on distributed data in a privacy-preserving manner. However, for HAR scenarios, the existing action recognition system mainly focuses on a unified model, i.e. it does not provide users with personalized recognition of activities. Furthermore, the heterogeneity of data across user devices can lead to degraded performance of traditional FL models in the smart applications such as personalized health care. To this end, we propose a novel federated learning model that tries to cope with a statistically heterogeneous federated learning environment by introducing a group-personalized FL (GP-FL) solution. The proposed GP-FL algorithm builds several global ML models, each one trained iteratively on a dynamic group of clients with homogeneous class probability estimations. The performance of the proposed FL scheme is studied and evaluated on real-world HAR data. The evaluation results demonstrate that our approach has advantages in terms of model performance and convergence speed with respect to two baseline FL algorithms used for comparison.

Journal ArticleDOI
TL;DR: This paper study the interpretability of attention in the context of set machine learning, where each data point is composed of an unordered collection of instances with a global label, and find that attention distributions are indeed often reflective of the relative importance of individual instances, but that silent failures happen where a model will have high classification performance but attention patterns that do not align with expectations.
Abstract: The debate around the interpretability of attention mechanisms is centered on whether attention scores can be used as a proxy for the relative amounts of signal carried by sub-components of data. We propose to study the interpretability of attention in the context of set machine learning, where each data point is composed of an unordered collection of instances with a global label. For classical multiple-instance-learning problems and simple extensions, there is a well-defined “importance” ground truth that can be leveraged to cast interpretation as a binary classification problem, which we can quantitatively evaluate. By building synthetic datasets over several data modalities, we perform a systematic assessment of attention-based interpretations. We find that attention distributions are indeed often reflective of the relative importance of individual instances, but that silent failures happen where a model will have high classification performance but attention patterns that do not align with expectations. Based on these observations, we propose to use ensembling to minimize the risk of misleading attention-based explanations.

Journal ArticleDOI
TL;DR: In this paper , an exploratory analysis of data (EAD) was carried out to understand the behavior of meteorological variables: precipitation, temperature, relative humidity and wind speed of Muisne station (M153).
Abstract: The study of climate is important over time since it allows to forecast the behavior of climatic variables, being necessary to have climatological records of at least 30 years. The community of Bunche has faced deficiencies in drinking water in their homes, taking into account that water is an increasingly scarce resource in the world, it was necessary to build a fog collector system to be able to capture water and meet the different needs in the population, thus, the exploratory analysis of data (EAD) was carried out to understand the behavior of meteorological variables: precipitation, temperature, relative humidity and wind speed of Muisne station (M153). The information was collected from the National Institute of Meteorology and Hydrology and the remote sensor TERRACLIMATE from the period 1990 to 2020, after that, recurrence maps were generated, and the ACP multivariate analysis was developed to filter the variables with the greatest correlation. Consequently, climate forecasts were made by applying the ARIMA model by coding in the R-Studio software. The results allowed predicting accumulated monthly rainfall ranging between 100–200 mm and average monthly temperatures between 25–26 ℃ by 2022, ensuring a high amount of water collected, which was suitable for human consumption and agricultural use according to the water quality analyses carried out and which were evaluated about the standards established in Ministerial Agreement 097-A from Ecuador.

Journal ArticleDOI
TL;DR: In this article , the Transformer architecture is used to encode, learn and enforce the underlying syntax of expressions creating latent representations that are correctly decoded to the exact mathematical expression tree, providing robustness to ablated inputs and unseen glyphs.
Abstract: Abstract The Transformer architecture is shown to provide a powerful framework as an end-to-end model for building expression trees from online handwritten gestures corresponding to glyph strokes. In particular, the attention mechanism was successfully used to encode, learn and enforce the underlying syntax of expressions creating latent representations that are correctly decoded to the exact mathematical expression tree, providing robustness to ablated inputs and unseen glyphs. For the first time, the encoder is fed with spatio-temporal data tokens potentially forming an infinitely large vocabulary, which finds applications beyond that of online gesture recognition. A new supervised dataset of online handwriting gestures is provided for training models on generic handwriting recognition tasks and a new metric is proposed for the evaluation of the syntactic correctness of the output expression trees. A small Transformer model suitable for edge inference was successfully trained to an average normalised Levenshtein accuracy of 94%, resulting in valid postfix RPN tree representation for 94% of predictions.

BookDOI
TL;DR: The ECML PKDD 2022 Workshop on Automating Data Science, Machine Learning, Artificial Intelligence, Knowledge Discovery, Data Mining, and Data Mining as mentioned in this paper has been held since 2011.
Abstract: The ECML PKDD 2022 Workshops proceedings on automating data science, machine learning and artificial intelligence, knowledge discovery, data mining, etc.

Journal ArticleDOI
TL;DR: In this paper , the authors proposed an optimal methodology called DTAN (Detecting Authorized nodes) in which authorization components such as integration, Data and social networking are utilized to generate the unique characteristics of authorization in amalgamated MANET.
Abstract: The most required objective in MANET is to identify whether the messages based on routing is established by the authorized node. To solve this issue, the existing systems provided a route among the authorization nodes within the particular environment. The description of the authorization based on the soft security methodology to remove the problems related to security also mentioned in the existing systems. Each and every node used an authorized threshold value. In this regard, our research work proposes an optimal methodology called DTAN (Detecting Authorized nodes) in which authorization components such as integration, Data and social networking are utilized to generate the unique characteristics of authorization in amalgamated MANET. Through these items the proposed work describes the authorization capacity, information integrity and non-static social characteristics of the node. After the detection of authorization nodes, a secure route has been deployed to the outer network by passing the gateway node using the particular secure route attaining value. The proposed approach performs confidential distribution methodology (CDM) to provide a secure communication in MANET. The parameters considered by the proposed system is network performance, packet delivery ratio, routing load and end to end delay. The experimental analysis has been done for both the proposed and existing systems in which the proposed systems increases the performance of the network, reduces the routing load and minimizes the end to end delay when comparing to the existing systems. At the mean time the proposed system ensures a secure data communication in the amalgamated internet MANET.