scispace - formally typeset
Search or ask a question
Topic

Tuple

About: Tuple is a research topic. Over the lifetime, 6513 publications have been published within this topic receiving 146057 citations. The topic is also known as: tuple & ordered tuplet.


Papers
More filters
Proceedings Article
15 Oct 2013
TL;DR: This research work provides a new and ecient extension of FCA to deal with complex data, which can be an alternative to the analysis of sequential datasets.
Abstract: In this paper, we are interested in the analysis of sequen- tial data and we propose an original framework based on FCA. For that, we introduce sequential pattern structures, an original specifica- tion of pattern structures for dealing with sequential data. Sequential pattern structures are given by a subsumption operation between set of sequences, based on subsequence matching. To avoid a huge number of resulting concepts, domain knowledge projections can be applied. The original definition of projections is revised in order to operate on sequen- tial pattern structures in a meaningful way. Based on the introduced definition, several projections of sequential pattern structures involving domain or expert knowledge are defined and discussed. This projections are evaluated on a real dataset on care trajectories where every hos- pitalization is described by a heterogeneous tuple with dierent fields. The evaluation reveals interesting concepts and justify the usage of in- troduced projections of sequential pattern structures. This research work provides a new and ecient extension of FCA to deal with complex data, which can be an alternative to the analysis of sequential datasets.

31 citations

Posted Content
TL;DR: RPT, a denoising autoencoder for tuple-to-X models, is presented, a Transformer-based neural translation architecture that consists of a bidirectional encoder and a left- to-right autoregressive decoder leading to a generalization of both BERT and GPT.
Abstract: Can AI help automate human-easy but computer-hard data preparation tasks that burden data scientists, practitioners, and crowd workers? We answer this question by presenting RPT, a denoising auto-encoder for tuple-to-X models (X could be tuple, token, label, JSON, and so on). RPT is pre-trained for a tuple-to-tuple model by corrupting the input tuple and then learning a model to reconstruct the original tuple. It adopts a Transformer-based neural translation architecture that consists of a bidirectional encoder (similar to BERT) and a left-to-right autoregressive decoder (similar to GPT), leading to a generalization of both BERT and GPT. The pre-trained RPT can already support several common data preparation tasks such as data cleaning, auto-completion and schema matching. Better still, RPT can be fine-tuned on a wide range of data preparation tasks, such as value normalization, data transformation, data annotation, etc. To complement RPT, we also discuss several appealing techniques such as collaborative training and few-shot learning for entity resolution, and few-shot learning and NLP question-answering for information extraction. In addition, we identify a series of research opportunities to advance the field of data preparation.

31 citations

Patent
19 Sep 2008
TL;DR: In this paper, a system has a processor coupled to access a document database that indexes keywords and instances of entities having entity types in a plurality of documents, where the aggregated scores are normalized.
Abstract: A system has a processor coupled to access a document database that indexes keywords and instances of entities having entity types in a plurality of documents. The processor is programmed to receive an input query including one or more keywords and one or more entity types, and search the database for documents having the keywords and entities with the entity types of the input query. The processor is programmed for aggregating a respective score for each of a plurality of entity tuples across the plurality of documents. The aggregated scores are normalized. Each respective normalized score provides a ranking of a respective entity tuple, relative to other entity tuples, as an answer to the input query. The processor has an interface to a storage or display device or network for outputting a list including a subset of the entity tuples having the highest normalized scores among the plurality of entity tuples.

31 citations

01 Jan 1984
TL;DR: In this paper, an external semantic query simplifier implemented in Prolog is proposed to reduce the number of tuple variables and terms in a relational calculus query using integrity constraints enforced in a database system.
Abstract: Semantic query simplification utilizes integrity constraints enforced in a database system for reducing the number of tuple variables and terms in a relational calculus query. To a large degree, this can be done by a system that is external to the DBMS. The paper advocates the application of database theory in such a system and describes a working prototype of an external semantic query simplifier implemented in Prolog. The system employs a graph-theoretic approach to integrate tableau techniques and algorithms for the syntactic simplification of queries containing inequality conditions. The use of integrity constraints is shown not only to improve efficiency but also to permit more meaningful error messages to be generated, particularly in the case of an empty query result. The paper concludes with outlining an extension to the multi-user case. Center for Digital Economy Research Stem School of Business IVorking Paper IS-84-5 1

31 citations

Journal Article
TL;DR: In this paper, the authors propose an extension to the package jFuzzyLogic and to the corresponding script language FCL to deal with unbalanced linguistic term sets, in order to offer a simpler and more faithful partition.
Abstract: In the domain of Computing with words (CW), fuzzy linguistic approaches are known to be relevant in many decision-making problems. Indeed, they allow us to model the human reasoning in replacing words, assessments, preferences, choices, wishes... by ad hoc variables, such as fuzzy sets or more sophisticated variables. This paper focuses on a particular model: Herrera and Martinez' 2-tuple linguistic model and their approach to deal with unbalanced linguistic term sets. It is interesting since the computations are accomplished without loss of information while the results of the decision-making processes always refer to the initial linguistic term set. They propose a fuzzy partition which distributes data on the axis by using linguistic hierarchies to manage the non-uniformity. However, the required input (especially the density around the terms) taken by their fuzzy partition algorithm may be considered as too much demanding in a real-world application, since density is not always easy to determine. Moreover, in some limit cases (especially when two terms are very closed semantically to each other), the partition doesn't comply with the data themselves, it isn't close to the reality. Therefore we propose to modify the required input, in order to offer a simpler and more faithful partition. We have added an extension to the package jFuzzyLogic and to the corresponding script language FCL. This extension supports both 2-tuple models: Herrera and Martinez' and ours. In addition to the partition algorithm, we present two aggregation algorithms: the arithmetic means and the addition. We also discuss these kinds of 2-tuple models.

31 citations


Network Information
Related Topics (5)
Graph (abstract data type)
69.9K papers, 1.2M citations
86% related
Time complexity
36K papers, 879.5K citations
85% related
Server
79.5K papers, 1.4M citations
83% related
Scalability
50.9K papers, 931.6K citations
83% related
Polynomial
52.6K papers, 853.1K citations
81% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
2023203
2022459
2021210
2020285
2019306
2018266