Patent
Efficient duplicate detection for machine learning data sets
Reads0
Chats0
TLDR
In this paper, a machine learning service is made that an analysis to detect whether at least a portion of contents of one or more observation records of a first data set are duplicated in a second set of observation records is to be performed.Abstract:
At a machine learning service, a determination is made that an analysis to detect whether at least a portion of contents of one or more observation records of a first data set are duplicated in a second set of observation records is to be performed. A duplication metric is obtained, indicative of a non-zero probability that one or more observation records of the second set are duplicates of respective observation records of the first set. In response to determining that the duplication metric meets a threshold criterion, one or more responsive actions are initiated, such as the transmission of a notification to a client of the service.read more
Citations
More filters
Patent
Interactive interfaces for machine learning model evaluations
Polly Po Yee Lee,Nicolle M. Correa,Leo Parker Dirac,Aleksandr Mikhaylovich Ingerman,Sriram Krishnan,Jin Li,Sudhakar Rao Puvvadi,Saman Zarandioon,Charles Eric Dannaker,Rakesh Ramakrishnan,Tianming Zheng,Donghui Zhuo,Tarun Agarwal,Robert Matthias Steele,Jun Qian,Michael Brueckner,Ralf Herbrich,Daniel Blick +17 more
TL;DR: In this article, a first data set corresponding to an evaluation run of a model is generated at a machine learning service for display via an interactive interface, which includes a prediction quality metric.
Patent
Optimized training of linear machine learning models
Michael Brueckner,Daniel Blick +1 more
TL;DR: In this article, a linear prediction model is used to generate predictions using respective parameters assigned to a plurality of features derived from observation records of the data source, and the parameter values are stored in a parameter vector.
Patent
Data processing systems and methods for performing privacy assessments and monitoring of new versions of computer code for privacy compliance
TL;DR: In this article, the authors present a system that collects and/or uses personal data, and then automatically analyzes the computer code to identify one or more privacy-related attributes that may impact privacy assessment standards.
Patent
Data processing systems and methods for efficiently assessing the risk of privacy campaigns
TL;DR: In this paper, the authors provide a centralized repository of templates of privacy-related question/answer pairings for various vendors, products (e.g., software products), and services.
Patent
Data processing systems and methods for operationalizing privacy compliance and assessing the risk of various respective privacy campaigns
TL;DR: In this paper, the authors present a system to assess and display a relative risk associated with each campaign and automatically set, monitor, and facilitate the timely completion of an audit schedule for each campaign.
References
More filters
Posted Content
A Tutorial on Bayesian Optimization of Expensive Cost Functions, with Application to Active User Modeling and Hierarchical Reinforcement Learning
TL;DR: Bayesian optimization as mentioned in this paper employs the Bayesian technique of setting a prior over the objective function and combining it with evidence to get a posterior function, which permits a utility-based selection of the next observation to make on the objective functions, which must take into account both exploration (sampling from areas of high uncertainty) and exploitation, sampling areas likely to offer improvement over the current best observation.
Journal ArticleDOI
Context-Aware Recommender Systems
TL;DR: An overview of the multifaceted notion of context is provided, several approaches for incorporating contextual information in recommendation process are discussed, and the usage of such approaches in several application areas where different types of contexts are exploited are illustrated.
Posted Content
Practical Bayesian Optimization of Machine Learning Algorithms
TL;DR: In this paper, a learning algorithm's generalization performance is modeled as a sample from a Gaussian process and the tractable posterior distribution induced by the GP leads to efficient use of the information gathered by previous experiments, enabling optimal choices about what parameters to try next.
Patent
Automatic software production system
José Iborra,Oscar Pastor +1 more
TL;DR: In this article, an automated software production system is provided, in which system requirements are captured, converted into a formal specification, and validated for correctness and completeness, and a translator is provided to automatically generate a complete, robust software application based on the validated formal specification.
Journal ArticleDOI
A comparative analysis of methods for pruning decision trees
TL;DR: A comparative study of six well-known pruning methods with the aim of understanding their theoretical foundations, their computational complexity, and the strengths and weaknesses of their formulation, and an objective evaluation of the tendency to overprune/underprune observed in each method is made.
Related Papers (5)
Interactive interfaces for machine learning model evaluations
Polly Po Yee Lee,Nicolle M. Correa,Leo Parker Dirac,Aleksandr Mikhaylovich Ingerman,Sriram Krishnan,Jin Li,Sudhakar Rao Puvvadi,Saman Zarandioon,Charles Eric Dannaker,Rakesh Ramakrishnan,Tianming Zheng,Donghui Zhuo,Tarun Agarwal,Robert Matthias Steele,Jun Qian,Michael Brueckner,Ralf Herbrich,Daniel Blick +17 more