scispace - formally typeset
Search or ask a question

Probabilistic programming for advanced machine learning (ppaml) discriminative learning for generative tasks (diligent)

About: The article was published on 2017-11-29 and is currently open access. It has received 4 citations till now. The article focuses on the topics: Artificial neural network.

Content maybe subject to copyright    Report

Citations
More filters
01 Jan 2016
TL;DR: The United States History of Science, Technology, and Medicine (US History, Technology and Medicine) as discussed by the authors is part of the Artificial Intelligence and Robotics Commons, Cognition and Perception Commons, history of science, technology, and medicine Commons, Information Security Commons, Intellectual History Commons, Metaphysics Commons, military history Commons, Other Psychology Commons, Philosophy of Science Commons, and the United States history Commons
Abstract: How does access to this work benefit you? Let us know! Follow this and additional works at: http://academicworks.cuny.edu/hc_sas_etds Part of the Artificial Intelligence and Robotics Commons, Cognition and Perception Commons, History of Science, Technology, and Medicine Commons, Information Security Commons, Intellectual History Commons, Metaphysics Commons, Military History Commons, Other Psychology Commons, Philosophy of Science Commons, and the United States History Commons

7 citations

Proceedings ArticleDOI
01 Aug 2019
TL;DR: This paper presents discussions on probabilistic reasoning and Probabilistic programming with respect to big data, and an investigation of the potential of probabilists programming in big data is presented by conducting a search through literature to find available big data solutions that use probabilism programming.
Abstract: The advent of the Internet in the late 1990s led to the increased flood of data termed Big Data. To derive meaningful value from big data, specialized tools and techniques are required. These tools and techniques are categorized under data management and data analysis. Under data analysis, predictive tools and techniques exists. These predictive tools and techniques use statistical models and machine learning algorithms to predict future events. Models are developed using probability theories such as Bayesian networks. However, development of probabilistic models require extreme technical expertise and it is a difficult task to model complex real-life situations. Thus, the emergence of probabilistic programming. The idea of probabilistic programming is new and its potential in AI and big data processing is important. This paper presents discussions on probabilistic reasoning and probabilistic programming with respect to big data. An investigation of the potential of probabilistic programming in big data is also presented by conducting a search through literature to find available big data solutions that use probabilistic programming. This search found one solution called InferSpark built on top of Apache Spark to process big data. This is an indication that more big data applications that uses the concept of probabilistic programming needs to be done.

3 citations


Cites background from "Probabilistic programming for advan..."

  • ...This is usually an arduous task that requires extreme technical expertise [38], [39]....

    [...]

Journal ArticleDOI
TL;DR: A model of structured objects in a grayscale or color image, described by means of optimal piecewise constant image approximations, which are characterized by the minimum possible approximation errors for a given number of pixel clusters, where the approximation error means the total squared error.
Abstract: The paper presents a model of structured objects in a grayscale or color image, described by means of optimal piecewise constant image approximations, which are characterized by the minimum possible approximation errors for a given number of pixel clusters, where the approximation error means the total squared error. An ambiguous image is described as a non-hierarchical structure but is represented as an ordered superposition of object hierarchies, each containing at least one optimal approximation in g0 = 1, 2,..., etc., colors. For the selected hierarchy of pixel clusters, the objects-of-interest are detected as the pixel clusters of optimal approximations, or as their parts, or unions. The paper develops the known idea in cluster analysis of the joint application of Ward’s and K-means methods. At the same time, it is proposed to modernize each of these methods and supplement them with a third method of splitting/merging pixel clusters. This is useful for cluster analysis of big data described by a convex dependence of the optimal approximation error on the cluster number and also for adjustable object detection in digital image processing, using the optimal hierarchical pixel clustering, which is treated as an alternative to the modern informally defined “semantic” segmentation.

1 citations

Dissertation
18 Jul 2016
TL;DR: Probabilistic programming is a new approach that makes probabilistic reasoning systems easier to build and more widely applicable, and has seen recent interest from the artificial intelligence, programming languages, cognitive science, and natural languages communities.
Abstract: Probabilistic programming is a way to create systems that help us make decisions in the face of uncertainty. Lots of everyday decisions involve judgment in determining relevant factors that we do not directly observe. Historically, one way to help make decisions under uncertainty has been to use a probabilistic reasoning system. Probabilistic reasoning combines our knowledge of a situation with the laws of probability to determine those unobserved factors that are critical to the decision. Typically, the way the several observations are combined is through the usage of bayesian statistics, due to its anachronistic interpretation where existing knowledge (priors) are combined with observations in order to gather evidence towards competing hypothesis.When compared to other machine learning methods (such as random forests, neural networks or linear regression), which take homogeneous data as input (requiring the user to separate their domain into different models), probabilistic programming is used to leverage the data's original structure. Plus, it provides full probability distributions over both the predictions and parameters of the model, whereas ML methods can only give the user a certain degree of confidence on the predictions.Until recently, probabilistic reasoning systems have been limited in scope, and have been hard to apply to many real world situations. Models are communicated using a mix of natural language, pseudo code, and mathematical formulae and solved using special purpose, one-off inference methods. Rather than precise specifications suitable for automatic inference, graphical models typically serve as coarse, high-level descriptions, eliding critical aspects such as fine-grained independence, abstraction and recursion.Probabilistic programming is a new approach that makes probabilistic reasoning systems easier to build and more widely applicable. A probabilistic programming language (PPL) is a programming language designed to describe probabilistic models, in a such a way we can say that the program itself is the model, and then perform inference in those models. PPLs have seen recent interest from the artificial intelligence, programming languages, cognitive science, and natural languages communities. By empowering users with a common dialect in the form of a programming language, rather than requiring each one of them to the non-trivial and error-prone task of writing their own models and hand-tailored inference algorithms for the problem at hand, it encourages exploration, since different models require less time to setup and evaluate, and enables sharing knowledge in the form of best practices, patterns and tools such as optimized compilers or interpreters, debuggers, IDE's, optimizers and profilers.PPLs are closely related to graphical models and Bayesian networks, but are more expressive and flexible. One can easily realize this by looking at the re-usable components PPLs offer, being one of them the inference engine, which can be plugged in into different models. For instances, it is easy to replace the exact-solution traditional Bayesian networks inference, which requires time exponential in the number of variables to run, with approximation algorithms such as the Markov Chain Monte Carlo (MCMC) or Variational Message Passing (VMP), which make it possible to compute large hierarchical models by resorting to sampling and approximation. PPLs often extend from a basic language (i.e., they are embedded in a host language like R, Java or Scala), although some PPLs such as WinBUGS and Stan offer a self-contained language, with no obvious origin in another language.There have been successful applications of visual programming among several domains, being it education (MIT's Scratch and Microsoft's VPL), general-purpose programming (NoFlo), 3D modeling (Blender) and data science (RapidMiner and Weka Knowledge Flow). The latter, being popular products, have shown that there is added value in providing a graphical representation for working with data. However, as of today no tool provides a graphical representation for a PPL.DARPA, the main funder behind PPLs' research, considers one of the main key points of its Probabilistic Programming for Advancing Machine Learning program to make models easier to write (reducing development time, encouraging experimentation and reducing the level of expertise required to develop such models). The use of visual programming is suitable for this kind of objectives, so building upon the enormous flexibility of PPLs and the advantages of probabilistic models, we want to take advantage of the graphical intuition given by data visualization that data scientists are now accustomed to, and attempt to provide model and algorithmical visualization by rethinking how to capture the (usually textual) programmatic formalisms in a graphical manner.The goal of this dissertation is thus to explore graphical representations of a probabilistic programming language through the usage of node-based programming. The hypothesis under consideration is that graphical representations (not to be confused with bayesian graphical model), are more intuitive and easy to learn that full-blown PPLs.We intend to validate such hypothesis by ensuring that classical problems solved in the literature by PPLs are also supported by our graphical representation, and then measure how quickly a group of people trained in statistics would produce a viable model in both alternatives.
References
More filters
01 Jan 2007

17,341 citations

Book ChapterDOI
21 Jun 2000
TL;DR: Some previous studies comparing ensemble methods are reviewed, and some new experiments are presented to uncover the reasons that Adaboost does not overfit rapidly.
Abstract: Ensemble methods are learning algorithms that construct a set of classifiers and then classify new data points by taking a (weighted) vote of their predictions. The original ensemble method is Bayesian averaging, but more recent algorithms include error-correcting output coding, Bagging, and boosting. This paper reviews these methods and explains why ensembles can often perform better than any single classifier. Some previous studies comparing ensemble methods are reviewed, and some new experiments are presented to uncover the reasons that Adaboost does not overfit rapidly.

5,679 citations

Book
John Platt1
08 Feb 1999
TL;DR: In this article, the authors proposed a new algorithm for training Support Vector Machines (SVM) called SMO (Sequential Minimal Optimization), which breaks this large QP problem into a series of smallest possible QP problems.
Abstract: This chapter describes a new algorithm for training Support Vector Machines: Sequential Minimal Optimization, or SMO Training a Support Vector Machine (SVM) requires the solution of a very large quadratic programming (QP) optimization problem SMO breaks this large QP problem into a series of smallest possible QP problems These small QP problems are solved analytically, which avoids using a time-consuming numerical QP optimization as an inner loop The amount of memory required for SMO is linear in the training set size, which allows SMO to handle very large training sets Because large matrix computation is avoided, SMO scales somewhere between linear and quadratic in the training set size for various test problems, while a standard projected conjugate gradient (PCG) chunking algorithm scales somewhere between linear and cubic in the training set size SMO's computation time is dominated by SVM evaluation, hence SMO is fastest for linear SVMs and sparse data sets For the MNIST database, SMO is as fast as PCG chunking; while for the UCI Adult database and linear SVMs, SMO can be more than 1000 times faster than the PCG chunking algorithm

5,019 citations

Journal ArticleDOI
TL;DR: An improved algorithm that theoretically converges and avoids numerical difficulties is proposed for Platt’s probabilistic outputs for Support Vector Machines.
Abstract: Platt's probabilistic outputs for Support Vector Machines (Platt, J. in Smola, A., et al. (eds.) Advances in large margin classifiers. Cambridge, 2000) has been popular for applications that require posterior class probabilities. In this note, we propose an improved algorithm that theoretically converges and avoids numerical difficulties. A simple and ready-to-use pseudo code is included.

926 citations

Proceedings Article
01 Jan 2009
TL;DR: Details of the new paradigm and corresponding algorithms are discussed, some new algorithms are introduced, several specific forms of privileged information are considered, and superiority of thenew learning paradigm over the classical learning paradigm when solving practical problems is demonstrated.
Abstract: In the Afterword to the second edition of the book "Estimation of Dependences Based on Empirical Data" by V. Vapnik, an advanced learning paradigm called Learning Using Hidden Information (LUHI) was introduced. This Afterword also suggested an extension of the SVM method (the so called SVM γ + method) to implement algorithms which address the LUHI paradigm (Vapnik, 1982-2006, Sections 2.4.2 and 2.5.3 of the Afterword). See also (Vapnik, Vashist, & Pavlovitch, 2008, 2009) for further development of the algorithms. In contrast to the existing machine learning paradigm where a teacher does not play an important role, the advanced learning paradigm considers some elements of human teaching. In the new paradigm along with examples, a teacher can provide students with hidden information that exists in explanations, comments, comparisons, and so on. This paper discusses details of the new paradigm 1 and corresponding algorithms, introduces some new algorithms, considers several specific forms of privileged information, demonstrates superiority of the new learning paradigm over the classical learning paradigm when solving practical problems, and discusses general questions related to the new ideas.

525 citations