Other affiliations: University of Florida, Max Planck Society, Boston Children's Hospital ...read more
Bio: Włodzisław Duch is an academic researcher from Nicolaus Copernicus University in Toruń. The author has contributed to research in topics: Artificial neural network & Deep learning. The author has an hindex of 36, co-authored 258 publications receiving 4941 citations. Previous affiliations of Włodzisław Duch include University of Florida & Max Planck Society.
Papers published on a yearly basis
29 Jun 2007
TL;DR: A shared task involving the assignment of ICD-9-CM codes to radiology reports resulted in the first freely distributable corpus of fully anonymized clinical text, suggesting that human-like performance on this task is within the reach of currently available technologies.
Abstract: This paper reports on a shared task involving the assignment of ICD-9-CM codes to radiology reports. Two features distinguished this task from previous shared tasks in the biomedical domain. One is that it resulted in the first freely distributable corpus of fully anonymized clinical text. This resource is permanently available and will (we hope) facilitate future research. The other key feature of the task is that it required categorization with respect to a large and commercially significant set of labels. The number of participants was larger than in any previous biomedical challenge task. We describe the data production process and the evaluation measures, and give a preliminary analysis of the results. Many systems performed at levels approaching the inter-coder agreement, suggesting that human-like performance on this task is within the reach of currently available technologies.
TL;DR: Several neural and machine learning methods of logical rule extraction generating initial rules are described, based on constrained multilayer perceptron, networks with localized transfer functions or on separability criteria for determination of linguistic variables.
Abstract: A new methodology of extraction, optimization, and application of sets of logical rules is described. Neural networks are used for initial rule extraction, local or global minimization procedures for optimization, and Gaussian uncertainties of measurements are assumed during application of logical rules. Algorithms for extraction of logical rules from data with real-valued features require determination of linguistic variables or membership functions. Contest-dependent membership functions for crisp and fuzzy linguistic variables are introduced and methods of their determination described. Several neural and machine learning methods of logical rule extraction generating initial rules are described, based on constrained multilayer perceptron, networks with localized transfer functions or on separability criteria for determination of linguistic variables. A tradeoff between accurary/simplicity is explored at the rule extraction stage and between rejection/error level at the optimization stage. Gaussian uncertainties of measurements are assumed during application of crisp logical rules, leading to "soft trapezoidal" membership functions and allowing to optimize the linguistic variables using gradient procedures. Numerous applications of this methodology to benchmark and real-life problems are reported and very simple crisp logical rules for many datasets provided.
••19 Apr 2004
TL;DR: All aspects of rule generation, optimization, and application are described, including the problem of finding good symbolic descriptors for continuous data, tradeoffs between accuracy and simplicity at the rule-extraction stage, and tradeoff between rejection and error level at therule optimization stage.
Abstract: In many applications, black-box prediction is not satisfactory, and understanding the data is of critical importance. Typically, approaches useful for understanding of data involve logical rules, evaluate similarity to prototypes, or are based on visualization or graphical methods. This paper is focused on the extraction and use of logical rules for data understanding. All aspects of rule generation, optimization, and application are described, including the problem of finding good symbolic descriptors for continuous data, tradeoffs between accuracy and simplicity at the rule-extraction stage, and tradeoffs between rejection and error level at the rule optimization stage. Stability of rule-based description, calculation of probabilities from rules, and other related issues are also discussed. Major approaches to extraction of logical rules based on neural networks, decision trees, machine learning, and statistical methods are introduced. Optimization and application issues for sets of logical rules are described. Applications of such methods to benchmark and real-life problems are reported and illustrated with simple logical rules for many datasets. Challenges and new directions for research are outlined.
••01 Jan 2007
TL;DR: An algorithm for filtering information based on the Pearson χ2 test approach has been implemented and tested on feature selection and empirical comparisons with four other state-of-the-art features selection algorithms are very encouraging.
Abstract: An algorithm for filtering information based on the Pearson χ2 test approach has been implemented and tested on feature selection. This test is frequently used in biomedical data analysis and should be used only for nominal (discretized) features. This algorithm has only one parameter, statistical confidence level that two distributions are identical. Empirical comparisons with four other state-of-the-art features selection algorithms (FCBF, CorrSF, ReliefF and ConnSF) are very encouraging.
•20 Jun 2008
TL;DR: Apparatus for scanning the image on an image-bearing member moved past an array of thin film light sensitive elements, relying upon the concept of "proximity focusing" in order to generate electrical signals for replicating said image.
Abstract: Cognitive architectures play a vital role in providing blueprints for building future intelligent systems supporting a broad range of capabilities similar to those of humans. How useful are existing architectures for creating artificial general intelligence? A critical survey of the state of the art in cognitive architectures is presented providing a useful insight into the possible frameworks for general intelligence. Grand challenges and an outline of the most promising future directions are described.
TL;DR: A fast, greedy algorithm is derived that can learn deep, directed belief networks one layer at a time, provided the top two layers form an undirected associative memory.
Abstract: We show how to use "complementary priors" to eliminate the explaining-away effects that make inference difficult in densely connected belief nets that have many hidden layers. Using complementary priors, we derive a fast, greedy algorithm that can learn deep, directed belief networks one layer at a time, provided the top two layers form an undirected associative memory. The fast, greedy algorithm is used to initialize a slower learning procedure that fine-tunes the weights using a contrastive version of the wake-sleep algorithm. After fine-tuning, a network with three hidden layers forms a very good generative model of the joint distribution of handwritten digit images and their labels. This generative model gives better digit classification than the best discriminative learning algorithms. The low-dimensional manifolds on which the digits lie are modeled by long ravines in the free-energy landscape of the top-level associative memory, and it is easy to explore these ravines by using the directed connections to display what the associative memory has in mind.
TL;DR: Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis.
Abstract: Machine Learning is the study of methods for programming computers to learn. Computers are applied to a wide range of tasks, and for most of these it is relatively easy for programmers to design and implement the necessary software. However, there are many tasks for which this is difficult or impossible. These can be divided into four general categories. First, there are problems for which there exist no human experts. For example, in modern automated manufacturing facilities, there is a need to predict machine failures before they occur by analyzing sensor readings. Because the machines are new, there are no human experts who can be interviewed by a programmer to provide the knowledge necessary to build a computer system. A machine learning system can study recorded data and subsequent machine failures and learn prediction rules. Second, there are problems where human experts exist, but where they are unable to explain their expertise. This is the case in many perceptual tasks, such as speech recognition, hand-writing recognition, and natural language understanding. Virtually all humans exhibit expert-level abilities on these tasks, but none of them can describe the detailed steps that they follow as they perform them. Fortunately, humans can provide machines with examples of the inputs and correct outputs for these tasks, so machine learning algorithms can learn to map the inputs to the outputs. Third, there are problems where phenomena are changing rapidly. In finance, for example, people would like to predict the future behavior of the stock market, of consumer purchases, or of exchange rates. These behaviors change frequently, so that even if a programmer could construct a good predictive computer program, it would need to be rewritten frequently. A learning program can relieve the programmer of this burden by constantly modifying and tuning a set of learned prediction rules. Fourth, there are applications that need to be customized for each computer user separately. Consider, for example, a program to filter unwanted electronic mail messages. Different users will need different filters. It is unreasonable to expect each user to program his or her own rules, and it is infeasible to provide every user with a software engineer to keep the rules up-to-date. A machine learning system can learn which mail messages the user rejects and maintain the filtering rules automatically. Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis. Statistics focuses on understanding the phenomena that have generated the data, often with the goal of testing different hypotheses about those phenomena. Data mining seeks to find patterns in the data that are understandable by people. Psychological studies of human learning aspire to understand the mechanisms underlying the various learning behaviors exhibited by people (concept learning, skill acquisition, strategy change, etc.).
TL;DR: The editors have done a masterful job of weaving together the biologic, the behavioral, and the clinical sciences into a single tapestry in which everyone from the molecular biologist to the practicing psychiatrist can find and appreciate his or her own research.
Abstract: I have developed "tennis elbow" from lugging this book around the past four weeks, but it is worth the pain, the effort, and the aspirin. It is also worth the (relatively speaking) bargain price. Including appendixes, this book contains 894 pages of text. The entire panorama of the neural sciences is surveyed and examined, and it is comprehensive in its scope, from genomes to social behaviors. The editors explicitly state that the book is designed as "an introductory text for students of biology, behavior, and medicine," but it is hard to imagine any audience, interested in any fragment of neuroscience at any level of sophistication, that would not enjoy this book. The editors have done a masterful job of weaving together the biologic, the behavioral, and the clinical sciences into a single tapestry in which everyone from the molecular biologist to the practicing psychiatrist can find and appreciate his or
TL;DR: In this article, a new internally contracted direct multiconfiguration-reference configuration interaction (MRCI) method is described which allows the use of much larger reference spaces than any previous MRCI method.
Abstract: A new internally contracted direct multiconfiguration–reference configuration interaction (MRCI) method is described which allows the use of much larger reference spaces than any previous MRCI method. The configurations with two electrons in the external orbital space are generated by applying pair excitation operators to the reference wave function as a whole, while the singly external and internal configurations are standard uncontracted spin eigenfunctions. A new efficient and simple method for the calculation of the coupling coefficients is used, which is well suited for vector machines, and allows the recalculation of all coupling coefficients each time they are needed. The vector H⋅c is computed partly in a nonorthogonal configuration basis. In order to test the accuracy of the internally contracted wave functions, benchmark calculations have been performed for F−, H2O, NH2, CH2, CH3, OH, NO, N2, and O2 at various geometries. The deviations of the energies obtained with internally contracted and uncontracted MRCI wave functions are mostly smaller than 1 mH and typically 3–5 times smaller than the deviations between the uncontracted MRCI and the full CI. Dipole moments, electric dipole polarizabilities, and electronic dipole transition moments calculated with uncontracted and contracted MRCI wave functions also are found to be in close agreement. The efficiency of the method is demonstrated in large scale calculations for the CN, NH3, CO2, and Cr2 molecules. In these calculations up to 3088 reference configurations and up to 154 orbitals were employed. The biggest calculation is equivalent to an uncontracted MRCI with more than 78 million configurations.