scispace - formally typeset
Search or ask a question

Showing papers by "Ran El-Yaniv published in 2021"


Proceedings Article
03 May 2021
TL;DR: Net-DNF as discussed by the authors is a generic architecture whose inductive bias elicits models whose structure corresponds to logical Boolean formulas in disjunctive normal form (DNF) over affine soft-threshold decision terms.
Abstract: A challenging open question in deep learning is how to handle tabular data Unlike domains such as image and natural language processing, where deep architectures prevail, there is still no widely accepted neural architecture that dominates tabular data As a step toward bridging this gap, we present Net-DNF a novel generic architecture whose inductive bias elicits models whose structure corresponds to logical Boolean formulas in disjunctive normal form (DNF) over affine soft-threshold decision terms Net-DNFs also promote localized decisions that are taken over small subsets of the features We present an extensive experiments showing that Net-DNFs significantly and consistently outperform fully connected networks over tabular data With relatively few hyperparameters, Net-DNFs open the door to practical end-to-end handling of tabular data using neural networks We present ablation studies, which justify the design choices of Net-DNF including the inductive bias elements, namely, Boolean formulation, locality, and feature selection

4 citations


Proceedings ArticleDOI
01 Jan 2021
TL;DR: In this article, the authors propose to transduce auxiliary text so as to enable recognition of relationships absent in the visual training data, which can be used as a plug-in module in existing VRD and SGG recognition systems to improve their performance and extend their capabilities.
Abstract: An important challenge in visual scene understanding is the recognition of interactions between objects in an image. This task – often called visual relationship detection (VRD) – must be solved to enable higher understanding of the semantic content in images. VRD can become particularly hard where there is severe statistical sparsity of some potentially involved objects, and the number of many relationships in standard training sets is limited. In this paper we show how to transduce auxiliary text so as to enable recognition of relationships absent in the visual training data. This transduction is performed by learning a shared relationship representation for both the textual and visual information. The proposed approach is model-agnostic and can be used as a plug-in module in existing VRD and scene graph generation (SGG) recognition systems to improve their performance and extend their capabilities. We consider the application of our technique using three widely accepted SGG models [20], [24], [16], and different auxiliary text sources: image captions, text generated by a deep text generation model (GPT-2), and ebooks from the Gutenberg Project. We conduct an extensive empirical study of both the VRD and SGG tasks over large-scale benchmark datasets. Our method is the first to enable recognition of visual relationships missing in the visual training data and appearing only in the auxiliary text. We conclusively show that text ingestion enables recognition of unseen visual relationships, and moreover, advances the state-of-the-art in all SGG tasks.

2 citations


Posted Content
TL;DR: The authors proposed a novel mechanism that dynamically augments during training the set of seen classes to produce additional fictitious classes to diminish the model's tendency to fixate during training on attribute correlations that appear in the training set but will not appear in newly exposed classes.
Abstract: Focusing on discriminative zero-shot learning, in this work we introduce a novel mechanism that dynamically augments during training the set of seen classes to produce additional fictitious classes. These fictitious classes diminish the model's tendency to fixate during training on attribute correlations that appear in the training set but will not appear in newly exposed classes. The proposed model is tested within the two formulations of the zero-shot learning framework; namely, generalized zero-shot learning (GZSL) and classical zero-shot learning (CZSL). Our model improves the state-of-the-art performance on the CUB dataset and reaches comparable results on the other common datasets, AWA2 and SUN. We investigate the strengths and weaknesses of our method, including the effects of catastrophic forgetting when training an end-to-end zero-shot model.

Posted Content
TL;DR: In this paper, the authors look at the board as a graph and combine a graph neural network architecture inside the AlphaZero framework, along with some other innovative improvements, to learn to play incrementally on small boards and advance to play on large ones.
Abstract: Playing board games is considered a major challenge for both humans and AI researchers. Because some complicated board games are quite hard to learn, humans usually begin with playing on smaller boards and incrementally advance to master larger board strategies. Most neural network frameworks that are currently tasked with playing board games neither perform such incremental learning nor possess capabilities to automatically scale up. In this work, we look at the board as a graph and combine a graph neural network architecture inside the AlphaZero framework, along with some other innovative improvements. Our ScalableAlphaZero is capable of learning to play incrementally on small boards, and advancing to play on large ones. Our model can be trained quickly to play different challenging board games on multiple board sizes, without using any domain knowledge. We demonstrate the effectiveness of ScalableAlphaZero and show, for example, that by training it for only three days on small Othello boards, it can defeat the AlphaZero model on a large board, which was trained to play the large board for $30$ days.

Posted Content
TL;DR: In this article, the authors present a simple and simple attack, which unlike adversarial attacks, does not cause incorrect predictions but instead cripples the network's capacity for uncertainty estimation, such that the DNN is more confident of its incorrect predictions than about its correct ones without having its accuracy reduced.
Abstract: Deep neural networks (DNNs) have proven to be powerful predictors and are widely used for various tasks. Credible uncertainty estimation of their predictions, however, is crucial for their deployment in many risk-sensitive applications. In this paper we present a novel and simple attack, which unlike adversarial attacks, does not cause incorrect predictions but instead cripples the network's capacity for uncertainty estimation. The result is that after the attack, the DNN is more confident of its incorrect predictions than about its correct ones without having its accuracy reduced. We present two versions of the attack. The first scenario focuses on a black-box regime (where the attacker has no knowledge of the target network) and the second scenario attacks a white-box setting. The proposed attack is only required to be of minuscule magnitude for its perturbations to cause severe uncertainty estimation damage, with larger magnitudes resulting in completely unusable uncertainty estimations. We demonstrate successful attacks on three of the most popular uncertainty estimation methods: the vanilla softmax score, Deep Ensembles and MC-Dropout. Additionally, we show an attack on SelectiveNet, the selective classification architecture. We test the proposed attack on several contemporary architectures such as MobileNetV2 and EfficientNetB0, all trained to classify ImageNet.

Journal ArticleDOI
TL;DR: In this paper, the authors compared a method of computerized visual acuity (VA) testing software to the Early Treatment Diabetic Retinopathy Study (ETDRS) chart.
Abstract: PURPOSE To describe and compare a method of computerized visual acuity (VA) testing software to the Early Treatment Diabetic Retinopathy Study (ETDRS) chart. METHODS Setting: Single tertiary institution. STUDY POPULATION Prospective study including right eyes of volunteers (N = 109) and patients (N = 126). INTERVENTION Subjects were tested in a random order twice with the ETDRS chart and twice with the VA software. For ETDRS, we calculated the final VA separately for each run, using four different test termination criteria (1-miss in a row, 2-miss in a row, 50% miss and per-letter). For software testing, we calculated final VA with a variety of number of letters presented. MAIN OUTCOME MEASURES The main outcome measures were reproducibility and number of letters required to exceed ETDRS reproducibility. RESULTS For ETDRS, the average number of letters presented was 55.1 ± 9, 54.3 ± 10, 53.1 ± 10 and 70 for the 1-miss, 2-miss, 50% termination and per-letter criterion. The test-retest variability (TRV) of ETDRS was 0.29, 0.42, 0.17 and 0.141 for the 1-miss in a row, 2-miss in a row, 50% and per-letter termination criteria. For the software VA test, TRV was 0.202, 0.138 and 0.112 after presenting 6, 11 and 20 letters. The reproducibility of the software was equal to the ETDRS at 11 letters and thereafter surpassed. Similar results were achieved in the patient group. CONCLUSIONS This study demonstrates that by utilizing a VA testing software, based on advanced threshold testing algorithms we were able to duplicate, and surpass, the reproducibility of the ETDRS chart while presenting much fewer letters.