Home
/
Authors
/
Shlomi Hod

Author

Shlomi Hod

Bio: Shlomi Hod is an academic researcher from Boston University. The author has contributed to research in topics: Artificial neural network & Modularity (networks). The author has an hindex of 4, co-authored 5 publications receiving 31 citations.

Papers

PDF

Open Access

More filters

Posted Content•

Performative Prediction in a Stateful World.

[...]

Gavin Brown¹, Shlomi Hod¹, Iden Kalemaj¹•Institutions (1)

Boston University¹

08 Nov 2020-arXiv: Learning

TL;DR: This work generalizes the results of Perdomo et al. (2020), who investigated "performative prediction" in a stateless setting to the case where the response of the population to the deployed classifier depends both on the classifier and the previous distribution of the Population.

...read moreread less

Abstract: Deployed supervised machine learning models make predictions that interact with and influence the world. This phenomenon is called "performative prediction" by Perdomo et al. (2020), who investigated it in a stateless setting. We generalize their results to the case where the response of the population to the deployed classifier depends both on the classifier and the previous distribution of the population. We also demonstrate such a setting empirically, for the scenario of strategic manipulation.

...read moreread less

41 citations

Posted Content•

Neural Networks are Surprisingly Modular

[...]

Daniel Filan, Shlomi Hod, Cody Wild, Andrew Critch, Stuart Russell - Show less +1 more

10 Mar 2020

TL;DR: A measurable notion of modularity is introduced for multi-layer perceptrons (MLPs) and it is found that MLPs that undergo training and weight pruning are often significantly more modular than random networks with the same distribution of weights.

...read moreread less

Abstract: The learned weights of a neural network are often considered devoid of scrutable internal structure. To discern structure in these weights, we introduce a measurable notion of modularity for multi-layer perceptrons (MLPs), and investigate the modular structure of MLPs trained on datasets of small images. Our notion of modularity comes from the graph clustering literature: a "module" is a set of neurons with strong internal connectivity but weak external connectivity. We find that training and weight pruning produces MLPs that are more modular than randomly initialized ones, and often significantly more modular than random MLPs with the same (sparse) distribution of weights. Interestingly, they are much more modular when trained with dropout. We also present exploratory analyses of the importance of different modules for performance and how modules depend on each other. Understanding the modular structure of neural networks, when such structure exists, will hopefully render their inner workings more interpretable to engineers. Note that this paper has been superceded by "Clusterability in Neural Networks", arXiv:2103.03386!

...read moreread less

9 citations

Posted Content•

Pruned Neural Networks are Surprisingly Modular

[...]

Daniel Filan, Shlomi Hod, Cody Wild, Andrew Critch, Stuart Russell - Show less +1 more

10 Mar 2020-arXiv: Neural and Evolutionary Computing

TL;DR: A measurable notion of modularity for multi-layer perceptrons (MLPs) is introduced, and it is found that training and weight pruning produces MLPs that are more modular than randomly initialized ones, and often significantly more modules than random MLPs with the same (sparse) distribution of weights.

...read moreread less

Abstract: The learned weights of a neural network are often considered devoid of scrutable internal structure To discern structure in these weights, we introduce a measurable notion of modularity for multi-layer perceptrons (MLPs), and investigate the modular structure of MLPs trained on datasets of small images Our notion of modularity comes from the graph clustering literature: a "module" is a set of neurons with strong internal connectivity but weak external connectivity We find that training and weight pruning produces MLPs that are more modular than randomly initialized ones, and often significantly more modular than random MLPs with the same (sparse) distribution of weights Interestingly, they are much more modular when trained with dropout We also present exploratory analyses of the importance of different modules for performance and how modules depend on each other Understanding the modular structure of neural networks, when such structure exists, will hopefully render their inner workings more interpretable to engineers Note that this paper has been superceded by "Clusterability in Neural Networks", arXiv:210303386!

...read moreread less

6 citations

Posted Content•

Clusterability in Neural Networks.

[...]

Daniel Filan, Stephen Casper, Shlomi Hod, Cody Wild, Andrew Critch, Stuart Russell - Show less +2 more

04 Mar 2021-arXiv: Neural and Evolutionary Computing

TL;DR: In this article, the authors look for structure in the form of clusterability: how well a network can be divided into groups of neurons with strong internal connectivity but weak external connectivity and find that a trained neural network is typically more clusterable than randomly initialized networks, and often clusterable relative to random networks with the same distribution of weights.

...read moreread less

Abstract: The learned weights of a neural network have often been considered devoid of scrutable internal structure. In this paper, however, we look for structure in the form of clusterability: how well a network can be divided into groups of neurons with strong internal connectivity but weak external connectivity. We find that a trained neural network is typically more clusterable than randomly initialized networks, and often clusterable relative to random networks with the same distribution of weights. We also exhibit novel methods to promote clusterability in neural network training, and find that in multi-layer perceptrons they lead to more clusterable networks with little reduction in accuracy. Understanding and controlling the clusterability of neural networks will hopefully render their inner workings more interpretable to engineers by facilitating partitioning into meaningful clusters.

...read moreread less

5 citations

Journal Article•DOI•

Data science meets law

[...]

Shlomi Hod, Karni Chagal-Feferkorn, Niva Elkin-Koren, Avid Gal

24 Jan 2022-Communications of The ACM

TL;DR: Learning Responsible AI together as mentioned in this paper ) is a learning-based approach to learning responsible AI together, which can be seen as a learning approach for learning AI together with learning responsible learning.

...read moreread less

Abstract: Learning Responsible AI together.

...read moreread less

5 citations

Cited by

PDF

Open Access

More filters

Proceedings Article•

Can Subnetwork Structure Be the Key to Out-of-Distribution Generalization?

[...]

Dinghuai Zhang, Kartik Ahuja, Yilun Xu¹, Yisen Wang², Aaron Courville³ - Show less +1 more•Institutions (3)

Massachusetts Institute of Technology¹, Peking University², Université de Montréal³

18 Jul 2021

TL;DR: A functional modular probing method is used to analyze deep model structures under OOD setting and demonstrates that even in biased models (which focus on spurious correlation) there still exist unbiased functional subnetworks.

...read moreread less

Abstract: Can models with particular structure avoid being biased towards spurious correlation in out-of-distribution (OOD) generalization? Peters et al. (2016) provides a positive answer for linear cases. In this paper, we use a functional modular probing method to analyze deep model structures under OOD setting. We demonstrate that even in biased models (which focus on spurious correlation) there still exist unbiased functional subnetworks. Furthermore, we articulate and demonstrate the functional lottery ticket hypothesis: full network contains a subnetwork that can achieve better OOD performance. We then propose Modular Risk Minimization to solve the subnetwork selection problem. Our algorithm learns the subnetwork structure from a given dataset, and can be combined with any other OOD regularization methods. Experiments on various OOD generalization tasks corroborate the effectiveness of our method.

...read moreread less

45 citations

Posted Content•

Outside the Echo Chamber: Optimizing the Performative Risk

[...]

John J. Miller¹, Juan C. Perdomo¹, Tijana Zrnic¹•Institutions (1)

University of California, Berkeley¹

17 Feb 2021-arXiv: Learning

TL;DR: This paper identifies a natural set of properties of the loss function and model-induced distribution shift under which the performative risk is convex, a property which does not follow from convexity of the losses alone.

...read moreread less

Abstract: In performative prediction, predictions guide decision-making and hence can influence the distribution of future data. To date, work on performative prediction has focused on finding performatively stable models, which are the fixed points of repeated retraining. However, stable solutions can be far from optimal when evaluated in terms of the performative risk, the loss experienced by the decision maker when deploying a model. In this paper, we shift attention beyond performative stability and focus on optimizing the performative risk directly. We identify a natural set of properties of the loss function and model-induced distribution shift under which the performative risk is convex, a property which does not follow from convexity of the loss alone. Furthermore, we develop algorithms that leverage our structural assumptions to optimize the performative risk with better sample efficiency than generic methods for derivative-free convex optimization.

...read moreread less

42 citations

Posted Content•

Are Neural Nets Modular? Inspecting Functional Modularity Through Differentiable Weight Masks

[...]

Róbert Csordás¹, Sjoerd van Steenkiste¹, Jürgen Schmidhuber¹•Institutions (1)

Dalle Molle Institute for Artificial Intelligence Research¹

05 Oct 2020-arXiv: Neural and Evolutionary Computing

TL;DR: A novel method based on learning binary weight masks to identify individual weights and subnets responsible for specific functions in NNs is presented, demonstrating how common NNs fail to reuse submodules and offering new insights into the related issue of systematic generalization on language tasks.

...read moreread less

Abstract: Neural networks (NNs) whose subnetworks implement reusable functions are expected to offer numerous advantages, including compositionality through efficient recombination of functional building blocks, interpretability, preventing catastrophic interference, etc. Understanding if and how NNs are modular could provide insights into how to improve them. Current inspection methods, however, fail to link modules to their functionality. In this paper, we present a novel method based on learning binary weight masks to identify individual weights and subnets responsible for specific functions. Using this powerful tool, we contribute an extensive study of emerging modularity in NNs that covers several standard architectures and datasets. We demonstrate how common NNs fail to reuse submodules and offer new insights into the related issue of systematic generalization on language tasks.

...read moreread less

32 citations

Posted Content•

Visual Representation Learning Does Not Generalize Strongly Within the Same Domain

[...]

Lukas Schott, Julius von Kügelgen, Frederik Träuble, Peter V. Gehler, Chris Russell, Matthias Bethge, Bernhard Schölkopf, Francesco Locatello, Wieland Brendel - Show less +5 more

17 Jul 2021-arXiv: Learning

TL;DR: This paper test whether 17 unsupervised, weakly supervised, and fully supervised representation learning approaches correctly infer the generative factors of variation in simple datasets and observe that all of them struggle to learn the underlying mechanism regardless of supervision signal and architectural bias.

...read moreread less

Abstract: An important component for generalization in machine learning is to uncover underlying latent factors of variation as well as the mechanism through which each factor acts in the world. In this paper, we test whether 17 unsupervised, weakly supervised, and fully supervised representation learning approaches correctly infer the generative factors of variation in simple datasets (dSprites, Shapes3D, MPI3D). In contrast to prior robustness work that introduces novel factors of variation during test time, such as blur or other (un)structured noise, we here recompose, interpolate, or extrapolate only existing factors of variation from the training data set (e.g., small and medium-sized objects during training and large objects during testing). Models that learn the correct mechanism should be able to generalize to this benchmark. In total, we train and test 2000+ models and observe that all of them struggle to learn the underlying mechanism regardless of supervision signal and architectural bias. Moreover, the generalization capabilities of all tested models drop significantly as we move from artificial datasets towards more realistic real-world datasets. Despite their inability to identify the correct mechanism, the models are quite modular as their ability to infer other in-distribution factors remains fairly stable, providing only a single factor is out-of-distribution. These results point to an important yet understudied problem of learning mechanistic models of observations that can facilitate generalization.

...read moreread less

28 citations

Multiplayer Performative Prediction: Learning in Decision-Dependent Games

[...]

Adhyyan Narang, Evan Faulkner, Dmitriy Drusvyatskiy, Maryam Fazel, Lillian J. Ratliff - Show less +1 more

10 Jan 2022

TL;DR: A new game theoretic framework for this phenomenon, called multi-player performative prediction, is formulates and it is shown that under mild assumptions, the performatively stable equilibria can be found efficiently by a variety of algorithms, including repeated retraining and repeated (stochastic) gradient play.

...read moreread less

Abstract: Learning problems commonly exhibit an interesting feedback mechanism wherein the population data reacts to competing decision makers' actions. This paper formulates a new game theoretic framework for this phenomenon, called"multi-player performative prediction". We focus on two distinct solution concepts, namely (i) performatively stable equilibria and (ii) Nash equilibria of the game. The latter equilibria are arguably more informative, but can be found efficiently only when the game is monotone. We show that under mild assumptions, the performatively stable equilibria can be found efficiently by a variety of algorithms, including repeated retraining and the repeated (stochastic) gradient method. We then establish transparent sufficient conditions for strong monotonicity of the game and use them to develop algorithms for finding Nash equilibria. We investigate derivative free methods and adaptive gradient algorithms wherein each player alternates between learning a parametric description of their distribution and gradient steps on the empirical risk. Synthetic and semi-synthetic numerical experiments illustrate the results.

...read moreread less

15 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13

Collapse