Home
/
Authors
/
Matteo Manica

Author

Matteo Manica

Other affiliations: ETH Zurich

Bio: Matteo Manica is an academic researcher from IBM. The author has contributed to research in topics: Computer science & Medicine. The author has an hindex of 12, co-authored 46 publications receiving 549 citations. Previous affiliations of Matteo Manica include ETH Zurich.

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017

Papers

PDF

Open Access

More filters

Journal Article•DOI•

BLOOM: A 176B-Parameter Open-Access Multilingual Language Model

[...]

Teven Le Scao, Angela Fan, Christopher Akiki, Elizabeth-Jane Pavlick +383 more

09 Nov 2022-arXiv.org

TL;DR: BLOOM as discussed by the authors is a decoder-only Transformer language model that was trained on the ROOTS corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total).

...read moreread less

Abstract: Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access language model designed and built thanks to a collaboration of hundreds of researchers. BLOOM is a decoder-only Transformer language model that was trained on the ROOTS corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total). We find that BLOOM achieves competitive performance on a wide variety of benchmarks, with stronger results after undergoing multitask prompted finetuning. To facilitate future research and applications using LLMs, we publicly release our models and code under the Responsible AI License.

...read moreread less

407 citations

Journal Article•DOI•

Mixed-precision in-memory computing

[...]

Manuel Le Gallo¹, Manuel Le Gallo², Abu Sebastian¹, Roland Mathis¹, Matteo Manica¹, Matteo Manica², Heiner Giefers¹, Tomas Tuma¹, Costas Bekas¹, Alessandro Curioni¹, Evangelos Eleftheriou¹ - Show less +7 more•Institutions (2)

IBM¹, ETH Zurich²

01 Apr 2018

TL;DR: A hybrid system that combines a von Neumann machine with a computational memory unit can offer both the high precision of digital computing and the energy/areal efficiency of in-memory computing, which is illustrated by accurately solving a system of 5,000 equations using 998,752 phase-change memory devices.

...read moreread less

Abstract: As complementary metal–oxide–semiconductor (CMOS) scaling reaches its technological limits, a radical departure from traditional von Neumann systems, which involve separate processing and memory units, is needed in order to extend the performance of today’s computers substantially. In-memory computing is a promising approach in which nanoscale resistive memory devices, organized in a computational memory unit, are used for both processing and memory. However, to reach the numerical accuracy typically required for data analytics and scientific computing, limitations arising from device variability and non-ideal device characteristics need to be addressed. Here we introduce the concept of mixed-precision in-memory computing, which combines a von Neumann machine with a computational memory unit. In this hybrid system, the computational memory unit performs the bulk of a computational task, while the von Neumann machine implements a backward method to iteratively improve the accuracy of the solution. The system therefore benefits from both the high precision of digital computing and the energy/areal efficiency of in-memory computing. We experimentally demonstrate the efficacy of the approach by accurately solving systems of linear equations, in particular, a system of 5,000 equations using 998,752 phase-change memory devices.

...read moreread less

280 citations

Journal Article•DOI•

Mixed-Precision In-Memory Computing

[...]

Manuel Le Gallo¹, Manuel Le Gallo², Abu Sebastian¹, Roland Mathis¹, Matteo Manica², Matteo Manica¹, Heiner Giefers¹, Tomas Tuma¹, Costas Bekas¹, Alessandro Curioni¹, Evangelos Eleftheriou¹ - Show less +7 more•Institutions (2)

IBM¹, ETH Zurich²

16 Jan 2017-arXiv: Emerging Technologies

TL;DR: In this article, a mixed precision in-memory computing (MIMO) system is proposed, which combines a von Neumann machine with a computational memory unit. But it does not address the limitations arising from device variability and nonideal device characteristics.

...read moreread less

Abstract: As CMOS scaling reaches its technological limits, a radical departure from traditional von Neumann systems, which involve separate processing and memory units, is needed in order to significantly extend the performance of today's computers. In-memory computing is a promising approach in which nanoscale resistive memory devices, organized in a computational memory unit, are used for both processing and memory. However, to reach the numerical accuracy typically required for data analytics and scientific computing, limitations arising from device variability and non-ideal device characteristics need to be addressed. Here we introduce the concept of mixed-precision in-memory computing, which combines a von Neumann machine with a computational memory unit. In this hybrid system, the computational memory unit performs the bulk of a computational task, while the von Neumann machine implements a backward method to iteratively improve the accuracy of the solution. The system therefore benefits from both the high precision of digital computing and the energy/areal efficiency of in-memory computing. We experimentally demonstrate the efficacy of the approach by accurately solving systems of linear equations, in particular, a system of 5,000 equations using 998,752 phase-change memory devices.

...read moreread less

101 citations

Journal Article•DOI•

Toward Explainable Anticancer Compound Sensitivity Prediction via Multimodal Attention-Based Convolutional Encoders.

[...]

Matteo Manica¹, Ali Oskooei¹, Jannis Born¹, Jannis Born², Jannis Born³, Vigneshwari Subramanian⁴, Julio Saez-Rodriguez⁵, María Rodríguez Martínez¹ - Show less +4 more•Institutions (5)

IBM¹, ETH Zurich², University of Zurich³, RWTH Aachen University⁴, Heidelberg University⁵

31 Oct 2019-Molecular Pharmaceutics

TL;DR: In this article, a multimodal attention-based convolutional encoder was proposed for interpretable prediction of anticancer compound sensitivity using protein-protein interaction networks (PIPI).

...read moreread less

Abstract: In line with recent advances in neural drug design and sensitivity prediction, we propose a novel architecture for interpretable prediction of anticancer compound sensitivity using a multimodal attention-based convolutional encoder. Our model is based on the three key pillars of drug sensitivity: compounds' structure in the form of a SMILES sequence, gene expression profiles of tumors, and prior knowledge on intracellular interactions from protein-protein interaction networks. We demonstrate that our multiscale convolutional attention-based encoder significantly outperforms a baseline model trained on Morgan fingerprints and a selection of encoders based on SMILES, as well as the previously reported state-of-the-art for multimodal drug sensitivity prediction (R2 = 0.86 and RMSE = 0.89). Moreover, the explainability of our approach is demonstrated by a thorough analysis of the attention weights. We show that the attended genes significantly enrich apoptotic processes and that the drug attention is strongly correlated with a standard chemical structure similarity index. Finally, we report a case study of two receptor tyrosine kinase (RTK) inhibitors acting on a leukemia cell line, showcasing the ability of the model to focus on informative genes and submolecular regions of the two compounds. The demonstrated generalizability and the interpretability of our model testify to its potential for in silico prediction of anticancer compound efficacy on unseen cancer cells, positioning it as a valid solution for the development of personalized therapies as well as for the evaluation of candidate compounds in de novo drug design.

...read moreread less

83 citations

Proceedings Article•

CogMol: Target-Specific and Selective Drug Design for COVID-19 Using Deep Generative Models

[...]

Vijil Chenthamarakshan¹, Payel Das¹, Samuel C. Hoffman¹, Hendrik Strobelt², Inkit Padhi¹, Kar Wai Lim¹, Benjamin Hoover², Matteo Manica¹, Jannis Born¹, Jannis Born³, Teodoro Laino¹, Aleksandra Mojsilovic¹ - Show less +8 more•Institutions (3)

IBM¹, Massachusetts Institute of Technology², ETH Zurich³

02 Apr 2020

TL;DR: A deep learning based generative modeling framework to design drug candidates specific to a given target protein sequence with high off-target selectivity is presented, and an in silico screening process that accounts for toxicity is augmented to lower the failure rate of the generated drug candidates in later stages of the drug development pipeline.

...read moreread less

Abstract: The novel nature of SARS-CoV-2 calls for the development of efficient de novo drug design approaches. In this study, we propose an end-to-end framework, named CogMol (Controlled Generation of Molecules), for designing new drug-like small molecules targeting novel viral proteins with high affinity and off-target selectivity. CogMol combines adaptive pre-training of a molecular SMILES Variational Autoencoder (VAE) and an efficient multi-attribute controlled sampling scheme that uses guidance from attribute predictors trained on latent features. To generate novel and optimal drug-like molecules for unseen viral targets, CogMol leverages a protein-molecule binding affinity predictor that is trained using SMILES VAE embeddings and protein sequence embeddings learned unsupervised from a large corpus. CogMol framework is applied to three SARS-CoV-2 target proteins: main protease, receptor-binding domain of the spike protein, and non-structural protein 9 replicase. The generated candidates are novel at both molecular and chemical scaffold levels when compared to the training data. CogMol also includes insilico screening for assessing toxicity of parent molecules and their metabolites with a multi-task toxicity classifier, synthetic feasibility with a chemical retrosynthesis predictor, and target structure binding with docking simulations. Docking reveals favorable binding of generated molecules to the target protein structure, where 87-95 % of high affinity molecules showed docking free energy < -6 kcal/mol. When compared to approved drugs, the majority of designed compounds show low parent molecule and metabolite toxicity and high synthetic feasibility. In summary, CogMol handles multi-constraint design of synthesizable, low-toxic, drug-like molecules with high target specificity and selectivity, and does not need target-dependent fine-tuning of the framework or target structure information.

...read moreread less

48 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13

Collapse

Cited by

PDF

Open Access

More filters

5分で分かる!? 有名論文ナナメ読み：Jacob Devlin et al. : BERT : Pre-training of Deep Bidirectional Transformers for Language Understanding

[...]

柴田知秀

15 Feb 2020

1,595 citations

Machine learning with Python

[...]

Pedro Ferreira, Christopher L. Simons

25 Apr 2017

TL;DR: This presentation is a case study taken from the travel and holiday industry and describes the effectiveness of various techniques as well as the performance of Python-based libraries such as Python Data Analysis Library (Pandas), and Scikit-learn (built on NumPy, SciPy and matplotlib).

...read moreread less

Abstract: This presentation is a case study taken from the travel and holiday industry. Paxport/Multicom, based in UK and Sweden, have recently adopted a recommendation system for holiday accommodation bookings. Machine learning techniques such as Collaborative Filtering have been applied using Python (3.5.1), with Jupyter (4.0.6) as the main framework. Data scale and sparsity present significant challenges in the case study, and so the effectiveness of various techniques are described as well as the performance of Python-based libraries such as Python Data Analysis Library (Pandas), and Scikit-learn (built on NumPy, SciPy and matplotlib). The presentation is suitable for all levels of programmers.

...read moreread less

1,338 citations

Journal Article•DOI•

Towards spike-based machine intelligence with neuromorphic computing.

[...]

Kaushik Roy¹, Akhilesh Jaiswal¹, Priyadarshini Panda¹•Institutions (1)

Purdue University¹

27 Nov 2019-Nature

TL;DR: An overview of the developments in neuromorphic computing for both algorithms and hardware is provided and the fundamentals of learning and hardware frameworks are highlighted, with emphasis on algorithm–hardware codesign.

...read moreread less

Abstract: Guided by brain-like ‘spiking’ computational frameworks, neuromorphic computing—brain-inspired computing for machine intelligence—promises to realize artificial intelligence while reducing the energy requirements of computing platforms. This interdisciplinary field began with the implementation of silicon circuits for biological neural routines, but has evolved to encompass the hardware implementation of algorithms with spike-based encoding and event-driven representations. Here we provide an overview of the developments in neuromorphic computing for both algorithms and hardware and highlight the fundamentals of learning and hardware frameworks. We discuss the main challenges and the future prospects of neuromorphic computing, with emphasis on algorithm–hardware codesign. The authors review the advantages and future prospects of neuromorphic computing, a multidisciplinary engineering concept for energy-efficient artificial intelligence with brain-inspired functionality.

...read moreread less

877 citations

Journal Article•DOI•

Memory devices and applications for in-memory computing

[...]

Abu Sebastian¹, Manuel Le Gallo¹, Riduan Khaddam-Aljameh¹, Evangelos Eleftheriou¹•Institutions (1)

IBM¹

30 Mar 2020-Nature Nanotechnology

TL;DR: This Review provides an overview of memory devices and the key computational primitives enabled by these memory devices as well as their applications spanning scientific computing, signal processing, optimization, machine learning, deep learning and stochastic computing.

...read moreread less

Abstract: Traditional von Neumann computing systems involve separate processing and memory units. However, data movement is costly in terms of time and energy and this problem is aggravated by the recent explosive growth in highly data-centric applications related to artificial intelligence. This calls for a radical departure from the traditional systems and one such non-von Neumann computational approach is in-memory computing. Hereby certain computational tasks are performed in place in the memory itself by exploiting the physical attributes of the memory devices. Both charge-based and resistance-based memory devices are being explored for in-memory computing. In this Review, we provide a broad overview of the key computational primitives enabled by these memory devices as well as their applications spanning scientific computing, signal processing, optimization, machine learning, deep learning and stochastic computing. This Review provides an overview of memory devices and the key computational primitives for in-memory computing, and examines the possibilities of applying this computing approach to a wide range of applications.

...read moreread less

841 citations

Journal Article•DOI•

LLaMA: Open and Efficient Foundation Language Models

[...]

Hugo Touvron, Thibaut Lavril, Gautier Izacard, Xavier Martinet, Marie-Anne Lachaux, Timothée Lacroix, Baptiste Roziere, Naman Goyal, Eric Hambro, Faisal Azhar, Aur'elien Rodriguez, Armand Joulin, Edouard Grave, Guillaume Lample - Show less +10 more

27 Feb 2023-arXiv.org

TL;DR: This article introduced LLaMA, a collection of foundation language models ranging from 7B to 65B parameters, and trained their models on trillions of tokens, and showed that it is possible to train state-of-the-art models using publicly available datasets exclusively, without resorting to proprietary and inaccessible datasets.

...read moreread less

Abstract: We introduce LLaMA, a collection of foundation language models ranging from 7B to 65B parameters. We train our models on trillions of tokens, and show that it is possible to train state-of-the-art models using publicly available datasets exclusively, without resorting to proprietary and inaccessible datasets. In particular, LLaMA-13B outperforms GPT-3 (175B) on most benchmarks, and LLaMA-65B is competitive with the best models, Chinchilla-70B and PaLM-540B. We release all our models to the research community.

...read moreread less

809 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse