Home
/
Authors
/
Björn Ommer

Author

Björn Ommer

Other affiliations: University of Bonn, ETH Zurich, University of California, Berkeley ...read more

Bio: Björn Ommer is an academic researcher from Heidelberg University. The author has contributed to research in topics: Object detection & Object (computer science). The author has an hindex of 31, co-authored 138 publications receiving 2933 citations. Previous affiliations of Björn Ommer include University of Bonn & ETH Zurich.

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2007
2006
2005
2003

Papers

PDF

Open Access

More filters

Posted Content•

Taming Transformers for High-Resolution Image Synthesis

[...]

Patrick Esser¹, Robin Rombach¹, Björn Ommer¹•Institutions (1)

Heidelberg University¹

17 Dec 2020-arXiv: Computer Vision and Pattern Recognition

TL;DR: It is demonstrated how combining the effectiveness of the inductive bias of CNNs with the expressivity of transformers enables them to model and thereby synthesize high-resolution images.

...read moreread less

Abstract: Designed to learn long-range interactions on sequential data, transformers continue to show state-of-the-art results on a wide variety of tasks. In contrast to CNNs, they contain no inductive bias that prioritizes local interactions. This makes them expressive, but also computationally infeasible for long sequences, such as high-resolution images. We demonstrate how combining the effectiveness of the inductive bias of CNNs with the expressivity of transformers enables them to model and thereby synthesize high-resolution images. We show how to (i) use CNNs to learn a context-rich vocabulary of image constituents, and in turn (ii) utilize transformers to efficiently model their composition within high-resolution images. Our approach is readily applied to conditional synthesis tasks, where both non-spatial information, such as object classes, and spatial information, such as segmentations, can control the generated image. In particular, we present the first results on semantically-guided synthesis of megapixel images with transformers and obtain the state of the art among autoregressive models on class-conditional ImageNet. Code and pretrained models can be found at this https URL .

...read moreread less

744 citations

Journal Article•DOI•

Asynchronous therapy restores motor control by rewiring of the rat corticospinal tract after stroke

[...]

Anna-Sophia Wahl¹, Anna-Sophia Wahl², Wolfgang Omlor², Jose C. Rubio³, J. L. Chen², H. Zheng³, A. Schröter¹, Miriam Gullo¹, Miriam Gullo², Oliver Weinmann², Oliver Weinmann¹, K. Kobayashi⁴, Fritjof Helmchen², Björn Ommer³, Martin E. Schwab¹, Martin E. Schwab² - Show less +12 more•Institutions (4)

ETH Zurich¹, University of Zurich², Interdisciplinary Center for Scientific Computing³, National Institutes of Natural Sciences, Japan⁴

13 Jun 2014-Science

TL;DR: Nearly full recovery of skilled forelimb functions in rats with large strokes are shown when a growth-promoting immunotherapy against a neurite growth–inhibitory protein was applied to boost the sprouting of new fibers, before stabilizing the newly formed circuits by intensive training.

...read moreread less

Abstract: The brain exhibits limited capacity for spontaneous restoration of lost motor functions after stroke. Rehabilitation is the prevailing clinical approach to augment functional recovery, but the scientific basis is poorly understood. Here, we show nearly full recovery of skilled forelimb functions in rats with large strokes when a growth-promoting immunotherapy against a neurite growth-inhibitory protein was applied to boost the sprouting of new fibers, before stabilizing the newly formed circuits by intensive training. In contrast, early high-intensity training during the growth phase destroyed the effect and led to aberrant fiber patterns. Pharmacogenetic experiments identified a subset of corticospinal fibers originating in the intact half of the forebrain, side-switching in the spinal cord to newly innervate the impaired limb and restore skilled motor function.

...read moreread less

284 citations

Proceedings Article•DOI•

Taming Transformers for High-Resolution Image Synthesis

[...]

Patrick Esser¹, Robin Rombach¹, Björn Ommer¹•Institutions (1)

Heidelberg University¹

01 Jun 2021

TL;DR: In this paper, the authors demonstrate how combining the effectiveness of the inductive bias of CNNs with the expressivity of transformers enables them to model and thereby synthesize high-resolution images.

...read moreread less

273 citations

Posted Content•

A Variational U-Net for Conditional Appearance and Shape Generation

[...]

Patrick Esser, Ekaterina Sutter, Björn Ommer

12 Apr 2018-arXiv: Computer Vision and Pattern Recognition

TL;DR: A conditional U-Net is presented for shape-guided image generation, conditioned on the output of a variational autoencoder for appearance, trained end-to-end on images, without requiring samples of the same object with varying pose or appearance.

...read moreread less

Abstract: Deep generative models have demonstrated great performance in image synthesis. However, results deteriorate in case of spatial deformations, since they generate images of objects directly, rather than modeling the intricate interplay of their inherent shape and appearance. We present a conditional U-Net for shape-guided image generation, conditioned on the output of a variational autoencoder for appearance. The approach is trained end-to-end on images, without requiring samples of the same object with varying pose or appearance. Experiments show that the model enables conditional image generation and transfer. Therefore, either shape or appearance can be retained from a query image, while freely altering the other. Moreover, appearance can be sampled due to its stochastic latent representation, while preserving shape. In quantitative and qualitative experiments on COCO, DeepFashion, shoes, Market-1501 and handbags, the approach demonstrates significant improvements over the state-of-the-art.

...read moreread less

196 citations

Proceedings Article•DOI•

Video parsing for abnormality detection

[...]

Borislav Antic¹, Björn Ommer¹•Institutions (1)

Interdisciplinary Center for Scientific Computing¹

06 Nov 2011

TL;DR: A probabilistic model is presented that localizes abnormalities using statistical inference and outperforms the state-of-the-art to achieve a frame-based abnormality classification performance of 91% and the localization performance improves by 32% to 76%.

...read moreread less

Abstract: Detecting abnormalities in video is a challenging problem since the class of all irregular objects and behaviors is infinite and thus no (or by far not enough) abnormal training samples are available. Consequently, a standard setting is to find abnormalities without actually knowing what they are because we have not been shown abnormal examples during training. However, although the training data does not define what an abnormality looks like, the main paradigm in this field is to directly search for individual abnormal local patches or image regions independent of another. To address this problem we parse video frames by establishing a set of hypotheses that jointly explain all the foreground while, at same time, trying to find normal training samples that explain the hypotheses. Consequently, we can avoid a direct detection of abnormalities. They are discovered indirectly as those hypotheses which are needed for covering the foreground without finding an explanation by normal samples for themselves. We present a probabilistic model that localizes abnormalities using statistical inference. On the challenging dataset of [15] it outperforms the state-of-the-art by 7% to achieve a frame-based abnormality classification performance of 91% and the localization performance improves by 32% to 76%.

...read moreread less

163 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29

Collapse

Cited by

PDF

Open Access

More filters

Journal Article•

다중혈관 관상동맥 환자에서 y-문합을 이용하여 양쪽 내흉동맥만을 사용한 우회술의 조기 성적

[...]

성기익, 이영탁, 박계현, 전태국, 박표원, 한일용, 장윤희 - Show less +3 more

01 Mar 2003-The Korean Journal of Thoracic and Cardiovascular Surgery

28,685 citations

Journal Article•

“Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”の学習報告

[...]

杉山拓海

12 Sep 2017-Computers & Graphics

3,940 citations

Computer vision : a modern approach = 计算机视觉 : 一种现代的方法

[...]

David Forsyth, Jean Ponce

01 Jan 2004

TL;DR: Comprehensive and up-to-date, this book includes essential topics that either reflect practical significance or are of theoretical importance and describes numerous important application areas such as image based rendering and digital libraries.

...read moreread less

Abstract: From the Publisher: The accessible presentation of this book gives both a general view of the entire computer vision enterprise and also offers sufficient detail to be able to build useful applications. Users learn techniques that have proven to be useful by first-hand experience and a wide range of mathematical methods. A CD-ROM with every copy of the text contains source code for programming practice, color images, and illustrative movies. Comprehensive and up-to-date, this book includes essential topics that either reflect practical significance or are of theoretical importance. Topics are discussed in substantial and increasing depth. Application surveys describe numerous important application areas such as image based rendering and digital libraries. Many important algorithms broken down and illustrated in pseudo code. Appropriate for use by engineers as a comprehensive reference to the computer vision enterprise.

...read moreread less

3,627 citations

The PASCAL Visual Object Classes Challenge

[...]

Jianguo Zhang

01 Jan 2006

3,012 citations

Proceedings Article•DOI•

Unsupervised Visual Representation Learning by Context Prediction

[...]

Carl Doersch¹, Abhinav Gupta¹, Alexei A. Efros²•Institutions (2)

Carnegie Mellon University¹, University of California, Berkeley²

07 Dec 2015

TL;DR: In this paper, the spatial context is used as a source of free and plentiful supervisory signal for training a rich visual representation, and the feature representation learned using this within-image context captures visual similarity across images.

...read moreread less

Abstract: This work explores the use of spatial context as a source of free and plentiful supervisory signal for training a rich visual representation. Given only a large, unlabeled image collection, we extract random pairs of patches from each image and train a convolutional neural net to predict the position of the second patch relative to the first. We argue that doing well on this task requires the model to learn to recognize objects and their parts. We demonstrate that the feature representation learned using this within-image context indeed captures visual similarity across images. For example, this representation allows us to perform unsupervised visual discovery of objects like cats, people, and even birds from the Pascal VOC 2011 detection dataset. Furthermore, we show that the learned ConvNet can be used in the R-CNN framework [19] and provides a significant boost over a randomly-initialized ConvNet, resulting in state-of-the-art performance among algorithms which use only Pascal-provided training set annotations.

...read moreread less

2,154 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse