Home
/
Authors
/
Lamberto Ballan

Author

Lamberto Ballan

Other affiliations: University of Pavia, University of Florence, Stanford University

Bio: Lamberto Ballan is an academic researcher from University of Padua. The author has contributed to research in topics: Image retrieval & Automatic image annotation. The author has an hindex of 25, co-authored 76 publications receiving 2786 citations. Previous affiliations of Lamberto Ballan include University of Pavia & University of Florence.

Papers published on a yearly basis

2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007

Papers

PDF

Open Access

More filters

Journal Article•DOI•

A SIFT-Based Forensic Method for Copy–Move Attack Detection and Transformation Recovery

[...]

Irene Amerini¹, Lamberto Ballan¹, Roberto Caldelli¹, A. Del Bimbo¹, Giuseppe Serra¹ - Show less +1 more•Institutions (1)

University of Florence¹

01 Sep 2011-IEEE Transactions on Information Forensics and Security

TL;DR: The problem of detecting if an image has been forged is investigated; in particular, attention has been paid to the case in which an area of an image is copied and then pasted onto another zone to create a duplication or to cancel something that was awkward.

...read moreread less

Abstract: One of the principal problems in image forensics is determining if a particular image is authentic or not. This can be a crucial task when images are used as basic evidence to influence judgment like, for example, in a court of law. To carry out such forensic analysis, various technological instruments have been developed in the literature. In this paper, the problem of detecting if an image has been forged is investigated; in particular, attention has been paid to the case in which an area of an image is copied and then pasted onto another zone to create a duplication or to cancel something that was awkward. Generally, to adapt the image patch to the new context a geometric transformation is needed. To detect such modifications, a novel methodology based on scale invariant features transform (SIFT) is proposed. Such a method allows us to both understand if a copy-move attack has occurred and, furthermore, to recover the geometric transformation used to perform cloning. Extensive experimental results are presented to confirm that the technique is able to precisely individuate the altered area and, in addition, to estimate the geometric transformation parameters with high reliability. The method also deals with multiple cloning.

...read moreread less

868 citations

Journal Article•DOI•

Copy-move forgery detection and localization by means of robust clustering with J-Linkage

[...]

Irene Amerini¹, Lamberto Ballan¹, Roberto Caldelli¹, Alberto Del Bimbo¹, Luca Del Tongo¹, Giuseppe Serra², Giuseppe Serra¹ - Show less +3 more•Institutions (2)

University of Florence¹, University of Modena and Reggio Emilia²

01 Jul 2013-Signal Processing-image Communication

TL;DR: A novel approach is presented for copy-move forgery detection and localization based on the JLinkage algorithm, which performs a robust clustering in the space of the geometric transformation, which outperforms other similar state-of-the-art techniques.

...read moreread less

Abstract: Understanding if a digital image is authentic or not, is a key purpose of image forensics. There are several different tampering attacks but, surely, one of the most common and immediate one is copy-move. A recent and effective approach for detecting copy-move forgeries is to use local visual features such as SIFT. In this kind of methods, SIFT matching is often followed by a clustering procedure to group keypoints that are spatially close. Often, this procedure could be unsatisfactory, in particular in those cases in which the copied patch contains pixels that are spatially very distant among them, and when the pasted area is near to the original source. In such cases, a better estimation of the cloned area is necessary in order to obtain an accurate forgery localization. In this paper a novel approach is presented for copy-move forgery detection and localization based on the JLinkage algorithm, which performs a robust clustering in the space of the geometric transformation. Experimental results, carried out on different datasets, show that the proposed method outperforms other similar state-of-the-art techniques both in terms of copy-move forgery detection reliability and of precision in the manipulated patch localization.

...read moreread less

242 citations

Journal Article•DOI•

Event detection and recognition for semantic annotation of video

[...]

Lamberto Ballan¹, Marco Bertini¹, Alberto Del Bimbo¹, Lorenzo Seidenari¹, Giuseppe Serra¹ - Show less +1 more•Institutions (1)

University of Florence¹

01 Jan 2011-Multimedia Tools and Applications

TL;DR: This paper surveys the field of event recognition, from interest point detectors and descriptors, to event modelling techniques and knowledge management technologies, and provides an overview of the methods, categorising them according to video production methods and video domains.

...read moreread less

Abstract: Research on methods for detection and recognition of events and actions in videos is receiving an increasing attention from the scientific community, because of its relevance for many applications, from semantic video indexing to intelligent video surveillance systems and advanced human-computer interaction interfaces Event detection and recognition requires to consider the temporal aspect of video, either at the low-level with appropriate features, or at a higher-level with models and classifiers than can represent time In this paper we survey the field of event recognition, from interest point detectors and descriptors, to event modelling techniques and knowledge management technologies We provide an overview of the methods, categorising them according to video production methods and video domains, and according to types of events and actions that are typical of these domains

...read moreread less

162 citations

Proceedings Article•DOI•

Context-Aware Trajectory Prediction

[...]

Federico Bartoli¹, Giuseppe Lisanti², Lamberto Ballan², Alberto Del Bimbo¹•Institutions (2)

University of Florence¹, University of Pavia²

01 Aug 2018

TL;DR: In this article, a context-aware recurrent neural network LSTM model is proposed to predict human motion in crowded spaces such as a sidewalk, a museum or a shopping mall.

...read moreread less

Abstract: Human motion and behaviour in crowded spaces is influenced by several factors, such as the dynamics of other moving agents in the scene, as well as the static elements that might be perceived as points of attraction or obstacles. In this work, we present a new model for human trajectory prediction which is able to take advantage of both human-human and human-space interactions. The future trajectory of humans, are generated by observing their past positions and interactions with the surroundings. To this end, we propose a “context-aware” recurrent neural network LSTM model, which can learn and predict human motion in crowded spaces such as a sidewalk, a museum or a shopping mall. We evaluate our model on a public pedestrian datasets, and we contribute a new challenging dataset that collects videos of humans that navigate in a (real) crowded space such as a big museum. Results show that our approach can predict human trajectories better when compared to previous state-of-the-art forecasting models.

...read moreread less

141 citations

Journal Article•DOI•

Socializing the Semantic Gap: A Comparative Survey on Image Tag Assignment, Refinement, and Retrieval

[...]

Xirong Li¹, Tiberio Uricchio², Lamberto Ballan², Marco Bertini², Cees G. M. Snoek³, Alberto Del Bimbo² - Show less +2 more•Institutions (3)

Renmin University of China¹, University of Florence², University of Amsterdam³

06 Jun 2016-ACM Computing Surveys

TL;DR: In this paper, a comprehensive survey of content-based image retrieval focuses on what people tag about an image and how such information can be exploited to construct a tag relevance function. And a two-dimensional taxonomy is presented to structure the growing literature, understand the ingredients of the main works, clarify their connections and difference, and recognize their merits and limitations.

...read moreread less

Abstract: Where previous reviews on content-based image retrieval emphasize what can be seen in an image to bridge the semantic gap, this survey considers what people tag about an image A comprehensive treatise of three closely linked problems (ie, image tag assignment, refinement, and tag-based image retrieval) is presented While existing works vary in terms of their targeted tasks and methodology, they rely on the key functionality of tag relevance, that is, estimating the relevance of a specific tag with respect to the visual content of a given image and its social context By analyzing what information a specific method exploits to construct its tag relevance function and how such information is exploited, this article introduces a two-dimensional taxonomy to structure the growing literature, understand the ingredients of the main works, clarify their connections and difference, and recognize their merits and limitations For a head-to-head comparison with the state of the art, a new experimental protocol is presented, with training sets containing 10,000, 100,000, and 1 million images, and an evaluation on three test sets, contributed by various research groups Eleven representative works are implemented and evaluated Putting all this together, the survey aims to provide an overview of the past and foster progress for the near future

...read moreread less

134 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16

Collapse

Cited by

PDF

Open Access

More filters

The PASCAL Visual Object Classes Challenge

[...]

Jianguo Zhang

01 Jan 2006

3,012 citations

Proceedings Article•DOI•

Social GAN: Socially Acceptable Trajectories with Generative Adversarial Networks

[...]

Agrim Gupta¹, Justin Johnson¹, Li Fei-Fei¹, Silvio Savarese¹, Alexandre Alahi² - Show less +1 more•Institutions (2)

Stanford University¹, École Polytechnique²

29 Mar 2018

TL;DR: A recurrent sequence-to-sequence model observes motion histories and predicts future behavior, using a novel pooling mechanism to aggregate information across people, and outperforms prior work in terms of accuracy, variety, collision avoidance, and computational complexity.

...read moreread less

Abstract: Understanding human motion behavior is critical for autonomous moving platforms (like self-driving cars and social robots) if they are to navigate human-centric environments. This is challenging because human motion is inherently multimodal: given a history of human motion paths, there are many socially plausible ways that people could move in the future. We tackle this problem by combining tools from sequence prediction and generative adversarial networks: a recurrent sequence-to-sequence model observes motion histories and predicts future behavior, using a novel pooling mechanism to aggregate information across people. We predict socially plausible futures by training adversarially against a recurrent discriminator, and encourage diverse predictions with a novel variety loss. Through experiments on several datasets we demonstrate that our approach outperforms prior work in terms of accuracy, variety, collision avoidance, and computational complexity.

...read moreread less

1,461 citations

Proceedings Article•DOI•

ESC: Dataset for Environmental Sound Classification

[...]

Karol J. Piczak¹•Institutions (1)

Warsaw University of Technology¹

13 Oct 2015

TL;DR: A new annotated collection of 2000 short clips comprising 50 classes of various common sound events, and an abundant unified compilation of 250000 unlabeled auditory excerpts extracted from recordings available through the Freesound project are presented.

...read moreread less

Abstract: One of the obstacles in research activities concentrating on environmental sound classification is the scarcity of suitable and publicly available datasets. This paper tries to address that issue by presenting a new annotated collection of 2000 short clips comprising 50 classes of various common sound events, and an abundant unified compilation of 250000 unlabeled auditory excerpts extracted from recordings available through the Freesound project. The paper also provides an evaluation of human accuracy in classifying environmental sounds and compares it to the performance of selected baseline classifiers using features derived from mel-frequency cepstral coefficients and zero-crossing rate.

...read moreread less

978 citations

Proceedings Article•DOI•

FaceForensics++: Learning to Detect Manipulated Facial Images

[...]

Andreas Rössler¹, Davide Cozzolino, Luisa Verdoliva, Christian Riess², Justus Thies¹, Matthias Niessner¹ - Show less +2 more•Institutions (2)

Technische Universität München¹, University of Erlangen-Nuremberg²

25 Jan 2019

TL;DR: In this paper, the realism of state-of-the-art image manipulations, and how difficult it is to detect them, either automatically or by humans, is examined.

...read moreread less

Abstract: The rapid progress in synthetic image generation and manipulation has now come to a point where it raises significant concerns for the implications towards society. At best, this leads to a loss of trust in digital content, but could potentially cause further harm by spreading false information or fake news. This paper examines the realism of state-of-the-art image manipulations, and how difficult it is to detect them, either automatically or by humans. To standardize the evaluation of detection methods, we propose an automated benchmark for facial manipulation detection. In particular, the benchmark is based on Deep-Fakes, Face2Face, FaceSwap and NeuralTextures as prominent representatives for facial manipulations at random compression level and size. The benchmark is publicly available and contains a hidden test set as well as a database of over 1.8 million manipulated images. This dataset is over an order of magnitude larger than comparable, publicly available, forgery datasets. Based on this data, we performed a thorough analysis of data-driven forgery detectors. We show that the use of additional domain-specific knowledge improves forgery detection to unprecedented accuracy, even in the presence of strong compression, and clearly outperforms human observers.

...read moreread less

917 citations

Proceedings Article•DOI•

DESIRE: Distant Future Prediction in Dynamic Scenes with Interacting Agents

[...]

Namhoon Lee¹, Wongun Choi, Paul Vernaza, Christopher Choy², Philip H. S. Torr¹, Manmohan Chandraker³ - Show less +2 more•Institutions (3)

University of Oxford¹, Stanford University², University of California, San Diego³

14 Apr 2017

TL;DR: The proposed Deep Stochastic IOC RNN Encoder-decoder framework, DESIRE, for the task of future predictions of multiple interacting agents in dynamic scenes significantly improves the prediction accuracy compared to other baseline methods.

...read moreread less

Abstract: We introduce a Deep Stochastic IOC RNN Encoder-decoder framework, DESIRE, for the task of future predictions of multiple interacting agents in dynamic scenes. DESIRE effectively predicts future locations of objects in multiple scenes by 1) accounting for the multi-modal nature of the future prediction (i.e., given the same context, future may vary), 2) foreseeing the potential future outcomes and make a strategic prediction based on that, and 3) reasoning not only from the past motion history, but also from the scene context as well as the interactions among the agents. DESIRE achieves these in a single end-to-end trainable neural network model, while being computationally efficient. The model first obtains a diverse set of hypothetical future prediction samples employing a conditional variational auto-encoder, which are ranked and refined by the following RNN scoring-regression module. Samples are scored by accounting for accumulated future rewards, which enables better long-term strategic decisions similar to IOC frameworks. An RNN scene context fusion module jointly captures past motion histories, the semantic scene context and interactions among multiple agents. A feedback mechanism iterates over the ranking and refinement to further boost the prediction accuracy. We evaluate our model on two publicly available datasets: KITTI and Stanford Drone Dataset. Our experiments show that the proposed model significantly improves the prediction accuracy compared to other baseline methods.

...read moreread less

874 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse