Home
/
Authors
/
Fernando Pereira

Author

Fernando Pereira

Bio: Fernando Pereira is an academic researcher from Instituto Superior Técnico. The author has contributed to research in topics: Encoder & Point cloud. The author has an hindex of 32, co-authored 80 publications receiving 5282 citations.

Topics: Encoder, Point cloud, Codec, Coding (social sciences), Motion compensation ...read more

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2000
1999
1998
1997
1994
1990

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Video coding with H.264/AVC: tools, performance, and complexity

[...]

Jörn Ostermann, J. Bormans¹, P. List², Detlev Marpe³, Matthias Narroschke, Fernando Pereira, Thomas Stockhammer, Thomas Wedi - Show less +4 more•Institutions (3)

IMEC¹, Deutsche Telekom², Heinrich Hertz Institute³

02 Aug 2004-IEEE Circuits and Systems Magazine

TL;DR: This paper provides an overview of the new tools, features and complexity of H.264/AVC.

...read moreread less

Abstract: H.264/AVC, the result of the collaboration between the ISO/IEC Moving Picture Experts Group and the ITU-T Video Coding Experts Group, is the latest standard for video coding. The goals of this standardization effort were enhanced compression efficiency, network friendly video representation for interactive (video telephony) and non-interactive applications (broadcast, streaming, storage, video on demand). H.264/AVC provides gains in compression efficiency of up to 50% over a wide range of bit rates and video resolutions compared to previous standards. Compared to previous standards, the decoder complexity is about four times that of MPEG-2 and two times that of MPEG-4 Visual Simple Profile. This paper provides an overview of the new tools, features and complexity of H.264/AVC.

...read moreread less

1,013 citations

Improving frame interpolation with spatial motion smoothing for pixel domain distributed video coding

[...]

Joao Ascenso, Catarina Brites, Fernando Pereira

01 Jan 2005

TL;DR: Besides forward and bidirectional motion estimation, a spatial motion smoothing algorithm to eliminate motion outliers is proposed that allows significant improvements in the rate-distortion (RD) performance without sacrificing the encoder complexity.

...read moreread less

Abstract: Distributed video coding (DVC) is a new compression paradigm based on two key Information Theory results: the Slepian-Wolf and Wyner-Ziv theorems. A particular case of DVC deals with lossy source coding with side information at the decoder (Wyner-Ziv) and enables to shift the coding complexity from the encoder to the decoder. The solution here described is based on a very lightweight encoder leaving for the decoder the time consuming motion estimation/compensation task. In this paper, the performance of the pixel domain distributed video codec is improved by using better side information based derived by motion compensated frame interpolation algorithms at the decoder. Besides forward and bidirectional motion estimation, a spatial motion smoothing algorithm to eliminate motion outliers is proposed. This allows significant improvements in the rate-distortion (RD) performance without sacrificing the encoder complexity.

...read moreread less

433 citations

Book•

The MPEG-4 Book

[...]

Fernando Pereira, Touradj Ebrahimi

20 Jul 2002

TL;DR: A comprehensive, targeted guide to the MPEG-4 standardand its use in cutting-edge applications, Fernando Pereira and Touradj Ebrahimi demonstrate how MPEG- 4 addresses tomorrow's multimedia applications more successfully than any previous standard.

...read moreread less

Abstract: From the Publisher: The most complete, focused guide to MPEG-4the breakthrough standard for interactive multimedia. The comprehensive, focused, up-to-the-minute guide to MPEG-4 Practical solutions for next-generation multimedia applications In-depth coverage of natural and synthetic audiovisual object coding, description, composition and synchronization Binary and textual scene description Transport and storage of MPEG-4 content MPEG-4 profiles and levels; verification tests MPEG-4 represents a breakthrough in multimedia, delivering not just outstanding compression but also a fully interactive user experience. In The MPEG-4 Book, two leaders of the MPEG-4 standards community offer a comprehensive, targeted guide to the MPEG-4 standardand its use in cutting-edge applications. Fernando Pereira and Touradj Ebrahimi, together with a unique collection of key MPEG experts, demonstrate how MPEG-4 addresses tomorrow's multimedia applications more successfully than any previous standard. They review every element of the standard to offer you a book that covers: Synthetic and natural audio and video object coding, description and synchronization BIFSthe MPEG-4 language for scene description and interaction The extensible MPEG-4 textual format XMT Transport and delivery of MPEG-4 content MPEG-J: using Java classes within MPEG-4 content A complete overview of MPEG-4 Profiles and Levels Verification tests The authors also walk through the MPEG-4 Systems Reference Software ?offering powerful real-world insights for every product developer, softwareprofessional, engineer, and researcher involved with MPEG-4 and state-of-the-art multimedia delivery. Part of the new IMSC Press Series from the Integrated Multimedia System Center at the University of Southern California, a federally funded center specializing in cutting-edge multimedia research.

...read moreread less

363 citations

Journal Article•DOI•

MPEG-7: the generic multimedia content description standard, part 1

[...]

José M. Martínez¹, Rob Koenen, Fernando Pereira•Institutions (1)

Complutense University of Madrid¹

07 Aug 2002-IEEE MultiMedia

TL;DR: The article provides a comprehensive overview of MPEG-7's motivation, objectives, scope, and components.

...read moreread less

Abstract: The recently completed ISO/IEC, International Standard 15938, formally called the Multimedia Content Description Interface (but better known as MPEG-7), provides a rich set of tools for completely describing multimedia content. The standard wasn't just designed from a content management viewpoint (classical archival information). It includes an innovative description of the media's content, which we can extract via content analysis and processing. MPEG-7 also isn't aimed at any one application; rather, the elements that MPEG-7 standardizes support as broad a range of applications as possible. This is one of the key differences between MPEG-7 and other metadata standards; it aims to be generic, not targeted to a specific application or application domain. The article provides a comprehensive overview of MPEG-7's motivation, objectives, scope, and components.

...read moreread less

308 citations

Journal Article•DOI•

Correlation Noise Modeling for Efficient Pixel and Transform Domain Wyner–Ziv Video Coding

[...]

Catarina Brites, Fernando Pereira

01 Sep 2008-IEEE Transactions on Circuits and Systems for Video Technology

TL;DR: The higher the estimation granularity is, the better the rate-distortion performance is since the deeper the adaptation of the decoding process is to the video statistical characteristics, which means that the pixel and coefficient levels are the best performing for PDWZ and TDWZ solutions, respectively.

...read moreread less

Abstract: In recent years, practical Wyner-Ziv (WZ) video coding solutions have been proposed with promising results. Most of the solutions available in the literature model the correlation noise (CN) between the original frame and its estimation made at the decoder, which is the so-called side information (SI), by a given distribution whose relevant parameters are estimated using an offline process, assuming that the SI is available at the encoder or the originals are available at the decoder. The major goal of this paper is to propose a more realistic WZ video coding approach by performing online estimation of the CN model parameters at the decoder, for pixel and transform domain WZ video codecs. In this context, several new techniques are proposed based on metrics which explore the temporal correlation between frames with different levels of granularity. For pixel-domain WZ (PDWZ) video coding, three levels of granularity are proposed: frame, block, and pixel levels. For transform-domain WZ (TDWZ) video coding, DCT bands and coefficients are the two granularity levels proposed. The higher the estimation granularity is, the better the rate-distortion performance is since the deeper the adaptation of the decoding process is to the video statistical characteristics, which means that the pixel and coefficient levels are the best performing for PDWZ and TDWZ solutions, respectively.

...read moreread less

241 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18

Collapse

Cited by

PDF

Open Access

More filters

Book•

H.264 and MPEG-4 Video Compression: Video Coding for Next-generation Multimedia

[...]

Iain E. G. Richardson

19 Dec 2003

TL;DR: In this article, the MPEG-4 and H.264 standards are discussed and an overview of the technologies involved in their development is presented. But the focus is on the performance and not the technical aspects.

...read moreread less

Abstract: About the Author.Foreword.Preface.Glossary.1. Introduction.2. Video Formats and Quality.3. Video Coding Concepts.4. The MPEG-4 and H.264 Standards.5. MPEG-4 Visual.6. H.264/MPEG-4 Part 10.7. Design and Performance.8. Applications and Directions.Bibliography.Index.

...read moreread less

2,491 citations

Journal Article•DOI•

Efficient Processing of Deep Neural Networks: A Tutorial and Survey

[...]

Vivienne Sze¹, Yu-Hsin Chen¹, Tien-Ju Yang¹, Joel Emer¹•Institutions (1)

Massachusetts Institute of Technology¹

20 Nov 2017

TL;DR: In this paper, the authors provide a comprehensive tutorial and survey about the recent advances toward the goal of enabling efficient processing of DNNs, and discuss various hardware platforms and architectures that support DNN, and highlight key trends in reducing the computation cost of deep neural networks either solely via hardware design changes or via joint hardware and DNN algorithm changes.

...read moreread less

Abstract: Deep neural networks (DNNs) are currently widely used for many artificial intelligence (AI) applications including computer vision, speech recognition, and robotics. While DNNs deliver state-of-the-art accuracy on many AI tasks, it comes at the cost of high computational complexity. Accordingly, techniques that enable efficient processing of DNNs to improve energy efficiency and throughput without sacrificing application accuracy or increasing hardware cost are critical to the wide deployment of DNNs in AI systems. This article aims to provide a comprehensive tutorial and survey about the recent advances toward the goal of enabling efficient processing of DNNs. Specifically, it will provide an overview of DNNs, discuss various hardware platforms and architectures that support DNNs, and highlight key trends in reducing the computation cost of DNNs either solely via hardware design changes or via joint hardware design and DNN algorithm changes. It will also summarize various development resources that enable researchers and practitioners to quickly get started in this field, and highlight important benchmarking metrics and design considerations that should be used for evaluating the rapidly growing number of DNN hardware designs, optionally including algorithmic codesigns, being proposed in academia and industry. The reader will take away the following concepts from this article: understand the key design considerations for DNNs; be able to evaluate different DNN hardware implementations with benchmarks and comparison metrics; understand the tradeoffs between various hardware architectures and platforms; be able to evaluate the utility of various DNN design techniques for efficient processing; and understand recent implementation trends and opportunities.

...read moreread less

2,391 citations

Monograph•DOI•

H.264 and MPEG-4 Video Compression

[...]

Iain E. G. Richardson

02 Sep 2003

TL;DR: This paper presents a meta-review of the MPEG-4 and H.264 standards for video quality and design, and some of the standards themselves have been revised and improved since their publication in 2009.

...read moreread less

1,520 citations

Proceedings Article•

Deep multi-scale video prediction beyond mean square error

[...]

Michael Mathieu¹, Michael Mathieu², Camille Couprie², Yann LeCun¹, Yann LeCun² - Show less +1 more•Institutions (2)

New York University¹, Facebook²

01 Jan 2016

TL;DR: This work trains a convolutional network to generate future frames given an input sequence and proposes three different and complementary feature learning strategies: a multi-scale architecture, an adversarial training method, and an image gradient difference loss function.

...read moreread less

Abstract: Learning to predict future images from a video sequence involves the construction of an internal representation that models the image evolution accurately, and therefore, to some degree, its content and dynamics. This is why pixel-space video prediction may be viewed as a promising avenue for unsupervised feature learning. In addition, while optical flow has been a very studied problem in computer vision for a long time, future frame prediction is rarely approached. Still, many vision applications could benefit from the knowledge of the next frames of videos, that does not require the complexity of tracking every pixel trajectories. In this work, we train a convolutional network to generate future frames given an input sequence. To deal with the inherently blurry predictions obtained from the standard Mean Squared Error (MSE) loss function, we propose three different and complementary feature learning strategies: a multi-scale architecture, an adversarial training method, and an image gradient difference loss function. We compare our predictions to different published results based on recurrent neural networks on the UCF101 dataset

...read moreread less

1,369 citations

Posted Content•

Deep multi-scale video prediction beyond mean square error

[...]

Michael Mathieu¹, Michael Mathieu², Camille Couprie², Yann LeCun¹, Yann LeCun² - Show less +1 more•Institutions (2)

New York University¹, Facebook²

17 Nov 2015-arXiv: Learning

TL;DR: In this paper, a multi-scale architecture, an adversarial training method, and an image gradient difference loss function were proposed to predict future frames from a video sequence. But their performance was not as good as those of the previous works.

...read moreread less

1,175 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse