Home
/
Authors
/
Urs-Viktor Marti

Author

Urs-Viktor Marti

Bio: Urs-Viktor Marti is an academic researcher from Swisscom. The author has contributed to research in topics: Terrestrial television & Pixel. The author has an hindex of 6, co-authored 11 publications receiving 95 citations.

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

SROBB: Targeted Perceptual Loss for Single Image Super-Resolution

[...]

Mohammad Saeed Rad¹, Behzad Bozorgtabar¹, Urs-Viktor Marti², Max Basler², Hazim Kemal Ekenel¹, Jean-Philippe Thiran¹ - Show less +2 more•Institutions (2)

École Polytechnique Fédérale de Lausanne¹, Swisscom²

01 Oct 2019

TL;DR: In this paper, the authors optimize a deep network-based decoder with a targeted objective function that penalizes images at different semantic levels using the corresponding terms, which results in more realistic textures and sharper edges.

...read moreread less

Abstract: By benefiting from perceptual losses, recent studies have improved significantly the performance of the super-resolution task, where a high-resolution image is resolved from its low-resolution counterpart. Although such objective functions generate near-photorealistic results, their capability is limited, since they estimate the reconstruction error for an entire image in the same way, without considering any semantic information. In this paper, we propose a novel method to benefit from perceptual loss in a more objective way. We optimize a deep network-based decoder with a targeted objective function that penalizes images at different semantic levels using the corresponding terms. In particular, the proposed method leverages our proposed OBB (Object, Background and Boundary) labels, generated from segmentation labels, to estimate a suitable perceptual loss for boundaries, while considering texture similarity for backgrounds. We show that our proposed approach results in more realistic textures and sharper edges, and outperforms other state-of-the-art algorithms in terms of both qualitative results on standard benchmarks and results of extensive user studies.

...read moreread less

113 citations

Journal Article•DOI•

Benefiting from Multitask Learning to Improve Single Image Super-Resolution

[...]

Mohammad Saeed Rad¹, Behzad Bozorgtabar¹, Claudiu Musat², Urs-Viktor Marti², Max Basler², Hazim Kemal Ekenel³, Hazim Kemal Ekenel¹, Jean-Philippe Thiran¹ - Show less +4 more•Institutions (3)

École Polytechnique Fédérale de Lausanne¹, Swisscom², Istanbul Technical University³

20 Jul 2020-Neurocomputing

TL;DR: Zhang et al. as discussed by the authors proposed an encoder architecture able to extract and use semantic information to super-resolve a given image by using multitask learning, simultaneously for image super-resolution and semantic segmentation.

...read moreread less

13 citations

Patent•

Method used in a speech-enabled automatic directory system

[...]

Urs-Viktor Marti¹, Kommer Robert Van¹•Institutions (1)

Swisscom¹

11 Jun 2003

TL;DR: A method used in a speech-enabled automatic directory system for determining a fallback threshold is described in this article. But this method is not suitable for automatic directory systems, as it requires the use of speech data corresponding to names uttered by users, and it requires a speech recognition system over this speech data to determine the false acceptance rate for various thresholds.

...read moreread less

Abstract: A method used in a speech-enabled automatic directory system for determining a fallback threshold, wherein a fallback decision is taken by said directory system when a metric delivered by a speech recognition system is lower than said threshold, said method comprising: collecting speech data corresponding to names uttered by users, running a speech recognition system over this speech data, determining the false acceptance rate for various thresholds of a metric delivered by said speech recognition system, determining an adequate fallback threshold based on said false acceptance rate.

...read moreread less

10 citations

Patent•

System for recording and playback of television signals from a plurality of television channels

[...]

Daniel Ledermann¹, Stefan Trittibach¹, Urs-Viktor Marti¹, Olivier Thyes¹•Institutions (1)

Swisscom¹

04 Mar 2004

TL;DR: In this paper, a system for recording and playback of television signals from a plurality of television channels is proposed, which comprises a computer-based controlling central unit, connectible to a telecommunication network, and a multiplicity of television receivers, connected to the controlling center unit, for receiving the television signals.

...read moreread less

Abstract: Proposed is a system for recording and playback of television signals from a plurality of television channels, which comprises a computer-based controlling central unit, connectible to a telecommunication network, and a plurality of television receivers, connected to the controlling central unit, for receiving the television signals in each case on one of the television channels via cable television networks and/or television antennas for terrestrial television broadcasting or satellite television transmission. The system further comprises coding modules, connected to the television receivers, for coding the received television signals in a digital format. The controlling central unit is set up to receive recording instructions from users via the telecommunication network and to store the television signals, coded in digital format, which have been received on the television channel specified by the stored recording instructions, at a time specified by the stored recording instructions. The system further comprises a playback module for transmitting the television signals, stored in digital format, via the telecommunication network, in each case for playback on a terminal of a respective user. The system enables users to have television signals from a plurality of television channels recorded at the same time without it being necessary for them to have a video recorder at their disposal or to operate a video recorder.

...read moreread less

8 citations

Patent•

Graphical user interface for browsing a list of visual elements

[...]

Urs-Viktor Marti¹, Alexander Schnepp¹, Alexander Schradt¹•Institutions (1)

Swisscom¹

09 Sep 2014

TL;DR: In this article, a graphical user interface may be configured to display consecutive visual elements, displaying one visual element, selected as a focus element, in a focus area of the graphical interface, and displaying one or more of the consecutive visual items, preceding or following the focus element in the list, as non-focus elements outside the focus area, at particular display positions.

...read moreread less

Abstract: Methods and systems are provided for configuring a graphical user interface that is used for browsing a list of visual elements. The graphical user interface may be configured to display consecutive visual elements, displaying one visual element, selected as a focus element, in a focus area of the graphical user interface, and displaying one or more of the consecutive visual elements, preceding or following the focus element in the list, as non-focus elements outside the focus area, at particular display positions. When focus is moved, a different visual element may be displayed in the focus area and the remaining visual elements may be displayed at rearranged display positions. The display positions may be arranged along two or more presentation lines running through the focus area, with the position of each visual element being representative of a visual element's position in the list with respect to the focus element.

...read moreread less

6 citations

Cited by

PDF

Open Access

More filters

Patent•

Method and system for considering information about an expected response when performing speech recognition

[...]

Keith Braho, Amro El-Jaroudi, Jeffrey Pike

02 Feb 2006

TL;DR: In this paper, a speech recognition system receives and analyzes speech input from a user in order to recognize and accept a response from the user, under certain conditions, information about the response expected from user may be available.

...read moreread less

Abstract: A speech recognition system receives and analyzes speech input from a user in order to recognize and accept a response from the user. Under certain conditions, information about the response expected from the user may be available. In these situations, the available information about the expected response is used to modify the behavior of the speech recognition system by taking this information into account. The modified behavior of the speech recognition system comprises adjusting the rejection threshold when speech input matches the predetermined expected response.

...read moreread less

517 citations

Patent•

Systems and methods for dynamically improving user intelligibility of synthesized speech in a work environment

[...]

James Hendrickson, Debra Drylie Scott, Duane Littleton, John Pecorari, Arkadiusz Slusarczyk - Show less +1 more

18 May 2012

TL;DR: In this paper, a method and apparatus that dynamically adjust operational parameters of a text-to-speech engine in a speech-based system are disclosed, in response to one or more environmental conditions.

...read moreread less

Abstract: A method and apparatus that dynamically adjust operational parameters of a text-to-speech engine in a speech-based system are disclosed. A voice engine or other application of a device provides a mechanism to alter the adjustable operational parameters of the text-to-speech engine. In response to one or more environmental conditions, the adjustable operational parameters of the text-to-speech engine are modified to increase the intelligibility of synthesized speech.

...read moreread less

407 citations

Patent•

Methods and systems for identifying errors in a speech recognition system

[...]

Keith Braho, Jeffrey Pike, Lori Pike

17 Oct 2014

TL;DR: In this article, a method for identifying possible errors made by a speech recognition system without using a transcript of words input to the system is described. But this method does not consider the use of a word-to-word model.

...read moreread less

Abstract: Methods are disclosed for identifying possible errors made by a speech recognition system without using a transcript of words input to the system. A method for model adaptation for a speech recognition system includes determining an error rate, corresponding to either recognition of instances of a word or recognition of instances of various words, without using a transcript of words input to the system. The method may further include adjusting an adaptation, of the model for the word or various models for the various words, based on the error rate. Apparatus are disclosed for identifying possible errors made by a speech recognition system without using a transcript of words input to the system. An apparatus for model adaptation for a speech recognition system includes a processor adapted to estimate an error rate, corresponding to either recognition of instances of a word or recognition of instances of various words, without using a transcript of words input to the system. The apparatus may further include a controller adapted to adjust an adaptation of the model for the word or various models for the various words, based on the error rate.

...read moreread less

306 citations

Patent•

Method and system for mitigating delay in receiving audio stream during production of sound from audio stream

[...]

Keith Braho, Russell Barr, Josh Karabin

15 Mar 2013

TL;DR: In this article, a communication component modifies production of an audio waveform at determined modification segments to mitigate the effects of a delay in processing and/or receiving a subsequent audio wave form.

...read moreread less

Abstract: A communication component modifies production of an audio waveform at determined modification segments to thereby mitigate the effects of a delay in processing and/or receiving a subsequent audio waveform. The audio waveform and/or data associated with the audio waveform are analyzed to identify the modification segments based on characteristics of the audio waveform and/or data associated therewith. The modification segments show where the production of the audio waveform may be modified without substantially affecting the clarity of the sound or audio. In one embodiment, the invention modifies the sound production at the identified modification segments to extend production time and thereby mitigate the effects of delay in receiving and/or processing a subsequent audio waveform for production.

...read moreread less

302 citations

Patent•

Pvr channel and pvr ipg information

[...]

Samuel H. Russ, Michael A. Gaul, Dariusz S. Kaminski¹•Institutions (1)

Scientific Atlanta¹

19 Sep 2003

TL;DR: In this article, a system that maps media content information to an interactive program guide (400) displayed on a screen is described, among other things, a memory with logic (351), and a processor (344) configured with the logic to display at least one personal video recording display channel in the interactive program guides.

...read moreread less

Abstract: A system (16) that maps media content information (352) to an interactive program guide (400) displayed on a screen (341) includes, among other things, a memory with logic (351), and a processor (344) configured with the logic to display at least one personal video recording display channel in the interactive program guide (400). The processor (344) is further preferably configured with the logic to display media content instance listings (460) in the personal video recording display channel (480) for corresponding media content instance recordings (410).

...read moreread less

175 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30

Collapse