scispace - formally typeset
H

Hedi Ben-younes

Researcher at Valeo

Publications -  18
Citations -  1488

Hedi Ben-younes is an academic researcher from Valeo. The author has contributed to research in topics: Computer science & Question answering. The author has an hindex of 10, co-authored 16 publications receiving 997 citations. Previous affiliations of Hedi Ben-younes include University of Paris.

Papers
More filters
Proceedings ArticleDOI

MUTAN: Multimodal Tucker Fusion for Visual Question Answering

TL;DR: In this article, a multimodal tensor-based Tucker decomposition is proposed to efficiently parametrize bilinear interactions between visual and textual representations, and a low-rank matrix-based decomposition to explicitly constrain the interaction rank.
Proceedings ArticleDOI

MUREL: Multimodal Relational Reasoning for Visual Question Answering

TL;DR: This paper proposes MuRel, a multimodal relational network which is learned end-to-end to reason over real images, and introduces the introduction of the MuRel cell, an atomic reasoning primitive representing interactions between question and image regions by a rich vectorial representation, and modeling region relations with pairwise combinations.
Proceedings Article

RUBi: Reducing Unimodal Biases for Visual Question Answering

TL;DR: RUBi, a new learning strategy to reduce biases in any VQA model, is proposed, which reduces the importance of the most biased examples, i.e. examples that can be correctly classified without looking at the image.
Journal ArticleDOI

BLOCK: Bilinear Superdiagonal Fusion for Visual Question Answering and Visual Relationship Detection

TL;DR: Cadene et al. as mentioned in this paper proposed a block-superdiagonal tensor decomposition (block-term ranks) to optimize the tradeoff between expressiveness and complexity of the fusion model.
Posted Content

MUREL: Multimodal Relational Reasoning for Visual Question Answering

TL;DR: MuRel as mentioned in this paper introduces an atomic reasoning primitive representing interactions between question and image regions by a rich vectorial representation, and modeling region relations with pairwise combinations, which progressively refines visual and question interactions, and can be used to define visualization schemes finer than mere attention maps.