scispace - formally typeset
Search or ask a question

Showing papers by "Sophia Bano published in 2018"


Journal ArticleDOI
TL;DR: An evaluation protocol is described, including framewise, extended framewis, and event-based measures, and empirical evidence that the fusion of visual face track scores with audio voice activity scores provides an effective combination is provided.
Abstract: Continuous detection of social interactions from wearable sensor data streams has a range of potential applications in domains, including health and social care, security, and assistive technology. We contribute an annotated, multimodal data set capturing such interactions using video, audio, GPS, and inertial sensing. We present methods for automatic detection and temporal segmentation of focused interactions using support vector machines and recurrent neural networks with features extracted from both audio and video streams. The focused interaction occurs when the co-present individuals, having the mutual focus of attention, interact by first establishing the face-to-face engagement and direct conversation. We describe an evaluation protocol, including framewise, extended framewise, and event-based measures, and provide empirical evidence that the fusion of visual face track scores with audio voice activity scores provides an effective combination. The methods, contributed data set, and protocol together provide a benchmark for the future research on this problem. The data set is available at https://doi.org/10.15132/10000134 .

16 citations


Book ChapterDOI
16 Sep 2018
TL;DR: XmoNet is proposed, a deep-learning architecture based on fully convolutional networks that enables cross-modality MR image inference and illustrates the utility of XmoNet in learning the mapping between heterogeneous T1- and T2-weighted MRI scans for accurate and realistic image synthesis in a preliminary analysis.
Abstract: Magnetic resonance imaging (MRI) can generate multimodal scans with complementary contrast information, capturing various anatomical or functional properties of organs of interest. But whilst the acquisition of multiple modalities is favourable in clinical and research settings, it is hindered by a range of practical factors that include cost and imaging artefacts. We propose XmoNet, a deep-learning architecture based on fully convolutional networks (FCNs) that enables cross-modality MR image inference. This multiple branch architecture operates on various levels of image spatial resolutions, encoding rich feature hierarchies suited for this image generation task. We illustrate the utility of XmoNet in learning the mapping between heterogeneous T1- and T2-weighted MRI scans for accurate and realistic image synthesis in a preliminary analysis. Our findings support scaling the work to include larger samples and additional modalities.

3 citations