Showing papers by "Michihiko Minoh published in 2012"

PDF

Open Access

Book Chapter•DOI•

Set based discriminative ranking for recognition

[...]

Yang Wu¹, Michihiko Minoh¹, Masayuki Mukunoki¹, Shihong Lao²•Institutions (2)

07 Oct 2012

TL;DR: A set-based discriminative ranking model (SBDR), which iterates between set-to-set distance finding and discrim inative feature space projection to achieve simultaneous optimization of these two.

...read moreread less

Abstract: Recently both face recognition and body-based person re-identification have been extended from single-image based scenarios to video-based or even more generally image-set based problems. Set-based recognition brings new research and application opportunities while at the same time raises great modeling and optimization challenges. How to make the best use of the available multiple samples for each individual while at the same time not be disturbed by the great within-set variations is considered by us to be the major issue. Due to the difficulty of designing a global optimal learning model, most existing solutions are still based on unsupervised matching, which can be further categorized into three groups: a) set-based signature generation, b) direct set-to-set matching, and c) between-set distance finding. The first two count on good feature representation while the third explores data set structure and set-based distance measurement. The main shortage of them is the lack of learning-based discrimination ability. In this paper, we propose a set-based discriminative ranking model (SBDR), which iterates between set-to-set distance finding and discriminative feature space projection to achieve simultaneous optimization of these two. Extensive experiments on widely-used face recognition and person re-identification datasets not only demonstrate the superiority of our approach, but also shed some light on its properties and application domain.

...read moreread less

57 citations

Proceedings Article•DOI•

Common-near-neighbor analysis for person re-identification

[...]

Wei Li¹, Yang Wu¹, Masayuki Mukunoki¹, Michihiko Minoh¹•Institutions (1)

Kyoto University¹

01 Sep 2012

TL;DR: A new way called Common-Near-Neighbor Analysis is presented, which analyzes the commonness of the near neighbors of each pair of samples in a learned metric space, measured by a novel rank-order based dissimilarity.

...read moreread less

Abstract: Person re-identification tackles the problem whether an observed person of interest reappears in a network of cameras. The difficulty primarily originates from few samples per class but large amounts of intra-class variations in real scenarios: illumination, pose and viewpoint changes across cameras. So far, proposals in the literature have treated this either as a matching problem focusing on feature representation or as a classification/ranking problem relying on metric optimization. This paper presents a new way called Common-Near-Neighbor Analysis, which to some extent combines the strengths of these two methodologies. It analyzes the commonness of the near neighbors of each pair of samples in a learned metric space, measured by a novel rank-order based dissimilarity. Our method, using only color cue, has been tested on widely-used benchmark datasets, showing significant performance improvement over the state-of-the-art.

...read moreread less

48 citations

Proceedings Article•DOI•

Collaborative Sparse Approximation for Multiple-Shot Across-Camera Person Re-identification

[...]

Yang Wu¹, Michihiko Minoh¹, Masayuki Mukunoki¹, Wei Li¹, Shihong Lao² - Show less +1 more•Institutions (2)

Kyoto University¹, Omron²

18 Sep 2012

TL;DR: A collaborative representation over all the gallery images of known person individuals is built to best approximate the query images (containing an unknown person) via affine combinations to reveal the identity of the querying person.

...read moreread less

Abstract: In this paper we propose a simple and effective solution to the important and challenging problem of across-camera person re-identification. We focus on the common case in video surveillance where multiple images or video frames are available for each person. Instead of exploring new features, the proposed approach aims at making a better use of such images/frames. It builds a collaborative representation over all the gallery images (of known person individuals) to best approximate the query images (containing an unknown person) via affine combinations. The approximation is measured by the nearest point distance between the two affine hulls constructed by the query images and gallery images, respectively. By enforcing the sparsity of the samples used for approximating the two nearest points, the relative importance of the gallery images belonging to different persons has the ability to reveal the identity of the querying person. Extensive experiments on public benchmark datasets demonstrate that the proposed approach greatly outperforms the state-of-the-art methods.

...read moreread less

30 citations

Proceedings Article•

Robust object recognition via third-party collaborative representation

[...]

Yang Wu¹, Michihiko Minoh¹, Masayuki Mukunoki¹, Shihong Lao²•Institutions (2)

Kyoto University¹, Omron²

01 Nov 2012

TL;DR: The proposed method is applicable to various real-world object recognition tasks instead of handling only the well-controlled face recognition problem, and enables using an existing dictionary for testing new data without time-consuming data annotation and model re-training.

...read moreread less

Abstract: A simple and effective method is proposed for object recognition via collaborative representation with ridge regression Different from existing sparse representation and collaborative representation based approaches, the proposal does not need extensive training samples for each testing class and it is robust to localization errors and large within-class variations, thus being applicable to various real-world object recognition tasks instead of handling only the well-controlled face recognition problem Its discriminative power is explored from a third-party dataset which can be different from the training and testing datasets, therefore, it enables using an existing dictionary for testing new data without time-consuming data annotation and model re-training As an example, the proposal is extensively tested on the representative and very challenging task of person re-identification, defining novel state-of-the-art results on widely adopted benchmark datasets using only simple and common features

...read moreread less

15 citations

Proceedings Article•DOI•

Recognizing ingredients at cutting process by integrating multimodal features

[...]

Atsushi Hashimoto¹, Jin Inoue¹, Kazuaki Nakamura², Takuya Funatomi¹, Mayumi Ueda³, Yoko Yamakata¹, Michihiko Minoh¹ - Show less +3 more•Institutions (3)

Kyoto University¹, Osaka University², University of Marketing and Distribution Sciences³

02 Nov 2012

TL;DR: A method that involves some physical signals obtained in a cutting process by attaching load and sound sensors to the chopping board to facilitate more precise recognition of ingredients in food preparing activity is proposed.

...read moreread less

Abstract: We propose a method for recognizing ingredients in food preparing activity. The research for object recognition mainly focuses on only visual information; however, ingredients are difficult to recognize only by visual information because of their limited color variations and larger within-class difference than inter-class difference in shapes. In this paper, we propose a method that involves some physical signals obtained in a cutting process by attaching load and sound sensors to the chopping board. The load may depend on an ingredient's hardness. The sound produced when a knife passes through an ingredient reflects the structure of the ingredient. Hence, these signals are expected to facilitate more precise recognition. We confirmed the effectiveness of the integration of the three modalities (visual, auditory, and load) through experiments in which the developed method was applied to 23 classes of ingredients.

...read moreread less

13 citations

Journal Article•DOI•

Learning to Estimate Slide Comprehension in Classrooms with Support Vector Machines

[...]

Nimit Pattanasri¹, Masayuki Mukunoki¹, Michihiko Minoh¹•Institutions (1)

Kyoto University¹

01 Jan 2012-IEEE Transactions on Learning Technologies

TL;DR: It is argued that students should report their own comprehension explicitly in a classroom with students' comprehension made available at the slide level, and a machine learning technique is applied to classify presentation slides according to comprehension levels.

...read moreread less

Abstract: Comprehension assessment is an essential tool in classroom learning. However, the judgment often relies on experience of an instructor who makes observation of students' behavior during the lessons. We argue that students should report their own comprehension explicitly in a classroom. With students' comprehension made available at the slide level, we apply a machine learning technique to classify presentation slides according to comprehension levels. Our experimental result suggests that presentation-based features are as predictive as bag-of-words feature vector which is proved successful in text classification tasks. Our analysis on presentation-based features reveals possible causes of poor lecture comprehension.

...read moreread less

11 citations

Book Chapter•DOI•

Investigation of a Method to Estimate Learners’ Interest Level for Agent-Based Conversational e-Learning

[...]

Kazuaki Nakamura¹, Koh Kakusho², Tetsuo Shoji³, Michihiko Minoh¹•Institutions (3)

Kyoto University¹, Kwansei Gakuin University², Nara University³

09 Jul 2012

TL;DR: This paper focuses on the learners’ interest level as an example of the important affective state, and investigates a method for estimating it from their nonverbal behaviors.

...read moreread less

Abstract: A method for recognizing or estimating learners’ affective state plays a key role for realizing agent-based conversational e-Learning. In this paper, we focus on the learners’ interest level as an example of the important affective state, and investigate a method for estimating it from their nonverbal behaviors. In conversational situations, the sense of the nonverbal behaviors will vary depending on the contexts of the conversations. Therefore we do not use the nonverbal behaviors themselves but use the occurrence frequencies of the nonverbal behaviors as inputs for estimation mechanism. In the result of our experiment, the proposed method could estimate whether the learners’ interest level is “High” or “Low” with the accuracy of more than 70%.

...read moreread less

8 citations

Journal Article•

False Alert Rejection for Detecting Objects on a Table by Touch Reasoning

[...]

Atsushi Hashimoto, Takuya Funatomi, Kazuaki Nakamura, Michihiko Minoh

01 Dec 2012-The IEICE transactions on information and systems

3 citations

Proceedings Article•DOI•

Pinhole-to-Projection Pyramid Subtraction for Reconstructing Non-rigid Objects from Range Images

[...]

Takuya Funatomi¹, Haruna Akuzawa¹, Masaaki Iiyama¹, Michihiko Minoh¹•Institutions (1)

Kyoto University¹

13 Oct 2012

TL;DR: A novel method for reconstructing the shape model of a non-rigid object as the union of rigid components, and uses the Pinhole-to-Projection Pyramid obtained from each range image to non-iteratively solve the assignment task.

...read moreread less

Abstract: In this paper, we propose a novel method for reconstructing the shape model of a non-rigid object. We represent the non-rigid object as the union of rigid components, and acquire range images of the object and motion of each component while the object varies its shape. We acquire the range images using one-shot scanning, and we use marker-based motion capture for motion acquisition. Based on them, our method performs registration of the range images and assigns a shape to each component. We propose the use of the Pinhole-to-Projection Pyramid obtained from each range image to non-iteratively solve the assignment task. The effectiveness of our method is demonstrated by applying it to reconstruct the shape of a human hand.

...read moreread less

3 citations

Book Chapter•DOI•

Students’ Posture Sequence Estimation Using Spatio-temporal Constraints

[...]

Masayuki Mukunoki¹, Kota Yoshitsugu¹, Michihiko Minoh¹•Institutions (1)

Kyoto University¹

09 Jul 2012

TL;DR: A method is proposed for estimating the students’ posture sequence in classroom from video footage by computer automatically by introducing spatio-temporal constraints, in which the belief of postures is propagated through a given time interval with considering the confidence of observation.

...read moreread less

Abstract: We propose a method for estimating the students’ posture sequence in classroom from video footage by computer automatically. A posture sequence is a time-series of student’s postures during a lecture and a posture of a student is described by a set of his head, body trunk (torso) and hands/arms states, which we call the body part states. The detection of body parts from video footage has many errors. To cope with the errors, we introduce spatio-temporal constraints, in which we propagate the belief of postures through a given time interval with considering the confidence of observation. Through this propagation, we can revise the erroneous detection results and estimate an appropriate posture sequence. In the experiment, we apply our proposed method to a real lecture, and show that our method can improve the accuracy of posture sequence estimation.

...read moreread less

1 citations

Journal Article•DOI•

Pre-roll Control of Live Video Streaming Based on Separateness

[...]

Yoshitaka Morimura¹, Koh Kakusho², Satoshi Nishiguchi³, Keisuke Yagi, Michihiko Minoh¹ - Show less +1 more•Institutions (3)

Kyoto University¹, Kwansei Gakuin University², Osaka Institute of Technology³

15 Sep 2012

TL;DR: This paper proposes the method to apply pre-roll sequentially for segmented video data divided by points where continuity of video content breaks, and shows it can meet spatial, temporal and continuous quality with few degradation of real-time quality.

...read moreread less

Abstract: This paper focuses on live video distribution by video streaming technology. In general, it is an important challenge for video streaming to preserve spatial, temporal, continuous, and real-time quality of video. Pre-roll streaming, which buffers a certain amount of data before video playback, can retain spatial and temporal quality without degradation of continuous quality, while the problem about real-time quality remains because it is difficult to know data amount of video filmed in future on live video distribution. Hence, we propose the method to apply pre-roll sequentially for segmented video data divided by points where continuity of video content breaks, and show it can meet spatial, temporal and continuous quality with few degradation of real-time quality. In the experiment, we show the proposed method met other three qualities with few degradation of real-time quality for lecture video streaming.

...read moreread less

Book Chapter•DOI•

Privacy-Aware Database System for Retrieving Facial Images

[...]

Fujita Tomohiko¹, Takuya Funatomi¹, Yoshitaka Morimura¹, Michihiko Minoh¹•Institutions (1)

Kyoto University¹

09 Jul 2012

TL;DR: In this experiment, the proposed method for generating a key by quantizing the facial features based on entropy was applied to a public facial image database, and the system performance and integrity was evaluated.

...read moreread less

Abstract: To achieve privacy protection on facial image retrieval systems, we propose a method of encrypting facial images with a key produced from facial features. Because facial features vary even for the same person, it is not recommended to use facial features as the cryptographic key. Therefore, we propose a method for generating a key by quantizing the facial features based on entropy. In our experiment, we applied the proposed method to a public facial image database, and evaluated the system performance and integrity by calculating the false acceptance rate and the false rejection rate.

...read moreread less