scispace - formally typeset
Search or ask a question

Showing papers presented at "Robot and Human Interactive Communication in 1995"


Proceedings ArticleDOI
05 Jul 1995
TL;DR: A taxonomy for classifying human mediated control of remote manipulation systems is proposed, based on three dimensions: degree of machine autonomy, level of structure of theRemote environment, and extent of knowledge, or modellability, of the remote world.
Abstract: A taxonomy for classifying human mediated control of remote manipulation systems is proposed, based on three dimensions: degree of machine autonomy, level of structure of the remote environment, and extent of knowledge, or modellability, of the remote world. For certain unstructured and thus difficult to model environments, a case is made for remote manipulation by means of director/agent control, rather than telepresence. The ARGOS augmented reality toolkit is presented, as a means for gathering quantitative spatial information about-i.e., for interactively creating a partial model of a remotely viewed 3D worksite. This information is used for off-line local programming of the remote manipulator-i.e. virtual telerobotic control-and when ready the final commands are transmitted for execution to the manipulator control system.

114 citations


Proceedings ArticleDOI
05 Jul 1995
TL;DR: Results indicate that redundant presentation of haptic information greatly increases performance and reduce error rates as compared to the open-loop case (with no force feedback), and the best results were obtained where both direct haptic and redundant sound feedback were provided.
Abstract: This paper deals with one of the key problems when interacting with virtual environments (VE): finger force feedback during manipulation of virtual objects. In the real world, control of haptic interactions with objects is achieved using the kinesthetic/tactile information. In a VE, the presentation of such information requires a dextrous hand master that provides force feedback to different fingers. The presence of friction and/or time delays and the lack of tactile information in such devices reduce operator performance. In those cases, other sensory channels such as vision and audition could be used to replace (sensory substitution) or to supplement (information redundancy) the haptic channel. In this paper, we present the result of an experimental study aimed at investigating human performance during interactions with virtual objects through different sensory channels. This study was performed using the Rutgers VR distributed system. This system enable the users to receive force informations through the haptic, visual, or auditive channel. The experiment was performed using both partially immersive monoscopic and stereoscopic visual displays. Results indicate that redundant presentation of haptic information greatly increases performance and reduce error rates as compared to the open-loop case (with no force feedback). The best results were obtained where both direct haptic and redundant sound feedback were provided. Increment of task completion time when using visual force feedback suggest an overload of the visual channel.

51 citations


Proceedings ArticleDOI
05 Jul 1995
TL;DR: Experimental results show that the hand path of the hander is straight and the velocity pattern of the hands is bell-shaped whose velocity peak is located in a little before the middle of the duration, which is similar to the positioning motion of humans.
Abstract: This paper examines the motions of handing an object over on the plane are investigated experimentally. Hander and receiver generate smooth motions of handing over by adjusting his/her kinematic features to the other. This adjustment is important and needs to be analyzed. The relative movement of both hands is also interesting and are analyzed from the view point of applying the results in robot motions. Moreover, the effect of direction, of the movement, which changes the trajectory of the motions is analyzed finally. Experimental results show that the hand path of the hander is straight and the velocity pattern of the hand is bell-shaped whose velocity peak is located in a little before the middle of the duration, which is similar to the positioning motion of humans. The two representative patterns of trajectory are observed on the motion of the receiver from the view point of the velocity pattern.

48 citations


Proceedings ArticleDOI
Fumio Ando1, Amane Nakajima1, F. Younosuke1
05 Jul 1995
TL;DR: A new platform for personal computer based self-service terminals with multimedia conferencing functions in order to provide networked multimedia capability and face-to-face interface and the major new features include: asymmetric operation and user interface of a shared chalkboard; dynamic line connection/disconnection control, and dynamic media (video/audio/data) control on ISDN communication.
Abstract: We developed a new platform for personal computer based self-service terminals with multimedia conferencing functions in order to provide networked multimedia capability and face-to-face interface. The platform is based on extending our desktop conferencing system called ConverStation/2. It provides ITU-T H.320 motion video audio communication, and shared chalkboard function between a self-service terminal and an operator terminal. It also provides scanned image data transmission for browsing and co-editing of images between a customer and an operator. We clearly separate the platform with application modules, by providing a string command interface between them; we enhanced our conferencing features to meet the requirements of self-service applications. The major new features include: 1) asymmetric operation and user interface of a shared chalkboard; 2) dynamic line connection/disconnection control, and 3) dynamic media (video/audio/data) control on ISDN communication. With these conferencing functions, more complex services such as consultation can be provided through self-service terminals.

44 citations


Proceedings ArticleDOI
05 Jul 1995
TL;DR: This work addresses the aspects related to the development of force and tactile feedback systems devoted to generate such stimuli with the aim of allowing the human operator to obtain a realistic control of the operation.
Abstract: The analysis of the behaviour of the human operator during the interaction with virtual environments requires the availability of adequate interface systems. In particular, when the control of manipulative and explorative procedures is required, all the movements of the hand should be recorded and tactile as well as contact force stimuli should be replicated at the level of the hand. We address the aspects related to the development of force and tactile feedback systems devoted to generate such stimuli with the aim of allowing the human operator to obtain a realistic control of the operation. The peculiar roles of force and tactile feedback systems are presented with direct reference to the grasping and explorative tasks. The general performances of a force feedback system are highlighted together with the description of the Hand Force Feedback system developed at the Scuola Superiore Sante Anna. Tactile feedback is presented by considering the modelling of both thermal and indentation stimuli.

43 citations


Proceedings ArticleDOI
05 Jul 1995
TL;DR: This paper deals with the real-time recognition of six basic-facial expressions, using a layer-type neural network for recognition of facial expressions, and finds that the correct recognition ratio reached 85%.
Abstract: This paper deals with the real-time recognition of six basic-facial expressions. In order to obtain the center position of both pupils, we obtain the brightness by using a CCD camera, along a vertical line crossing over the pupil and eyebrow as base data and calculate the cross-correlation between base data and that in the given image. We extract the position of right and left pupils separately. As the facial information for utilizing the recognition of facial expression, we use brightness data of 13 vertical lines (facial information), determined empirically and including the areas of eyes, eyebrows and mouth. Then we acquire the facial information of a basic facial expressions for 30 subjects whose face images have already been obtained. Since we use a layer-type neural network for recognition of facial expressions, the facial information for some of 30 subjects is used for training the neural network. We found that, when we used 15 subjects for the network training, the correct recognition ratio reached 85%. and the total time for detecting right and left pupil positions plus the recognition of facial expression was about 60 ms per one recognition cycle.

35 citations


Proceedings ArticleDOI
05 Jul 1995
TL;DR: It is shown that the multimodal drawing tool with speech recognition reduces the average operation time, the number of command inputs to 89%, and the movement of mouse pointer to 53% in evaluation with inexperienced users.
Abstract: This paper focuses on the utility of speech input. We proposed some principles of human-computer interaction, which consist of the basic principle and the organization principles of interface that are required for comfortable input systems. In applying these principles, we discuss the desired organization of interfaces using speech, mouse and key-board, and design a multimodal drawing tool S-tgif. We also verify the benefits of speech input by using the tool. It is shown that the multimodal drawing tool with speech recognition reduces the average operation time to 82%, the number of command inputs to 89%, and the movement of mouse pointer to 53% in evaluation with inexperienced users.

34 citations


Proceedings ArticleDOI
05 Jul 1995
TL;DR: A new control method is proposed which, after dividing an operated load into gravity load and dynamic load, selects a power assist ratios for the dynamic by considering the remaining actuator torque after the ratio for the gravity load is determined basing on the operator's capability.
Abstract: This paper proposes a control method for a power assist system which attenuates the load force. In the system, the question of how to select a power assist ratio is important. This ratio must be selected with consideration of the maximum torque of each actuator used in the system, otherwise the actuator saturation may occur and cause the lack of the manoeuvrability and instability. To avoid such saturation problems, we propose a new control method which, after dividing an operated load into gravity load and dynamic load, selects a power assist ratio for the dynamic by considering the remaining actuator torque after the ratio for the gravity load is determined basing on the operator's capability. The control law is formulated for a single axis power assist system and the effectiveness of the method is confirmed by experiments.

29 citations


Journal ArticleDOI
05 Jul 1995
TL;DR: This study is developing a machine-maintenance training system in a virtual environment which can be used easily by novice users and a method for representing assembly and disassembly procedures by using Petri net is proposed.
Abstract: The periodical inspection of nuclear power plants is indispensable in their operations. However, it requires a lot of workforces with a high degree of technical skill in assembling and disassembling various sorts of machines in hazardous environment. In this study the authors are developing a machine-maintenance training system in a virtual environment which can be used easily by novice users. In this paper, the system configuration is outlined, and proposed is a method for representing assembly and disassembly procedures by using Petri net.

24 citations


Proceedings ArticleDOI
05 Jul 1995
TL;DR: A virtual space teleconferencing system that connects three different sites via a 1.5 Mbps ISDN in commercial use as an example of a human-oriented telecommunications system that involves a new paradigm called "Communication with realistic sensations".
Abstract: This paper describes a virtual space teleconferencing system as an example of a human-oriented telecommunications system. Virtual space teleconferencing at ATR involves a new paradigm called "Communication with realistic sensations". By using this system, participants at different sites can engage in a conference with the sensation of sharing the same space. More specifically, our system connects three different sites via a 1.5 Mbps ISDN in commercial use. The system has two large screens with which real-time reproductions of 3-D whole body images of humans are achieved. Participants at the three different sites are able to feel as if they are all at one site. They can also cooperatively work through a virtual common space using a multimodal interface.

24 citations


Proceedings ArticleDOI
05 Jul 1995
TL;DR: The "Optical flow technique" adopted here can distinguish moving objects from the background and demonstrates an example of detecting human being.
Abstract: This paper proposes a method to detect human being by an autonomous mobile guard robot. The "Optical flow technique" adopted here can distinguish moving objects from the background. This paper describes the technique and demonstrates an example of detecting human being.

Proceedings ArticleDOI
05 Jul 1995
TL;DR: The present work seeks to model human motions in a manner amenable to leaning and recognition, and employs hidden Markov models (HMMs) to model semantically meaningful human movements.
Abstract: Efforts to understand human motion have been increasing in number and complexity, and will most likely prove to be a key component in human-computer interfaces. One key feature of motion in general, human motion in particular, is its dynamic nature. The present work seeks to model human motions in a manner amenable to leaning and recognition. For such application, hidden Markov models (HMMs) are employed to model semantically meaningful human movements. The data used for modeling the human motions is an approximate pose derived from a sequence of camera images. An HMM is learned for each motion class and employed as a maximum likelihood recognizer. Experiments show promising results for a set of six sport actions.

Proceedings ArticleDOI
05 Jul 1995
TL;DR: A vision-based hand pose recognition system that expresses a hand pose by a plane model that consists of hand's center of gravity (COG) and fingertip points and solves the detection problem by using hand skeleton images detected by a multi-camera system.
Abstract: We propose a vision-based hand pose recognition system. The system we propose expresses a hand pose by a plane model that consists of hand's center of gravity (COG) and fingertip points. These points of reference can be relatively more stable and easily detected than other points (e.g., finger base points). However, since it has been assumed in the COG detection process that a hand region in an image is separate from other regions, detection becomes unstable when, for instance, a hand region is connected with an arm region. Moreover, finger occlusion which occurs in specific ranges of palm direction, makes angle detection unstable. The technique we present here solves the former problem by using hand skeleton images detected by a multi-camera system. We picked many candidates as the COG and selected a candidate according to its attributes. The multi-camera system also solves the latter problem. Results of a series of experiments on the former problem are also presented.

Proceedings ArticleDOI
05 Jul 1995
TL;DR: The algorithm of automated face area segmentation and facial feature extraction from input images with free backgrounds is described, which can be applied not only to the media conversion but also to human-machine interface.
Abstract: In this paper, we describe the algorithm of automated face area segmentation and facial feature extraction from input images with free backgrounds. The extracted feature points around the eyes, mouth, nose and facial contours are used for modifying facial images. The modified images are stored in the frame memory, and the human speaking scene is generated by continually changing the frames according to input text or speech sound. When speaking voice is input, the vowels are recognised and the corresponding frames are recalled out. This system can be applied not only to the media conversion but also to human-machine interface.

Proceedings ArticleDOI
05 Jul 1995
TL;DR: The EMT-HMD can present wider view angle and higher resolution compared with the conventional HMDs, though conventional display devices are used and the structure which can track eye movement in 2-dimension is discussed.
Abstract: This paper describes a head mounted display (HMD) which can present visual images with high reality using the characteristics of the human eye. The HMD has received considerable attention as vision presentation equipment for virtual reality or telerobotics. However conventional HMDs have problems that those images are narrow view angle and low resolution, because those displays have a fixed number of scanning lines. In order to improve it, an eye movement tracking type HMD (EMT-HMD) was proposed. The EMT-HMD can present wider view angle and higher resolution compared with the conventional HMDs, though conventional display devices are used. The principle is that a small high resolution image is superimposed on a wide low resolution image using two conventional display devices. A small, high resolution image is presented at small area around the view point according to the output of the eye movement detector. On the other hand, a wide, low resolution image is presented at its peripheral wide area. In this paper, the structure which can track eye movement in 2-dimension is discussed. Then, the structure of the trial production system, is explained. Also, some evaluation experiments are be introduced to confirm the effectiveness of the proposed idea.

Proceedings ArticleDOI
05 Jul 1995
TL;DR: This paper presents an attempt to create a framework in which the Urge Theory of M. Toda can be investigated of ASD, and shows improvements in implementational efficiency and theoretical accuracy of ASD.
Abstract: Recent implementations of action selection dynamics (ASD) with learning in situated/embodied form, increased the potential for more timely, dynamic, and vigorous interactions between the autonomous agent and its environment than Maes' (1989) simulation of ASD demonstrated and implied. The most recent implementations of ASD is an attempt to create a framework in which the Urge Theory of M. Toda can be investigated. It produced improvements in implementational efficiency and theoretical accuracy of ASD. The ASD network gets inputs from several different sensors (including vision), and supports learning to change inter-agent network relationships. Emotional states such as fear, curiosity, affection-seeking, hunger, joy, irritation, and anger are supported as emergent phenomena. The robot's on-board voice synthesis unit announces its internal states.

Proceedings ArticleDOI
05 Jul 1995
TL;DR: This paper proposes an approach to designing behavior and its subjective world of a small robot to behave like an animal by employing a hierarchical model of the relation between consciousness and behavior.
Abstract: This paper proposes an approach to designing behavior and its subjective world of a small robot to behave like an animal. This approach employs a hierarchical model of the relation between consciousness and behavior. The basic idea of this model is that a consciousness appears on a level in the hierarchical structure when an action on an immediately lower level is inhibited for internal or external causes, and that the appearing consciousness drives a chosen higher action. The computer simulation on a Mac shows the behavior of an artificial animal from reflex actions to catching of food. Its instantaneous consciousness that appears due to inhibited behavior is visualized with the behavior on the screen with use of colors according to emotions of the animal.

Proceedings ArticleDOI
05 Jul 1995
TL;DR: In this paper, subjective, multidimensional responses and objective, facial responses to a set of 20 different stimuli or stimuli were investigated, including caricatures and anthropomorphic forms typically used to represent agents in human interfaces.
Abstract: There is an increasing need for a robust emotion state model in human computer interaction, especially when naturalistic input, such as facial expression, is used. Existing cognitive models of emotion provide a starting point, but they are dependent upon phenomena which occur in the physical environment. Virtual environments present the human with qualitatively different phenomena. We investigated subjective, multidimensional responses and objective, facial responses to a set of 20 different phenomena or stimuli. This set of stimuli included caricatures and anthropomorphic forms typically used to represent agents in human interfaces. The stimuli were presented both static and dynamic forms. Subjects rated the anthropomorphic forms as having a higher degree of agency and intelligence. A variety of other interesting results were found relating to complexity, anthropomorphism, movement, and culture of the subject. These findings indicate that a substantially different emotion state model will have to be developed for human computer interaction. The findings provide practical heuristics for the design of "social" or agent-based interfaces.

Proceedings ArticleDOI
05 Jul 1995
TL;DR: The present article deals with the work carried out in the framework of the TIDE project No.150 MARCUS, for the development of a new three-fingered polyarticulated myoelectric prosthesis, equipped with position, force, slip sensors while a sensor-based control allows it to maintain a stable grasping of the object without affecting the user attention.
Abstract: The present article deals with the work carried out in the framework of the TIDE project No.150 MARCUS, for the development of a new three-fingered polyarticulated myoelectric prosthesis. The prosthetic hand is equipped with position, force, slip sensors while a sensor-based control allows it to maintain a stable grasping of the object without affecting the user attention. A general description of the whole system is given by emphasizing the mechanical solutions utilized for the three fingers. Force sensors at the level of the fingertips as well as palm sensors have been integrated in the structure. Results of sensor performances are also shown.

Proceedings ArticleDOI
05 Jul 1995
TL;DR: This work exemplifies its methods for the game of Kendama, executed by the SARCOS Dextrous Slave Arm, which has exactly the same kinematic structure as a human arm.
Abstract: A general theory of movement pattern perception based on a dynamic optimization theory can be used for motion capture and learning by watching in robotics. We exemplify our methods for the game of Kendama, executed by the SARCOS Dextrous Slave Arm, which has exactly the same kinematic structure as a human arm. Three ingredients have to be integrated for the successful execution of this task. The ingredients were (1) to extract via-points from a human movement trajectory using a forward-inverse relaxation model, (2) to treat via-points as a control variable while reconstructing the desired trajectory from all the via-points, and (3) to modify the via-points for successful execution.

Proceedings ArticleDOI
05 Jul 1995
TL;DR: Two parallel algorithms for detecting collisions among 3D objects in real-time are proposed for MIMD multi-processors having a shared-memory; one uses a static and the other uses a dynamic method for proper load balancing.
Abstract: We propose parallel algorithms for detecting collisions among 3D objects in real-time. First, a basic algorithm of serial version is described. It can detect potential collisions among multiple objects with arbitrary motion (translation and rotation) in 3D space. The algorithm can be used without modification for both convex and concave objects represented as polyhedra. This algorithm is efficient, simple to implement, and does not require any memory intensive auxiliary data structure to be precomputed and updated. Then, two parallel algorithms are proposed for MIMD multi-processors having a shared-memory; one uses a static and the other uses a dynamic method for proper load balancing. Experimental results demonstrate the performance of the proposed collision detection methods.

Journal ArticleDOI
01 Jan 1995
TL;DR: Etude des rapports entre philosophy de la nature and spiritualisme chez Cousin this article, et al. etudie egalement la place de la notion d'absolu telle qu'elle apparait dans le Cours de 1818
Abstract: Etude des rapports entre philosophie de la nature et spiritualisme chez Cousin. L'A. etudie egalement la place de la notion d'absolu telle qu'elle apparait dans le Cours de 1818

Proceedings ArticleDOI
05 Jul 1995
TL;DR: It is confirmed that coordinated motions of the eye and head system are possible with both motion and velocity equivalent to those of human.
Abstract: On the Humanoid Project, at the Waseda University, we are developing a 'campus information assistant Hadaly', which provides campus information services. This paper describes the anthropomorphic head-eye system which has an eye and head mechanism that comprises a subsystem of the campus information assistant Hadaly. The head-eye system consists of an eyeball mechanism and a head. Since the camera drive mechanism needs light-weight and immune to backlashes, the eyeball part uses a tendon-driven gimbal mechanism. Experiments were performed for the system to look at the target on the side, placed in the visual field of the CCD camera, and also to pursue a moving target within the visual field of the CCD camera. After performing the above experiments, we confirm that coordinated motions of the eye and head system are possible with both motion and velocity equivalent to those of human.

Proceedings ArticleDOI
05 Jul 1995
TL;DR: The purpose of this study was to propose the algorithm on the system including the mechanism to avoid a man, the experimental system for avoidance motions and experimental results.
Abstract: The purpose of this study was propose the algorithm on the system including the mechanism to avoid a man. On basic study, this paper reports on the experiment of human avoidance motion, the experimental system for avoidance motions and experimental results. The human avoidance behavior occurs when passing each other. Many passings each of experimenter and subject have been recorded by VTR. Avoidance motions data are obtained through their loci by analyzing VTR. The experimental system for avoidance motions has three DC servo motors with encoders. One DC servo motor is used on human side. Two DC servo motors are used on robot side to realize X-Y axes avoidance. Personal computer controls all motors and realizes avoidance motion using human behavior data.

Proceedings ArticleDOI
05 Jul 1995
TL;DR: By introducing a multi-level synchronization checking and context analysis, the action pattern of the robot can be regulated to make the robot perform in a complicated environment with plural speakers.
Abstract: The purpose of this study is to realize a multimedia sensing system for robot. Using both image and sound processing, the system makes a robot track the person who is speaking. The sound direction is calculated from the phase difference between the sounds arriving at the right and left microphones (ears) of the robot. Then by detecting the synchronization between the sound and image changes, the system identifies the speaker. Furthermore, by introducing a multi-level synchronization checking and context analysis, the action pattern of the robot can be regulated to make the robot perform in a complicated environment with plural speakers. All the processes are performed in real-time. The proposed system is implemented in the information assistant robot "Hadaly".

Proceedings ArticleDOI
K. Watanabe1, H. Kosaka1
05 Jul 1995
TL;DR: The results of factor analysis for the questionnaire data obtained in the sensory test show that the most substantial factor is the feeling expressed by the words "Clear, "Smooth", "Stiff", and "Clicking".
Abstract: Comfortable man-machine interfaces are required for current advanced systems. This paper aims at investigating the relation between reaction force of keyboard switches and human switch operation feeling. We investigate how each reaction force of switches effects to the touch feeling of keyboard switches via a sensory test. The results of factor analysis for the questionnaire data obtained in the sensory test show that the most substantial factor is the feeling expressed by the words "Clear", "Smooth", "Stiff", and "Clicking". Finally, each feeling described by each word is directly evaluated by reaction force or vice versa on dual scales. The dual scale for reaction forces and degree of feeling can be directly used for designing a comfortable keyboard switch.

Proceedings ArticleDOI
05 Jul 1995
TL;DR: This research proposes a method that merges information of prosody, facial expression in macro and the movement of the head, and classify these into a category of feelings, and by using both visual and speech information the accuracy of the recognition of human feeling up to 75% was achieved.
Abstract: While all sorts of systems have been computerized, the importance of the interactive communication between human and computer rises. In this research we propose a method that merges information of prosody, facial expression in macro and the movement of the head, and classify these into a category of feelings. By using both visual and speech information the accuracy of the recognition of human feeling up to 75% was achieved.

Proceedings ArticleDOI
05 Jul 1995
TL;DR: A basic model of violinist's playing information flow is described and the following rule has been derived: when the Kansei information changes, both the bow force and sounding point also change, but the bow speed does not change.
Abstract: The purpose of this study is to derive evaluation functions to plan human motion involving Kansei effect. Clarifying the effects of Kansei on human motion has contribution to the design of human-machine interface and robot planning. This study deals with violin playing as an example of human motion affected by Kansei. In this paper, a basic model of violinist's playing information flow and results of experiments which were carried out from the basic model are described. In the experiments, scales were played by three professional violinists to estimate the effects of Kansei information. As a result, the following rule has been derived: when the Kansei information changes, both the bow force and sounding point also change, but the bow speed does not change.

Proceedings ArticleDOI
Fumio Kawakami1, M. Okura1, Hiroshi Yamada1, H. Harashima1, Shigeo Morishima1 
05 Jul 1995
TL;DR: The subjective evaluation using the synthesized facial expression from the 3-D emotion space to realize a very natural human-machine communication system by giving a facial expression to a computer.
Abstract: The goal of the research is to realize a very natural human-machine communication system by giving a facial expression to a computer. The 3-D emotion space previously proposed, can represent both human and computer emotion conditions appearing on the face compactly in 3-D space. Namely, this 3-D emotion space can also realize both mapping and inverse mapping from facial expression to this 3-D space. This paper is mainly about the subjective evaluation using the synthesized facial expression from the 3-D emotion space.

Proceedings ArticleDOI
05 Jul 1995
TL;DR: The emotions being flexible, pleasant, and human-like are explained by a linear regression model of the impedance values and the emotion being reassuring are classified into 4 groups from the view point of contact force and response speed.
Abstract: This paper examines the appropriate virtual impedance values of robots coexisting with humans from the view point of human emotions. The values are investigated experimentally using the rating scale method which is generally used for evaluating various stimulus subjectively. In the experiments, 11 subjects add force or impact to the robot and evaluate its reaction by the rating scale method. The evaluations are done on 48 combinations of virtual mass, viscous coefficient, stiffness. As a result, the emotions being flexible, pleasant, and human-like are explained by a linear regression model of the impedance values and the emotion being reassuring are classified into 4 groups from the view point of contact force and response speed.