scispace - formally typeset
Search or ask a question

Showing papers in "IEICE Transactions on Information and Systems in 1999"


Journal Article
TL;DR: This paper addresses current topics about document image understanding from a technical point of view as a survey and proposes methods/approaches for recognition of various kinds of documents.
Abstract: The subject about document image understanding is to extract and classify individual data meaningfully from paper-based documents. Until today, many methods/approaches have been proposed with regard to recognition of various kinds of documents, various technical problems for extensions of OCR, and requirements for practical usages. Of course, though the technical research issues in the early stage are looked upon as complementary attacks for the traditional OCR which is dependent on character recognition techniques, the application ranges or related issues are widely investigated or should be established progressively. This paper addresses current topics about document image understanding from a technical point of view as a survey. key words: document model, top-down, bottom-up, layout structure, logical structure, document types, layout recognition

222 citations


Journal Article
TL;DR: In this paper, a recon gurable processor architecture called REMARC (Recon gurable Multimedia Array Coprocessor) is proposed to accelerate multimedia applications, such as video compression, decompression, and image processing.
Abstract: This paper describes a new recon gurable processor architecture called REMARC (Recon gurable Multimedia Array Coprocessor). REMARC is a recon gurable coprocessor that is tightly coupled to a main RISC processor and consists of a global control unit and 64 programmable logic blocks called nano processors. REMARC is designed to accelerate multimedia applications, such as video compression, decompression, and image processing. These applications typically use 8bit or 16-bit data therefore, each nano processor has a 16-bit datapath that is much wider than those of other recon gurable coprocessors. We have developed a programming environment for REMARC and several realistic application programs, DES encryption, MPEG-2 decoding, and MPEG-2 encoding. REMARC achieves speedups ranging from a factor of 2.3 to 21.2 on these applications.

200 citations


Journal Article
TL;DR: The goal of this paper is to present a critical survey of existing literature on an omnidirectional sensing, and their applications in fields of autonomous robot navigation, telepresence, remote surveillance and virtual reality.
Abstract: The goal of this paper is to present a critical survey of existing literature on an omnidirectional sensing. The area of vision application such as autonomous robot navigation, telepresence and virtual reality is expanding by use of a camera with a wide angle of view. In particular, a real-time omnidirectional camera with a single center of projection is suitable for analyzing and monitoring, because we can easily generate any desired image projected on any designated image plane, such as a pure perspective image or a panoramic image, from the omnidirectional input image. In this paper, I review designs and principles of existing omnidirectional cameras, which can acquire an omnidirectional (360 degrees) field of view, and their applications in fields of autonomous robot navigation, telepresence, remote surveillance and virtual reality. key words: omnidirectional camera, multiple sensing camera, panoramic view, omnidirectional view, computer vision

181 citations


Journal Article
TL;DR: In this article, the authors analyze the expected number of rounds and optimal values to minimize communication costs in a multi-round sealed-bid auction, where the winners from an auction round participate in a subsequent tie-breaking second auction round.
Abstract: SUMMARY Auctions are a critical element of the electronic commerce infrastructure. But for real-time applications, auctions are a potential problem – they can cause significant time delays. Thus, for most real-time applications, sealed-bid auctions are recommended. But how do we handle tie-breaking in sealed-bid auctions? This paper analyzes the use of multi-round auctions where the winners from an auction round participate in a subsequent tie-breaking second auction round. We perform this analysis over the classical first-price sealed-bid auction that has been modified to provide full anonymity. We analyze the expected number of rounds and optimal values to minimize communication costs.

146 citations


Journal Article
TL;DR: The scheme uses the image fusion technique to automatically recognize and remove contamination of clouds and their shadows, and integrate complementary information into the composite image from multitemporal images.
Abstract: In this paper, a scheme to remove clouds and their shadows from remotely sensed images of Landsat TM over land has been proposed. The scheme uses the image fusion technique to automatically recognize and remove contamination of clouds and their shadows, and integrate complementary information into the composite image from multitemporal images. The cloud regions can be detected on the basis of the reflectance differences with the other regions. Based on the fact that shadows smooth the brightness changes of the ground, the shadow regions can be detected successfully by means of wavelet transform. Further, an area-based detection rule is developed in this paper and the multispectral characteristics of Landsat TM images are used to alleviate the computational load. Because the wavelet transform is adopted for the image fusion, artifacts are invisible in the fused images. Finally, the performance of the proposed scheme is demonstrated experimentally. key words: remote sensing, image fusion, wavelet transform, automated detection and removal, Landsat TM images

104 citations


Journal Article
TL;DR: In this paper, the branch and bound technique was applied to find the true network structure with the minimum value of the minimum description length (MDL) principle, and the resulting algorithm searches considerably save the computation and successfully searches the network structure.
Abstract: In this paper, the computational issue in the problem of learning Bayesian belief networks (BBNs) based on the minimum description length (MDL) principle is addressed. Based on an asymptotic formula of description length, we apply the branch and bound technique to finding true network structures. The resulting algorithm searches considerably saves the computation yet successfully searches the network structure with the minimum value of the formula. Thus far, there has been no search algorithm that finds the optimal solution for examples of practical size and a set of network structures in the sense of the maximum posterior probability, and heuristic searches such as K2 and K3 trap in local optima due to the greedy nature even when the sample size is large. The proposed algorithm, since it minimizes the description length, eventually selects the true network structure as the sample size goes to infinity. key words: Bayesian belief networks, minimum description length (MDL) principle, branch and bound technique, Cooper and Herskovits procedure, MDL-based procedure, K2, K3

94 citations


Journal Article
TL;DR: Positive decision trees are built to recover the positivity of data, which the original data had but was lost in the process of decomposing data sets by such methods as ID3, and positive decision trees exhibit higher accuracy and tend to choose correct attributes, on which the hidden positive Boolean function is defined.

48 citations


Journal Article
TL;DR: This paper proposes genetic algorithms (GAs) for path planning and trajectory planning of an autonomous mobile robot that has an advantage of adaptivity such that the GAs work even if an environment is time-varying or unknown.
Abstract: This paper proposes genetic algorithms (GAs) for path planning and trajectory planning of an autonomous mobile robot. Our GA-based approach has an advantage of adaptivity such that the GAs work even if an environment is time-varying or unknown. Therefore, it is suitable for both off-line and on-line motion planning. We first presents a GA for path planning in a 2D terrain. Simulation results on the performance and adaptivity of the GA on randomly generated terrains are shown. Then, we discuss an extension of the GA for solving both path planning and trajectory planning simultaneously. key words: genetic algorithms, adaptivity, autonomous mobile robots, path planning, trajectory planning

40 citations


Journal Article
TL;DR: A deflector for a cable transport system, wherein the cable is provided with a plurality of members each supporting a load, has a wheel rotatable about an upright axis and provided around its periphery with a plethora of angularly equispaced guides.
Abstract: A deflector for a cable transport system, wherein the cable is provided with a plurality of members each supporting a load, has a wheel rotatable about an upright axis and provided around its periphery with a plurality of angularly equispaced guides. Each such guide is carried on one end of a short arm pivoted itself on the periphery of the wheel so that the guides can be pushed to the side by the load-carrying members as they pass. The guides are small synthetic-resin sheaves which are rotatable about respective axes on their respective arms. The arms normally project radially inwardly of the wheel. In the case of a system wherein the arc of the deflection is flatter than the wheel curvature a cam is provided to array those guides in contact with the cable along the desired arc.

32 citations


Journal Article
TL;DR: The methods proposed here enable evaluators to analyze the log files of multiple users together by detecting interaction patterns that commonly appear in the log Files by utilizing a repeating pattern detection algorithm.
Abstract: In this paper, we propose methods for testing the usability of graphical user interface (GUI) applications based on log files of user interactions. Log analysis by existing methods is not efficient because evaluators analyze a single log file or log files of the same user and then manually compare results. The methods proposed here solve this problem; the methods enable evaluators to analyze the log files of multiple users together by detecting interaction patterns that commonly appear in the log files. To achieve the methods, we first clarify usability attributes that can be evaluated by a log-based usability testing method and user interaction patterns that have to be detected for the evaluation. Based on an investigation on the information that can be obtained from the log files, we extract the attributes of clarity, safety, simplicity, and continuity. For the evaluations of clarity and safety, the interaction patterns that have to be detected include those from user errors. We then propose our methods for detecting interaction patterns from the log files of multiple users. Patterns that commonly appear in the log files are detected by utilizing a repeating pattern detection algorithm. By regarding an operation sequence recorded in a log file as a string and concatenating strings, common patterns are able to be detected as repeating patterns in the concatenated string. We next describe the implementation of the methods in a computer tool for logbased usability testing. The tool, GUITESTER, records userapplication interactions into log files, generates usability analysis data from the log files by applying the proposed methods, and visualizes the generated usability analysis data. To show the effectiveness of GUITESTER in finding usability problems, we report an example of a usability test. In this test, evaluators could find 14 problems in a tested GUI application. We finally discuss the ability of the proposed methods in terms of its log analysis efficiency, by comparing the analysis/sequence time (AT/ST) ratio of GUITESTER with those of other methods and tools. The ratio of GUITESTER is found to be smaller. This indicates the methods make log analysis more efficient. key words: usability, graphical user interfaces, humancomputer interaction, log files, interaction patterns

28 citations


Journal Article
TL;DR: General views of computer vision and image processing based on optimization are presented and the use of a genetic algorithm (GA) as a method of optimization is introduced.
Abstract: In this paper, the authors present general views of computer vision and image processing based on optimization. Relaxation and regularization in both broad and narrow senses are used in various fields and problems of computer vision and image processing, and they are currently being combined with general-purpose optimization algorithms. The principle and case examples of relaxation and regularization are discussed; the application of optimization to shape description that is a particularly important problem in the field is described; and the use of a genetic algorithm (GA) as a method of optimization is introduced. key words: computer vision, image processing, optimization, relaxation, regularization, snakes, genetic algorithm

Journal Article
TL;DR: More general platform for the 3-D image representation is introduced, aiming to outgrow the framework of 3- D “image” communication and to open up a novel field of technology, which should be called the “spatial’ communication.
Abstract: This paper surveys the results of various studies on 3-D image coding. Themes are focused on efficient compression and display-independent representation of 3-D images. Most of the works on 3-D image coding have been concentrated on the compression methods tuned for each of the 3-D image formats (stereo pairs, multi-view images, volumetric images, holograms and so on). For the compression of stereo images, several techniques concerned with the concept of disparity compensation have been developed. For the compression of multi-view images, the concepts of disparity compensation and epipolar plane image (EPI) are the efficient ways of exploiting redundancies between multiple views. These techniques, however, heavily depend on the limited camera configurations. In order to consider many other multi-view configurations and other types of 3-D images comprehensively, more general platform for the 3-D image representation is introduced, aiming to outgrow the framework of 3-D “image” communication and to open up a novel field of technology, which should be called the “spatial” communication. Especially, the light ray based method has a wide range of application, including efficient transmission of the physical world, as well as integration of the virtual and physical worlds. key words: 3-D image coding, stereo images, multi-view images, panoramic images, volumetric images, holograms, displayindependent representation, light rays, spatial communication

Journal Article
TL;DR: In this paper, the vector median-rational hybrid filter (VMRHF) was introduced for multispectral image processing, which exploits the features of the median filter and the vector rational operator (VRF).
Abstract: |In this paper, a novel lter structure is introduced for multispectral image processing, the vector median-rational hybrid lter's (VMRHF's), which constitute an extension of the nonlinear rational type hybrid lters called medianrational hybrid lter's (MRHF's) recently introduced for 1-D and 2-D signal processing. The VMRHF is a two-stages lter, which exploits in an e ective way the features of the vector median lter (VM) and those of the vector rational operator (VRF). Experimental results show that the new VMRHF outperforms signi cantly widely known nonlinear lters for multispectral image processing such as the vector median lter and the class of the directional distance (DD) lters for all criteria used. KEYWORDS| Color image processing, Vector Rational Filters, Vector Median Filters, Vector Median-Rational Hybrid Filters.


Journal Article
TL;DR: A new dynamic programming (DP) based algorithm for monotonic and continuous two-dimensional warping (2DW) is presented, which searches for the optimal pixel-to-pixel mapping between a pair of images subject to monotonicity and continuity constraints with by far less time complexity.
Abstract: A new dynamic programming (DP) based algorithm for monotonic and continuous two-dimensional warping (2DW) is presented. This algorithm searches for the optimal pixel-to-pixel mapping between a pair of images subject to monotonicity and continuity constraints with by far less time complexity than the algorithm previously reported by the authors. This complexity reduction results from a refinement of the multi-stage decision process representing the 2DW problem. As an implementation technique, a polynomial order approximation algorithm incorporated with beam search is also presented. Theoretical and experimental comparisons show that the present approximation algorithm yields better performance than the previous approximation algorithm. key words: two-dimensional warping, image matching, dynamic programming, correspondence optimization, Markovian process formulation

Journal Article
TL;DR: This paper surveys recent research trends in the processing of face images by a computer and its typical applications and the various characteristics of faces are considered.
Abstract: Human faces convey various information, including that is specific to each individual person and that is part of mutual communication among persons. Information exhibited by a “face” is what is called “non-verbal information” and usually verbal media cannot easily describe such information appropriately. Recently, detailed studies on the processing of face images by a computer have been carried out in the engineering field for applications to communication media and human computer interaction as well as automatic identification of human faces. Two main technical topics are the recognition of human faces and the synthesis of face images. The objective of the former is to enable a computer to detect and identify users and further to recognize their facial expressions, while that of the latter is to provide a natural and impressive user interface on a computer in the form of a “face.” These studies have also been found to be useful in various non-engineering fields related to a face, such as psychology, anthropology, cosmetology and dentistry. Most of the studies in these different fields have been carried out independently up to now, although all of them deal with a “face.” Now in virtue of the progress in the above engineering technologies a common study tools and databases for facial information have become available. On the basis of these backgrounds, this paper surveys recent research trends in the processing of face images by a computer and its typical applications. Firstly, the various characteristics of faces are considered. Secondly, recent research activities in the recognition and synthesis of face images are outlined. Thirdly, the applications of digital processing methods of facial information are discussed from several standpoints: intelligent image coding, media handling, human computer interaction, caricature, facial impression, psychological and medical applications. The common tools and databases used in the studies of processing of facial information and some related topics are also described. key words: face, facial information, recognition of face images, synthesis of face images, applications, tools and databases

Journal Article
TL;DR: This survey discusses aspects of this research field and reviews some recent advances including video-rate range imaging sensors as well as emerging themes and applications.
Abstract: Acquisition of three-dimensional information of a real-world scene from two-dimensional images has been one of the most important issues in computer vision and image understanding in the last two decades. Noncontact range acquisition techniques can be essentially classified into two classes: Passive and active. This paper concentrates on passive depth extraction techniques which have the advantage that 3-D information can be obtained without affecting the scene. Passive range sensing techniques are often referred to as shape-from-x, where x is one of visual cues such as shading, texture, contour, focus, stereo, and motion. These techniques produce 2.5-D representations of visible surfaces. This survey discusses aspects of this research field and reviews some recent advances including video-rate range imaging sensors as well as emerging themes and applications. Summary Japanese Page For Full-Text PDF, please login, if you are a member of IEICE, or go to Pay Per View on menu list, if you are a nonmember of IEICE.

Journal Article
TL;DR: The final result uses the apriori correlation information on the original function ensemble to devise an efficient sampling scheme which, when used in conjunction with the learning scheme described here, is shown to result in optimal generalization.
Abstract: In this paper, we discuss the problem of active training data selection for improving the generalization capability of a neural network. We look at the learning problem from a function approximation perspective and formalize it as an inverse problem. Based on this framework, we analytically derive a method of choosing a training data set optimized with respect to the Wiener optimization criterion. The final result uses the apriori correlation information on the original function ensemble to devise an efficient sampling scheme which, when used in conjunction with the learning scheme described here, is shown to result in optimal generalization. This result is substantiated through a simulated example and a learning problem in high dimensional function space. key words: active learning, wiener optimization criterion, generalization, inverse problem, training data selection


Journal Article
TL;DR: The generalized learning vector quantization (GLVQ) algorithm is applied to design a handwritten Chinese character recognition system that is designed to extract discriminative features to enhance the recognition performance.
Abstract: In this paper, the generalized learning vector quantization (GLVQ) algorithm is applied to design a handwritten Chinese character recognition system. The system proposed herein consists of two modules, feature transformation and recognizer. The feature transformation module is designed to extract discriminative features to enhance the recognition performance. The initial feature transformation matrix is obtained by using Fisher’s linear discriminant (FLD) function. A template matching with minimum distance criterion recognizer is used and each character is represented by one reference template. These reference templates and the elements of the feature transformation matrix are trained by using the generalized learning vector quantization algorithm. In the experiments, 540100 (5401× 100) hand-written Chinese character samples are used to build the recognition system and the other 540100 (5401 × 100) samples are used to do the open test. A good performance of 92.18% accuracy is achieved by proposed system. key words: Handwritten Chinese character recognition, generalized learning vector quantization, Fisher’s linear discriminant, feature transformation



Journal Article
TL;DR: Two models of valuation: interval and paired probabilities are proposed and it is shown that the valuation corresponding to the set operations ∩ (intersection), ∪ (union) and ∼ (complement) can be described by the truth functional ∧ (AND), ∨ (OR) and∪ (negation) operations in both models.
Abstract: When the degree of intersections A∩B of events A, B is unknown arises a problem: how to evaluate the probability P(A ∩ B) and P(A ∩ B) from P(A) and P(B). To treat related problems two models of valuation: interval and paired probabilities are proposed. It is shown that the valuation corresponding to the set operations ∩ (intersection), ∪ (union) and ∼ (complement) can be described by the truth functional ∧ (AND), ∨ (OR) and ∼ (negation) operations in both models. The probabilistic AND and OR operations are represented by combinations of Kleene and ( Lukasiewicz operations, and satisfy the axioms of MV (multiple-valued logic)-Algebra except the complementary laws. key words: interval probability

Journal Article
TL;DR: A neuron-MOS transistor (νMOS) is applied to current-mode multi-valued logic (MVL) circuits and a threshold detector and a quaternary T-gate using the proposed νMOS current mirrors are proposed.
Abstract: A neuron-MOS transistor (νMOS) is applied to current-mode multi-valued logic (MVL) circuits. First, a novel low-voltage and low-power νMOS current mirror is presented. Then, a threshold detector and a quaternary T-gate using the proposed νMOS current mirrors are proposed. The minimum output voltage of the νMOS current mirror is decreased by VT (threshold voltage), compared with the conventional double cascode current mirror. The νMOS threshold detector is built on a νMOS current comparator originally composed of νMOS current mirrors. It has a high output swing and sharp transfer characteristics. The gradient of the proposed comparator output in the transfer region can be increased 6.3-fold compared with that in the conventional comparator. Along with improved operation of the novel current comparator, the discriminative ability of the proposed νMOS threshold detector is also increased. The performances of the proposed circuits are validated by HSPICE with Motorola 1.5 μm CMOS device parameters. Furthermore, the operation of a νMOS current mirror is also confirmed through experiments on test chips fabricated by VDEC . The active area of the proposed νMOS current mirror is 63 μm × 51 μm. key words: neuron-MOS transistor, multi-valued logic, currentmode circuit, current mirror, current comparator, threshold detector, T-gate, integrated circuit



Journal Article
TL;DR: Different applications for image processing when used for vehicles are examined, including design and inspection, image sensor for traffic control and vehicle control vehicle detection, speed detection, distance measurement, obstacle detection, and lane detection.
Abstract: This paper examines different applications for image processing when used for vehicles. Some of the applications discussed include: design and inspection, image sensor for traffic control and vehicle control vehicle detection, speed detection, distance measurement, obstacle detection, and lane detection.



Journal Article
TL;DR: A clustering-based method is proposed for automatically constructing a multi-input TakagiSugeno (TS) fuzzy model where only the input-output data of the identified system are available.
Abstract: In this paper, a clustering-based method is proposed for automatically constructing a multi-input TakagiSugeno (TS) fuzzy model where only the input-output data of the identified system are available. The TS fuzzy model is automatically generated by the process of structure identification and parameter identification. In the structure identification step, a clustering method is proposed to provide a systematic procedure to partition the input space so that the number of fuzzy rules and the shapes of fuzzy sets in the premise part are determined from the given input-output data. In the parameter identification step, the recursive least-squares algorithm is applied to choose the parameter values in the consequent part from the given input-output data. Finally, two examples are used to illustrate the effectiveness of the proposed method. key words: fuzzy modeling, data clustering, recursive leastsquares algorithm