scispace - formally typeset
Search or ask a question

Showing papers in "Chinese Journal of Computers in 2003"


Journal Article
TL;DR: This paper makes an deep study of the reasons of the algorithms' inefficiency, analyzes the properties of indiscernibility relation, proposes and proves an equivalent and efficient method for computing positive region, and designs a complete algorithm for the reduction of attributes.
Abstract: This paper makes an deep study of the reasons of the algorithms' inefficiency, mainly focuses on two important concepts: indiscernibility relation and positive region, analyzes the properties of indiscernibility relation, proposes and proves an equivalent and efficient method for computing positive region. Thus some efficient basic algorithms for rough set methods are introduced with a detailed analysis of the time complexity and comparison with the existing algorithms. Furthermore, this paper researches the incremental computing of positive region. Based on the above results, a complete algorithm for the reduction of attributes is designed. Its completeness is proved. In addition, its time complexity and space complexity are analyzed in detail. In order to test the efficiency of the algorithm, some experiments are made on the data sets in UCI machine learning repository. Theoretical analysis and experimental results show that the reduction algorithm is more efficient than those existing algorithms.

98 citations


Journal Article
TL;DR: The main objective of this paper is to introduce a new concept of knowledge reduction in inconsistent systems, referred to as maximum distribution reduction, which preserves all maximum decision rules.
Abstract: Knowledge reduction is one of the most important problems in rough set theory. There are many types of knowledge reductions in the area of rough sets. It is required to provide their consistent classification. But most of information systems are not consistent because of various factors such as noise in data, compact representation, prediction capability and so on. To acquire brief decision rules from inconsistent systems, knowledge reductions are needed. The main objective of this paper is to introduce a new concept of knowledge reduction in inconsistent systems. It is referred to as maximum distribution reduction, which preserves all maximum decision rules. The maximum distribution reduction eliminates the harsh requirements of the distribution reduction and overcomes the drawback of the possible reduction that the derived decision rules may be in incompatible with the ones derived from the original system. The relationships among distribution reduction,maximum distribution reduction,approximate reduction and assignment reduction are examined. The judgement theorems and discernibility matrixes with respect to those reductions are obtained, from which we can provide new approaches to knowledge reductions in inconsistent information systems.

81 citations


Journal Article
TL;DR: The problem of calculating the core attributes of a decision table is studied, Definitions of core attributes in the algebra and information views are studied and the difference between these two views is discovered.
Abstract: The problem of calculating the core attributes of a decision table is studied. Some errors and limitations in some former results by Hu and Ye are analyzed. Definitions of core attributes in the algebra and information views are studied and the difference between these two views is discovered. An algorithm for calculating the core attributes of a decision table is also presented.

80 citations


Journal Article
TL;DR: A family of interconnection networks called bijection connected (BC) graphs that properly contain the crossed cubes and the Mbius cubes and possess the same logarithm level diameters and node degrees, the highest connectivity, and diagnosability as the hypercubes are proposed.
Abstract: This paper proposes a family of interconnection networks called bijection connected (BC) graphs that properly contain the crossed cubes and the Mbius cubes and possess the same logarithm level diameters and node degrees, the highest connectivity (fault tolerance), and diagnosability as the hypercubes Thus, this merges the study of some properties of the hypercube and a great many interconnection networks similar to it in structure In addition, this paper proves that the BC interconnection network family contains a kind of Hamilton connected graphs and gives a guess on the diameters of the graphs in it

78 citations


Journal Article
TL;DR: A hierarchical self organizing neural network model and its application in the learning of the trajectory distribution patterns for event recognition are presented and both local and global anomaly detection as well as object behavior prediction are considered.
Abstract: This paper presents a hierarchical self organizing neural network model and its application in the learning of the trajectory distribution patterns for event recognition. By linking the side neurons, some lines are formed. Each line becomes an internal net of the hierarchical self organizing neural network. Corresponding to the hierarchical self organizing neural network model, the authors define two neighborhoods, namely the neuron neighborhood and the internal net neighborhood. The neurons in both neighborhoods will update their weights to different extent. In this way, the trajectory distribution patterns can be learned. Using the learned patterns, the authors consider both local and global anomaly detection as well as object behavior prediction. Experimental results demonstrate the effectiveness of the approach to trajectory analysis.

34 citations


Journal Article
TL;DR: This paper lays stress on the introduction of the developed motivations and the foundation of the theories, alternative approaches and corresponding algorithms, and explores applications for various fields using principal curves.
Abstract: Principal Curves are nonlinear generalizations of the first linear principal component which can be thought of as 'optimal' linear 1-d summarization of the data. They emphasize for finding 'self-consistent' smooth one dimensional curves that pass through the middle of a multidimensional data set, and the theoretical foundation is to seek lower-dimensional non-euclidean manifolds embedded multidimensional data space. This paper lays stress on the introduction of the developed motivations and the foundation of the theories, alternative approaches and corresponding algorithms, explores applications for various fields using principal curves. Finally, we analyze existing problems about principal curves.

28 citations


Journal Article
TL;DR: A definition of migrating workflow model adopted the mobile computing paradigm is given, then a framework of migration workflow system based on the dock which consists of an anchorage with a local network is proposed.
Abstract: The migrating workflow is a new direction within the workflow management area, but it lacks explicit definition and mechanism for migrating area management till now. Owing to the above reason, a definition of migrating workflow model adopted the mobile computing paradigm is first given by this paper, then a framework of migrating workflow system based on the dock which consists of an anchorage with a local network is proposed. Some key techniques about the migrating workflow system such as the construction of anchorage and migrating instance, the organization and management of migrating area, the path searching method and the lifetime of a migrating instance based on the framework, are also discussed here. Finally, we give an example to demonstrate that the definition is sound and the system is easy to construct and to manage.

28 citations


Journal Article
TL;DR: In this article, a new fisheye lens distortion correction method using spherical perspective projection (SPP) constraint is proposed, where a space line must be projected into a great circle on a unit sphere under the SPP model.
Abstract: Fisheye lenses are often used to enlarge the field of view of a camera. But the images taken with the fisheye camera often have severe distortions. This paper proposes a new fisheye lenses distortion correction method using spherical perspective projection (SPP) constraint. Space line are generally projected into curves on a fisheye image. The SPP constraint means that a space line must be projected into a great circle on a unit sphere under the SPP model. The reason of using the SPP model is to handle the situation of larger-than-180-degree view-angles, and our case happens to be true. Our proposed distortion correction process consists of the following two main steps: Firstly, polynomial models for radial and tangential distortions with free parameters are selected; Secondly, these parameters are estimated based on the fact that the projected image curves of space lines should be mapped to great circles on a unit spherical surface, i.e., the SPP constraint. By minimizing the sum of errors, more specifically, the squares of spherical distances from mapped points to their corresponding best-fit great circles, the parameters are obtained. Our minimization process also consists of two steps, at first, an initial estimation is obtained by the linearization of the two distortion models, then Levenberg-Marquardt algorithm is employed for the final non-linear optimization. Finally, experimental results with synthetic data under different noise levels as well as with real fisheye images are reported, and the results appear satisfactory.

23 citations


Journal Article
TL;DR: It is shown that each subset is often expected to have a different prototype (or cluster center) than others when the data set is clustered into c (c1) subsets in general cases, and it is proved that the optimal choice of the fuzziness index m depends on the dataSet itself.
Abstract: The fuzzy c-means algorithm (FCM) is a widely used clustering algorithm It is well known that the fuzziness index m has a significant impact on the performance of the FCM However, it is an open problem how to select an appropriate fuzziness index m in theory when implementing the FCM In this paper, we point out that each subset is often expected to have a different prototype (or cluster center) than others when the data set is clustered into c (c1) subsets in general cases But the FCM has a trivial solution--the mass center of the data set According to the above assumption, the mass center of the data set is not expected to be stable We get a simpler criterion to judge whether the trivial solution of the FCM is stable or not As such criterion is related to the fuzziness index m,we also prove that the optimal choice of the fuzziness index m depends on the data set itself Therefore, a theoretical approach to choose the appropriate fuzziness index m is obtained Finally, we carry out numerical experiments in order to verify if our method is effective or notThe experimental results show that these rules are effective

21 citations


Journal Article
TL;DR: The main techniques and theoretical results for the Pareto optimal-based evolutionary approaches, mainly focusing on the preference based-individual ordering, fitness assignment, fitness sharing and niche size setting are introduced.
Abstract: Multi-objective optimization (MOO) has become an important research area of evolutionary computations in recent years, and the current research work focuses on the Pareto optimal-based MOO evolutionary approaches. The evolutionary MOO techniques are used to find the non-dominated set of solutions and distribute them uniformly in the Pareto front. After comparing and analyzing the developing history of evolutionary MOO techniques, this paper takes the multi-objective genetic algorithm as an example and introduces the main techniques and theoretical results for the Pareto optimal-based evolutionary approaches, mainly focusing on the preference based-individual ordering, fitness assignment, fitness sharing and niche size setting etc.. In addition, some problems that deserve further studying are also addressed.

21 citations


Journal Article
TL;DR: A three-dimensional workflow model including three sub-models, i.e., organization model, data model and process model, each of which describes some attributes of workflows from different perspectives is proposed.
Abstract: Workflow is the automation of a business process in whole or part To achieve the function of workflow management, the business process must be abstracted from the real world and described by a kind of formal method The result is workflow model This paper mainly discusses workflow models and their formal descriptions Based on the analysis of business processes in the real world, this paper proposes a three-dimensional workflow model including three sub-models, ie, organization model, data model and process model, each of which describes some attributes of workflows from different perspectives These sub-models and their relationships are addressed in detail in this paper Furthermore, the formal description for the three-dimensional workflow model proposed is also presented

Journal Article
TL;DR: A novel K-means clustering based on the immune programming algorithm that not only avoids the local optima and is robust to initialization, but also increases the convergence speed.
Abstract: This paper proposes a novel K-means clustering based on the immune programming algorithm after analyzing the advantages and disadvantages of the classical K-means clustering algorithm. The theory analysis and experimental results show that the algorithm not only avoids the local optima and is robust to initialization, but also increases the convergence speed.

Journal Article
TL;DR: A novel method for tolerating up to two disk failures in disk arrays has been presented by representing a check group consisting of date and parity units with a graph.
Abstract: A novel method for tolerating up to two disk failures in disk arrays has been presented. By representing a check group consisting of date and parity units with a graph, the conditions for tolerating two disk failures in disk arrays becomes to that of partitions of check group, and thus to that of the decompositions of its graph. A necessary and sufficient condition for the partition of check group is proved; the existence of the partition is given; the condition for optimizing the performance of the placement scheme is discussed; and the step of placement of the date and parity in a disk array is shown. It presents an efficient method for placement schemes with optimizing performance to tolerating two disk failures in disk arrays.

Journal Article
TL;DR: This paper proposes a simple and efficient motion based gait recognition algorithm by spatial temporal silhouette analysis that can implicitly capture the structural and transitional characteristics of gait, especially biometric shape cues.
Abstract: This paper proposes a simple and efficient motion based gait recognition algorithm by spatial temporal silhouette analysis. For each image sequence, an improved background subtraction algorithm and a simple correspondence procedure are first used to segment and track the moving silhouettes of a walking figure from the background. Then, eigenspace transformation based on the traditional principal component analysis (PCA) is applied to time varying distance signals derived from a sequence of silhouette images to reduce the dimensionality of the input feature space. Supervised pattern classification techniques are finally performed in the lower dimensional eigenspace for recognition. This method can implicitly capture the structural and transitional characteristics of gait, especially biometric shape cues. Extensive experimental results on outdoor image sequences demonstrate that the proposed algorithm has an encouraging recognition performance with relatively lower computational cost.

Journal Article
TL;DR: The evolutionary computing can fast generate a group of S-boxes of DES which are more secure than the old ones of DES and based on the group of increasingly secure S- boxes it can construct the evolutionary cryptosystem of type DES.
Abstract: This paper proposes the concept of evolutionary cryptosystems and a evolutionary method for designing cryptosystems. With the evolutionary computing we can fast generate a group of S-boxes of DES which are more secure than the old ones of DES. Furthermore based on the group of increasingly secure S-boxes we generate, we can construct the evolutionary cryptosystem of type DES. Generally the evolutionary cryptosystems will have more powerful secure function than the usual ones.

Journal Article
TL;DR: The process of chunking can be regarded as a classification problem which trains from the corpus with chunk tags and POS tags, and the focus of ME model is how to select useful features.
Abstract: This paper proposes to use Maximum Entropy (ME) model to conduct Chinese chunk parsing. First we define Chinese chunks and list all chunk categories and tags used in the model. Thus the process of chunking can be regarded as a classification problem which trains from the corpus with chunk tags and POS tags. The focus of ME model is how to select useful features. Then, the procedure and algorithms of feature selection is introduced. At last we test the model, and experimental results are given.

Journal Article
TL;DR: An improved SVM is presented: NN-SVM, which first prunes the training set, reserves or deletes a sample according to whether its nearest neighbor has same class label with itself or not, then trains the new set with SVM to obtain a classifier.
Abstract: A support vector machine constructs an optimal hyperplane from a small set of samples near the boundary. This makes it sensitive to these specific samples and tends to result in machines either too complex with poor generalization ability or too imprecise with high training error, depending on the kernel parameters. SVM focuses on the samples near the boundary in training time, and those samples intermixed in another class are usually no good to improve the classifier's performance, instead they may greatly increase the burden of computation and their existence may lead to overlearning and decrease the generalization ability. In order to improve the generalization ability we present an improved SVM: NN-SVM. It first prunes the training set, reserves or deletes a sample according to whether its nearest neighbor has same class label with itself or not, then trains the new set with SVM to obtain a classifier. Experiment results show that NN-SVM is better than SVM in speed and accuracy of classification.

Journal Article
TL;DR: A User Access Matrix based preferred broswing paths algorithm that was accurate and scalable, suitable for application in E-business, such as to optimize web site or to design personalized service.
Abstract: Web logs contain a lot of user browsing information. How to mine user browsing interest patterns is a important research topic. On the analysis of the present algorithms for mining user broswing patterns, representing user broswing interest and intention accurately by comparing relatively access ratio and the average of relatively access ratio, support-preference can be used for mining user broswing paths. According to the conception, we proposed a User Access Matrix based preferred broswing paths algorithm. Firstly, An URL-URL matrix was set up from web logs according to Web site's broswing paths, where referer URL as rows, navigating URL as columns and path broswing frequency as matrix elements. This URL-URL matrix is a sparse matrix which can be represented by List of 3-tuples. Then, preferred broswing sub-paths could be discovered from the computation of this matrix. Finally, all the sub-paths were combined. Experiments showed that it was accurate and scalable. It's suitable for application in E-business, such as to optimize web site or to design personalized service.

Journal Article
TL;DR: A classifier combination algorithm based on multi-agent system is presented at abstract level and its space complexity is lower than Behavior Knowledge Space (BKS) method and experiments show that the algorithm is convergent.
Abstract: In this paper, a classifier combination algorithm based on multi-agent system is presented at abstract level. The problem is modeled as people's tracing back to their homeland: once upon a time, people from a region left their homeland and resided in different places, decades later, their offspring set out to find their homeland according to the wide spread legends of the ancestors' origin. In this combination problem, the class label of a testing sample serves as the homeland, decisions made by classifiers serve as offspring's residual places, and legends are class creditability by classifiers acquired from combination training set. Messengers sent by offspring try to trace back to their homeland according to the legends. They act as agents and exchange information with one another, so that confidences of different places being the original place change gradually. After congruence among the messengers is achieved, combination decision is made. The co-decision matrix is used for information exchange between agents, thus relativity between classifiers is utilized, while it is rarely considered in Bayesian Rule. According to experiments on standard database, when the number of classifiers used in combination is small, this algorithm lead to less error than other methods, and its space complexity is lower than Behavior Knowledge Space (BKS) method. Experiments show that the algorithm is convergent.

Journal Article
TL;DR: A novel classification method for data mining, Organizational CoEvolutionary algorithm for Classification (OCEC), is proposed in this paper, which is different from the GA based classification methods available and can achieve higher predicting accuracy and a smaller number of rules.
Abstract: A novel classification method for data mining, Organizational CoEvolutionary algorithm for Classification (OCEC), is proposed in this paper, which is different from the GA based classification methods available The evolutionary operations of OCEC do not act on rules, but on the given data directly, and rules are extracted from the final evolutionary results, which can avoid generating meaningless rules during evolutionary process Three evolutionary operators, add and subtract operator, exchange operator and unite operator, and a selection mechanism are developed for organizations The fitness of organization is then defined based on the importance of attributes, which are determined during evolution OCEC is compared to other GA based and non GA based classification algorithms on some benchmark datasets from the UCI machine learning repository Results show the proposed algorithm can achieve higher predicting accuracy and a smaller number of rules In addition, its performance is more stable in that its predicting accuracy fluctuates in a very small scope during experiments with k fold cross validation method

Journal Article
TL;DR: In this paper, the authors use a new approach other than the predecessors' to study the P3P problem, and find that when the three control points form an isosceles triangle, they can find some spatial regions, and when the camera appears in these spatial regions.
Abstract: The multi-solution phenomenon of the P3P problem limits its applications in practices, and the results of predecessors are not effective in directing us to dispose the control points and the camera. We use a new approach other than the predecessors’ to study the P3P problem, and find that when the three control points form an isosceles triangle, we can find some spatial regions, and when the camera appears in these spatial regions, we can get the real solution of the formed P3P problem uniquely. And the results we obtain are also very useful for disposing control points and the camera in the practical applications.

Journal Article
TL;DR: A customer behavior analysis algorithm based on swarm intelligence that meets the demands of customer clustering and classifying of customer relationship management and shows the advantages of visualization, self-organization and clusters with distinct characteristics is proposed.
Abstract: A customer behavior analysis algorithm based on swarm intelligence is proposed. Firstly, customer consumption patterns are randomly projected on a plane. Then, clustering analysis is processed by a clustering method based on swarm intelligence with different swarm similarity coefficients. Finally, the clustering customer groups with various consume characteristics are collected from the plane by a recursive algorithm. A parallel strategy is also proposed. It improves the scalability of the algorithm. The data of telecom mobile customer consumption are used in the experiment. The results are compared with the results obtained by other clustering methods such as k-means algorithm and self-organizing maps algorithm. The comparison shows that this customer behavior analysis algorithm based on swarm intelligence meets the demands of customer clustering and classifying of customer relationship management. Especially, on the aspect of master customer analysis and one to one sell analysis, the algorithm shows the advantages of visualization, self-organization and clusters with distinct characteristics.

Journal Article
TL;DR: This paper proposes to narrow the image domain and use machine learning methods to automatically construct models for image classes, thus providing users with a conceptualized way to image query.
Abstract: In the traditional approach of content-based image retrieval, the wide image domain results in the wide semantic gap between the low-level features and the high-level concepts. We propose to narrow the image domain and use machine learning methods to automatically construct models for image classes, thus providing users with a conceptualized way to image query. In this paper, support vector machines are trained for natural image classification. The resulting image class models are incorporated into image retrieval system, so that the users can search natural images by classes. The experimental results are promising.

Journal Article
TL;DR: A family of rough analysis methods for decision support, called Rough Decision Support Method (RDSM), is proposed in this paper and can make the best use of the decision support abilities of decision table, and provide powerful decision support.
Abstract: Traditional rough analysis method can produce a set of reduced decision rules from a decision table by attribute reduction and value reduction. These rules can provide decision support to some extent. However, the process of attribute reduction and value reduction is at the cost of many decision support abilities, so that the obtained rules remain just some part of the whole decision support abilities of the original decision table. In many cases, the set of reduced rules is often unable to offer decision support, which could be offered by the original decision table. A family of rough analysis methods for decision support, called Rough Decision Support Method (RDSM), is proposed in this paper. RDSM can make the best use of the decision support abilities of decision table, and provide powerful decision support. RDSM is essentially an error-tolerable method. RDSM and the traditional method can be combined into a hybrid decision support model, which can offer powerful decision support at high speed.

Journal Article
TL;DR: This paper proposes a minimal test suite generation method that gives a partition to the set of all the applicable test cases at first on basis of the interrelations among the testing requirements, and generates a test suite by the partition.
Abstract: The cost and effectiveness of software testing is determined by the quantity and quality of the test suite. This paper proposes a minimal test suite generation method. It gives a partition to the set of all the applicable test cases at first on basis of the interrelations among the testing requirements,and then generates a test suite by the partition,at last a minimal test suite is obtained by reduction with the methods of integer programming,heuristic algorithm or greedy algorithm. Compared with the existed methods,our method has better property that can generate the minimal test suite to test all the test requirements sufficiently.

Journal Article
TL;DR: Under DRAM(h) model, it is found that the different implementation forms of the same algorithm can have different memory complexity though they have almost the same time and space complexity under traditional RAM model.
Abstract: In this paper, a new parallel computation model DRAM( h ), which has h-level memory hierarchies, was proposed. With this new model, we performed memory complexity analysis on different implementation forms of two classical parallel numerical linear algebra algorithms, i.e., four forms of parallel lower triangular solver(PTRS) and six forms of parallel LU factorization without column pivoting(PLU). Under DRAM(h) model, we find out that the different implementation forms of the same algorithm can have different memory complexity though they have almost the same time and space complexity under traditional RAM model. Finally, we validate our analytical results with experimental results on three parallel computing platforms, i.e., HITACHI SR2201, DAWNING3000 and 128 node LSEC Linux Cluster. In most cases, our model’s analytical results match well with experimental results, which indicates the effectiveness of our new model on clarifying the different memory access pattern of various forms of the same algorithm. Some mismatches can be well explained through slightly modification on model analysis assumptions according to platform memory hierarchy features.

Journal Article
TL;DR: The components and system description are represented as Boolean variables and the Boolean algebra algorithm is presented to compute hitting sets to solve model-based diagnosis problems.
Abstract: In model-based diagnosis,the system is described by Boolean logic,but the hitting sets are computed with HS-TREE or DAG,so the data structures of diagnosis system were composed of many descriptions. In this paper,both the components and system description are represented as Boolean variables. And the Boolean algebra algorithm is presented to compute hitting sets. The efficiency of this algorithm is better than the other works,data structures are simpler,and the correct answers are not lost. This algorithm could be used more widely.

Journal Article
TL;DR: A new threshold signature scheme based on the difficulty of solving the discrete logarithm problem is discussed, and a new scheme by use of Joint Secret Sharing technique to protect secret keys of group members is presented.
Abstract: This paper discusses the (t,n) threshold signature scheme based on the difficulty of solving the discrete logarithm problem. All up-to-date solutions for threshold signature can be classified into the two categories:(1) solutions with the assistance of a trusted party (2) solutions without the assistance of a trusted party. Generally speaking, as an authority which can be trusted by all members doesn't exist, a threshold signature scheme without a trusted party appears more attractive. However, Secret Sharing technique used in previous schemes may cause some colluding members of the group to obtain secret keys of others. In order to solve the problem, authors present a new scheme by use of Joint Secret Sharing technique to protect secret keys of group members.

Journal Article
TL;DR: An incremental updating algorithm based on FP tree for mining association rules in the cases including inserting the transactions in the databases and modifying support, and making use of the previous mining result to cut down the cost of finding new rules in an updated database.
Abstract: The discovery of interesting association rules among huge amounts of business transaction records can help in many business decision making processes, such as catalog design, cross marketing, and loss leader analysis. There have been many algorithms proposed for efficient discovery of association rules in large databases. However, a little work has been done on maintenance of discovered association rules. This paper presents an incremental updating algorithm based on FP tree for mining association rules in the cases including inserting the transactions in the databases and modifying support. The proposed algorithm makes use of the previous mining result to cut down the cost of finding new rules in an updated database. Comparing with FUP algorithm, the authors also offer some experiments to show that the new algorithm is more efficient.

Journal Article
TL;DR: The existing resource locating methods are analyzed, their insufficiency is pointed out, a new algorithm based on Routing-Transfer mechanism is brought and the result shows that the RT algorithm is an effective algorithm in Grid environments.
Abstract: In Grid, which is a distributed and heterogeneous environment, locating needed resource quickly is very important for the performance of Grid computing. This article analyzes the existing resource locating methods, points out their insufficiency and brings a new algorithm based on Routing-Transfer mechanism. We also analyze the time cost and space cost of all these algorithms, the result shows that the RT algorithm costs lest time and tolerable space. As a conclusion, RT algorithm is an effective algorithm in Grid environments.