scispace - formally typeset
Search or ask a question

Showing papers in "Computer Engineering in 2007"


Journal Article
TL;DR: The vector space model used in Lucene is discussed, the structure of index files and ranking algorithm is analyzed, and the compressing algorithm inLucene is described.
Abstract: As an information retrieval library written in Java, Lucene, with high performance and easy to scale, can easily add searching and indexing capabilities to applications. This paper discusses the vector space model used in Lucene, analyzes the structure of index files and ranking algorithm, and describes the compressing algorithm in Lucene. An experiment is done to test the indexing process of Lucene.

67 citations


Journal Article
Song Xin1
TL;DR: Experiments on the standard database UCI show that the proposed method can produce a high purity clustering result and eliminate the sensitivity to the initial start centers.
Abstract: The traditional k-means algorithm has sensitivity to the initial start center.To solve this problem,a new method is proposed to find the initial start center.First it computes the density of the area where the data object belongs to;then finds k data objects all of which are belong to high density area and the most far away to each other,using these k data objects as the initial start centers.Experiments on the standard database UCI show that the proposed method can produce a high purity clustering result and eliminate the sensitivity to the initial start centers.

34 citations


Journal Article
TL;DR: The new method defines a similarity computation formula among Hownet’s sememes according to information theory, finds a way out of the difficulty that OOV words cannot participate in semantic computation by implementing concept segmentation and automatic semantic production to OOV Words.
Abstract: Similarity computation of Chinese words is a key problem in Chinese information processing. This paper proposes a new method on similarity computation which is based on Hownet, geared to semantic and could be expanded. The new method defines a similarity computation formula among Hownet’s sememes according to information theory, finds a way out of the difficulty that OOV words cannot participate in semantic computation by implementing concept segmentation and automatic semantic production to OOV words, and realizes the similarity computation on the semantic level among arbitrary words finally. Experimental result of CILIN indicates that the accuracy rate of the new method is nearly 15% higher than present ones.

34 citations


Journal Article
TL;DR: A crawling strategy, which is based on “board-topic correlation judgments” algorithm, which performs remarkably better both in precisions and recall than “ board-first” strategy.
Abstract: Web Forums have been one of dominating ways for information release and exchange in Internet. Crawling is the groundwork of searching and mining information from Web Forums. However, traditional crawling component usually using “Broad-first” strategy can not fetch information from Web Forums effectively. Exploring inner structure-features of forums, this paper presents a crawling strategy, which is based on “board-topic correlation judgments” algorithm. Compared with “board-first” strategy, this solution performs remarkably better both in precisions and recall. In practice, the algorithm is performed over 12 000 different Web forums and achieves a good result.

17 citations


Journal Article
TL;DR: A real-time multitask system for the space aircraft docking simulation is presented and the engineering and modularization methods are used and the system performances are good as a result.
Abstract: A real-time multitask system for the space aircraft docking simulation is presented and the system composition,the functions requirement and the system structure are introduced.Based on the methods of finite state machine(FSM)and Petri net,the single task and the multitask models are built respectively.According to these models,the system design is accomplished.By applying the fork and the resources sharing prototypes,the synchronization and the mutex functions can be realized.In actual application,the engineering and modularization methods are used and the system performances are good as a result.Experimental results show that the methods for the system analysis and design are rational and feasible.

16 citations


Journal Article
TL;DR: Experiments show that, compared with traditional method based on means, the modified data clustering algorithm can improve the efficiency of dataclustering.
Abstract: The method of data clustering will influence the effect of clustering directly.The algorithm of k-means is discussed,the shortages ofthis algorithm such as it can not deal with symbolic data and it is sensitive for data of isolation point and noise are demonstrated.A modified k-meansclustering algorithm based on weights is put forward,it changes the shortcomings of k-means.Its complexity is analyzed from theoretical.Theexperiments show that,compared with traditional method based on means,the modified data clustering algorithm can improve the efficiency of dataclustering.

15 citations


Journal Article
TL;DR: This paper proposes semi-supervised affinity propagation, where cluster validity indices are embedded into iteration process of the algorithm to supervise and guide its running to an optimal clustering solution.
Abstract: Affinity propagation clustering is an efficient and fast clustering algorithm,especially for large data sets,but its clustering quality is low when it is applied to a data set with loose cluster structuresThis paper proposes semi-supervised affinity propagation,where cluster validity indices are embedded into iteration process of the algorithm to supervise and guide its running to an optimal clustering solutionThe experimental results show that the algorithm gives accurate clustering results for data sets with compact and loose cluster structures

14 citations


Journal Article
TL;DR: This paper introduces an intrusion detection model based on clustering analysis and realizes an algorithm of K-means which can set up a database of intrusion detection and classify safe levels and demonstrates strong applicability and self-adaptability.
Abstract: This paper introduces an intrusion detection model based on clustering analysis and realizes an algorithm of K-means which can set up a database of intrusion detection and classify safe levels.Experiential data are not required to set up this detection system,which is capable of re-classifying intrusion behaviors in terms of related data automatically.Simulation experiments show that the technique possesses strong applicability and self-adaptability.

14 citations


Journal Article
TL;DR: Through the cryptanalysis of an E-cash system with multiple banks based on proxy signature proposed by Zhou etc., it is found that the central bank or the third trustee can issue valid E- cash by imitating other legal banks.
Abstract: Through the cryptanalysis of an E-cash system with multiple banks based on proxy signature proposed by Zhou etc.,it is found that the central bank or the third trustee can issue valid E-cash by imitating other legal banks.An off-line E-cash system with multiple banks based on elliptic curve is proposed using the technique of proxy-protected proxy signature.The E-cash’s denomination is marked with authorization in this E-cash system,and the analysis shows that it overcomes the disadvantages of those existing E-cash systems with multiple banks,and has higher efficiency.

14 citations


Journal Article
Zhu Wei-xing1
TL;DR: A partial recursive algorithm of threshold selection and segmentation is put forward, which is based on the Otsu threshold selecting method and not only reduces the running time, but also has better self-adaptability.
Abstract: In image segmentation,threshold selection is very important.A partial recursive algorithm of threshold selection and segmentation is put forward,which is based on the Otsu threshold selecting method.Based on the information of entropy of image pixels,a partial recursive algorithm is used to search optical threshold.It not only reduces the running time,but also has better self-adaptability.With this algorithm,the image can be segmented effectively even if it is uneven and not the single-modal or bimodal one.The segmentation result has more details,which is good to the feature extraction.An experiment with Lena image is made and good result is obtained.

14 citations


Journal Article
XU Zhanqi1
TL;DR: In this paper, a thorough analysis for approximate point-in-triangulation test (APIT) algorithm is made, and an improvement of it is proposed, showing that the improvement algorithm expands the coverage of anchors and reduces the probabilities of In-To-Out Error and Out-to-In Error.
Abstract: As a novel technology about acquiring and processing information,wireless sensor networks(WSNs) can be used in many application fields to realize complicated detection and tracking tasksThe localization of sensor nodes,which is the foundation of the other applications,is to determine self-position using a certain localization scheme,according to the anchor nodesIn this paper,a thorough analysis for approximate point-in-triangulation test(APIT) algorithm is made,and an improvement of it is proposedThe analysis shows: compared with the original one,the improvement algorithm expands the coverage of anchors and reduces the probabilities of In-To-Out Error and Out-To-In Error

Journal Article
TL;DR: The results on four benchmark functions prove a few adjusting methods on inertia weight are feasible, and indicate these methods can improve the global convergence and convergence speed.
Abstract: The inertia weight is the crucial parameter of the particle swarm optimization(PSO).It can balance the global search and local search to improve PSO’s convergence.This paper analyzes the effect of inertia weight on PSO’s performance.To enhance the global optimality,a few adjusting methods on inertia weight are put forward.The results on four benchmark functions prove these methods are feasible,and indicate these methods can improve the global convergence and convergence speed.

Journal Article
TL;DR: This paper focuses on the design and implementation of CrossBit dynamic binary translation infrastructure, including its system architecture and essential design philosophies, and introduces the principles of binary translation.
Abstract: Binary translation is the technique that translates binary program from one machine platform to another,which enables the binary code to migrate among heterogeneous machine platforms.This paper introduces the principles of binary translation,and focuses on the design and implementation of CrossBit dynamic binary translation infrastructure,including its system architecture and essential design philosophies.Experiment data are provided to prove binary translation’s performance advantage.

Journal Article
TL;DR: Based on the method presented, the paper solves two typical combinational optimization problems (0/1 knapsack problem and TSP problem) and the result of simulation shows the validity of the algorithm.
Abstract: Chaos optimization algorithm(COA) is put forward to solve value optimization problems,which can find the best solution efficiently.However,it is difficult for COA to solve combinational optimization problems.To solve the problem,this paper presents a method to solve combinational optimization problem based on COA.An initial solution is produced,and a new solution is produced by chaos variables or by disturbing initial solution based on chaos variables.Based on the method,the paper solves two typical combinational optimization problems(0/1 knapsack problem and TSP problem).And the result of simulation shows the validity of the algorithm.

Journal Article
TL;DR: A heuristic algorithm of attribute reduction based on the conception of the condition distinguish ability is put forward and indicated that this algorithm can get the minimum reduction in information system under most situations.
Abstract: Attribute reduction is one of the key topics of rough set theorySearch for minimum reduction has been proved to be a NP-hard problemThe definition of attribute condition distinguish ability is given in information systemOn this basis,a heuristic algorithm of attribute reduction based on the conception of the condition distinguish ability is put forwardBy analyzing the example,it is indicated that this algorithm can get the minimum reduction in information system under most situations

Journal Article
TL;DR: An edge detection-projection feature based algorithm to locate the LP and a vertical projection-template matching algorithm to segment the characters are proposed and it reports that the proposed algorithms have high accuracy and robustness.
Abstract: An edge detection-projection feature based algorithm to locate the LP and a vertical projection-template matching algorithm to segment the characters are proposed.The edges are detected in a gray-level vehicle image,the result of experiment shows that the speed of detecting license plate is high and the obtained contour is very legible.The LP region is located by projection method,the tilt angle of LP is corrected by Hough transform.The character is segmented by LP segmentation algorithm,and some problems are solved effectively under complex scenes.To demonstrate the effectiveness of the proposed algorithm,it conducts extensive experiments over a large number of real-world vehicle license plates.It reports that the proposed algorithms have high accuracy and robustness.

Journal Article
TL;DR: An algorithm combines the auto-cor correlation and inter-correlation to estimate the time delay of stabilized signal in different noises, based on the analysis of many generalized correlations.
Abstract: An algorithm combines the auto-correlation and inter-correlation to estimate the time delay of stabilized signal in different noises,based on the analysis of many generalized correlations.Theory analysis and simulation prove that the second correlation can obtain higher precision than the generalized correlation in lower SNR.

Journal Article
TL;DR: The results show that the algorithm can keep good balance between the exploration and the exploitation and has better capability of jumping out of local optimum than basic particle swarm optimization algorithm.
Abstract: This paper proposes an adaptive particle swarm optimization algorithm and two benchmarks are used to test itThe results show that the algorithm can keep good balance between the exploration and the exploitationWhen solving the problem of multi-modal function optimization,the algorithm has better capability of jumping out of local optimum than basic particle swarm optimization algorithm,and the executing efficiency does not reduce distinctly

Journal Article
TL;DR: This paper analyses the principle and implementation of LEPS routing in TinyOS, and an experiment in an practical sensor network is also proposed.
Abstract: LEPS is a multihop routing protocol in TinyOS.It forms a tree-like topology in the network with a shortest-path-first algorithm.In this topology,each node delivers the sensed data to its parent node,and the parent node forwards it to the sink node along the optimized path.LEPS routing also takes the link quality into account in choosing parent node to improve reliability.This paper analyses the principle and implementation in TinyOS.An experiment in an practical sensor network is also proposed.LEPS routing can form and maintain the data gathering tree in the network,but the link among sensor nodes are not stable,and network topology changes frequently.

Journal Article
TL;DR: This paper presents a method based on minimum error threshold for detecting ship targets from coastal regions and is applied in detecting more than one hundred optical remote sense images and proved to be fast, exact and robust.
Abstract: Optical remote sense images have become more and more useful in our life.Focusing on the medium or high resolution optical remote sense images,this paper presents a method based on minimum error threshold for detecting ship targets from coastal regions.In this method,an adaptive oriented orthogonal projective decomposition method is used to get a more accuracy approximation of the histogram,and the candidate ship targets can be detected by an improved minimum error threshold algorithm based on the shannon information entropy function.A multi-criteria target detection idea is used to detect ship exactly.The method is applied in detecting more than one hundred optical remote sense images and proved to be fast,exact and robust.

Journal Article
TL;DR: The experiment indicates that the dynamic learning algorithm can catch and record user’s latest interest in time, thus the user required information can be truly recommended.
Abstract: 【】This paper proposes a user profile representation based on vector space model together with its dynamic learning algorithm, and studies feature selection in user modeling A new feature selection method combining term frequency and TFIDF according to part-of-speech tagging is proposed The experiment indicates that the dynamic learning algorithm can catch and record user’s latest interest in time, thus the user required information can be truly recommended The experiment also shows that the effect of combining method based on part-of-speech tagging is better than that of using TF or TFIDF separately

Journal Article
TL;DR: The method of support vector regression forecast is used to forecast the indicator of network attack situation evaluation in time series and the framework of learning module and forecast module in the algorithm are produced.
Abstract: As one of the most important techniques about active defense,the forecast technique of network attack situation becomes more crucial.The indicator of network attack situation evaluation is defined.The method of support vector regression forecast is used to forecast the indicator of network attack situation evaluation in time series.The framework of learning module and forecast module in the algorithm are produced.Experiment results prove the method is correct.

Journal Article
LU Hua-pu1
TL;DR: Integrating several traffic management sub-systems based on WebGIS, this system provides the traffic managers with intelligent management, command and dispatch and the system architecture of the system is proposed.
Abstract: This paper presents a new urban intelligent traffic management command and dispatch system based on WebGIS.Integrating several traffic management sub-systems based on WebGIS,this system provides the traffic managers with intelligent management,command and dispatch.The concept,characteristics and advantages of WebGIS applied in urban traffic management are introduced and the system architecture of the system is proposed.The data exchange based on XML and the database is introduced.It also analyses the system’s main function,technology characteristics and the method of implementation.

Journal Article
TL;DR: A secure problem in a forward-secure proxy signature scheme is pointed out, and an evolution on proxy signer’s key is proposed, under the strong RSA assumption, and the new scheme is truly forward secure.
Abstract: A secure problem in a forward-secure proxy signature scheme is pointed out,the proxy signer’s key does not satisfy the forward-security,so that the original scheme is insecure once the adversary gets the proxy signer’s key.The improved scheme proposes an evolution on proxy signer’s key,under the strong RSA assumption,the new scheme is truly forward secure.

Journal Article
TL;DR: An algorithm for optimizing measure parameters based on genetic algorithm and biological immunology is designed together with a hybrid intrusion detection engine that precludes the security problems by utilizing the useful metaphors of biological immunity and the prominent characteristics of genetic algorithm.
Abstract: 【】This paper analyzes the security threats of the immune IDS schemes and the useful metaphor of biological immune system considering whose application in the study of IDS. In light of the system flaws arising from the transfer of disease-causing mechanisms of biological immune system into IDS, an algorithm for optimizing measure parameters based on genetic algorithm and biological immunology is designed together with a hybrid intrusion detection engine. The scheme precludes the security problems by utilizing the useful metaphors of biological immunity and the prominent characteristics of genetic algorithm. It is characterized by parallel operating, stability, adaptability and robustness. The paper justifies its brevity, security, high efficiency. 【Key words】biological immunology; intrusion detection; transfer of pathological mechanism; hybrid detection engine

Journal Article
TL;DR: This paper reviews the characteristics and deficiencies of modern development process, such as dynamic development team and uncertain workflow, and PCR-based flexible process model, which the instance of workflow model depends on the conditions during product development.
Abstract: This paper reviews the characteristics and deficiencies of modern development process,such as dynamic development team and uncertain workflow.Complex relations among tasks are analysed.Process control rule(PCR) in NCPD is proposed,which includes route condition and task constraint.PCR-based flexible process model is proposed,which the instance of workflow model depends on the conditions during product development.Inspection approach for process integrality is given through two algorithms.Through task decomposition,task flow model reduces the complexity of development process.An instance of development process is introduced.

Journal Article
TL;DR: A new algorithm is proposed which adopts hierarchy structure and boundary of attribute as heuristic function to choose the essential attribute and can select the important attributes that reflect the characteristic of system while the universal is decreasing.
Abstract: Through analysing the attribute reduction algorithms of consistent decision table,reasons of inefficiency are found.A new algorithm is proposed which adopts hierarchy structure and boundary of attribute as heuristic function to choose the essential attribute.It can select the important attributes that reflect the characteristic of system while the universal is decreasing.Theoretical analysis and experiment results show that on the premise of unchanged of classification precision,the algorithm can obtain the best or sub-best attribute reduce set.

Journal Article
TL;DR: Experiment shows that general pattern match technology can’t get acceptable result in task of Chinese, but new method presented by this paper is more adapted to deal withtask of Chinese entity relation extraction.
Abstract: The paper studies entity relation extraction of Chinese in information extraction.To extract Chinese entity relation,it imports technology of word semantic match into pattern match technology.Performance between general pattern match technology and semantic pattern match technology is compared.Experiment shows that general pattern match technology can’t get acceptable result in task of Chinese,but new method presented by this paper is more adapted to deal with task of Chinese entity relation extraction.

Journal Article
TL;DR: In the paper, FMP(Framework for multi-policy)studies are divided systematically in three classes: FMP based on policy language, F MP based on security attributes and FMPbased on uniformed security model.
Abstract: How to support multi-policy in secure information systems is a research hotspot in recent years.In the paper,FMP(Framework for multi-policy)studies are divided systematically in three classes:FMP based on policy language,FMP based on security attributes and FMP based on uniformed security model.Accordingly,typical FMPs of each class are analyzed and compared,and a research direction is pointed out.A practical case of enforcing FMP in secure OS is described.

Journal Article
HU Chunchun1
TL;DR: The experimental results indicate that the index is effective and efficient for evaluating the result of fuzzy clustering and can correctly identify the optimal clustering number and is not sensitive to the weighting exponent.
Abstract: Cluster validity index is used to evaluate the validity of clusteringAnew cluster validity index is proposed to identify the optimal fuzzy partition according to the basic properties of clusteringThe index exploits two important evaluation factors: the measure of fuzzy partition and information entropyThe first factor is used to evaluate the compress within a cluster and the separation between clusters,and the second is to measure the uncertainty of the partition resultThe experimental results indicate that the index is effective and efficient for evaluating the result of fuzzy clusteringEspecially,for the spatial data,the index can correctly identify the optimal clustering number and is not sensitive to the weighting exponent