scispace - formally typeset
Search or ask a question

Showing papers on "Centroid published in 2007"


Proceedings ArticleDOI
01 Oct 2007
TL;DR: Weighted centroid localization (WCL) provides a fast and easy algorithm to locate devices in wireless sensor networks that is derived from a centroid determination which calculates the position of devices by averaging the coordinates of known reference points.
Abstract: Localization in wireless sensor networks gets more and more important, because many applications need to locate the source of incoming measurements as precise as possible. Weighted centroid localization (WCL) provides a fast and easy algorithm to locate devices in wireless sensor networks. The algorithm is derived from a centroid determination which calculates the position of devices by averaging the coordinates of known reference points. To improve the calculated position in real implementations, WCL uses weights to attract the estimated position to close reference points provided that coarse distances are available. Due to the fact that Zigbee provides the link quality indication (LQI) as a quality indicator of a received packet, it can also be used to estimate a distance from a node to reference points.

491 citations


Journal ArticleDOI
TL;DR: An integrated local surface descriptor for surface representation and object recognition is introduced and, in order to speed up the search process and deal with a large set of objects, model local surface patches are indexed into a hash table.

456 citations


Journal ArticleDOI
TL;DR: This paper proves that the Karnik-Mendel iterative algorithms converge monotonically and super-exponentially fast, which are highly desirable for iterative algorithm convergence.
Abstract: Computing the centroid of an interval T2 FS is an important operation in a type-2 fuzzy logic system (where it is called type-reduction), but it is also a potentially time-consuming operation. The Karnik-Mendel (KM) iterative algorithms are widely used for doing this. In this paper, we prove that these algorithms converge monotonically and super-exponentially fast. Both properties are highly desirable for iterative algorithms and explain why in practice the KM algorithms have been observed to converge very fast, thereby making them very practical to use

208 citations


Patent
Ashok Kulkarni1, Brian Duffy1, Kais Maayah1, Gordon Rouse1, Eugene Shifrin1 
07 Jun 2007
TL;DR: In this paper, a computer-implemented method for determining a centroid of an alignment target formed on a wafer using an image of the alignment target acquired by imaging the wafer is presented.
Abstract: Various methods and systems for determining a position of inspection data in design data space are provided. One computer-implemented method includes determining a centroid of an alignment target formed on a wafer using an image of the alignment target acquired by imaging the wafer. The method also includes aligning the centroid to a centroid of a geometrical shape describing the alignment target. In addition, the method includes assigning a design data space position of the centroid of the alignment target as a position of the centroid of the geometrical shape in the design data space. The method further includes determining a position of inspection data acquired for the wafer in the design data space based on the design data space position of the centroid of the alignment target.

167 citations


Journal ArticleDOI
TL;DR: The centroid of an interval type-2 fuzzy set (IT2 FS) provides a measure of the uncertainty of such a FS, and its calculation is very widely used in interval type 2 fuzzy logic systems as mentioned in this paper.

153 citations


Patent
20 Jun 2007
TL;DR: In this article, a common centroid layout design system is proposed to automatically define the complete layout for an integrated circuit (IC) object, based on which the system selects an algorithm for tiling the common unit based on the size of such unit such that, upon completion of the tiling process, all of the devices have the same centroid.
Abstract: An exemplary common centroid layout design system receives various inputs about an integrated circuit (IC) design Based on such inputs, the system calculates a common centroid unit, which represents an array of segments of each device in the IC design The number of segments for each device within the common centroid unit is selected based on the respective sizes of the devices The common centroid unit is then tiled to automatically define the complete layout for the IC object The system selects an algorithm for tiling the common centroid unit based on the size of such unit such that, upon completion of the tiling process, all of the devices have a common centroid Using the common centroid layout design, the IC object can be manufactured that is more immune to linear process gradients and more resistant to non-linear gradients.

125 citations


Proceedings ArticleDOI
24 Jun 2007
TL;DR: Experimental evidence shows that the recursive approach to compute the generalized centroid of an interval type-2 fuzzy set is computationally faster than the Karnik-Mendel method without loosing numeric precision.
Abstract: This article presents a recursive algorithm to compute the generalized centroid of an interval type-2 fuzzy set. First, a re-expression of the upper and lower limits of the generalized centroid is introduced. Then, the re-expressed formulas are solved by using a mixed approach of exhaustive search and recursive computations. This method is compared with the Karnik-Mendel iterative algorithm under the same computational principles. Experimental evidence shows that the recursive approach is computationally faster than the Karnik-Mendel method without loosing numeric precision.

111 citations


Journal ArticleDOI
TL;DR: The Gibbs Centroid Sampler reports a centroid alignment, i.e. an alignment that has the minimum total distance to the set of samples chosen from the a posteriori probability distribution of transcription factor binding-site alignments.
Abstract: The Gibbs Centroid Sampler is a software package designed for locating conserved elements in biopolymer sequences. The Gibbs Centroid Sampler reports a centroid alignment, i.e. an alignment that has the minimum total distance to the set of samples chosen from the a posteriori probability distribution of transcription factor binding-site alignments. In so doing, it garners information from the full ensemble of solutions, rather than only the single most probable point that is the target of many motif-finding algorithms, including its predecessor, the Gibbs Recursive Sampler. Centroid estimators have been shown to yield substantial improvements, in both sensitivity and positive predictive values, to the prediction of RNA secondary structure and motif finding. The Gibbs Centroid Sampler, along with interactive tutorials, an online user manual, and information on downloading the software, is available at: http://bayesweb.wadsworth.org/gibbs/gibbs.html.

64 citations


Journal ArticleDOI
TL;DR: The approach enables the rapid development of autonomous star-camera systems without the extensive characterizations required to derive polynomic fitting coefficients employed by traditional centroid algorithms.

62 citations


Proceedings ArticleDOI
06 Jun 2007
TL;DR: It is proved that if the body is a polytope given as an intersection of half-spaces, then computing the centroid exactly is #P-hard, even for order poly topes, a special case of 0-1 polytopes.
Abstract: Consider the problem of computing the centroid of a convex body in n-dimensional Euclidean space. We prove that if the body is a polytope given as an intersection of half-spaces, then computing the centroid exactly is #P-hard, even for order polytopes, a special case of 0-1 polytopes. We also prove that if the body is given by a membership oracle, then for any deterministic algorithm that makes a polynomial number of queries there exists a body satisfying a roundedness condition such that the output of the algorithm is outside a ball of radius sigma/100 around the centroid, where sigma^2 is the minimum eigenvalue of the inertia matrix of the body.

54 citations


Proceedings ArticleDOI
05 Nov 2007
TL;DR: To address the same problem with an additional constraint that devices are required to be placed uniformly to average out the parasitic errors, a grid-based approach is proposed which is fast and promising, and has high scalability that even large data sets can be handled effectively.
Abstract: In order to reduce parasitic mismatch in analog circuits, some groups of devices are required to share a common centroid while being placed. Devices are split into smaller ones and placed with a common center point. We will address this problem of handling common centroid constraint in placement. A new representation called center-based corner block list (C-CBL) is proposed which is a natural extension of corner block list (CBL) [1] to represent a common centroid placement of a set of device pairs. C-CBL is complete and non-redundant in representing any common centroid mosaic packings with pairs of blocks to be matched. To address the same problem with an additional constraint that devices are required to be placed uniformly to average out the parasitic errors, a grid-based approach is proposed. Experimental results show that both approaches are fast and promising, and have high scalability that even large data sets can be handled effectively.

01 Sep 2007
TL;DR: This paper proposes a enhanced centroid localization method using edge weights of adjacent nodes based on TSK fuzzy modeling and develops the fuzzy membership function using genetic algorithms for edge weights based on received signal strength indicator (RSSI) information.
Abstract: Localization is one of the fundamental problems in wireless sensor networks (WSNs) that forms the basis for many location-aware applications. Localization in WSNs is to determine the position of node based on the known positions of several nodes. Most of previous localization method use triangulation or multilateration based on angle of arrival (AOA) or distance measurements. In this paper, we propose a enhanced centroid localization method using edge weights of adjacent nodes based on TSK fuzzy modeling. In the proposed method, first we find adjacent reference nodes which are connected to the node to be localized, and then develop the fuzzy membership function using genetic algorithms (GAs) for edge weights based on received signal strength indicator (RSSI) information. After calculating edge weights, we employ weighted centroid method to localize the node. Finally, we simulated the proposed method to demonstrate the performance.

Patent
David Charles Smart1
25 Oct 2007
TL;DR: In this paper, the authors propose a method for calibrating a touch screen, which is comprised of determining an area for a region suitable for receiving, excluding activations determined to be located within a defined distance from the outer edge of the area determined for the region, and accumulating a pattern of activations for a centroid within the region.
Abstract: A method (200 in FIG. 2) for calibrating a touch screen. Method may be comprised of determining an area for a region suitable for receiving, excluding activations determined to be located within a defined distance from the outer edge of the area determined for the region (206), accumulating a pattern of activations for a centroid within the region (204), and tuning a calibration factor to reposition the centroid within the center of the region (208).

Patent
30 Apr 2007
TL;DR: In this paper, the Euclidean distance between actual color samples under measurement and each cluster centroid is measured, and the spectra are then reconstructed using only the training samples from the cluster corresponding to the shortest distance.
Abstract: To determine spectra, integrated multiple illuminant measurements from a non-fully illuminant populated color sensor may be converted into a fully populated spectral curve using a reference database. The reference database is partitioned into a plurality of clusters, and an appropriate centroid is determined for each cluster by, for example, vector quantization. Training samples that form the reference database may be assigned to the clusters by comparing the Euclidean distance between the centroids and the sample under consideration, and assigning each sample to the cluster having the centroid with the shortest Euclidean distance. When all training samples have been assigned, the resulting structure is stored as the reference database. When reconstructing the spectra for new measurements from the sensor, the Euclidean distances between actual color samples under measurement and each cluster centroid are measured. The spectra are then reconstructed using only the training samples from the cluster corresponding to the shortest Euclidean distance, resulting in improved speed and accuracy.

Patent
21 Nov 2007
TL;DR: In this article, the centroid decomposition technique is used to detect an attacker by removing a leaf node from the shortest path tree and generating a centroid tree whose node of each level is the detected centroid node.
Abstract: There are provided a system and method for tracing back an attacker by using centroid decomposition technique, the system including: a log data input module collecting log data of an intrusion alarm from an intrusion detection system; a centroid node detection module generating a shortest path tree by applying a shortest path algorithm to network router connection information collected by a network administration server, detecting a centroid node by applying centroid decomposition technique removing a leaf-node to the shortest path tree, and generating a centroid tree whose node of each level is the detected centroid node; and a traceback processing module requesting log data of a router matched with the node of each level of the centroid tree, and tracing back a router identical to the log data of the collected intrusion alarm as a router connected to a source of an attacker by comparing the log data of the router with the log data of the collected intrusion alarm. According to the system and method, an attacker causing a security intrusion event may be quickly detected, a load on the system is reduced, and a passage host exposed to a danger or having weaknesses may be easily recognized, thereby easily coping with an attack.

Proceedings ArticleDOI
11 Mar 2007
TL;DR: This paper studies the effect of using unlabeled data in conjunction with a small portion of labeled data on the accuracy of a centroid-based classifier used to perform single-label text categorization, and proposes the combination of Expectation-Maximization with a centoid-based method to incorporate information about the unlabeling data during the training phase.
Abstract: In this paper we study the effect of using unlabeled data in conjunction with a small portion of labeled data on the accuracy of a centroid-based classifier used to perform single-label text categorization. We chose to use centroid-based methods because they are very fast when compared with other classification methods, but still present an accuracy close to that of the state-of-the-art methods. Efficiency is particularly important for very large domains, like regular news feeds, or the web.We propose the combination of Expectation-Maximization with a centroid-based method to incorporate information about the unlabeled data during the training phase. We also propose an alternative to EM, based on the incremental update of a centroid-based method with the unlabeled documents during the training phase.We show that these approaches can greatly improve accuracy relatively to a simple centroid-based method, in particular when there are very small amounts of labeled data available (as few as one single document per class).Using one synthetic and three real-world datasets, we show that, if the initial model of the data is sufficiently precise, using unlabeled data improves performance. On the other hand, using unlabeled data degrades performance if the initial model is not precise enough.

Journal ArticleDOI
03 Oct 2007-PLOS ONE
TL;DR: This work introduces a new feature selection approach for high-dimensional nearest centroid classifiers that is based on the theoretically optimal choice of a given number of features, and applies it to clinical classification based on gene-expression microarrays, demonstrating that the proposed method can outperform existing nearest centroids.
Abstract: Nearest-centroid classifiers have recently been successfully employed in high-dimensional applications, such as in genomics. A necessary step when building a classifier for high-dimensional data is feature selection. Feature selection is frequently carried out by computing univariate scores for each feature individually, without consideration for how a subset of features performs as a whole. We introduce a new feature selection approach for high-dimensional nearest centroid classifiers that instead is based on the theoretically optimal choice of a given number of features, which we determine directly here. This allows us to develop a new greedy algorithm to estimate this optimal nearest-centroid classifier with a given number of features. In addition, whereas the centroids are usually formed from maximum likelihood estimates, we investigate the applicability of high-dimensional shrinkage estimates of centroids. We apply the proposed method to clinical classification based on gene-expression microarrays, demonstrating that the proposed method can outperform existing nearest centroid classifiers.

Proceedings ArticleDOI
09 Jul 2007
TL;DR: The work in this paper addresses the task of control design for coordinated autonomous vehicles operating in 3D by designing a velocity matching controller for each individual so that the collective centroid is asymptotically stabilized to a moving target vehicle.
Abstract: The work in this paper addresses the task of control design for coordinated autonomous vehicles operating in 3D. A group of N unit-speed individuals are modeled using natural frenet frames, in which each individual has a total of two inputs that act to steer the vehicle gyroscopically. Extending previous results for planar target tracking based on oscillator models, a velocity matching controller is first designed for each individual so that the collective centroid is asymptotically stabilized to a moving target vehicle. Simultaneously, an additive spacing control term induces helical motion in order to keep pursuer vehicles near the centroid. Simulation examples are presented to support analytical results.

Patent
18 Dec 2007
TL;DR: In this article, a method and system for quantifying the quality of search results from a search engine based on cohesion is presented, where each document in the set of search engine search results is represented by a vector where each cell of the vector represents a stemmed word.
Abstract: A method and system for quantifying the quality of search results from a search engine based on cohesion. The method and system include modeling a set of search engine search results as a cluster and measuring the cohesion of the cluster. In an embodiment, the cohesion of the cluster is the average similarity between the cluster elements to a centroid vector. The centroid vector is the average of the weights of the vectors of the cluster. The similarity between the centroid vector and the cluster's elements is the cosine similarity measure. Each document in the set of search results is represented by a vector where each cell of the vector represents a stemmed word. Each cell has a cell value which is the frequency of the corresponding stemmed word in a document multiplied by a weight that takes into account the location of the stemmed word within the document.

Proceedings ArticleDOI
17 Dec 2007
TL;DR: A template method that uses the sunlight image model to determine the centroid of the sunshine image is suggested, and the performance has been compared and analyzed.
Abstract: The digital sun sensor calculates the incident sunlight angle using the sunlight image registered on a CMOS image sensor. In order to accomplish this, an exact center of the sunlight image has to be determined. Therefore, an accurate estimate of the centroid is the most important factor in digital sun sensor development. The most general method for determining the centroid is the thresholding method, and this method is also the simplest and easy to implement. Another centering algorithm often used is the image filtering method that utilizes image processing. The sun sensor accuracy using these methods, however, is quite susceptible to noise in the detected sunlight intensity. This is especially true in the thresholding method where the accuracy changes according to the threshold level. In this paper, a template method that uses the sunlight image model to determine the centroid of the sunlight image is suggested, and the performance has been compared and analyzed. The template method suggested, unlike the thresholding and image filtering method, has comparatively higher accuracy. In addition, it has the advantage of having consistent level of accuracy regardless of the noise level, which results in a higher reliability.

Journal ArticleDOI
TL;DR: A system to inspect metal stencil that is used to print solder paste on pads of surface-mounted device on printed circuit board and is composed of a moderately precise X-Y robot and a vision system is presented.
Abstract: In this paper, the authors present a system to inspect metal stencil that is used to print solder paste on pads of surface-mounted device on printed circuit board. The developed inspection system is composed of a moderately precise X-Y robot and a vision system. To correct a position error caused by the X-Y robot, the authors define position error vector and apply modified Hough transform to determine the dominant position error vector. Using this extracted dominant position error vector, the reference image is modified. This transformed reference image is compared with the camera image. Fuzzy logic is utilized to judge the correctness of the holes on the stencil. The input variables are the ratio of the overlapped area of two holes and the distance between the centroid of them. The output variable is the grade of the identity of the hole. These methods are verified by a simulation and applied to the inspection system

Book ChapterDOI
29 Oct 2007
TL;DR: This work takes one mixture density onto another by deforming the component centroids via a thin-plate spline (TPS) and also minimizing the distance with respect to the variance parameters and validate the approach on synthetic and 3D arterial tree data and evaluate it on 3D hippocampal shapes.
Abstract: There exists a large body of literature on shape matching and registration in medical image analysis. However, most of the previous work is focused on matching particular sets of features--point-sets, lines, curves and surfaces. In this work, we forsake specific geometric shape representations and instead seek probabilistic representations-- specifically Gaussian mixture models--of shapes. We evaluate a closed-form distance between two probabilistic shape representations for the general case where the mixture models differ in variance and the number of components. We then cast non-rigid registration as a deformable density matching problem. In our approach, we take one mixture density onto another by deforming the component centroids via a thin-plate spline (TPS) and also minimizing the distance with respect to the variance parameters. We validate our approach on synthetic and 3D arterial tree data and evaluate it on 3D hippocampal shapes.

Posted Content
TL;DR: This paper gives closed-form solutions for the sided centroids that are generalized means, and design a provably fast and efficient approximation algorithm for the symmetrized centroid based on its exact geometric characterization that requires solely to walk on the geodesic linking the two sided centROids.
Abstract: In this paper, we generalize the notions of centroids and barycenters to the broad class of information-theoretic distortion measures called Bregman divergences. Bregman divergences are versatile, and unify quadratic geometric distances with various statistical entropic measures. Because Bregman divergences are typically asymmetric, we consider both the left-sided and right-sided centroids and the symmetrized centroids, and prove that all three are unique. We give closed-form solutions for the sided centroids that are generalized means, and design a provably fast and efficient approximation algorithm for the symmetrized centroid based on its exact geometric characterization that requires solely to walk on the geodesic linking the two sided centroids. We report on our generic implementation for computing entropic centers of image clusters and entropic centers of multivariate normals, and compare our results with former ad-hoc methods.

Journal ArticleDOI
TL;DR: A new shape descriptor is presented, which is based on the centroid–radii model and the Haar wavelet transform, which can achieve more accurate shape feature and implement more efficient retrieval under multi-resolution.

01 Jan 2007
TL;DR: In this article, a shape codebook entry consists of two components: a shape codeword and a group of associated vectors that specify the object centroids, which can be easily extracted from most object categories.
Abstract: This paper presents a method for detecting categories of objects in real-world images. Given training images of an object category, our goal is to recognize and localize instances of those objects in a candidate image. The main contribution of this work is a novel structure of the shape codebook for object detection. A shape codebook entry consists of two components: a shape codeword and a group of associated vectors that specify the object centroids. Like their counterpart in language, the shape codewords are simple and generic such that they can be easily extracted from most object categories. The associated vectors store the geometrical relationships between the shape codewords, which specify the characteristics of a particular object category. Thus they can be considered as the “grammar” of the shape codebook. In this paper, we use Triple-Adjacent-Segments (TAS) extracted from image edges as the shape codewords. Object detection is performed in a probabilistic voting framework. Experimental results on public datasets show performance similiar to the state-of-the-art, yet our method has significantly lower complexity and requires considerably less supervision in the training (We only need bounding boxes for a few training samples, do not need figure/ground segmentation and do not need a validation dataset).

Book ChapterDOI
03 Sep 2007
TL;DR: This paper compares three different approaches for finding geometric algorithms for centroid detection which are appropriate for a fine-grained parallel hardware architecture in an embedded vision chip.
Abstract: Current industrial applications require fast and robust image processing in systems with low size and power dissipation. One of the main tasks in industrial vision is fast detection of centroids of objects. This paper compares three different approaches for finding geometric algorithms for centroid detection which are appropriate for a fine-grained parallel hardware architecture in an embedded vision chip. The algorithms shall comprise emergent capabilities and high problem-specific functionality without requiring large amounts of states or memory. For that problem, we consider uniform and non-uniform cellular automata (CA) as well as Genetic Programming. Due to the inherent complexity of the problem, an evolutionary approach is applied. The appropriateness of these approaches for centroid detection is discussed.

Book ChapterDOI
20 Oct 2007
TL;DR: This paper presents a novel tool for body-part segmentation and tracking in the context of multiple camera systems that takes advantage of temporal correlation to consistently segment body-parts over time.
Abstract: In this paper we present a novel tool for body-part segmentation and tracking in the context of multiple camera systems. Our goal is to produce robust motion cues over time sequences, as required by human motion analysis applications. Given time sequences of 3D body shapes, body-parts are consistently identified over time without any supervision or a priori knowledge. The approach first maps shape representations of a moving body to an embedding space using locally linear embedding. While this map is updated at each time step, the shape of the embedded body remains stable. Robust clustering of body parts can then be performed in the embedding space by k-wise clustering, and temporal consistency is achieved by propagation of cluster centroids. The contribution with respect to methods proposed in the literature is a totally unsupervised spectral approach that takes advantage of temporal correlation to consistently segment body-parts over time. Comparisons on real data are run with direct segmentation in 3D by EM clustering and ISOMAP-based clustering: the way different approaches cope with topology transitions is discussed.

Proceedings ArticleDOI
17 Apr 2007
TL;DR: Evaluating the performance of centroid-based classification algorithm and comparing it to nearest mean and nearest neighbor algorithms on 9 data sets finds that Euclidean distance is turned into a similarity measure using division as opposed to exponentiation and can perform almost as good as cosine similarity.
Abstract: k-nearest neighbor and centroid-based classification algorithms are frequently used in text classification due to their simplicity and performance. While k-nearest neighbor algorithm usually performs well in terms of accuracy, it is slow in recognition phase. Because the distances/similarities between the new data point to be recognized and all the training data need to be computed. On the other hand, centroid-based classification algorithms are very fast, because only as many distance/similarity computations as the number of centroids (i.e. classes) needs to be done. In this paper, we evaluate the performance of centroid-based classification algorithm and compare it to nearest mean and nearest neighbor algorithms on 9 data sets. We propose and evaluate an improvement on centroid based classification algorithm. Proposed algorithm starts from the centroids of each class and increases the weight of misclassified training data points on the centroid computation until the validation error starts increasing. The weight increase is done based on the training confusion matrix entries for misclassified points. Vie proposed algorithm results in smaller test error than centroid-based classification algorithm in 7 out of 9 data sets. It is also better than 10-nearest neighbor algorithm in 8 out of 9 data sets. We also evaluate different similarity metrics together with centroid and nearest neighbor algorithms. We find out that, when Euclidean distance is turned into a similarity measure using division as opposed to exponentiation. Euclidean-based similarity can perform almost as good as cosine similarity.

Proceedings ArticleDOI
01 Apr 2007
TL;DR: Formulas for computing the cardinality, fuzziness, variance and skewness of an IT2 FS are derived and these new formulas have closed-form expressions, so they can be computed very fast.
Abstract: Centroid, cardinality, fuzziness, variance and skewness are all important concepts for an interval type-2 fuzzy set (IT2 FS) because they are all measures of uncertainty, i.e. each of them is an interval, and the length of the interval is an indicator of the uncertainty. The centroid of an IT2 FS has been defined by Karnik and Mendel. In this paper, the other four concepts are defined. All definitions use the Mendel-John representation theorem for IT2 FSs. Formulas for computing the cardinality, fuzziness, variance and skewness of an IT2 FS are derived. Unlike the formulas for the centroid of an IT2 FS, which must be computed by iterative Karnik-Mendel algorithms, these new formulas have closed-form expressions, so they can be computed very fast. These definitions are useful not only for measuring the uncertainties of an IT2 FS, but also in measuring the similarity between two IT2 FSs

Journal ArticleDOI
TL;DR: A generalized DragPushing strategy for centroid classifier is proposed, which is called as ''Large Margin DragP pushing'' (LMDP) and it is shown that LMDP achieved about one percent improvement over the performance of DragPushes and delivered top performance nearly as well as state-of-the-art SVM without incurring significant computational costs.
Abstract: Among all conventional methods for text categorization, centroid classifier is a simple and efficient method. However it often suffers from inductive bias (or model misfit) incurred by its assumption. DragPushing is a very simple and yet efficient method to address this so-called inductive bias problem. However, DragPushing employs only one criterion, i.e., training-set error, as its objective function that cannot guarantee the generalization capability. In this paper, we propose a generalized DragPushing strategy for centroid classifier, which we called as ''Large Margin DragPushing'' (LMDP). The experiments conducted on three benchmark evaluation collections show that LMDP achieved about one percent improvement over the performance of DragPushing and delivered top performance nearly as well as state-of-the-art SVM without incurring significant computational costs.