scispace - formally typeset
Search or ask a question

Showing papers on "Sorting published in 2006"


Proceedings ArticleDOI
27 Jun 2006
TL;DR: Overall, the results indicate that using a GPU as a co-processor can significantly improve the performance of sorting algorithms on large databases.
Abstract: We present a novel external sorting algorithm using graphics processors (GPUs) on large databases composed of billions of records and wide keys. Our algorithm uses the data parallelism within a GPU along with task parallelism by scheduling some of the memory-intensive and compute-intensive threads on the GPU. Our new sorting architecture provides multiple memory interfaces on the same PC -- a fast and dedicated memory interface on the GPU along with the main memory interface for CPU computations. As a result, we achieve higher memory bandwidth as compared to CPU-based algorithms running on commodity PCs. Our approach takes into account the limited communication bandwidth between the CPU and the GPU, and reduces the data communication between the two processors. Our algorithm also improves the performance of disk transfers and achieves close to peak I/O performance. We have tested the performance of our algorithm on the SortBenchmark and applied it to large databases composed of a few hundred Gigabytes of data. Our results on a 3 GHz Pentium IV PC with $300 NVIDIA 7800 GT GPU indicate a significant performance improvement over optimized CPU-based algorithms on high-end PCs with 3.6 GHz Dual Xeon processors. Our implementation is able to outperform the current high-end PennySort benchmark and results in a higher performance to price ratio. Overall, our results indicate that using a GPU as a co-processor can significantly improve the performance of sorting algorithms on large databases.

493 citations


Journal ArticleDOI
TL;DR: The authors found that companies funded by more experienced VCs are more likely to go public and that sorting is almost twice as important as influence for the difference in IPO rates, but sorting creates an endogeneity problem, but a structural model based on a Two-Sided Matching model is able to exploit the characteristics of the other agents in the market to separately identify and estimate influence and sorting.
Abstract: I find that companies funded by more experienced VCs are more likely to go public. This follows both from the direct influence of more experienced VCs and from sorting in the market, which leads experienced VCs to invest in better companies. Sorting creates an endogeneity problem, but a structural model based on a Two-Sided Matching model is able to exploit the characteristics of the other agents in the market to separately identify and estimate influence and sorting. Both effects are found to be significant, but sorting is almost twice as important as influence for the difference in IPO rates.

394 citations


Journal ArticleDOI
01 Oct 2006
TL;DR: A 1.375-approximation algorithm for sorting by transpositions is provided based on a new upper bound on the diameter of 3-permutations and some new results regarding the transposition diameter are presented.
Abstract: Sorting permutations by transpositions is an important problem in genome rearrangements. A transposition is a rearrangement operation in which a segment is cut out of the permutation and pasted in a different location. The complexity of this problem is still open and it has been a 10-year-old open problem to improve the best known 1.5-approximation algorithm. In this paper, we provide a 1.375-approximation algorithm for sorting by transpositions. The algorithm is based on a new upper bound on the diameter of 3-permutations. In addition, we present some new results regarding the transposition diameter: We improve the lower bound for the transposition diameter of the symmetric group and determine the exact transposition diameter of simple permutations.

217 citations


Journal ArticleDOI
TL;DR: This work examines the performance of different versions of Gillespie's stochastic simulation algorithm when applied to several biochemical models and proposes a new algorithm called the sorting direct method that maintains a loosely sorted order of the reactions as the simulation executes.

211 citations


Journal ArticleDOI
TL;DR: Overall, amplitude sorting performed better than phase angle sorting for 33 of the 35 patients and equally well for two patients who were immobilized with a stereotactic body frame and an abdominal compression plate, suggesting a stronger relationship between internal motion and amplitude.
Abstract: Respiratory motion can cause significant dose delivery errors in conformal radiation therapy for thoracic and upper abdominal tumors. Four-dimensional computed tomography (4D CT) has been proposed to provide the image data necessary to model tumor motion and consequently reduce these errors. The purpose of this work was to compare 4D CT reconstruction methods using amplitude sorting and phase angle sorting. A 16-slice CT scanner was operated in cine mode to acquire 25 scans consecutively at each couch position through the thorax. The patient underwent synchronized external respiratory measurements. The scans were sorted into 12 phases based, respectively, on the amplitude and direction (inhalation or exhalation) or on the phase angle (0-360 degrees) of the external respiratory signal. With the assumption that lung motion is largely proportional to the measured respiratory amplitude, the variation in amplitude corresponds to the variation in motion for each phase. A smaller variation in amplitude would associate with an improved reconstructed image. Air content, defined as the amount of air within the lungs, bronchi, and trachea in a 16-slice CT segment and used by our group as a surrogate for internal motion, was correlated to the respiratory amplitude and phase angle throughout the lungs. For the 35 patients who underwent quiet breathing, images (similar to those used for treatment planning) and animations (used to display respiratory motion) generated using amplitude sorting displayed fewer reconstruction artifacts than those generated using phase angle sorting. The variations in respiratory amplitude were significantly smaller (P < 0.001) with amplitude sorting than those with phase angle sorting. The subdivision of the breathing cycle into more (finer) phases improved the consistency in respiratory amplitude for amplitude sorting, but not for phase angle sorting. For 33 of the 35 patients, the air content showed significantly improved (P < 0.001) correlation with the respiratory amplitude than with the phase angle, suggesting a stronger relationship between internal motion and amplitude. Overall, amplitude sorting performed better than phase angle sorting for 33 of the 35 patients and equally well for two patients who were immobilized with a stereotactic body frame and an abdominal compression plate.

205 citations


Proceedings ArticleDOI
11 Nov 2006
TL;DR: A memory model is presented to analyze and improve the performance of scientific algorithms on graphics processing units (GPUs) and incorporates many characteristics of GPU architectures including smaller cache sizes, 2D block representations, and the 3C's model to analyze the cache misses.
Abstract: We present a memory model to analyze and improve the performance of scientific algorithms on graphics processing units (GPUs). Our memory model is based on texturing hardware, which uses a 2D block-based array representation to perform the underlying computations. We incorporate many characteristics of GPU architectures including smaller cache sizes, 2D block representations, and use the 3C's model to analyze the cache misses. Moreover. we present techniques to improve the performance of nested loops on GPUs. In order to demonstrate the effectiveness of our model, we highlight its performance on three memory-intensive scientific applications - sorting, fast Fourier transform and dense matrix-multiplication. In practice, our cache-efficient algorithms for these applications are able to achieve memory throughput of 30-50 GB/s on a NVIDIA 7900 GTX GPU. We also compare our results with prior GPU-based and CPU-based implementations on high-end processors. In practice, we are able to achieve 2-5 x performance improvement.

203 citations


Journal ArticleDOI
TL;DR: An integrated detection and separation approach streamlines microfluidic cell sorting and minimizes the optical and feedback complexity commonly associated with extant platforms.
Abstract: Effective methods for manipulating, isolating and sorting cells and particles are essential for the development of microfluidic-based life science research and diagnostic platforms. We demonstrate an integrated optical platform for cell and particle sorting in microfluidic structures. Fluorescent-dyed particles are excited using an integrated optical waveguide network within micro-channels. A diode-bar optical trapping scheme guides the particles across the waveguide/micro-channel structures and selectively sorts particles based upon their fluorescent signature. This integrated detection and separation approach streamlines microfluidic cell sorting and minimizes the optical and feedback complexity commonly associated with extant platforms.

195 citations


Journal ArticleDOI
TL;DR: The results showed that sorting combined with verbalisation led to meaningful and consistent product sensory mapping, whatever the panelist's level of training.

174 citations


Journal ArticleDOI
TL;DR: This review highlights important contributions where flow cytometric cell sorting was used for physiological research, protein engineering, cell engineering, and specifically emphasizing selection of overproducing cell lines, concerning the impact of cell sorting on inverse metabolic engineering and systems biology.
Abstract: Due to its unique capability to analyze a large number of single cells for several parameters simultaneously, flow cytometry has changed our understanding of the behavior of cells in culture and of the population dynamics even of clonal populations. The potential of this method for biotechnological research, which is based on populations of living cells, was soon appreciated. Sorting applications, however, are still less frequent than one would expect with regard to their potential. This review highlights important contributions where flow cytometric cell sorting was used for physiological research, protein engineering, cell engineering, specifically emphasizing selection of overproducing cell lines. Finally conclusions are drawn concerning the impact of cell sorting on inverse metabolic engineering and systems biology.

161 citations


Journal ArticleDOI
TL;DR: In this paper, the authors investigated continuous separation and size sorting of particles and blood cells suspended in a microchannel flow due to an acoustic force both numerically and experimentally, and found good agreement in the measured particle trajectories in the micro channel flow subjected to the acoustic force with those obtained by the numerical simulations up to the fitting parameter.
Abstract: Continuous separation and size sorting of particles and blood cells suspended in a microchannel flow due to an acoustic force are investigated both numerically and experimentally. Good agreement in the measured particle trajectories in a microchannel flow subjected to the acoustic force with those obtained by the numerical simulations up to the fitting parameter is found. High separation efficiency, particularly in a three-stage microdevice (up to 99.975%), for particles and blood cells leads us to believe that the device can be developed into commercially useful set-up. The novel particle size sorting microdevice provides an opportunity to replace rather expensive existing devices based on specific chemical bonding with an ultrasound cell size sorter that can be considerably improved by adding many stages for multistage size sorting.

150 citations


Journal ArticleDOI
Goetz Graefe1
TL;DR: This survey collects many of the sorting techniques that are publicly known, but not readily available in the research literature for easy reference by students, researchers, and product developers.
Abstract: Most commercial database systems do (or should) exploit many sorting techniques that are publicly known, but not readily available in the research literature. These techniques improve both sort performance on modern computer systems and the ability to adapt gracefully to resource fluctuations in multiuser operations. This survey collects many of these techniques for easy reference by students, researchers, and product developers. It covers in-memory sorting, disk-based external sorting, and considerations that apply specifically to sorting in database systems.

Journal ArticleDOI
TL;DR: A new retrospective gating technique with sorting based on the amplitude of the motion trace is presented and significant improvement using the amplitude‐sorting technique was observed, particularly when testing nonperiodic motion functions.
Abstract: Image quality of CT scans suffers when objects undergo motion. Respiratory motion causes artifacts that prevent adequate visualization of anatomy. 4D-CT is a method in which image reconstruction of moving objects is retrospectively gated according to the recorded phase information of the monitored motion pattern. Although several groups have investigated the use of 4D-CT in radiotherapy, little has been detailed with regard to the sorting method. We present a new retrospective gating technique with sorting based on the amplitude of the motion trace. This method is compared to previously developed methods that sort based on phase. A 16-slice CT scanner (Sensation 16, Siemens Medical Solutions, Erlangen, Germany) was used to acquire images of two phantoms on a motion platform moving in two dimensions. The motion was monitored using a strain gauge inserted inside an adjustable belt. 180° interpolation was used for reconstruction after gating. Significant improvement using the amplitude sorting technique was observed, particularly when testing non-periodic motion functions.

Proceedings ArticleDOI
01 Sep 2006
TL;DR: It is demonstrated that high-quality SAH based acceleration structures can be constructed quickly enough to make them a viable option for interactive ray tracing of dynamic scenes, and the resulting trees are almost as good as those produced by a sorting-based SAH builder as measured by ray tracing time.
Abstract: Construction of effective acceleration structures for ray tracing is a well studied problem The highest quality acceleration structures are generally agreed to be those built using greedy cost optimization based on a surface area heuristic (SAH) This technique is most often applied to the construction of kd-trees, as in this work, but is equally applicable to the construction of other hierarchical acceleration structures Unfortunately, SAH-optimized data structure construction has previously been too slow to allow per-frame rebuilding for interactive ray tracing of dynamic scenes, leading to the use of lower-quality acceleration structures for this application The goal of this paper is to demonstrate that high-quality SAH based acceleration structures can be constructed quickly enough to make them a viable option for interactive ray tracing of dynamic scenes We present a scanning-based algorithm for choosing kd-tree split planes that are close to optimal with respect to the SAH criteria Our approach approximates the SAH cost function across the spatial domain with a piecewise quadratic function with bounded error and picks minima from this approximation This algorithm takes full advantage of SIMD operations (eg, SSE) and has favorable memory access patterns In practice this algorithm is faster than sorting-based SAH build algorithms with the same asymptotic time complexity, and is competitive with non-SAH build algorithms which produce lower-quality trees The resulting trees are almost as good as those produced by a sorting-based SAH builder as measured by ray tracing time For a test scene with 180 k polygons our system builds a high-quality kd-tree in 026 seconds that only degrades ray tracing time by 36% compared to a full quality tree

Proceedings ArticleDOI
25 Apr 2006
TL;DR: This paper presents a novel approach for parallel sorting on stream processing architectures based on adaptive bitonic sorting that achieves the optimal time complexity O((n log n)/p) and presents an implementation on modern programmable graphics hardware (GPUs).
Abstract: In this paper, we present a novel approach for parallel sorting on stream processing architectures. It is based on adaptive bitonic sorting. For sorting n values utilizing p stream processor units, this approach achieves the optimal time complexity O((n log n)/p). While this makes our approach competitive with common sequential sorting algorithms not only from a theoretical viewpoint, it is also very fast from a practical viewpoint. This is achieved by using efficient linear stream memory accesses (and by combining the optimal time approach with algorithms optimized for small input sequences). We present an implementation on modern programmable graphics hardware (GPUs). On GPUs, our optimal parallel sorting approach has shown to be remarkably faster than sequential sorting on the CPU, and it is also faster than previous non-optimal sorting approaches on the GPU for sufficiently large input sequences. Because of the excellent scalability of our algorithm with the number of stream processor units p (up to n/log/sup 2/ n or even n/log n units, depending on the stream architecture), our approach profits heavily from the trend of increasing number of fragment processor units on GPUs, so that we can expect further speed improvement with upcoming GPU generations.

Journal ArticleDOI
TL;DR: In this paper, a new dynamic and efficient bounding volume hierarchy for breakable objects undergoing structured and/or unstructured motion is proposed, which leads to significant advantages in terms of execution speed.

Journal ArticleDOI
TL;DR: In this paper, the authors present a theory of entrepreneurial entry and exit decisions, showing that each entrant in a large market is more efficient than any entrepreneur in a smaller market because competition is endogenously more intense in larger markets.
Abstract: We present a theory of entrepreneurial entry (and exit) decisions. Knowing their own managerial talent, entrepreneurs decide which market to enter, where markets differ in size. We obtain a striking sorting result: Each entrant in a large market is more efficient than any entrepreneur in a smaller market because competition is endogenously more intense in larger markets. This result continues to hold when entrepreneurs can export their output to other markets, thereby incurring a unit transport cost or tariff. The sorting and price competition effects imply that the number of entrants (and hence product variety) may actually be smaller in larger markets. In the stochastic dynamic extension of the model, we show that the churning rate of entrepreneurs is higher in larger markets. (JEL: L11, L13, M13, F12)

Proceedings ArticleDOI
20 Aug 2006
TL;DR: A simples sampling algorithm to effciently detect distance-based outliers indomains where each and every distance computation is veryexpensive.
Abstract: An effective approach to detecting anomalous points in a data set is distance-based outlier detection. This paper describes a simple sampling algorithm to effciently detect distance-based outliers in domains where each and every distance computation is very expensive. Unlike any existing algorithms, the sampling algorithm requires a xed number of distance computations and can return good results with accuracy guarantees. The most computationally expensive aspect of estimating the accuracy of the result is sorting all of the distances computed by the sampling algorithm. The experimental study on two expensive domains as well as ten additional real-life datasets demonstrates both the effciency and effectiveness of the sampling algorithm in comparison with the state-of-the-art algorithm and there liability of the accuracy guarantees.

Journal ArticleDOI
TL;DR: In this article, the authors extend Mousseau et al. (2003) to incorporate information about the confidence attached to each assignment example, hence providing inconsistency resolutions that the DMs are most likely to accept.
Abstract: Sorting models consist in assigning alternatives evaluated on several criteria to ordered categories. To implement such models it is necessary to set the values of the preference parameters used in the model. Rather than fixing the values of these parameters directly, a usual approach is to infer these values from assign- ment examples provided by the decision maker (DM), i.e., alternatives for which (s)he specifies a required category. However, assignment examples provided by DMs can be inconsistent, i.e., may not match the sorting model. In such situations, it is necessary to support the DMs in the resolution of this inconsistency. In this paper, we extend algorithms from Mousseau et al. (2003) that calculate different ways to remove assignment examples so that the information can be represented in the sorting model. The extension concerns the possibility to relax (rather than to delete) assignment examples. These algorithms incorporate information about the confidence attached to each assignment example, hence providing inconsistency resolutions that the DMs are most likely to accept.

Patent
03 Aug 2006
TL;DR: In this paper, a metal sorting device including an X-ray tube, a dual energy detector array, a microprocessor, and an air ejector array is described, which detects the presence of samples in the x-ray sensing region and initiates identifying and sorting the samples, at a specific time, the device activates an array of air ejectors located at specific positions in order to place the sample in the proper collection bin.
Abstract: Disclosed herein is a metal sorting device including an X-ray tube, a dual energy detector array, a microprocessor, and an air ejector array. The device senses the presence of samples in the x-ray sensing region and initiates identifying and sorting the samples. After identifying and classifying the category of a sample, at a specific time, the device activates an array of air ejectors located at specific positions in order to place the sample in the proper collection bin.

Journal ArticleDOI
TL;DR: Functional analysis shows that when both modes of Rsp5 association with Hse1 are altered, sorting of cargo that requires efficient ubiquitination for entry into the MVB is blocked, whereas sorting of Cargo containing an in-frame addition of ubiquitin is normal.
Abstract: Ubiquitinated integral membrane proteins are delivered to the interior of the lysosome/vacuole for degradation. This process relies on specific ubiquitination of potential cargo and recognition of that Ub-cargo by sorting receptors at multiple compartments. We show that the endosomal Hse1-Vps27 sorting receptor binds to ubiquitin peptidases and the ubiquitin ligase Rsp5. Hse1 is linked to Rsp5 directly via a PY element within its C-terminus and through a novel protein Hua1, which recruits a complex of Rsp5, Rup1, and Ubp2. The SH3 domain of Hse1 also binds to the deubiquitinating protein Ubp7. Functional analysis shows that when both modes of Rsp5 association with Hse1 are altered, sorting of cargo that requires efficient ubiquitination for entry into the MVB is blocked, whereas sorting of cargo containing an in-frame addition of ubiquitin is normal. Further deletion of Ubp7 restores sorting of cargo when the Rsp5:Hse1 interaction is compromised suggesting that both ubiquitin ligases and peptidases associate with the Hse1-Vps27 sorting complex to control the ubiquitination status and sorting efficiency of cargo proteins. Additionally, we find that disruption of UBP2 and RUP1 inhibits MVB sorting of some cargos suggesting that Rsp5 requires association with Ubp2 to properly ubiquitinate cargo for efficient MVB sorting.

Book ChapterDOI
TL;DR: In this article, the authors assess the impact of errors in sorting and identifying macroinvertebrate samples collected and analysed using different protocols (e.g. STAR-AQEM, RIVPACS).
Abstract: This study assesses the impact of errors in sorting and identifying macroinvertebrate samples collected and analysed using different protocols (e.g. STAR-AQEM, RIVPACS). The study is based on the auditing scheme implemented in the EU-funded project STAR and presents the first attempt at analysing the audit data. Data from 10 participating countries are analysed with regard to the impact of sorting and identification errors. These differences are measured in the form of gains and losses at each level of audit for 120 samples. Based on gains and losses to the primary results, qualitative binary taxa lists were deducted for each level of audit for a subset of 72 data sets. Between these taxa lists the taxonomic similarity and the impact of differences on selected metrics common to stream assessment were analysed. The results of our study indicate that in all methods used, a considerable amount of sorting and identification error could be detected. This total impact is reflected in most functional metrics. In some metrics indicative of taxonomic richness, the total impact of differences is not directly reflected in differences in metric scores. The results stress the importance of implementing quality control mechanisms in macroinvertebrate assessment schemes.

Posted Content
TL;DR: In this paper, the authors focus primarily on the sorting of parents and children into schools and classrooms and the equilibrium level of sorting (along parental income and child peer quality dimensions) then depends on both the specifics of how education production works and the overall characteristics of the general equilibrium environment within which schools operate.
Abstract: Any system of primary and secondary schools involves explicit or implicit mechanisms that ration not only financial but also nonfinancial inputs into education production. This chapter focuses primarily on such mechanisms as they relate to the sorting of parents and children into schools and classrooms. Three primary mechanisms are reviewed: (1) sorting that emerges through residential location choices within housing markets that are linked to schools; (2) sorting that arises from parental choices to send children to private rather than public schools; and (3) sorting within schools that results from explicit tracking policies. The equilibrium level of sorting (along parental income and child peer quality dimensions) then depends on both the specifics of how education production works and the overall characteristics of the general equilibrium environment within which schools operate. We review the theoretical as well as the related simulation-based literature in this area and suggest that much potential exists for increasing empirical relevance of the emerging models for policy analysis, particularly as a related empirical literature comes to better terms with the nature of peer effects in education production.

Journal ArticleDOI
TL;DR: All algorithms for sorting linear permutations by transpositions can be used to sort circular permutations, and a new O(n 3/2 log n) 1.5-approximation algorithm is observed, which is considerably simpler than previously reported.
Abstract: An important problem in genome rearrangements is sorting permutations by transpositions. The complexity of the problem is still open, and two rather complicated 1.5-approximation algorithms for sorting linear permutations are known (Bafna and Pevzner, 98 and Christie, 99). The fastest known algorithm is the quadratic algorithm of Bafna and Pevzner. In this paper, we observe that the problem of sorting circular permutations by transpositions is equivalent to the problem of sorting linear permutations by transpositions. Hence, all algorithms for sorting linear permutations by transpositions can be used to sort circular permutations. Our main result is a new O(n3/2√log n) 1.5-approximation algorithm, which is considerably simpler than the previous ones, and whose analysis is significantly less involved.

Proceedings ArticleDOI
07 Jun 2006
TL;DR: An efficient algorithm for detecting duplicate regions is proposed and a set of colour-based morphological operations are used to remove isolated mismatches, as well as to fill in missing matches.
Abstract: An efficient algorithm for detecting duplicate regions is proposed in this paper. The basic idea is to segment the input image into blocks and search for blocks with similar intensity patterns using matching techniques. To improve the efficiency, the blocks are sorted based on the concept of k-dimensional tree. The sorting process groups blocks with similar patterns and hence the number of matching operations required for finding the duplicated blocks can be significantly reduced. The matching block detection results are encoded as a color image. This makes it possible to use a set of colour-based morphological operations to remove isolated mismatches, as well as to fill in missing matches. The experiments conducted show the effectiveness of the proposed algorithm.

Proceedings ArticleDOI
01 Jan 2006
TL;DR: This work presents and applies infinite Gaussian mixture modeling, a non-parametric Bayesian method, to the problem of spike sorting, and compares this approach to using penalized log likelihood to select the best from multiple finite mixture models trained by expectation maximization.
Abstract: In this work we present and apply infinite Gaus- sian mixture modeling, a non-parametric Bayesian method, to the problem of spike sorting. As this approach is Bayesian, it allows us to integrate prior knowledge about the problem in a principled way. Because it is non-parametric we are able to avoid model selection, a difficult problem that most current spike sorting methods do not address. We compare this approach to using penalized log likelihood to select the best from multiple finite mixture models trained by expectation maximization. We show favorable offline sorting results on real data and discuss ways to extend our model to online applications. Index Terms— Spike sorting, mixture modeling, infinite mix- ture model, non-parametric Bayesian modeling, Chinese restau- rant process, Bayesian inference, Markov chain Monte Carlo, expectation maximization, Gibbs sampling. I. I NTRODUCTION

Book ChapterDOI
10 Apr 2006
TL;DR: This work presents efficient methods to compute the exact rank vector even for large-scale web graphs in only a few minutes and iteration steps, with intrinsic support for incremental web crawling, and without the need for page sorting/reordering or for sharing global rank information.
Abstract: PageRank inherently is massively parallelizable and distributable, as a result of web's strict host-based link locality. We show that the Gaus-Seidel iterative method can actually be applied in such a parallel ranking scenario in order to improve convergence. By introducing a two-dimensional web model and by adapting the PageRank to this environment, we present efficient methods to compute the exact rank vector even for large-scale web graphs in only a few minutes and iteration steps, with intrinsic support for incremental web crawling, and without the need for page sorting/reordering or for sharing global rank information.

Patent
25 Jul 2006
TL;DR: In this article, a method, system, and computer program product are disclosed for automatically matching the profile of unstructured electronic documents to objective sets of criteria, which is accomplished by evaluating text in the documents, comparing it to a set of weighted keyword criteria, generating a rating based on adherence to the criteria, rating and categorizing the results, sorting and viewing the results based on user defined criteria.
Abstract: A method, system, and computer program product are disclosed for automatically matching the profile of unstructured electronic documents to objective sets of criteria. The is accomplished by evaluating text in the documents, comparing it to a set of weighted keyword criteria, generating a rating based on adherence to the criteria, rating and categorizing the results, sorting and viewing the results based on user defined criteria.

Journal ArticleDOI
10 Jul 2006
TL;DR: This framework considers the problems of sorting and searching in optimal time while tolerating the largest possible number of memory faults, and designs an O(nlogn) time sorting algorithm that can optimally tolerate up to $O(\sqrt{n\log n}\,)$ memory faults.
Abstract: We investigate the problem of reliable computation in the presence of faults that may arbitrarily corrupt memory locations. In this framework, we consider the problems of sorting and searching in optimal time while tolerating the largest possible number of memory faults. In particular, we design an O(nlogn) time sorting algorithm that can optimally tolerate up to $O(\sqrt{n\log n}\,)$ memory faults. In the special case of integer sorting, we present an algorithm with linear expected running time that can tolerate $O(\sqrt{n}\,)$ faults. We also present a randomized searching algorithm that can optimally tolerate up to O(logn) memory faults in O(logn) expected time, and an almost optimal deterministic searching algorithm that can tolerate O((logn)1−e) faults, for any small positive constant e, in O(logn) worst-case time. All these results improve over previous bounds.

Patent
19 Jun 2006
TL;DR: In this article, a radio frequency reader device or tag can be associated with each container used to hold a plurality of mail items destined for a common delivery location, which can be configured to automatically alert a sorting operator as to whether a mail item belongs in a selected container.
Abstract: Systems and methods are disclosed for sorting and tracking mail items that are sent via a mail system. Such systems improve the efficiency and accuracy of mail systems by utilizing radio frequency identification (RFID) technology to communicate sorting instructions to a sorting operator. In one embodiment, a radio frequency reader device or tag can be associated with each container used to hold a plurality of mail items destined for a common delivery location. By reading information stored on RFID tags associated with mail items to be sorted, the radio frequency reader devices can be configured to automatically alert a sorting operator as to whether a mail item belongs in a selected container. The reader devices can also maintain a record of what mail items have been placed in each such container. This information can be stored in association with tracking information that is generated during the transport of the containers.