Showing papers on "Sorting published in 2006"

PDF

Open Access

Proceedings Article•DOI•

GPUTeraSort: high performance graphics co-processor sorting for large database management

[...]

Naga K. Govindaraju¹, Jim Gray², Ritesh Kumar¹, Dinesh Manocha¹•Institutions (2)

University of North Carolina at Chapel Hill¹, Microsoft²

27 Jun 2006

TL;DR: Overall, the results indicate that using a GPU as a co-processor can significantly improve the performance of sorting algorithms on large databases.

...read moreread less

Abstract: We present a novel external sorting algorithm using graphics processors (GPUs) on large databases composed of billions of records and wide keys. Our algorithm uses the data parallelism within a GPU along with task parallelism by scheduling some of the memory-intensive and compute-intensive threads on the GPU. Our new sorting architecture provides multiple memory interfaces on the same PC -- a fast and dedicated memory interface on the GPU along with the main memory interface for CPU computations. As a result, we achieve higher memory bandwidth as compared to CPU-based algorithms running on commodity PCs. Our approach takes into account the limited communication bandwidth between the CPU and the GPU, and reduces the data communication between the two processors. Our algorithm also improves the performance of disk transfers and achieves close to peak I/O performance. We have tested the performance of our algorithm on the SortBenchmark and applied it to large databases composed of a few hundred Gigabytes of data. Our results on a 3 GHz Pentium IV PC with $300 NVIDIA 7800 GT GPU indicate a significant performance improvement over optimized CPU-based algorithms on high-end PCs with 3.6 GHz Dual Xeon processors. Our implementation is able to outperform the current high-end PennySort benchmark and results in a higher performance to price ratio. Overall, our results indicate that using a GPU as a co-processor can significantly improve the performance of sorting algorithms on large databases.

...read moreread less

493 citations

Journal Article•DOI•

How Smart is Smart Money? A Two-Sided Matching Model of Venture Capital

[...]

Morten Sorensen¹•Institutions (1)

Dartmouth College¹

01 Jan 2006-Social Science Research Network

TL;DR: The authors found that companies funded by more experienced VCs are more likely to go public and that sorting is almost twice as important as influence for the difference in IPO rates, but sorting creates an endogeneity problem, but a structural model based on a Two-Sided Matching model is able to exploit the characteristics of the other agents in the market to separately identify and estimate influence and sorting.

...read moreread less

Abstract: I find that companies funded by more experienced VCs are more likely to go public. This follows both from the direct influence of more experienced VCs and from sorting in the market, which leads experienced VCs to invest in better companies. Sorting creates an endogeneity problem, but a structural model based on a Two-Sided Matching model is able to exploit the characteristics of the other agents in the market to separately identify and estimate influence and sorting. Both effects are found to be significant, but sorting is almost twice as important as influence for the difference in IPO rates.

...read moreread less

394 citations

Journal Article•DOI•

A 1.375-Approximation Algorithm for Sorting by Transpositions

[...]

Isaac Elias¹, Tzvika Hartman²•Institutions (2)

Royal Institute of Technology¹, Weizmann Institute of Science²

01 Oct 2006

TL;DR: A 1.375-approximation algorithm for sorting by transpositions is provided based on a new upper bound on the diameter of 3-permutations and some new results regarding the transposition diameter are presented.

...read moreread less

Abstract: Sorting permutations by transpositions is an important problem in genome rearrangements. A transposition is a rearrangement operation in which a segment is cut out of the permutation and pasted in a different location. The complexity of this problem is still open and it has been a 10-year-old open problem to improve the best known 1.5-approximation algorithm. In this paper, we provide a 1.375-approximation algorithm for sorting by transpositions. The algorithm is based on a new upper bound on the diameter of 3-permutations. In addition, we present some new results regarding the transposition diameter: We improve the lower bound for the transposition diameter of the symmetric group and determine the exact transposition diameter of simple permutations.

...read moreread less

217 citations

Journal Article•DOI•

The sorting direct method for stochastic simulation of biochemical systems with varying reaction execution behavior

[...]

James M. McCollum¹, Gregory D. Peterson², Christopher Cox², Michael L. Simpson², Nagiza F. Samatova¹ - Show less +1 more•Institutions (2)

Oak Ridge National Laboratory¹, University of Tennessee²

01 Feb 2006-Computational Biology and Chemistry

TL;DR: This work examines the performance of different versions of Gillespie's stochastic simulation algorithm when applied to several biochemical models and proposes a new algorithm called the sorting direct method that maintains a loosely sorted order of the reactions as the simulation executes.

...read moreread less

211 citations

Journal Article•DOI•

A comparison between amplitude sorting and phase-angle sorting using external respiratory measurement for 4D CT

[...]

Wei Lu¹, Parag J. Parikh¹, James P. Hubenschmidt¹, Jeffrey D. Bradley¹, Daniel A. Low¹ - Show less +1 more•Institutions (1)

Washington University in St. Louis¹

01 Aug 2006-Medical Physics

TL;DR: Overall, amplitude sorting performed better than phase angle sorting for 33 of the 35 patients and equally well for two patients who were immobilized with a stereotactic body frame and an abdominal compression plate, suggesting a stronger relationship between internal motion and amplitude.

...read moreread less

Abstract: Respiratory motion can cause significant dose delivery errors in conformal radiation therapy for thoracic and upper abdominal tumors. Four-dimensional computed tomography (4D CT) has been proposed to provide the image data necessary to model tumor motion and consequently reduce these errors. The purpose of this work was to compare 4D CT reconstruction methods using amplitude sorting and phase angle sorting. A 16-slice CT scanner was operated in cine mode to acquire 25 scans consecutively at each couch position through the thorax. The patient underwent synchronized external respiratory measurements. The scans were sorted into 12 phases based, respectively, on the amplitude and direction (inhalation or exhalation) or on the phase angle (0-360 degrees) of the external respiratory signal. With the assumption that lung motion is largely proportional to the measured respiratory amplitude, the variation in amplitude corresponds to the variation in motion for each phase. A smaller variation in amplitude would associate with an improved reconstructed image. Air content, defined as the amount of air within the lungs, bronchi, and trachea in a 16-slice CT segment and used by our group as a surrogate for internal motion, was correlated to the respiratory amplitude and phase angle throughout the lungs. For the 35 patients who underwent quiet breathing, images (similar to those used for treatment planning) and animations (used to display respiratory motion) generated using amplitude sorting displayed fewer reconstruction artifacts than those generated using phase angle sorting. The variations in respiratory amplitude were significantly smaller (P < 0.001) with amplitude sorting than those with phase angle sorting. The subdivision of the breathing cycle into more (finer) phases improved the consistency in respiratory amplitude for amplitude sorting, but not for phase angle sorting. For 33 of the 35 patients, the air content showed significantly improved (P < 0.001) correlation with the respiratory amplitude than with the phase angle, suggesting a stronger relationship between internal motion and amplitude. Overall, amplitude sorting performed better than phase angle sorting for 33 of the 35 patients and equally well for two patients who were immobilized with a stereotactic body frame and an abdominal compression plate.

...read moreread less

205 citations

Proceedings Article•DOI•

A memory model for scientific algorithms on graphics processors

[...]

Naga K. Govindaraju¹, Scott Larsen¹, Jim Gray², Dinesh Manocha¹•Institutions (2)

University of North Carolina at Chapel Hill¹, Microsoft²

11 Nov 2006

TL;DR: A memory model is presented to analyze and improve the performance of scientific algorithms on graphics processing units (GPUs) and incorporates many characteristics of GPU architectures including smaller cache sizes, 2D block representations, and the 3C's model to analyze the cache misses.

...read moreread less

Abstract: We present a memory model to analyze and improve the performance of scientific algorithms on graphics processing units (GPUs). Our memory model is based on texturing hardware, which uses a 2D block-based array representation to perform the underlying computations. We incorporate many characteristics of GPU architectures including smaller cache sizes, 2D block representations, and use the 3C's model to analyze the cache misses. Moreover. we present techniques to improve the performance of nested loops on GPUs. In order to demonstrate the effectiveness of our model, we highlight its performance on three memory-intensive scientific applications - sorting, fast Fourier transform and dense matrix-multiplication. In practice, our cache-efficient algorithms for these applications are able to achieve memory throughput of 30-50 GB/s on a NVIDIA 7900 GTX GPU. We also compare our results with prior GPU-based and CPU-based implementations on high-end processors. In practice, we are able to achieve 2-5 x performance improvement.

...read moreread less

203 citations

Journal Article•DOI•

Microfluidic sorting system based on optical waveguide integration and diode laser bar trapping

[...]

Robert W. Applegate¹, Jeff Squier¹, Tor Vestad, John Oakey¹, David W. M. Marr¹, Philippe Bado, Mark Dugan, Ali A. Said - Show less +4 more•Institutions (1)

Colorado School of Mines¹

24 Feb 2006-Lab on a Chip

TL;DR: An integrated detection and separation approach streamlines microfluidic cell sorting and minimizes the optical and feedback complexity commonly associated with extant platforms.

...read moreread less

Abstract: Effective methods for manipulating, isolating and sorting cells and particles are essential for the development of microfluidic-based life science research and diagnostic platforms. We demonstrate an integrated optical platform for cell and particle sorting in microfluidic structures. Fluorescent-dyed particles are excited using an integrated optical waveguide network within micro-channels. A diode-bar optical trapping scheme guides the particles across the waveguide/micro-channel structures and selectively sorts particles based upon their fluorescent signature. This integrated detection and separation approach streamlines microfluidic cell sorting and minimizes the optical and feedback complexity commonly associated with extant platforms.

...read moreread less

195 citations

Journal Article•DOI•

Sorting procedure as an alternative to quantitative descriptive analysis to obtain a product sensory map

[...]

Raphaelle Cartier¹, Andreas Rytz¹, Angèle Lecomte¹, Fabienne Poblete¹, Jocelyne Krystlik¹, Emmanuelle Belin¹, Nathalie Martin¹ - Show less +3 more•Institutions (1)

Nestlé¹

01 Oct 2006-Food Quality and Preference

TL;DR: The results showed that sorting combined with verbalisation led to meaningful and consistent product sensory mapping, whatever the panelist's level of training.

...read moreread less

174 citations

Journal Article•DOI•

Applications of cell sorting in biotechnology

[...]

Diethard Mattanovich¹, Nicole Borth¹•Institutions (1)

University of Natural Resources and Life Sciences, Vienna¹

21 Mar 2006-Microbial Cell Factories

TL;DR: This review highlights important contributions where flow cytometric cell sorting was used for physiological research, protein engineering, cell engineering, and specifically emphasizing selection of overproducing cell lines, concerning the impact of cell sorting on inverse metabolic engineering and systems biology.

...read moreread less

Abstract: Due to its unique capability to analyze a large number of single cells for several parameters simultaneously, flow cytometry has changed our understanding of the behavior of cells in culture and of the population dynamics even of clonal populations. The potential of this method for biotechnological research, which is based on populations of living cells, was soon appreciated. Sorting applications, however, are still less frequent than one would expect with regard to their potential. This review highlights important contributions where flow cytometric cell sorting was used for physiological research, protein engineering, cell engineering, specifically emphasizing selection of overproducing cell lines. Finally conclusions are drawn concerning the impact of cell sorting on inverse metabolic engineering and systems biology.

...read moreread less

161 citations

Journal Article•DOI•

Continuous particle size separation and size sorting using ultrasound in a microchannel

[...]

Sergey Kapishnikov, Vasiliy Kantsler, Victor Steinberg

01 Jan 2006-Journal of Statistical Mechanics: Theory and Experiment

TL;DR: In this paper, the authors investigated continuous separation and size sorting of particles and blood cells suspended in a microchannel flow due to an acoustic force both numerically and experimentally, and found good agreement in the measured particle trajectories in the micro channel flow subjected to the acoustic force with those obtained by the numerical simulations up to the fitting parameter.

...read moreread less

Abstract: Continuous separation and size sorting of particles and blood cells suspended in a microchannel flow due to an acoustic force are investigated both numerically and experimentally. Good agreement in the measured particle trajectories in a microchannel flow subjected to the acoustic force with those obtained by the numerical simulations up to the fitting parameter is found. High separation efficiency, particularly in a three-stage microdevice (up to 99.975%), for particles and blood cells leads us to believe that the device can be developed into commercially useful set-up. The novel particle size sorting microdevice provides an opportunity to replace rather expensive existing devices based on specific chemical bonding with an ultrasound cell size sorter that can be considerably improved by adding many stages for multistage size sorting.

...read moreread less

150 citations

Journal Article•DOI•

Implementing sorting in database systems

[...]

Goetz Graefe¹•Institutions (1)

Microsoft¹

30 Sep 2006-ACM Computing Surveys

TL;DR: This survey collects many of the sorting techniques that are publicly known, but not readily available in the research literature for easy reference by students, researchers, and product developers.

...read moreread less

Abstract: Most commercial database systems do (or should) exploit many sorting techniques that are publicly known, but not readily available in the research literature. These techniques improve both sort performance on modern computer systems and the ability to adapt gracefully to resource fluctuations in multiuser operations. This survey collects many of these techniques for easy reference by students, researchers, and product developers. It covers in-memory sorting, disk-based external sorting, and considerations that apply specifically to sorting in database systems.

...read moreread less

Journal Article•DOI•

Phase versus amplitude sorting of 4D-CT data

[...]

Nicole M Wink¹, Christoph Panknin², Timothy D. Solberg³, Timothy D. Solberg¹•Institutions (3)

University of California, Los Angeles¹, Siemens², University of Nebraska Medical Center³

21 Feb 2006-Journal of Applied Clinical Medical Physics

TL;DR: A new retrospective gating technique with sorting based on the amplitude of the motion trace is presented and significant improvement using the amplitude‐sorting technique was observed, particularly when testing nonperiodic motion functions.

...read moreread less

Abstract: Image quality of CT scans suffers when objects undergo motion. Respiratory motion causes artifacts that prevent adequate visualization of anatomy. 4D-CT is a method in which image reconstruction of moving objects is retrospectively gated according to the recorded phase information of the monitored motion pattern. Although several groups have investigated the use of 4D-CT in radiotherapy, little has been detailed with regard to the sorting method. We present a new retrospective gating technique with sorting based on the amplitude of the motion trace. This method is compared to previously developed methods that sort based on phase. A 16-slice CT scanner (Sensation 16, Siemens Medical Solutions, Erlangen, Germany) was used to acquire images of two phantoms on a motion platform moving in two dimensions. The motion was monitored using a strain gauge inserted inside an adjustable belt. 180° interpolation was used for reconstruction after gating. Significant improvement using the amplitude sorting technique was observed, particularly when testing non-periodic motion functions.

...read moreread less

Proceedings Article•DOI•

Fast kd-tree Construction with an Adaptive Error-Bounded Heuristic

[...]

Warren Hunt¹, William R. Mark¹, Gordon Stoll²•Institutions (2)

University of Texas at Austin¹, Intel²

01 Sep 2006

TL;DR: It is demonstrated that high-quality SAH based acceleration structures can be constructed quickly enough to make them a viable option for interactive ray tracing of dynamic scenes, and the resulting trees are almost as good as those produced by a sorting-based SAH builder as measured by ray tracing time.

...read moreread less

Abstract: Construction of effective acceleration structures for ray tracing is a well studied problem The highest quality acceleration structures are generally agreed to be those built using greedy cost optimization based on a surface area heuristic (SAH) This technique is most often applied to the construction of kd-trees, as in this work, but is equally applicable to the construction of other hierarchical acceleration structures Unfortunately, SAH-optimized data structure construction has previously been too slow to allow per-frame rebuilding for interactive ray tracing of dynamic scenes, leading to the use of lower-quality acceleration structures for this application The goal of this paper is to demonstrate that high-quality SAH based acceleration structures can be constructed quickly enough to make them a viable option for interactive ray tracing of dynamic scenes We present a scanning-based algorithm for choosing kd-tree split planes that are close to optimal with respect to the SAH criteria Our approach approximates the SAH cost function across the spatial domain with a piecewise quadratic function with bounded error and picks minima from this approximation This algorithm takes full advantage of SIMD operations (eg, SSE) and has favorable memory access patterns In practice this algorithm is faster than sorting-based SAH build algorithms with the same asymptotic time complexity, and is competitive with non-SAH build algorithms which produce lower-quality trees The resulting trees are almost as good as those produced by a sorting-based SAH builder as measured by ray tracing time For a test scene with 180 k polygons our system builds a high-quality kd-tree in 026 seconds that only degrades ray tracing time by 36% compared to a full quality tree

...read moreread less

Proceedings Article•DOI•

GPU-ABiSort: optimal parallel sorting on stream architectures

[...]

A. Greb¹, Gabriel Zachmann²•Institutions (2)

University of Bonn¹, Clausthal University of Technology²

25 Apr 2006

TL;DR: This paper presents a novel approach for parallel sorting on stream processing architectures based on adaptive bitonic sorting that achieves the optimal time complexity O((n log n)/p) and presents an implementation on modern programmable graphics hardware (GPUs).

...read moreread less

Abstract: In this paper, we present a novel approach for parallel sorting on stream processing architectures. It is based on adaptive bitonic sorting. For sorting n values utilizing p stream processor units, this approach achieves the optimal time complexity O((n log n)/p). While this makes our approach competitive with common sequential sorting algorithms not only from a theoretical viewpoint, it is also very fast from a practical viewpoint. This is achieved by using efficient linear stream memory accesses (and by combining the optimal time approach with algorithms optimized for small input sequences). We present an implementation on modern programmable graphics hardware (GPUs). On GPUs, our optimal parallel sorting approach has shown to be remarkably faster than sequential sorting on the CPU, and it is also faster than previous non-optimal sorting approaches on the GPU for sufficiently large input sequences. Because of the excellent scalability of our algorithm with the number of stream processor units p (up to n/log/sup 2/ n or even n/log n units, depending on the stream architecture), our approach profits heavily from the trend of increasing number of fragment processor units on GPUs, so that we can expect further speed improvement with upcoming GPU generations.

...read moreread less

Journal Article•DOI•

A dynamic bounding volume hierarchy for generalized collision detection

[...]

Thomas Larsson¹, Tomas Akenine-Möller²•Institutions (2)

Mälardalen University College¹, Lund University²

01 Jun 2006-Computers & Graphics

TL;DR: In this paper, a new dynamic and efficient bounding volume hierarchy for breakable objects undergoing structured and/or unstructured motion is proposed, which leads to significant advantages in terms of execution speed.

...read moreread less

Journal Article•DOI•

A Gap for Me: Entrepreneurs and Entry

[...]

Volker Nocke¹•Institutions (1)

University of Pennsylvania¹

01 Sep 2006-Journal of the European Economic Association

TL;DR: In this paper, the authors present a theory of entrepreneurial entry and exit decisions, showing that each entrant in a large market is more efficient than any entrepreneur in a smaller market because competition is endogenously more intense in larger markets.

...read moreread less

Abstract: We present a theory of entrepreneurial entry (and exit) decisions. Knowing their own managerial talent, entrepreneurs decide which market to enter, where markets differ in size. We obtain a striking sorting result: Each entrant in a large market is more efficient than any entrepreneur in a smaller market because competition is endogenously more intense in larger markets. This result continues to hold when entrepreneurs can export their output to other markets, thereby incurring a unit transport cost or tariff. The sorting and price competition effects imply that the number of entrants (and hence product variety) may actually be smaller in larger markets. In the stochastic dynamic extension of the model, we show that the churning rate of entrepreneurs is higher in larger markets. (JEL: L11, L13, M13, F12)

...read moreread less

Proceedings Article•DOI•

Outlier detection by sampling with accuracy guarantees

[...]

Mingxi Wu¹, Chris Jermaine¹•Institutions (1)

University of Florida¹

20 Aug 2006

TL;DR: A simples sampling algorithm to effciently detect distance-based outliers indomains where each and every distance computation is veryexpensive.

...read moreread less

Abstract: An effective approach to detecting anomalous points in a data set is distance-based outlier detection. This paper describes a simple sampling algorithm to effciently detect distance-based outliers in domains where each and every distance computation is very expensive. Unlike any existing algorithms, the sampling algorithm requires a xed number of distance computations and can return good results with accuracy guarantees. The most computationally expensive aspect of estimating the accuracy of the result is sorting all of the distances computed by the sampling algorithm. The experimental study on two expensive domains as well as ten additional real-life datasets demonstrates both the effciency and effectiveness of the sampling algorithm in comparison with the state-of-the-art algorithm and there liability of the accuracy guarantees.

...read moreread less

Journal Article•DOI•

Dealing with inconsistent judgments in multiple criteria sorting models

[...]

Vincent Mousseau¹, Luis C. Dias², José Rui Figueira², José Rui Figueira³•Institutions (3)

Paris Dauphine University¹, University of Coimbra², Rutgers University³

01 Jun 2006-A Quarterly Journal of Operations Research

TL;DR: In this article, the authors extend Mousseau et al. (2003) to incorporate information about the confidence attached to each assignment example, hence providing inconsistency resolutions that the DMs are most likely to accept.

...read moreread less

Abstract: Sorting models consist in assigning alternatives evaluated on several criteria to ordered categories. To implement such models it is necessary to set the values of the preference parameters used in the model. Rather than fixing the values of these parameters directly, a usual approach is to infer these values from assign- ment examples provided by the decision maker (DM), i.e., alternatives for which (s)he specifies a required category. However, assignment examples provided by DMs can be inconsistent, i.e., may not match the sorting model. In such situations, it is necessary to support the DMs in the resolution of this inconsistency. In this paper, we extend algorithms from Mousseau et al. (2003) that calculate different ways to remove assignment examples so that the information can be represented in the sorting model. The extension concerns the possibility to relax (rather than to delete) assignment examples. These algorithms incorporate information about the confidence attached to each assignment example, hence providing inconsistency resolutions that the DMs are most likely to accept.

...read moreread less

Patent•

Method and apparatus for sorting materials according to relative composition

[...]

Edward J. Sommer, Charles E. Roos, David B. Spencer, R. Lynn Conley

03 Aug 2006

TL;DR: In this paper, a metal sorting device including an X-ray tube, a dual energy detector array, a microprocessor, and an air ejector array is described, which detects the presence of samples in the x-ray sensing region and initiates identifying and sorting the samples, at a specific time, the device activates an array of air ejectors located at specific positions in order to place the sample in the proper collection bin.

...read moreread less

Abstract: Disclosed herein is a metal sorting device including an X-ray tube, a dual energy detector array, a microprocessor, and an air ejector array. The device senses the presence of samples in the x-ray sensing region and initiates identifying and sorting the samples. After identifying and classifying the category of a sample, at a specific time, the device activates an array of air ejectors located at specific positions in order to place the sample in the proper collection bin.

...read moreread less

Journal Article•DOI•

Hse1, a Component of the Yeast Hrs-STAM Ubiquitin-sorting Complex, Associates with Ubiquitin Peptidases and a Ligase to Control Sorting Efficiency into Multivesicular Bodies

[...]

Jihui Ren¹, Younghoon Kee², Jon M. Huibregtse², Robert C. Piper¹•Institutions (2)

University of Iowa¹, University of Texas at Austin²

01 Nov 2006-Molecular Biology of the Cell

TL;DR: Functional analysis shows that when both modes of Rsp5 association with Hse1 are altered, sorting of cargo that requires efficient ubiquitination for entry into the MVB is blocked, whereas sorting of Cargo containing an in-frame addition of ubiquitin is normal.

...read moreread less

Abstract: Ubiquitinated integral membrane proteins are delivered to the interior of the lysosome/vacuole for degradation. This process relies on specific ubiquitination of potential cargo and recognition of that Ub-cargo by sorting receptors at multiple compartments. We show that the endosomal Hse1-Vps27 sorting receptor binds to ubiquitin peptidases and the ubiquitin ligase Rsp5. Hse1 is linked to Rsp5 directly via a PY element within its C-terminus and through a novel protein Hua1, which recruits a complex of Rsp5, Rup1, and Ubp2. The SH3 domain of Hse1 also binds to the deubiquitinating protein Ubp7. Functional analysis shows that when both modes of Rsp5 association with Hse1 are altered, sorting of cargo that requires efficient ubiquitination for entry into the MVB is blocked, whereas sorting of cargo containing an in-frame addition of ubiquitin is normal. Further deletion of Ubp7 restores sorting of cargo when the Rsp5:Hse1 interaction is compromised suggesting that both ubiquitin ligases and peptidases associate with the Hse1-Vps27 sorting complex to control the ubiquitination status and sorting efficiency of cargo proteins. Additionally, we find that disruption of UBP2 and RUP1 inhibits MVB sorting of some cargos suggesting that Rsp5 requires association with Ubp2 to properly ubiquitinate cargo for efficient MVB sorting.

...read moreread less

Book Chapter•DOI•

Assessing the impact of errors in sorting and identifying macroinvertebrate samples

[...]

Peter Haase¹, John Murray-Bligh², Susanne Lohse¹, Steffen U. Pauls¹, Andrea Sundermann¹, Rick Gunn, Ralph T. Clarke - Show less +3 more•Institutions (2)

American Museum of Natural History¹, Environment Agency²

01 Aug 2006-Hydrobiologia

TL;DR: In this article, the authors assess the impact of errors in sorting and identifying macroinvertebrate samples collected and analysed using different protocols (e.g. STAR-AQEM, RIVPACS).

...read moreread less

Abstract: This study assesses the impact of errors in sorting and identifying macroinvertebrate samples collected and analysed using different protocols (e.g. STAR-AQEM, RIVPACS). The study is based on the auditing scheme implemented in the EU-funded project STAR and presents the first attempt at analysing the audit data. Data from 10 participating countries are analysed with regard to the impact of sorting and identification errors. These differences are measured in the form of gains and losses at each level of audit for 120 samples. Based on gains and losses to the primary results, qualitative binary taxa lists were deducted for each level of audit for a subset of 72 data sets. Between these taxa lists the taxonomic similarity and the impact of differences on selected metrics common to stream assessment were analysed. The results of our study indicate that in all methods used, a considerable amount of sorting and identification error could be detected. This total impact is reflected in most functional metrics. In some metrics indicative of taxonomic richness, the total impact of differences is not directly reflected in differences in metric scores. The results stress the importance of implementing quality control mechanisms in macroinvertebrate assessment schemes.

...read moreread less

Posted Content•

Income and Peer Quality Sorting in Public and Private Schools

[...]

Thomas J. Nechyba

01 Jan 2006-Research Papers in Economics

TL;DR: In this paper, the authors focus primarily on the sorting of parents and children into schools and classrooms and the equilibrium level of sorting (along parental income and child peer quality dimensions) then depends on both the specifics of how education production works and the overall characteristics of the general equilibrium environment within which schools operate.

...read moreread less

Abstract: Any system of primary and secondary schools involves explicit or implicit mechanisms that ration not only financial but also nonfinancial inputs into education production. This chapter focuses primarily on such mechanisms as they relate to the sorting of parents and children into schools and classrooms. Three primary mechanisms are reviewed: (1) sorting that emerges through residential location choices within housing markets that are linked to schools; (2) sorting that arises from parental choices to send children to private rather than public schools; and (3) sorting within schools that results from explicit tracking policies. The equilibrium level of sorting (along parental income and child peer quality dimensions) then depends on both the specifics of how education production works and the overall characteristics of the general equilibrium environment within which schools operate. We review the theoretical as well as the related simulation-based literature in this area and suggest that much potential exists for increasing empirical relevance of the emerging models for policy analysis, particularly as a related empirical literature comes to better terms with the nature of peer effects in education production.

...read moreread less

Journal Article•DOI•

A simpler and faster 1.5-approximation algorithm for sorting by transpositions

[...]

Tzvika Hartman¹, Ron Shamir²•Institutions (2)

Weizmann Institute of Science¹, Tel Aviv University²

01 Feb 2006-Information & Computation

TL;DR: All algorithms for sorting linear permutations by transpositions can be used to sort circular permutations, and a new O(n 3/2 log n) 1.5-approximation algorithm is observed, which is considerably simpler than previously reported.

...read moreread less

Abstract: An important problem in genome rearrangements is sorting permutations by transpositions. The complexity of the problem is still open, and two rather complicated 1.5-approximation algorithms for sorting linear permutations are known (Bafna and Pevzner, 98 and Christie, 99). The fastest known algorithm is the quadratic algorithm of Bafna and Pevzner. In this paper, we observe that the problem of sorting circular permutations by transpositions is equivalent to the problem of sorting linear permutations by transpositions. Hence, all algorithms for sorting linear permutations by transpositions can be used to sort circular permutations. Our main result is a new O(n3/2√log n) 1.5-approximation algorithm, which is considerably simpler than the previous ones, and whose analysis is significantly less involved.

...read moreread less

Proceedings Article•DOI•

An Efficient Match-based Duplication Detection Algorithm

[...]

Aaron Langille¹, Minglun Gong¹•Institutions (1)

Laurentian University¹

07 Jun 2006

TL;DR: An efficient algorithm for detecting duplicate regions is proposed and a set of colour-based morphological operations are used to remove isolated mismatches, as well as to fill in missing matches.

...read moreread less

Abstract: An efficient algorithm for detecting duplicate regions is proposed in this paper. The basic idea is to segment the input image into blocks and search for blocks with similar intensity patterns using matching techniques. To improve the efficiency, the blocks are sorted based on the concept of k-dimensional tree. The sorting process groups blocks with similar patterns and hence the number of matching operations required for finding the duplicated blocks can be significantly reduced. The matching block detection results are encoded as a color image. This makes it possible to use a set of colour-based morphological operations to remove isolated mismatches, as well as to fill in missing matches. The experiments conducted show the effectiveness of the proposed algorithm.

...read moreread less

Proceedings Article•DOI•

A non-parametric Bayesian approach to spike sorting.

[...]

Frank Wood¹, Sharon Goldwater, Michael J. Black•Institutions (1)

Brown University¹

01 Jan 2006

TL;DR: This work presents and applies infinite Gaussian mixture modeling, a non-parametric Bayesian method, to the problem of spike sorting, and compares this approach to using penalized log likelihood to select the best from multiple finite mixture models trained by expectation maximization.

...read moreread less

Abstract: In this work we present and apply infinite Gaus- sian mixture modeling, a non-parametric Bayesian method, to the problem of spike sorting. As this approach is Bayesian, it allows us to integrate prior knowledge about the problem in a principled way. Because it is non-parametric we are able to avoid model selection, a difficult problem that most current spike sorting methods do not address. We compare this approach to using penalized log likelihood to select the best from multiple finite mixture models trained by expectation maximization. We show favorable offline sorting results on real data and discuss ways to extend our model to online applications. Index Terms— Spike sorting, mixture modeling, infinite mix- ture model, non-parametric Bayesian modeling, Chinese restau- rant process, Bayesian inference, Markov chain Monte Carlo, expectation maximization, Gibbs sampling. I. I NTRODUCTION

...read moreread less

Book Chapter•DOI•

Efficient parallel computation of pagerank

[...]

Christian Kohlschütter¹, Paul-Alexandru Chirita¹, Wolfgang Nejdl¹•Institutions (1)

Leibniz University of Hanover¹

10 Apr 2006

TL;DR: This work presents efficient methods to compute the exact rank vector even for large-scale web graphs in only a few minutes and iteration steps, with intrinsic support for incremental web crawling, and without the need for page sorting/reordering or for sharing global rank information.

...read moreread less

Abstract: PageRank inherently is massively parallelizable and distributable, as a result of web's strict host-based link locality. We show that the Gaus-Seidel iterative method can actually be applied in such a parallel ranking scenario in order to improve convergence. By introducing a two-dimensional web model and by adapting the PageRank to this environment, we present efficient methods to compute the exact rank vector even for large-scale web graphs in only a few minutes and iteration steps, with intrinsic support for incremental web crawling, and without the need for page sorting/reordering or for sharing global rank information.

...read moreread less

Patent•

System and method for providing profile matching with an unstructured document

[...]

John Harney, Janet Dwyer

25 Jul 2006

TL;DR: In this article, a method, system, and computer program product are disclosed for automatically matching the profile of unstructured electronic documents to objective sets of criteria, which is accomplished by evaluating text in the documents, comparing it to a set of weighted keyword criteria, generating a rating based on adherence to the criteria, rating and categorizing the results, sorting and viewing the results based on user defined criteria.

...read moreread less

Abstract: A method, system, and computer program product are disclosed for automatically matching the profile of unstructured electronic documents to objective sets of criteria. The is accomplished by evaluating text in the documents, comparing it to a set of weighted keyword criteria, generating a rating based on adherence to the criteria, rating and categorizing the results, sorting and viewing the results based on user defined criteria.

...read moreread less

Journal Article•DOI•

Optimal resilient sorting and searching in the presence of memory faults

[...]

Irene Finocchi¹, Fabrizio Grandoni², Giuseppe F. Italiano²•Institutions (2)

Sapienza University of Rome¹, University of Rome Tor Vergata²

10 Jul 2006

TL;DR: This framework considers the problems of sorting and searching in optimal time while tolerating the largest possible number of memory faults, and designs an O(nlogn) time sorting algorithm that can optimally tolerate up to $O(\sqrt{n\log n}\,)$ memory faults.

...read moreread less

Abstract: We investigate the problem of reliable computation in the presence of faults that may arbitrarily corrupt memory locations. In this framework, we consider the problems of sorting and searching in optimal time while tolerating the largest possible number of memory faults. In particular, we design an O(nlogn) time sorting algorithm that can optimally tolerate up to $O(\sqrt{n\log n}\,)$ memory faults. In the special case of integer sorting, we present an algorithm with linear expected running time that can tolerate $O(\sqrt{n}\,)$ faults. We also present a randomized searching algorithm that can optimally tolerate up to O(logn) memory faults in O(logn) expected time, and an almost optimal deterministic searching algorithm that can tolerate O((logn)1−e) faults, for any small positive constant e, in O(logn) worst-case time. All these results improve over previous bounds.

...read moreread less

Patent•

Mail sorting systems and methods

[...]

Nagesh Kadaba¹•Institutions (1)

United Parcel Service¹

19 Jun 2006

TL;DR: In this article, a radio frequency reader device or tag can be associated with each container used to hold a plurality of mail items destined for a common delivery location, which can be configured to automatically alert a sorting operator as to whether a mail item belongs in a selected container.

...read moreread less

Abstract: Systems and methods are disclosed for sorting and tracking mail items that are sent via a mail system. Such systems improve the efficiency and accuracy of mail systems by utilizing radio frequency identification (RFID) technology to communicate sorting instructions to a sorting operator. In one embodiment, a radio frequency reader device or tag can be associated with each container used to hold a plurality of mail items destined for a common delivery location. By reading information stored on RFID tags associated with mail items to be sorted, the radio frequency reader devices can be configured to automatically alert a sorting operator as to whether a mail item belongs in a selected container. The reader devices can also maintain a record of what mail items have been placed in each such container. This information can be stored in association with tracking information that is generated during the transport of the containers.

...read moreread less

Patent•

Method for sorting discarded and spent pharmaceutical items

[...]

Scott R. Mallett, Randall C. Danta, James R. Benson, Alan D. Corey, Alan A. Davidner, Peter Regla - Show less +2 more

03 May 2006

Collapse