scispace - formally typeset
Search or ask a question

Showing papers on "Sorting published in 2001"


Journal ArticleDOI
TL;DR: The authors present fascinating history and insights into the development of various classification systems and identify issues that arise during the creation of any classification system, such as the need to compromise between providing granular classifications that satisfy needs specific to a time and place.
Abstract: Bowker GC and Star SL. 389 pages. Cambridge, MA, and London: MIT Pr; 1999. $29.95. ISBN 0262024616. Order phone 800-356-0343. Field of medicine: Public health and medical informatics. Format: Hardcover book (softcover also available). Audience: Physicians and nonphysicians involved in developing or setting policy for classification systems, nomenclatures, or vocabularies. Purpose: To discuss the idea that classifications and standardizations have direct impact on social and political aspects of human interaction. Content: The authors organize their presentation into an introductory chapter that frames the issues, followed by three sections (classification and large-scale infrastructures, classification and biography, and classification and work practice) providing specific examples, and a conclusion section. The authors use the International Classification of Diseases, 9th revision, race classification under apartheid in South Africa, and the Nursing Intervention Classification as primary examples. An extensive bibliography of more than 300 references, a name index, and a subject index follow the text. Highlights: The authors present fascinating history and insights into the development of various classification systems. In addition, they identify issues that arise during the creation of any classification system, such as the need to compromise between providing granular classifications that satisfy needs specific to a time and place. Finally, the authors draw attention to the implications of choices made in the development of some important classification systems. These implications bear on moral judgments, financial effects, and political gains or losses. Limitations: The authors' writing style hinder the reader's ability to access the interesting information and to understand the implications of choices made in developing classification systems. While the overall organization of the book is clear, the themes and ideas do not flow well. Sentences require repeated readings, and a dictionary at your side would be helpful, given the authors' frequent use of unfamiliar words. These failings obscure interesting and valuable facts and viewpoints. Related readings: Svenonius'The Intellectual Foundation of Information Organization (MIT Pr; 2000) and Aitchison and colleagues'Thesaurus Construction and Use: A Practical Manual (Fitzroy Dearborn; 2000). Reviewers: J. Marc Overhage, MD, PhD, and Jeffery G. Suico, MD, Regenstrief Institute for Health Care and Indiana University School of Medicine, Indianapolis, IN.

2,314 citations



Journal ArticleDOI
TL;DR: The state of the art in the design and analysis of external memory algorithms and data structures, where the goal is to exploit locality in order to reduce the I/O costs is surveyed.
Abstract: Data sets in large applications are often too massive to fit completely inside the computers internal memory. The resulting input/output communication (or I/O) between fast internal memory and slower external memory (such as disks) can be a major performance bottleneck. In this article we survey the state of the art in the design and analysis of external memory (or EM) algorithms and data structures, where the goal is to exploit locality in order to reduce the I/O costs. We consider a variety of EM paradigms for solving batched and online problems efficiently in external memory. For the batched problem of sorting and related problems such as permuting and fast Fourier transform, the key paradigms include distribution and merging. The paradigm of disk striping offers an elegant way to use multiple disks in parallel. For sorting, however, disk striping can be nonoptimal with respect to I/O, so to gain further improvements we discuss distribution and merging techniques for using the disks independently. We also consider useful techniques for batched EM problems involving matrices (such as matrix multiplication and transposition), geometric data (such as finding intersections and constructing convex hulls), and graphs (such as list ranking, connected components, topological sorting, and shortest paths). In the online domain, canonical EM applications include dictionary lookup and range searching. The two important classes of indexed data structures are based upon extendible hashing and B-trees. The paradigms of filtering and bootstrapping provide a convenient means in online data structures to make effective use of the data accessed from disk. We also reexamine some of the above EM problems in slightly different settings, such as when the data items are moving, when the data items are variable-length (e.g., text strings), or when the allocated amount of internal memory can change dynamically. Programming tools and environments are available for simplifying the EM programming task. During the course of the survey, we report on some experiments in the domain of spatial databases using the TPIE system (transparent parallel I/O programming environment). The newly developed EM algorithms and data structures that incorporate the paradigms we discuss are significantly faster than methods currently used in practice.

751 citations


Patent
13 Dec 2001
TL;DR: In this paper, a microfabricated device and methods of using the device for analyzing and sorting polynucleotide molecules by size was described. But this device was used only for the analysis of single nucleotide molecules.
Abstract: The invention relates to a microfabricated device and methods of using the device for analyzing and sorting polynucleotide molecules by size.

251 citations


Journal ArticleDOI
TL;DR: The authors construct a dynamic model of intergenerational education acquisition, fertility, and marital sorting and parameterize the steady state to match several basic empirical endings: a negative correlation between fertility and education, a decreasing marginal effect of parental education on children's years of education, wages that are sensitive to the relative supply of skilled workers, and borrowing constraints that affect educational attainment for some low-income households.
Abstract: Many social commentators have raised concerns over the possibility that increased sorting in society may lead to greater inequality. To investigate this, we construct a dynamic model of intergenerational education acquisition, fertility, and marital sorting and parameterize the steady state to match several basic empirical endings. We end that increased sorting will signiecantly increase income inequality. Four factors are important to our endings: a negative correlation between fertility and education, a decreasing marginal effect of parental education on children’s years of education, wages that are sensitive to the relative supply of skilled workers, and borrowing constraints that affect educational attainment for some low-income households.

222 citations


Journal ArticleDOI
TL;DR: A counterpropagating dual-beam optical-trapping configuration is shown theoretically and experimentally to be preferred due to a greater ability to manipulate cells in three dimensions and to be suitable for automated single-cell sorting.
Abstract: We provide a basis for automated single-cell sorting based on optical trapping and manipulation using human peripheral blood as a model system. A counterpropagating dual-beam optical-trapping configuration is shown theoretically and experimentally to be preferred due to a greater ability to manipulate cells in three dimensions. Theoretical analysis performed by simulating the propagation of rays through the region containing an erythrocyte (red blood cell) divided into numerous elements confirms experimental results showing that a trapped erythrocyte orients with its longest axis in the direction of propagation of the beam. The single-cell sorting system includes an image-processing system using thresholding, background subtraction, and edge-enhancement algorithms, which allows for the identification of single cells. Erythrocytes have been identified and manipulated into designated volumes using the automated dual-beam trap. Potential applications of automated single-cell sorting, including the incorporation of molecular biology techniques, are discussed.

160 citations


Journal Article
TL;DR: In this paper, the first exact polynomial time algorithm for sorting signed permutations by reversals was proposed, and a 1.375-approximation algorithm was given for the special case of unsigned permutations.
Abstract: Analysis of genomes evolving by inversions leads to a general combinatorial problem of Sorting by Reversals, MIN-SBR, the problem of sorting a permutation by a minimum number of reversals. Following a series of preliminary results, Hannenhalli and Pevzner developed the first exact polynomial time algorithm for the problem of sorting signed permutations by reversals, and a polynomial time algorithm for a special case of unsigned permutations. The best known approximation algorithm for MIN-SBR, due to Christie, gives a performance ratio of 1.5. In this paper, by exploiting the polynomial time algorithm for sorting signed permutations and by developing a new approximation algorithm for maximum cycle decomposition of breakpoint graphs, we design a new 1.375-algorithm for the MIN-SBR problem.

130 citations


Patent
31 Jan 2001
TL;DR: In this article, a parameter to be used in sorting is selected, either by the player or by the user, and a random number is generated and weights are assigned to the parameter and the random number.
Abstract: A method for sorting music files. A parameter to be used in sorting is selected, either by the player or by the user. A random number is generated and weights are assigned to the parameter and the random number. These values are then used to calculate sorting criteria for each file. The files are then sorted by their sorting criteria, generating a playlist.

117 citations


Patent
22 Jan 2001
TL;DR: In this paper, the authors combine spectral and temporal sorting algorithms to identify microcolonies of recombinant organisms harboring mutated genes encoding enzymes having desirable kinetic attributes and substrate specificity, and display their associated spectral or kinetic data.
Abstract: Complex multidimensional datasets generated by digital imaging spectroscopy can be organized and analyzed by applying software and computer-based methods comprising sorting algorithms. Combinations of these algorithms to images and graphical data, allow pixels or features to be rapidly and efficiently classified into meaningful groups according to defined criteria. Multiple rounds of pixel or feature selection may be performed based on independent sorting criteria. In one embodiment sorting by spectral criteria (e.g., intensity at a given wavelength) is combined with sorting by temporal criteria (e.g., absorbance at a given time) to identify microcolonies of recombinant organisms harboring mutated genes encoding enzymes having desirable kinetic attributes and substrate specificity. Restriction of the set of pixels analyzed in a subsequent sort based on criteria applied in an earlier sort (“sort and lock” analyses) minimize computational and storage resources. User-defined criteria can also be incorporated into the sorting process by means of a graphical user interface that comprises a visualization tools including a contour plot, a sorting bar and a grouping bar, an image window, and a plot window that allow run-time interactive identification of pixels or features meeting one or more criteria, and display of their associated spectral or kinetic data. These methods are useful for extracting information from imaging data in applications ranging from biology and medicine to remote sensing.

114 citations


Journal ArticleDOI
TL;DR: A complete answer to the optimal sorting problem of a permutation of length n in at most 3n/4 moves is given, namely [(n + 1)/2].

104 citations


Patent
19 Dec 2001
TL;DR: In this paper, a system and method for processing a plurality of call detail records (CDRs) each indicative of a call transaction on a telecommunications network is presented. But the method is based on a first controller, where each CDR includes a data structure including a pluralityof fields each containing at least one character, and the first sorting field is used to group the CDRs according to different carriers.
Abstract: A system and method for processing a plurality of call detail records (CDRs) each indicative of a call transaction on a telecommunications network. The method includes receiving the plurality of CDRs at a first controller, wherein each of the CDRs include a data structure including a plurality of fields each containing at least one character. The method then selects a first sorting field from the plurality of fields and groups the plurality of CDRs as a function of data within the first sorting field. In one embodiment, the first sorting field is used to group the CDRs according to different carriers. The method then analyzes at least one additional sorting field within each of the CDRs which were previously grouped according to the first sorting field. A report is then generated for each of the grouped CDRs as a function of data within the additional sorting field. In this way, periodic, customized reports can be generated from information contained with CDRs with user-selectable sorting or analysis fields.

Patent
Louis Amadio1, Chris J. Guzak1, Todd Ouzts1, Philip P. Fortier1, Suzan M. Andrew1 
11 Apr 2001
TL;DR: In this paper, a new way of providing pertinent information about an item (e.g., a text file, a picture file, music file, video file, or any other similar file) is provided.
Abstract: A new way of providing pertinent information about an item (e.g., a text file, a picture file, a music file, video file, or any other similar file) is provided. The invention provides graphical information about the item along with user-selectable properties that are specific to that item. The invention further provides a way of sorting the items by the user-selectable properties and communicating the sort order to the user. The invention thereby provides the user with a way of quickly finding pertinent information about the item.

Patent
Jussi Myllymaki1
26 Jan 2001
TL;DR: In this paper, a system and method for sorting information that has particular significance at a specific location only to those individuals that are at or near that geo-spatial location is presented.
Abstract: A system and method for sorting information that has particular significance at a specific location only to those individuals that are at or near that geo-spatial location. The system includes a GPS client wireless component that can be a personal wireless communication device (such as Palm Pilot, cellular digital phones, etc.) or personal computer configured for use within a global position satellite network.

Patent
26 Jun 2001
TL;DR: The sorting and packaging system (100) comprises an induction and scanning system (104), a single pass sorting and packing system (110), and a control unit (112) as mentioned in this paper.
Abstract: The sorting and packaging system (100) comprises an induction and scanning system (104), a single pass sorting and packaging system (110) for automatically sorting and packaging a plurality of mailpieces based on a single scan by the induction and scanning system (104), and a control unit (112) connected to and controlling the induction and scanning system (104) and the single pass sorting and packaging system (110). The single pass sorting and packaging system (110) comprises at least one cell rack (302), at least one packaging system (304), and at least one delivery system (308). The cell rack (302) is connected to the induction and scanning system (104) by a transport sorting system (408). The cell rack (302) comprises a plurality of cells (402) and a purging system (416). The packaging system (304) is connected to the cell rack (302) and comprises a transport packaging system (410) and a packaging unit (426). The delivery system (308) is connected to the packaging system (304).

Journal ArticleDOI
TL;DR: Upper and lower bounds for reversal and transposition distance are obtained and it is shown that the problem of finding reversal distance between binary strings, and therefore between strings over an arbitrary fixed-size alphabet, is NP-hard.
Abstract: The problems of sorting by reversals and sorting by transpositions have been studied because of their applications to genome comparison. Prior studies of both problems have assumed that the sequences to be compared (or sorted) contain no duplicates, but there is a natural generalization in which the sequences are allowed to contain repeated characters. In this paper we study primarily the versions of these problems in which the strings to be compared are drawn from a binary alphabet. We obtain upper and lower bounds for reversal and transposition distance and show that the problem of finding reversal distance between binary strings, and therefore between strings over an arbitrary fixed-size alphabet, is NP-hard.

Patent
22 Feb 2001
TL;DR: In this article, the delivery request registration number and the address and the name of a receiver registered to the RFID label affixed on a delivery article is read via RFID reader/radio communications apparatus (400) attached to an arm of a sorting worker (31).
Abstract: Information such as the delivery request registration number and the address and the name of a receiver registered to the RFID label affixed on a delivery article is read via RFID reader/radio communications apparatus (400) attached to an arm of a sorting worker (31). The information is sent via radio communications apparatus, and a guidance instruction on the carrying palette for sorting where the delivery article is to be put away is given on radio incoming information display apparatus (500) attached to a carrying palette for sorting (700) via blinking of a guidance lamp (600).


Patent
Dale W. Malik1
25 Jun 2001
TL;DR: In this article, a system for intelligently sorting e-mail comprises a client which downloads e-mails from a server and prior to presenting the emails to the user, the client sorts the e mail into classifications based upon whether the email is from a personal contact.
Abstract: A system for intelligently sorting e-mail comprises a client which downloads e-mails from a server. Prior to presenting the e-mails to the user, the client sorts the e-mail into classifications based upon whether the e-mail is from a personal contact, i.e. someone that that the user knows, whether the e-mail is from a commercial vendor from whom the user has indicated that he or she wishes to accept commercial e-mail, or whether the e-mail is from an unknown source. The client presents the e-mails to the user in these classifications.


Patent
Henry Esmond Butterworth1
04 Sep 2001
TL;DR: In this article, a method for data sorting in an information storage system and a log-structured system is described, where units of data are sorted into streams according to the expected time until the next rewrite of the unit of data.
Abstract: A method for data sorting in an information storage system and an information storage system ( 104 ) are described. The information storage system ( 104 ) is a log structured system having storage devices ( 106 ) in which information segments ( 202, 204 ) are located. Units of data are sorted into streams ( 136, 138 ) according to the expected time until the next rewrite of the unit of data. Sorting data into streams ( 136, 138 ) improves the efficiency of free space collection in the storage devices ( 106 ). Separate streams ( 136, 138 ) are provided for rewritten data units and units of data being relocated due to free space collections in the storage devices ( 106 ). The streams ( 136, 138 ) can have fixed or dynamic boundaries.

Patent
Aviad Zlotnick1
31 Jul 2001
TL;DR: In this paper, a method for data entry, which includes receiving a plurality of images and sorting the images into an order responsive to a measure of similarity between the images, so as to group similar images together in the order, is presented.
Abstract: A method for data entry, includes receiving a plurality of images and sorting the images into an order responsive to a measure of similarity between the images, so as to group similar images together in the order A first image among the images in the order is presented to an operator, and an input is received from the operator specifying a code to be assigned to the first image A second image, subsequent to the first image among the images in the order, is then presented to the operator, along with the code specified by the operator for assignment to the first image The code is assigned to the second image responsive to a single input action by the operator, indicating that the second image is to be assigned the same code as the first image

Journal ArticleDOI
TL;DR: New evidence now suggests that: (1) two proteins with structurally similar sorting signals can use different sorting mechanisms; (2) one protein with multiple sorting signalsCan be sorted differently in different cell types; and (3) one cell type can recognize different sorting signals and use different sorted mechanisms.

Patent
27 Nov 2001
TL;DR: An object sorting system (100) includes a first sorting matrix (300), a second sorting matrix crossing the first one, and a control system (120) for directing the sorting of objects (110) to one of a plurality of sort destinations (501) as mentioned in this paper.
Abstract: An object sorting system (100) includes a first sorting matrix (300), a second sorting matrix (200) crossing said first sorting matrix (300), and a control system (120) for directing the sorting of objects (110) to one of a plurality of sort destinations (501).

Patent
29 May 2001
TL;DR: In this paper, the problem of solving the work by a user after closing a power source again is troublesome, once a trouble is generated and power source is cut. But the problem can be solved by using a nonvolatile memory.
Abstract: PROBLEM TO BE SOLVED: To solve the problem in the work by a user after closing a power source again is troublesome, once a trouble is generated and a power source is cut. SOLUTION: This image forming device is provided with a first nonvolatile memory means for housing the present condition of a body 1 just before cutting a power source with a control system and a second nonvolatile memory means for housing the present condition of a sheet sorting device just before cutting the power source with the control system. When closing the power source just before concluding the image forming operation and the sheet sorting operation, the control system informs the housing contents of one of the first and the second nonvolatile memory means to the other thereof to continue the image forming operation and the sheet sorting operation of the body 1 and the sheet sorting device 2.

Posted Content
01 Jan 2001
TL;DR: In this paper, preference disaggregation analysis provides the framework for developing sorting models through the analysis of the global judgment of the decision-maker using mathematical programming techniques, however, the automatic elicitation of preferential information through the preference analysis raises several issues regarding the impact of the parameters involved in the model development process on the performance and the stability of the developed models.
Abstract: Within the field of multicriteria decision aid (MCDA), sorting refers to the assignment of a set of alternatives into predefined homogenous groups defined in an ordinal way. The real–world applications of this type of problem extend to a wide range of decision–making fields. Preference disaggregation analysis provides the framework for developing sorting models through the analysis of the global judgment of the decision–maker using mathematical programming techniques. However, the automatic elicitation of preferential information through the preference disaggregation analysis raises several issues regarding the impact of the parameters involved in the model development process on the performance and the stability of the developed models. The objective of this paper is to shed light on this issue. For this purpose the UTADIS preference disaggregation sorting method (UTilites Additives DIScriminantes) is considered. The conducted analysis is based on an extensive Monte Carlo simulation and useful findings are obtained on the aforementioned issues.

01 Jan 2001
TL;DR: A general model of conjoint measurement is proposed that is able to represent inconsistent data that are often encountered in preferential information, due to hesitation of decision makers, unstable character of their preferences, imprecise or incomplete information and the like, and can be represented in a meaningful way by "if...,then..." decision rules induced from rough approximations.
Abstract: We consider a multicriteria sorting problem consisting in assignment of some actions to some predefined and preference-ordered decision classes. The actions are described by a finite set of criteria. The sorting task is usually performed using one of three preference models: discriminant function (as in scoring methods, discriminant analysis, UTADIS), outranking relation (as in ELECTRE TRI) or decision rules. A challenging problem in multicriteria sorting is the aggregation of ordinal criteria. To handle this problem some max-min aggregation operators have been considered, with the most general one - the fuzzy integral of Sugeno (1974). We show show that the decision rule model has some advantages over the integral of Sugeno. More generally, we consider the multicriteria sorting problem in terms of conjoint measurement and prove a representation theorem stating an equivalence of a very simple cancellation property, a general discriminant function and a specific outranking relation, on the one hand, and a decision rule model on the other hand. Moreover, we consider a more general decision rule model based on the rough sets theory being one of emerging methodologies for extraction of knowledge from data. The advantage of the rough sets approach in comparison to competitive methodologies is the possibility of handling inconsistent data that are often encountered in preferential information, due to hesitation of decision makers, unstable character of their preferences, imprecise or incomplete information and the like. Therefore, we propose a general model of conjoint measurement that, using the basic concepts of the rough set approach (lower and upper approximation), is able to represent these inconsistencies by a specific discriminant function. We show that these inconsistencies can also be represented in a meaningful way by "if...,then..." decision rules induced from rough approximations.

Patent
06 Aug 2001
TL;DR: In this article, an intake station, an outlet station and a pre-sorting station disposed between the intake station and the outlet station are arranged for manual examination and removal of undesirable batteries and further objects.
Abstract: Apparatus and method for sorting used batteries, comprising an intake station, an outlet station and a pre-sorting station disposed between the intake station and the outlet station, which pre-sorting station connects to a first re-sorting station and a second re-sorting station. The two re-sorting stations are arranged for manual examination and removal of undesirable batteries and further objects, as well as for manual sorting of batteries and other objects that land in the re-sorting stations during operation of the apparatus.

Patent
18 Jun 2001
TL;DR: In this paper, an electronic commerce system (10) includes a server (40 ) operating on one or more computers that communicates a search query for one or multiple products to one or many seller databases (32 ) that contain product data.
Abstract: An electronic commerce system ( 10 ) includes a server ( 40 ) operating on one or more computers that communicates a search query for one or more products to one or more seller databases ( 32 ) that contain product data. Each seller database ( 32 ) generates local search results that are responsive to the search query. The server ( 40 ) also communicates one or more sorting parameters to the seller databases ( 32 ). The sorting parameters direct each seller database ( 32 ) to sort local search results generated at each seller database ( 32 ) according to the sorting parameters in response to the search query. In addition, the server ( 40 ) receives sorted local search results from one or more of the seller databases ( 32 ) and merges the sorted local search results received from the seller databases ( 32 ) to generate merged search results. Furthermore, the server ( 40 ) sorts the merged search results according to the sorting parameters and communicates the sorted merged search results to a user.

Proceedings ArticleDOI
Anupam Gupta1, Amit Kumar1
14 Oct 2001
TL;DR: It is shown that it is possible to get much improved results with the structured cost model than the case when the authors do not have any assumptions on comparison costs, and most practical applications will have some structured cost property.
Abstract: The study of the effect of priced information on basic algorithmic problems was initiated by M. Charikar et al. (2000). The authors continue the study of sorting and selection in the priced comparison model, i.e., when each comparison has an associated cost, and answer some of the open problems suggested by Charikar et al. If the comparison costs are allowed to be arbitrary, we show that one cannot get good approximation ratios. A different way to assign costs is based on the idea that one can distill out an intrinsic value for each item being compared, such that the cost of comparing two elements is some "well-behaved" or "structured" function of their values. We feel that most practical applications will have some structured cost property. The authors study the problems of sorting and selection (which includes finding the maximum and the median) in the structured cost model. We get a variety of approximation results for these problems, depending on the restrictions we put on the structured costs. We show that it is possible to get much improved results with the structured cost model than the case when we do not have any assumptions on comparison costs.

Posted Content
TL;DR: The authors examines the education literature through the lens of sorting and argues that how individuals sort across neighborhoods, schools and households (spouse) can have important consequences for the acquisition of human capital and inequality.
Abstract: This Paper examines the education literature through the lens of sorting. It argues that how individuals sort across neighborhoods, schools and households (spouses), can have important consequences for the acquisition of human capital and inequality. It discusses the implications of different education finance systems for sorting and analyses the efficiency and welfare properties of these in static and dynamic frameworks.