scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Self-organizing lists and independent references: a statistical synergy

01 Dec 1991-Journal of Algorithms (Academic Press, Inc.)-Vol. 12, Iss: 4, pp 533-555
TL;DR: The paper points out that the CS interacts with the access model to produce some remarkable synergistic effects that make it possible to use very effective “truncated versions of the CS, which have very modest space requirements.
About: This article is published in Journal of Algorithms.The article was published on 1991-12-01. It has received 13 citations till now. The article focuses on the topics: Probability vector.
Citations
More filters
Journal ArticleDOI
TL;DR: A standard combinatorial problem is to estimate the number of coupons, drawn at random, needed to complete a collection of all possible m types, and two computational paradigms are shown that are well suited for this type of problems.
Abstract: A standard combinatorial problem is to estimate the number (T) of coupons, drawn at random, needed to complete a collection of all possible m types. Generalizations of this problem have found many engineering applications. The usefulness of the model is hampered by the difficulties in obtaining numerical results for moments or distributions. We show two computational paradigms that are well suited for this type of problems: one, following Flajolet et al. [21], is the calculus of generating functions over regular languages. We use it to provide relatively efficient answers to several questions about the sampling process – we show it is possible to compute arbitrarily accurate approximations for quantities such as E[T], in a time which is linear in m for any type distribution, while an exponential time is required for exact calculation. It also leads to a proof of a long-standing folk-theorem, concerning the extremality of uniform reference probabilities. The second method is a generalization of the Poisson...

106 citations

01 Jan 1989
TL;DR: Ute calculus of generating functions over regular languages may be applied to the problem, answer numerous questions about the sampling process and demonstrate their numerical efficiency, and present a proof of a long-standing folk-theorem.
Abstract: A standard combinatorial problem calls to cstlmate the expected number of purchases of coupons needed LO complete Ute collection of all possible m different types. Generalizing this problem. by letting lhe coupons be obtained with an arbitrary probability distribution. and considering other related processes, the problem has been found to model many practical siwations. The usefulness of lhis model has been seriously hampered by !.he computational difficulties in obtaining any numerical results concerning moments or distributions. We show, following Flajolet et al. [15], !hat Ute calculus of generating functions over regular languages may be applied La the problem, answer numerous questions about the sampling process and demonstrate their numerical efficiency. We also present a proof of a long-standing folk-theorem. concerning lhe extremalily of uniform reference probabilities. The paper concludes with a discussion of estimation problems related lo the engineering applications of lhis problem.

22 citations


Cites background from "Self-organizing lists and independe..."

  • ...Good estimates, especially for the smaller probabilities, require an inordinately long sampling time, typically much longer than E[TCp)] Csee [21])....

    [...]

Journal ArticleDOI
TL;DR: In this paper, the authors consider a Markov chain with a transition kernel and show that move-to-front is optimal with respect to the stationary search cost of the sequence of required keys.
Abstract: In papers about self-organizing data structures, it is often mentioned that the assumption of independence of successive requests of keys should be relaxed and that the dependence should assume the form of a locality phenomenon. In this setting, the move-to-front rule is considered to be of interest, but no optimality result concerning this rule has yet appeared. In this paper we assume that the sequence of required keys is a Markov chain with a transition kernel $P$ and we consider the class $\mathscr{F}^\ast$ of stochastic matrices $P$ such that move-to-front is optimal among on-line rules, with respect to the stationary search cost. We give properties of $\mathscr{F}^\ast$ that bear out the usual explanation of optimality of move-to-front by a locality phenomenon exhibited by the sequence of required keys. We explicitly produce a large subclass of $\mathscr{F}^\ast$, while showing that in some cases move-to-front is optimal with respect to the speed of convergence toward stationary search cost.

13 citations

Book ChapterDOI
01 Jan 1999
TL;DR: In this article, the Chernoff inequality and the Hoeffding inequality are combined with the Bennett inequality to obtain tight bounds on the decay rate of probabilies of the type.
Abstract: In Computer Science and Statistics it is often desirable to obtain tight bounds on the decay rate of probabilies of the type \(\Pr \left\{ {{S_n} - E\left[ {{S_n}} \right] \geqslant na} \right\},\), where S n is a sum of independent random variables \(\left\{ {{X_i}} \right\}_{i = 1}^n\). This is usually done by means of Chernoff inequality, or the more general Hoeffding inequality. The latter inequality is assymptotically optimal as far as the expectations of X i -s go, but ceases to be so when the variances are also given. The variances are taken into account in the stronger Bennett inequality, which despite its potential usefulness is virtually unknown in CS community.

11 citations


Cites background or methods from "Self-organizing lists and independe..."

  • ...The Counter Scheme (CS), which maintains a reference count for each element, and rearranges the list in decreasing order of the counters, can be shown to converge to the optimal ordering [7]....

    [...]

  • ...Hofri and Shachnai present in [7] a stopping point for this reorganization process in the case in which the vector of access probabilities p̄ = (p1, ....

    [...]

  • ...In [7] it is shown using the additivity of expectation that Cm(CS|p̄) = C(OPT | p̄) + ∑...

    [...]

  • ..., [8, 7]), in which a set of n items held as a linear list is accessed randomly, according to some fixed probability distribution....

    [...]

Journal ArticleDOI
TL;DR: The counter scheme is defined, a policy that keeps the records sorted by their access frequencies, and it is proved that among all deterministic policies it produces the least expected cost of access, at any time.

8 citations

References
More filters
Journal ArticleDOI
TL;DR: This article shows that move-to-front is within a constant factor of optimum among a wide class of list maintenance rules, and analyzes the amortized complexity of LRU, showing that its efficiency differs from that of the off-line paging rule by a factor that depends on the size of fast memory.
Abstract: In this article we study the amortized efficiency of the “move-to-front” and similar rules for dynamically maintaining a linear list. Under the assumption that accessing the ith element from the front of the list takes t(i) time, we show that move-to-front is within a constant factor of optimum among a wide class of list maintenance rules. Other natural heuristics, such as the transpose and frequency count rules, do not share this property. We generalize our results to show that move-to-front is within a constant factor of optimum as long as the access cost is a convex function. We also study paging, a setting in which the access cost is not convex. The paging rule corresponding to move-to-front is the “least recently used” (LRU) replacement rule. We analyze the amortized complexity of LRU, showing that its efficiency differs from that of the off-line paging rule (Belady's MIN algorithm) by a factor that depends on the size of fast memory. No on-line paging algorithm has better amortized performance.

2,378 citations


"Self-organizing lists and independe..." refers background in this paper

  • ...Recently, there have appeared some work [Sleator and Tarjan (1985), Bentley and McGeogh (1985)] that considers not the expected cost of MTF, but rather the highest possible cost (worst case), when averaged (or – as called in that context – amortized) over a long reference sequence....

    [...]

Book
01 Jan 1977
TL;DR: In this article, the philosophy of selecting and ordering populations has been studied in the context of normal distribution models, and the main focus of this paper is on the following: 1. Selecting the one best population for Normal Distributions with Common Known Variance (CKV) 2.
Abstract: 1. The Philosophy of Selecting and Ordering Populations 2. Selecting the One Best Population for Normal Distributions with Common Known Variance 3. Selecting the One Best Population for Other Normal Distribution Models 4. Selecting the One Best Population Bionomial (or Bernoulli) Distributions 5. Selecting the One Normal Population with the Smallest Variance 6. Selecting the One Best Category for the Multinomial Distribution 7. Nonparametric Selection Procedures 8. Selection Procedures for a Design with Paired Comparisons 9. Selecting the Normal Population with the Best Regression Value 10. Selecting Normal Populations Better than a Control 11. Selecting the t Best Out of k Populations 12. Complete Ordering of k Populations 13. Subset Selection (or Elimination) Procedures 14. Selecting the Best Gamma Population 15. Selection Procedures for Multivariate Normal Distributions Appendix A. Tables for Normal Means Selection Problems Appendix B. Figures for Normal Means Selection Problems Appendix C. Table of the Cumulative Standard Normal Distribution F(z) Appendix D. Table of Critical Values for the Chi-Square Distribution Appendix E. Tables for Binomial Selection Problems Appendix F. Figures for Binomial Selection Problems Appendix G. Tables for Normal Variances Selection Problems Appendix H. Tables for Multinomial Selection Problems Appendix I. Curtailment Tables for the Multinomial Selection Problem Appendix J. Tables of the Incomplete Beta Function Appendix K. Tables for Nonparametric Selection Problems Appendix L. Tables for Paired-Comparison Selection Problems Appendix M. Tables for Selecting from k Normal Populations Those Better Than a Control Appendix N. Tables for Selecting the t Best Normal Populations Appendix O. Table of Critical Values of Fisher's F Distribution Appendix P. Tables for Complete Ordering Problems Appendix Q. Tables for Subset Selection Problems Appendix R. Tables for Gamma Distribution Problems Appendix S. Tables for Multivariate Selection Problems Appendix T. Excerpt of Table of Random Numbers Appendix U. Table of Squares and Square Roots Bibliography References for Applications Index for Data and Examples Name Index Subject Index.

357 citations


"Self-organizing lists and independe..." refers background in this paper

  • ...We conclude with a discussion of the assumptions of the model and point out that there are many questions concerning it, and its immediate extensions, that are as yet unanswered....

    [...]

  • ...There is a substantial statistical literature on discriminating multinomial probabilities, under various requirements; a comprehensive account is Gibbons et al. (1977)....

    [...]

  • ...…the other hand, when pi − p j > 0 is very small, even though the number of references required to order them correctly with high probability is huge [Gibbons et al. (1977)], the corresponding penalty of incorrect order is minute (even when pi and p j proper are not small – since the added cost is…...

    [...]

Journal ArticleDOI
TL;DR: Empirical evidence suggests that transposition is in fact optimal for any distribution of search probabilities, and the "move to front" and "transposition" heuristics are shown to be optimal to within a constant factor.
Abstract: This paper examines a class of heuristics for maintaining a sequential list in approximately optimal order with respect to the average time required to search for a specified element, assuming that each element is searched for with a fixed probability independent of previous searches performed. The “move to front” and “transposition” heuristics are shown to be optimal to within a constant factor, and the transposition rule is shown to be the more efficient of the two. Empirical evidence suggests that transposition is in fact optimal for any distribution of search probabilities.

244 citations


"Self-organizing lists and independe..." refers background or methods in this paper

  • ...We conclude with a discussion of the assumptions of the model and point out that there are many questions concerning it, and its immediate extensions, that are as yet unanswered....

    [...]

  • ...This rule was found to be more efficient asymptotically than the MTF scheme [Hendricks, (1976), Rivest (1976)]....

    [...]

  • ...Its asymptotic cost per access under various rpv’s has been discussed by Bitner (1979), Burville and Kingman (1973), Hendricks (1976), Knuth (1973), McCabe (1965) and Rivest (1976)....

    [...]

  • ...This holds true independently of the list order and the history of past accesses....

    [...]

Journal ArticleDOI
John McCabe1
TL;DR: The theory of regular Markov chains is used to demonstrate the existence of EX and to show that the law of large numbers holds, where EX is the limiting average position of a queried record.
Abstract: A serial file is considered in which, after a query is processed, the order in the file is changed by moving the record to which the query referred into the first place in the file. The theory of regular Markov chains is used to demonstrate the existence of EX and to show that the law of large numbers holds, where EX is the limiting average position of a queried record. A closed-form expression for EX is determined. A second method of relocation is proposed, in which the queried record exchanges positions with the record immediately before it in the file. It is conjectured that this method of relocation is at least as good as the first method. It is pointed out that, whichever method of relocation is used, if one only relocated after every Mth query, the limiting average position of a queried record is the same as it is if we relocate after every query.

159 citations


"Self-organizing lists and independe..." refers background in this paper

  • ...Its asymptotic cost per access under various rpv’s has been discussed by Bitner (1979), Burville and Kingman (1973), Hendricks (1976), Knuth (1973), McCabe (1965) and Rivest (1976)....

    [...]

  • ...This holds true independently of the list order and the history of past accesses....

    [...]

Journal ArticleDOI
TL;DR: Experiments show that the behavior of the heuristics on real data is more closely described by the amortized analyses than by the probabilistic analyses.
Abstract: The performance of sequential search can be enhanced by the use of heuristics that move elements closer to the front of the list as they are found. Previous analyses have characterized the performance of such heuristics probabilistically. In this article, we use amortization to analyze the heuristics in a worst-case sense; the relative merit of the heuristics in this analysis is different in the probabilistic analyses. Experiments show that the behavior of the heuristics on real data is more closely described by the amortized analyses than by the probabilistic analyses.

148 citations