Topic

Trie

About: Trie is a research topic. Over the lifetime, 2245 publications have been published within this topic receiving 40445 citations. The topic is also known as: digital tree & radix tree.

...read moreread less

Papers published on a yearly basis

1 / 2

Papers

PDF

Open Access

More filters

Proceedings Article•

KenLM: Faster and Smaller Language Model Queries

[...]

Kenneth Heafield¹•Institutions (1)

Carnegie Mellon University¹

30 Jul 2011

TL;DR: KenLM is a library that implements two data structures for efficient language model queries, reducing both time and memory costs and is integrated into the Moses, cdec, and Joshua translation systems.

...read moreread less

Abstract: We present KenLM, a library that implements two data structures for efficient language model queries, reducing both time and memory costs. The Probing data structure uses linear probing hash tables and is designed for speed. Compared with the widely-used SRILM, our Probing model is 2.4 times as fast while using 57% of the memory. The Trie data structure is a trie with bit-level packing, sorted records, interpolation search, and optional quantization aimed at lower memory consumption. Trie simultaneously uses less memory than the smallest lossless baseline and less CPU than the fastest baseline. Our code is open-source, thread-safe, and integrated into the Moses, cdec, and Joshua translation systems. This paper describes the several performance techniques used and presents benchmarks against alternative implementations.

...read moreread less

1,297 citations

Journal Article•DOI•

Trie memory

[...]

Edward Fredkin¹•Institutions (1)

BBN Technologies¹

01 Sep 1960-Communications of The ACM

TL;DR: In this paper several paradigms of trie memory are described and compared with other memory paradigm, their advantages and disadvantages are examined in detail, and applications are discussed.

...read moreread less

Abstract: Trie memory is a way of storing and retrieving information. ~ It is applicable to information that consists of function-argument (or item-term) pairs--information conventionally stored in unordered lists, ordered lists, or pigeonholes. The main advantages of trie memory over the other memoIw plans just mentioned are shorter access time, greater ease of addition or up-dating, greater convenience in handling arguments of diverse lengths, and the ability to take advantage of redundancies in the information stored. The main disadvantage is relative inefficiency in using storage space, but this inefficiency is not great when the store is large. In this paper several paradigms of trie memory are described and compared with other memory paradigms, their advantages and disadvantages are examined in detail, and applications are discussed. Many essential features of trie memory were mentioned by de la Briandais [1] in a paper presented to the Western Joint Computer Conference in 1959. The present development is essentially independent of his, having been described in memorandum form in January 1959 [2], and it is fuller in that it considers additional paradigms (finitedimensional trie memories) and includes experimental results bearing on the efficiency of utilization of storage space.

...read moreread less

1,144 citations

Journal Article•DOI•

The Reactive Tabu Search

[...]

Roberto Battiti, Giampietro Tecchiolli

01 May 1994-Informs Journal on Computing

TL;DR: An algorithm for combinatorial optimization where an explicit check for the repetition of configurations is added to the basic scheme of Tabu search, and it is shown that the Hashing or Digital Tree techniques can be used in order to search for repetitions in a time that is approximately constant.

...read moreread less

Abstract: We propose an algorithm for combinatorial optimization where an explicit check for the repetition of configurations is added to the basic scheme of Tabu search. In our Tabu scheme the appropriate size of the list is learned in an automated way by reacting to the occurrence of cycles. In addition, if the search appears to be repeating an excessive number of solutions excessively often, then the search is diversified by making a number of random moves proportional to a moving average of the cycle length. The reactive scheme is compared to a “strict” Tabu scheme that forbids the repetition of configurations and to schemes with a fixed or randomly varying list size. From the implementation point of view we show that the Hashing or Digital Tree techniques can be used in order to search for repetitions in a time that is approximately constant. We present the results obtained for a series of computational tests on a benchmark function, on the 0-1 Knapsack Problem, and on the Quadratic Assignment Problem. INFORMS...

...read moreread less

865 citations

Journal Article•DOI•

Efficient Tree Structures for High Utility Pattern Mining in Incremental Databases

[...]

Chowdhury Farhan Ahmed¹, Syed Khairuzzaman Tanbeer¹, Byeong-Soo Jeong¹, Young-Koo Lee¹•Institutions (1)

Kyung Hee University¹

01 Dec 2009-IEEE Transactions on Knowledge and Data Engineering

TL;DR: This paper proposes three novel tree structures to efficiently perform incremental and interactive HUP mining that can capture the incremental data without any restructuring operation, and shows that these tree structures are very efficient and scalable.

...read moreread less

Abstract: Recently, high utility pattern (HUP) mining is one of the most important research issues in data mining due to its ability to consider the nonbinary frequency values of items in transactions and different profit values for every item. On the other hand, incremental and interactive data mining provide the ability to use previous data structures and mining results in order to reduce unnecessary calculations when a database is updated, or when the minimum threshold is changed. In this paper, we propose three novel tree structures to efficiently perform incremental and interactive HUP mining. The first tree structure, Incremental HUP Lexicographic Tree (IHUPL-Tree), is arranged according to an item's lexicographic order. It can capture the incremental data without any restructuring operation. The second tree structure is the IHUP transaction frequency tree (IHUPTF-Tree), which obtains a compact size by arranging items according to their transaction frequency (descending order). To reduce the mining time, the third tree, IHUP-transaction-weighted utilization tree (IHUPTWU-Tree) is designed based on the TWU value of items in descending order. Extensive performance analyses show that our tree structures are very efficient and scalable for incremental and interactive HUP mining.

...read moreread less

555 citations

Journal Article•DOI•

Fast address lookups using controlled prefix expansion

[...]

V. Srinivasan¹, George Varghese¹•Institutions (1)

Washington University in St. Louis¹

01 Feb 1999-ACM Transactions on Computer Systems

TL;DR: The main technique, controlled prefix expansion, transforms a set of prefixes into an equivalent set with fewer prefix lengths, and optimization techniques based on dynamic programming, and local transformations of data structures to improve cache behavior are used.

...read moreread less

Abstract: Internet (IP) address lookup is a major bottleneck in high-performance routers. IP address lookup is challenging because it requires a longest matching prefix lookup. It is compounded by increasing routing table sizes, increased traffic, higher-speed links, and the migration to 128-bit IPv6 addresses. We describe how IP lookups and updates can be made faster using a set of of transformation techniques. Our main technique, controlled prefix expansion, transforms a set of prefixes into an equivalent set with fewer prefix lengths. In addition, we use optimization techniques based on dynamic programming, and local transformations of data structures to improve cache behavior. When applied to trie search, our techniques provide a range of algorithms (Expanded Tries) whose performance can be tuned. For example, using a processor with 1MB of L2 cache, search of the MaeEast database containing 38000 prefixes can be done in 3 L2 cache accesses. On a 300MHz Pentium II which takes 4 cycles for accessing the first word of the L2 cacheline, this algorithm has a worst-case search time of 180 nsec., a worst-case insert/delete time of 2.5 msec., and an average insert/delete time of 4 usec. Expanded tries provide faster search and faster insert/delete times than earlier lookup algirthms. When applied to Binary Search on Levels, our techniques improve worst-case search times by nearly a factor of 2 (using twice as much storage) for the MaeEast database. Our approach to algorithm design is based on measurements using the VTune tool on a Pentium to obtain dynamic clock cycle counts. Our techniques also apply to similar address lookup problems in other network protocols.

...read moreread less

514 citations

Collapse

Network Information

Performance

Metrics

2,348

Papers

42,724

Citations

No. of papers in the topic in previous years
Year	Papers
2023	36
2022	65
2021	56
2020	79
2019	110
2018	87

Trie

Papers published on a yearly basis

Papers

Trending Questions (6)

Network Information

Related Topics (5)

Performance

Metrics