scispace - formally typeset
Search or ask a question

Showing papers by "AT&T Labs published in 1999"


Proceedings Article
29 Nov 1999
TL;DR: This paper proves for the first time that a version of policy iteration with arbitrary differentiable function approximation is convergent to a locally optimal policy.
Abstract: Function approximation is essential to reinforcement learning, but the standard approach of approximating a value function and determining a policy from it has so far proven theoretically intractable. In this paper we explore an alternative approach in which the policy is explicitly represented by its own function approximator, independent of the value function, and is updated according to the gradient of expected reward with respect to the policy parameters. Williams's REINFORCE method and actor-critic methods are examples of this approach. Our main new result is to show that the gradient can be written in a form suitable for estimation from experience aided by an approximate action-value or advantage function. Using this result, we prove for the first time that a version of policy iteration with arbitrary differentiable function approximation is convergent to a locally optimal policy.

5,492 citations


Journal ArticleDOI
Vladimir Vapnik1
TL;DR: How the abstract learning theory established conditions for generalization which are more general than those discussed in classical statistical paradigms are demonstrated and how the understanding of these conditions inspired new algorithmic approaches to function estimation problems are demonstrated.
Abstract: Statistical learning theory was introduced in the late 1960's. Until the 1990's it was a purely theoretical analysis of the problem of function estimation from a given collection of data. In the middle of the 1990's new types of learning algorithms (called support vector machines) based on the developed theory were proposed. This made statistical learning theory not only a tool for the theoretical analysis but also a tool for creating practical algorithms for estimating multidimensional functions. This article presents a very general overview of statistical learning theory including both theoretical and algorithmic aspects of the theory. The goal of this overview is to demonstrate how the abstract learning theory established conditions for generalization which are more general than those discussed in classical statistical paradigms and how the understanding of these conditions inspired new algorithmic approaches to function estimation problems.

5,370 citations


Proceedings ArticleDOI
27 Sep 1999
TL;DR: Some of the research challenges in understanding context and in developing context-aware applications are discussed, which are increasingly important in the fields of handheld and ubiquitous computing, where the user?s context is changing rapidly.
Abstract: When humans talk with humans, they are able to use implicit situational information, or context, to increase the conversational bandwidth. Unfortunately, this ability to convey ideas does not transfer well to humans interacting with computers. In traditional interactive computing, users have an impoverished mechanism for providing input to computers. By improving the computer’s access to context, we increase the richness of communication in human-computer interaction and make it possible to produce more useful computational services. The use of context is increasingly important in the fields of handheld and ubiquitous computing, where the user?s context is changing rapidly. In this panel, we want to discuss some of the research challenges in understanding context and in developing context-aware applications.

4,842 citations


Journal ArticleDOI
TL;DR: This paper presents a tutorial introduction to the use of variational methods for inference and learning in graphical models (Bayesian networks and Markov random fields), and describes a general framework for generating variational transformations based on convex duality.
Abstract: This paper presents a tutorial introduction to the use of variational methods for inference and learning in graphical models (Bayesian networks and Markov random fields). We present a number of examples of graphical models, including the QMR-DT database, the sigmoid belief network, the Boltzmann machine, and several variants of hidden Markov models, in which it is infeasible to run exact inference algorithms. We then introduce variational methods, which exploit laws of large numbers to transform the original graphical model into a simplified graphical model in which inference is efficient. Inference in the simpified model provides bounds on probabilities of interest in the original model. We describe a general framework for generating variational transformations based on convex duality. Finally we return to the examples and demonstrate how variational algorithms can be formulated in each case.

4,093 citations


Journal ArticleDOI
TL;DR: It is shown that options enable temporally abstract knowledge and action to be included in the reinforcement learning frame- work in a natural and general way and may be used interchangeably with primitive actions in planning methods such as dynamic pro- gramming and in learning methodssuch as Q-learning.

3,233 citations


Book
01 Jan 1999
TL;DR: It is shown that using multiple transmit antennas and space-time block coding provides remarkable performance at the expense of almost no extra processing.
Abstract: We document the performance of space-time block codes, which provide a new paradigm for transmission over Rayleigh fading channels using multiple transmit antennas. Data is encoded using a space-time block code, and the encoded data is split into n streams which are simultaneously transmitted using n transmit antennas. The received signal at each receive antenna is a linear superposition of the n transmitted signals perturbed by noise. Maximum likelihood decoding is achieved in a simple way through decoupling of the signals transmitted from different antennas rather than joint detection. This uses the orthogonal structure of the space-time block code and gives a maximum likelihood decoding algorithm which is based only on linear processing at the receiver. We review the encoding and decoding algorithms for various codes and provide simulation results demonstrating their performance. It is shown that using multiple transmit antennas and space-time block coding provides remarkable performance at the expense of almost no extra processing.

1,958 citations


Journal ArticleDOI
TL;DR: The use of support vector machines in classifying e-mail as spam or nonspam is studied by comparing it to three other classification algorithms: Ripper, Rocchio, and boosting decision trees, which found SVM's performed best when using binary features.
Abstract: We study the use of support vector machines (SVM) in classifying e-mail as spam or nonspam by comparing it to three other classification algorithms: Ripper, Rocchio, and boosting decision trees. These four algorithms were tested on two different data sets: one data set where the number of features were constrained to the 1000 best features and another data set where the dimensionality was over 7000. SVM performed best when using binary features. For both data sets, boosting trees and SVM had acceptable test performance in terms of accuracy and speed. However, SVM had significantly less training time.

1,536 citations


Journal ArticleDOI
TL;DR: It is observed that a simple remapping of the input x(i)-->x(i)(a) improves the performance of linear SVM's to such an extend that it makes them, for this problem, a valid alternative to RBF kernels.
Abstract: Traditional classification approaches generalize poorly on image classification tasks, because of the high dimensionality of the feature space. This paper shows that support vector machines (SVM) can generalize well on difficult image classification problems where the only features are high dimensional histograms. Heavy-tailed RBF kernels of the form K(x, y)=e/sup -/spl rho///spl Sigma//sub i//sup |xia-yia|b/ with a /spl les/1 and b/spl les/2 are evaluated on the classification of images extracted from the Corel stock photo collection and shown to far outperform traditional polynomial or Gaussian radial basis function (RBF) kernels. Moreover, we observed that a simple remapping of the input x/sub i//spl rarr/x/sub i//sup a/ improves the performance of linear SVM to such an extend that it makes them, for this problem, a valid alternative to RBF kernels.

1,510 citations


Proceedings ArticleDOI
01 Aug 1999
TL;DR: A sensor-driven, or sentient, platform for context-aware computing that enables applications to follow mobile users as they move around a building and presents it in a form suitable for application programmers is described.
Abstract: We describe a sensor-driven, or sentient, platform for context-aware computing that enables applications to follow mobile users as they move around a building. The platform is particularly suitable for richly equipped, networked environments. The only item a user is required to carry is a small sensor tag, which identifies them to the system and locates them accurately in three dimensions. The platform builds a dynamic model of the environment using these location sensors and resource information gathered by telemetry software, and presents it in a form suitable for application programmers. Use of the platform is illustrated through a practical example, which allows a user's current working desktop to follow them as they move around the environment.

1,479 citations


Book ChapterDOI
19 Apr 1999
TL;DR: A resurrecting duckling security policy model is presented, which describes secure transient association of a device with multiple serialised owners over the air in a short range wireless channel.
Abstract: In the near future, many personal electronic devices will be able to communicate with each other over a short range wireless channel. We investigate the principal security issues for such an environment. Our discussion is based on the concrete example of a thermometer that makes its readings available to other nodes over the air. Some lessons learned from this example appear to be quite general to ad-hoc networks, and rather different from what we have come to expect in more conventional systems: denial of service, the goals of authentication, and the problems of naming all need re-examination. We present the resurrecting duckling security policy model, which describes secure transient association of a device with multiple serialised owners.

1,355 citations


Proceedings Article
Robert E. Schapire1
31 Jul 1999
TL;DR: The boosting algorithm AdaBoost is introduced, and the underlying theory of boosting is explained, including an explanation of why boosting often does not suffer from overfitting.
Abstract: Boosting is a general method for improving the accuracy of any given learning algorithm. This short paper introduces the boosting algorithm AdaBoost, and explains the underlying theory of boosting, including an explanation of why boosting often does not suffer from overfitting. Some examples of recent applications of boosting are also described.

Book
21 Oct 1999
TL;DR: A Syntax for Data: Typing semistructured data and the Lore system and database products supporting XML are explained.
Abstract: 1 Introduction 2 A Syntax for Data 3 XML 4 Query Languages 5 Query Languages for XML 6 Interpretation and advanced features 7 Typing semistructured data 8 Query Processing 9 The Lore system 10 Strudel 11 Database products supporting XML

Journal ArticleDOI
TL;DR: With the proposed channel estimator, combining OPDM with transmitter diversity using space-time coding is a promising technique for highly efficient data transmission over mobile wireless channels.
Abstract: Transmitter diversity is an effective technique to improve wireless communication performance. In this paper, we investigate transmitter diversity using space-time coding for orthogonal frequency division multiplexing (OFDM) systems in high-speed wireless data applications. We develop channel parameter estimation approaches, which are crucial for the decoding of the space-time codes, and we derive the MSE bounds of the estimators. The overall receiver performance using such a transmitter diversity scheme is demonstrated by extensive computer simulations. For an OFDM system with two transmitter antennas and two receiver antennas with transmission efficiency as high as 1.475 bits/s/Hz, the required signal-to-noise ratio is only about 7 dB for a 1% bit error rate and 9 dB for a 10% word error rate assuming channels with two-ray, typical urban, and hilly terrain delay profiles, and a 40-Hz Doppler frequency. In summary, with the proposed channel estimator, combining OPDM with transmitter diversity using space-time coding is a promising technique for highly efficient data transmission over mobile wireless channels.

Proceedings Article
23 Aug 1999
TL;DR: This work proposes and evaluates new graphical password schemes that exploit features of graphical input displays to achieve better security than text-based passwords and describes the prototype implementation of one of the schemes on a personal digital assistants (PDAs) namely the Palm PilotTM.
Abstract: In this paper we propose and evaluate new graphical password schemes that exploit features of graphical input displays to achieve better security than text-based passwords. Graphical input devices enable the user to decouple the position of inputs from the temporal order in which those inputs occur, and we show that this decoupling can be used to generate password schemes with substantially larger (memorable) password spaces. In order to evaluate the security of one of our schemes, we devise a novel way to capture a subset of the "memorable" passwords that, we believe, is itself a contribution. In this work we are primarily motivated by devices such as personal digital assistants (PDAs) that offer graphical input capabilities via a stylus, and we describe our prototype implementation of one of our password schemes on such a PDA, namely the Palm PilotTM.

Journal ArticleDOI
TL;DR: It is believed users' confidence in online transactions will increase when they are presented with meaningful information and choices about Web site privacy practices, and P3P is not a silver bullet; it is complemented by other technologies as well as regulatory and self-regulatory approaches to privacy.
Abstract: nternet users are concerned about the privacy of information they supply to Web sites, not only in terms of personal data, but information that Web sites may derive by tracking their online activities [7]. Many online privacy concerns arise because it is difficult for users to obtain information about actual Web site information practices. Few Web sites post privacy policies, 1 and even when they are posted, users do not always find them trustworthy or understandable. Thus, there is often a one-way mirror effect: Web sites ask users to provide personal information, but users have little knowledge about how their information will be used. Understandably, this lack of knowledge leads to confusion and mistrust. The WorldWide Web Consortium (W3C)'s Platform for Privacy Preferences Project (P3P) provides a framework for informed online interactions. The goal of P3P is to enable users to exercise preferences over Web site privacy practices at the Web sites. P3P applications will allow users to be informed about Web site practices , delegate decisions to their computer agent when they wish, and tailor relationships with specific sites. We believe users' confidence in online transactions will increase when they are presented with meaningful information and choices about Web site privacy practices. P3P is not a silver bullet; it is complemented by other technologies as well as regulatory and self-regulatory approaches to privacy. Some technologies have the ability to technically preclude practices that may be unacceptable to a user. For example, digital cash, anonymizers, and encryp-tion limit the information the recipient or eaves-droppers can collect during an interaction. Laws and industry guidelines codify and enforce expectations regarding information practices as the default or baseline for interactions. A compelling feature of P3P is that localized decision making enables flexibility in a medium that encompasses diverse preferences, cultural norms, and regulatory jurisdictions. However, for P3P to be effective, users must be willing and able to make meaningful decisions when presented with disclosures. This requires the existence of easy-to-use tools that allow P3P P Pr ri iv va ac cy y P Pr re ef fe er re en nc ce es s Web sites can bolster user confidence by clarifying their privacy practices upfront, allowing visitors to become active players in the decision-making process. 49 users to delegate much of the information processing and decision making to their computer agents when they wish, as well as a framework promoting the use …

Journal ArticleDOI
TL;DR: An on-line algorithm for learning preference functions that is based on Freund and Schapire's "Hedge" algorithm is considered, and it is shown that the problem of finding the ordering that agrees best with a learned preference function is NP-complete.
Abstract: There are many applications in which it is desirable to order rather than classify instances. Here we consider the problem of learning how to order instances given feedback in the form of preference judgments, i.e., statements to the effect that one instance should be ranked ahead of another. We outline a two-stage approach in which one first learns by conventional means a binary preference function indicating whether it is advisable to rank one instance before another. Here we consider an on-line algorithm for learning preference functions that is based on Freund and Schapire's "Hedge" algorithm. In the second stage, new instances are ordered so as to maximize agreement with the learned preference function. We show that the problem of finding the ordering that agrees best with a learned preference function is NP-complete. Nevertheless, we describe simple greedy algorithms that are guaranteed to find a good approximation. Finally, we show how metasearch can be formulated as an ordering problem, and present experimental results on learning a combination of "search experts," each of which is a domain-specific query expansion strategy for a web search engine.

Journal ArticleDOI
TL;DR: The results confirm that the reduction in magnitude and within-subject variability of both temporal and spectral acoustic parameters with age is a major trend associated with speech development in normal children, and support the hypothesis of uniform axial growth of the vocal tract for male speakers.
Abstract: Changes in magnitude and variability of duration, fundamental frequency, formant frequencies, and spectral envelope of children’s speech are investigated as a function of age and gender using data obtained from 436 children, ages 5 to 17 years, and 56 adults. The results confirm that the reduction in magnitude and within-subject variability of both temporal and spectral acoustic parameters with age is a major trend associated with speech development in normal children. Between ages 9 and 12, both magnitude and variability of segmental durations decrease significantly and rapidly, converging to adult levels around age 12. Within-subject fundamental frequency and formant-frequency variability, however, may reach adult range about 2 or 3 years later. Differentiation of male and female fundamental frequency and formant frequency patterns begins at around age 11, becoming fully established around age 15. During that time period, changes in vowel formant frequencies of male speakers is approximately linear with...

Journal ArticleDOI
Xiaoxin Qiu1, K. Chawla2
TL;DR: The results show that using adaptive modulation even without any power control provides a significant throughput advantage over using signal-to-interference-plus-noise ratio (SINR) balancing power control and combining adaptive modulation and a suitable power control scheme leads to a significantly higher throughput as compared to no power control or using SINR-balancing power control.
Abstract: Adaptive modulation techniques have the potential to substantially increase the spectrum efficiency and to provide different levels of service to users, both of which are considered important for third-generation cellular systems. In this work, we propose a general framework to quantify the potential gains of such techniques. Specifically, we study the throughput performance gain that may be achieved by combining adaptive modulation and power control. Our results show that: (1) using adaptive modulation even without any power control provides a significant throughput advantage over using signal-to-interference-plus-noise ratio (SINR) balancing power control and (2) combining adaptive modulation and a suitable power control scheme leads to a significantly higher throughput as compared to no power control or using SINR-balancing power control. The first observation is especially important from an implementation point of view. Adjusting the modulation level without changing the transmission power requires far fewer measurements and feedback as compared to the SINR-balancing power control or the optimal power control. Hence, it is significantly easier to implement. Although presented in the context of adaptive modulation, the results also apply to other variable rate transmission techniques, e.g., rate adaptive coding schemes, coded modulation schemes, etc. This work provides valuable insight into the performance of variable rate transmission techniques in multi-user environments.

Journal ArticleDOI
William DuMouchel1
TL;DR: Here, a baseline or null hypothesis expected frequency is constructed for each cell, and screening criteria for ranking the cell deviations of observed from expected count are suggested and compared.
Abstract: A common data mining task is the search for associations in large databases Here we consider the search for “interestingly large” counts in a large frequency table, having millions of cells, most of which have an observed frequency of 0 or 1 We first construct a baseline or null hypothesis expected frequency for each cell, and then suggest and compare screening criteria for ranking the cell deviations of observed from expected count A criterion based on the results of fitting an empirical Bayes model to the cell counts is recommended An example compares these criteria for searching the FDA Spontaneous Reporting System database maintained by the Division of Pharmacovigilance and Epidemiology In the example, each cell count is the number of reports combining one of 1,398 drugs with one of 952 adverse events (total of cell counts = 49 million), and the problem is to screen the drug-event combinations for possible further investigation

Book ChapterDOI
10 Jan 1999
TL;DR: The semistructured data model consists of an edge-labeled graph, in which nodes correspond to objects and edges to attributes or values as mentioned in this paper, and it is used to describe the data that does not conform to traditional data models.
Abstract: In recent years there has been an increased interest in managing data that does not conform to traditional data models, like the relational or object oriented model. The reasons for this non-conformance are diverse. On the one hand, data may not conform to such models at the physical level: it may be stored in data exchange formats, fetched from the Web, or stored as structured files. One the other hand, it may not conform at the logical level: data may have missing attributes, some attributes may be of different types in different data items, there may be heterogeneous collections, or the schema may be too complex or changes too often. The term semistructured data has been used to refer to such data. The semistructured data model consists of an edge-labeled graph, in which nodes correspond to objects and edges to attributes or values. Figure 1 illustrates a semistructured database providing information about a city.

Journal ArticleDOI
17 May 1999
TL;DR: This work presents a query language for XML, called XML-QL, which is argued to be suitable for performing the above tasks, and can extract data from existing XML documents and construct new XML documents.
Abstract: An important application of XML is the interchange of electronic data (EDI) between multiple data sources on the Web. As XML data proliferates on the Web, applications will need to integrate and aggregate data from multiple source and clean and transform data to facilitate exchange. Data extraction, conversion, transformation, and integration are all well-understood database problems, and their solutions rely on a query language. We present a query language for XML, called XML-QL, which we argue is suitable for performing the above tasks. XML-QL is a declarative, `relational complete' query language and is simple enough that it can be optimized. XML-QL can extract data from existing XML documents and construct new XML documents.

Journal ArticleDOI
TL;DR: An unextendible product basis (UPB) as discussed by the authors is an incomplete orthogonal product basis whose complementary subspace contains no product state, and it is shown that the uniform mixed state over the subspace complementary to any UPB is a bound entangled state.
Abstract: An unextendible product basis( UPB) for a multipartite quantum system is an incomplete orthogonal product basis whose complementary subspace contains no product state. We give examples of UPBs, and show that the uniform mixed state over the subspace complementary to any UPB is a bound entangled state. We exhibit a tripartite 2 3 2 3 2 UPB whose complementary mixed state has tripartite entanglement but no bipartite entanglement, i.e., all three corresponding 2 3 4 bipartite mixed states are unentangled. We show that members of a UPB are not perfectly distinguishable by local positive operator valued measurements and classical communication. [S0031-9007(99)09360-6]

Journal ArticleDOI
TL;DR: A variant of the game-playing algorithm is proved to be optimal in a very strong sense and a new, simple proof of the min–max theorem, as well as a provable method of approximately solving a game.

Proceedings ArticleDOI
30 Aug 1999
TL;DR: A new service interface is proposed, termed a hose, to provide the appropriate performance abstraction to manage network resources in the face of increased uncertainty, and the statistical multiplexing and resizing techniques deal effectively with uncertainties about the traffic.
Abstract: As IP technologies providing both tremendous capacity and the ability to establish dynamic secure associations between endpoints emerge, Virtual Private Networks (VPNs) are going through dramatic growth. The number of endpoints per VPN is growing and the communication pattern between endpoints is becoming increasingly hard to forecast. Consequently, users are demanding dependable, dynamic connectivity between endpoints, with the network expected to accommodate any traffic matrix, as long as the traffic to the endpoints does not overwhelm the rates of the respective ingress and egress links. We propose a new service interface, termed a hose, to provide the appropriate performance abstraction. A hose is characterized by the aggregate traffic to and from one endpoint in the VPN to the set of other endpoints in the VPN, and by an associated performance guarantee.Hoses provide important advantages to a VPN customer: (i) flexibility to send traffic to a set of endpoints without having to specify the detailed traffic matrix, and (ii) reduction in the size of access links through multiplexing gains obtained from the natural aggregation of the flows between endpoints. As compared with the conventional point to point (or customer-pipe) model for managing QoS, hoses provide reduction in the state information a customer must maintain. On the other hand, hoses would appear to increase the complexity of the already difficult problem of resource management to support QoS. To manage network resources in the face of this increased uncertainty, we consider both conventional statistical multiplexing techniques, and a new resizing technique based on online measurements.To study these performance issues, we run trace driven simulations, using traffic derived from AT&T's voice network, and from a large corporate data network. From the customer's perspective, we find that aggregation of traffic at the hose level provides significant multiplexing gains. From the provider's perspective, we find that the statistical multiplexing and resizing techniques deal effectively with uncertainties about the traffic, providing significant gains over the conventional alternative of a mesh of statically sized customer-pipes between endpoints.

Journal ArticleDOI
TL;DR: This paper dramatically reduces encoding and decoding complexity by partitioning antennas at the transmitter into small groups, and using individual space-time codes, called the component codes, to transmit information from each group of antennas.
Abstract: The information capacity of wireless communication systems may be increased dramatically by employing multiple transmit and receive antennas. The goal of system design is to exploit this capacity in a practical way. An effective approach to increasing data rate over wireless channels is to employ space-time coding techniques appropriate to multiple transmit antennas. These space-time codes introduce temporal and spatial correlation into signals transmitted from different antennas, so as to provide diversity at the receiver, and coding gain over an uncoded system. For large number of transmit antennas and at high bandwidth efficiencies, the receiver may become too complex whenever correlation across transmit antennas is introduced. This paper dramatically reduces encoding and decoding complexity by partitioning antennas at the transmitter into small groups, and using individual space-time codes, called the component codes, to transmit information from each group of antennas. At the receiver, an individual space-time code is decoded by a novel linear processing technique that suppresses signals transmitted by other groups of antennas by treating them as interference. A simple receiver structure is derived that provides diversity and coding gain over uncoded systems. This combination of array processing at the receiver and coding techniques for multiple transmit antennas can provide reliable and very high data rate communication over narrowband wireless channels. A refinement of this basic structure gives rise to a multilayered space-time architecture that both generalizes and improves upon the layered space-time architecture proposed by Foschini (see Bell Labs Tech. J., vol.1, no.2, 1996).

Proceedings ArticleDOI
01 Nov 1999
TL;DR: There is a need to know more about the range of user concerns and preferences about privacy in order to build usable and effective interface mechanisms for P3P and other privacy technologies.
Abstract: Privacy is a necessary concern in electronic commerce. It is difficult, if not impossible, to complete a transaction without revealing some personal data ‐ a shipping address, billing information, or product preference. Users may be unwilling to provide this necessary information or even to browse online if they believe their privacy is invaded or threatened. Fortunately, there are technologies to help users protect their privacy. P3P (Platform for Privacy Preferences Project) from the World Wide Web Consortium is one such technology. However, there is a need to know more about the range of user concerns and preferences about privacy in order to build usable and effective interface mechanisms for P3P and other privacy technologies. Accordingly, we conducted a survey of 381 U.S. Net users, detailing a range of commerce scenarios and examining the participants' concerns and preferences about privacy. This paper presents both the findings from that study as well as their design implications.

Journal ArticleDOI
01 Jun 1999
TL;DR: It is shown how a document-type-descriptor (DTD), when present, can be exploited to further improve performance and is interested in applying STORED to XML data, which is an instance of semistructured data.
Abstract: Systems for managing and querying semistructured-data sources often store data in proprietary object repositories or in a tagged-text format. We describe a technique that can use relational database management systems to store and manage semistructured data. Our technique relies on a mapping between the semistructured data model and the relational data model, expressed in a query language called STORED. When a semistructured data instance is given, a STORED mapping can be generated automatically using data-mining techniques. We are interested in applying STORED to XML data, which is an instance of semistructured data. We show how a document-type-descriptor (DTD), when present, can be exploited to further improve performance.

Proceedings Article
31 Jul 1999
TL;DR: It is shown that STRIPS problems can be directly translated into SAT and efficiently solved using new randomized systematic solvers and that polynomialtime SAT simplification algorithms applied to the encoded problem instances are a powerful complement to the "mutex" propagation algorithm that works directly on the plan graph.
Abstract: The Blackbox planning system unifies the planning as satisfiability framework (Kautz and Selman 1992, 1996) with the plan graph approach to STRIPS planning (Blum and Furst 1995). We show that STRIPS problems can be directly translated into SAT and efficiently solved using new randomized systematic solvers. For certain computationally challenging benchmark problems this unified approach outperforms both SATPLAN and Graphplan alone. We also demonstrate that polynomialtime SAT simplification algorithms applied to the encoded problem instances are a powerful complement to the "mutex" propagation algorithm that works directly on the plan graph.

Proceedings Article
31 Jul 1999
TL;DR: In this paper, the authors present an algorithm that, given only a generative model (simulator) for an arbitrary MDP, performs near-optimal planning with a running time that has no dependence on the number of states.
Abstract: An issue that is critical for the application of Markov decision processes (MDPs) to realistic problems is how the complexity of planning scales with the size of the MDP. In stochastic environments with very large or even infinite state spaces, traditional planning and reinforcement learning algorithms are often inapplicable, since their running time typically scales linearly with the state space size. In this paper we present a new algorithm that, given only a generative model (simulator) for an arbitrary MDP, performs near-optimal planning with a running time that has no dependence on the number of states. Although the running time is exponential in the horizon time (which depends only on the discount factor 7 and the desired degree of approximation to the optimal policy), our results establish for the first time that there are no theoretical barriers to computing near-optimal policies in arbitrarily large, unstructured MDPs. Our algorithm is based on the idea of sparse sampling. We prove that a randomly sampled look-ahead tree that covers only a vanishing fraction of the full look-ahead tree nevertheless suffices to compute near-optimal actions from any state of an MDP. Practical implementations of the algorithm are discussed, and we draw ties to our related recent results on finding a near-best strategy from a given class of strategies in very large partially observable MDPs [KMN99].

Journal ArticleDOI
Mikkel Thorup1
TL;DR: A deterministic linear time and linear space algorithm is presented for the undirected single source shortest paths problem with positive integer weights, which avoids the sorting bottleneck by building a hierarchical bucketing structure.
Abstract: The single-source shortest paths problem (SSSP) is one of the classic problems in algorithmic graph theory: given a positively weighted graph G with a source vertex s, find the shortest path from s to all other vertices in the graph.Since 1959, all theoretical developments in SSSP for general directed and undirected graphs have been based on Dijkstra's algorithm, visiting the vertices in order of increasing distance from s. Thus, any implementation of Dijkstra's algorithm sorts the vertices according to their distances from s. However, we do not know how to sort in linear time.Here, a deterministic linear time and linear space algorithm is presented for the undirected single source shortest paths problem with positive integer weights. The algorithm avoids the sorting bottleneck by building a hierarchical bucketing structure, identifying vertex pairs that may be visited in any order.