scispace - formally typeset
Search or ask a question

Showing papers by "Indian Institute of Technology Bombay published in 1999"


Journal ArticleDOI
17 May 1999
TL;DR: A new hypertext resource discovery system called a Focused Crawler that is robust against large perturbations in the starting set of URLs, and capable of exploring out and discovering valuable resources that are dozens of links away from the start set, while carefully pruning the millions of pages that may lie within this same radius.
Abstract: The rapid growth of the World-Wide Web poses unprecedented scaling challenges for general-purpose crawlers and search engines In this paper we describe a new hypertext resource discovery system called a Focused Crawler The goal of a focused crawler is to selectively seek out pages that are relevant to a pre-defined set of topics The topics are specified not using keywords, but using exemplary documents Rather than collecting and indexing all accessible Web documents to be able to answer all possible ad-hoc queries, a focused crawler analyzes its crawl boundary to find the links that are likely to be most relevant for the crawl, and avoids irrelevant regions of the Web This leads to significant savings in hardware and network resources, and helps keep the crawl more up-to-date To achieve such goal-directed crawling, we designed two hypertext mining programs that guide our crawler: a classifier that evaluates the relevance of a hypertext document with respect to the focus topics, and a distiller that identifies hypertext nodes that are great access points to many relevant pages within a few links We report on extensive focused-crawling experiments using several topics at different levels of specificity Focused crawling acquires relevant pages steadily while standard crawling quickly loses its way, even though they are started from the same root set Focused crawling is robust against large perturbations in the starting set of URLs It discovers largely overlapping sets of resources in spite of these perturbations It is also capable of exploring out and discovering valuable resources that are dozens of links away from the start set, while carefully pruning the millions of pages that may lie within this same radius Our anecdotes suggest that focused crawling is very effective for building high-quality collections of Web documents on specific topics, using modest desktop hardware © 1999 Published by Elsevier Science BV All rights reserved

1,700 citations


Journal ArticleDOI
TL;DR: Clever is a search engine that analyzes hyperlinks to uncover two types of pages: authorities, which provide the best source of information on a given topic; and hubs, which provides collections of links to authorities.
Abstract: The Web is a hypertext body of approximately 300 million pages that continues to grow at roughly a million pages per day. Page variation is more prodigious than the data's raw scale: taken as a whole, the set of Web pages lacks a unifying structure and shows far more authoring style and content variation than that seen in traditional text document collections. This level of complexity makes an "off-the-shelf" database management and information retrieval solution impossible. To date, index based search engines for the Web have been the primary tool by which users search for information. Such engines can build giant indices that let you quickly retrieve the set of all Web pages containing a given word or string. Experienced users can make effective use of such engines for tasks that can be solved by searching for tightly constrained key words and phrases. These search engines are, however, unsuited for a wide range of equally important tasks. In particular, a topic of any breadth will typically contain several thousand or million relevant Web pages. How then, from this sea of pages, should a search engine select the correct ones-those of most value to the user? Clever is a search engine that analyzes hyperlinks to uncover two types of pages: authorities, which provide the best source of information on a given topic; and hubs, which provide collections of links to authorities. We outline the thinking that went into Clever's design, report briefly on a study that compared Clever's performance to that of Yahoo and AltaVista, and examine how our system is being extended and updated.

559 citations


Journal ArticleDOI
TL;DR: In this paper, an in situ method of probing the structure of living epithelial cells, based on light scattering spectroscopy with polarized light, was proposed, which makes it possible to distinguish between single backscattering from uppermost epithelium cells and multiply scattered light.
Abstract: We report an in situ method of probing the structure of living epithelial cells, based on light scattering spectroscopy with polarized light. The method makes it possible to distinguish between single backscattering from uppermost epithelial cells and multiply scattered light. The spectrum of the single backscattering component can be further analyzed to provide histological information about the epithelial cells such as the size distribution of the cell nuclei and their refractive index. These are valuable quantities' to detect and diagnose precancerous changes in human tissues.

450 citations


Journal ArticleDOI
TL;DR: The person model is augmented by a simple motion model of constant velocity for all DOFs which is used in the prediction step of the IEKF and in the update step, both region and edge information are used.

262 citations


Journal ArticleDOI
01 Jun 1999
TL;DR: A notion of quasi-succinctness is introduced, which allows a quasi-Succinct 2-var constraint to be reduced to two succinct 1-var constraints for pruning, and a query optimizer is proposed that is ccc-optimal, i.e., minimizing the effort incurred w.r.t. constraint checking and support counting.
Abstract: Currently, there is tremendous interest in providing ad-hoc mining capabilities in database management systems. As a first step towards this goal, in [15] we proposed an architecture for supporting constraint-based, human-centered, exploratory mining of various kinds of rules including associations, introduced the notion of constrained frequent set queries (CFQs), and developed effective pruning optimizations for CFQs with 1-variable (1-var) constraints.While 1-var constraints are useful for constraining the antecedent and consequent separately, many natural examples of CFQs illustrate the need for constraining the antecedent and consequent jointly, for which 2-variable (2-var) constraints are indispensable. Developing pruning optimizations for CFQs with 2-var constraints is the subject of this paper. But this is a difficult problem because: (i) in 2-var constraints, both variables keep changing and, unlike 1-var constraints, there is no fixed target for pruning; (ii) as we show, “conventional” monotonicity-based optimization techniques do not apply effectively to 2-var constraints.The contributions are as follows. (1) We introduce a notion of quasi-succinctness, which allows a quasi-succinct 2-var constraint to be reduced to two succinct 1-var constraints for pruning. (2) We characterize the class of 2-var constraints that are quasi-succinct. (3) We develop heuristic techniques for non-quasi-succinct constraints. Experimental results show the effectiveness of all our techniques. (4) We propose a query optimizer for CFQs and show that for a large class of constraints, the computation strategy generated by the optimizer is ccc-optimal, i.e., minimizing the effort incurred w.r.t. constraint checking and support counting.

187 citations


Journal ArticleDOI
TL;DR: The results indicate that model-based tracking of rigid objects in monocular image sequences may have to be reappraised more thoroughly than anticipated during the recent past.
Abstract: A model-based vehicle tracking system for the evaluation of inner-city traffic video sequences has been systematically tested on about 15 minutes of real world video data Methodological improvements during preparatory test phases affected—among other changes—the combination of edge element and optical flow estimates in the measurement process and a more consequent exploitation of background knowledge The explication of this knowledge in the form of models facilitates the evaluation of video data for different scenes by exchanging the scene-dependent models An extensive series of experiments with a large test sample demonstrates that the current version of our system appears to have reached a relative optimum: further interactive tuning of tracking parameters does no longer promise to improve the overall system performance significantly Even the incorporation of further knowledge regarding vehicle and scene geometry or illumination has to cope with an increasing level of interaction between different knowledge sources and system parameters Our results indicate that model-based tracking of rigid objects in monocular image sequences may have to be reappraised more thoroughly than anticipated during the recent past

181 citations


Journal ArticleDOI
TL;DR: In this paper, an improved formulation based upon multi-objective integer programming approach is presented to arrive at the optimal configuration of RHWMS components, addressing important practical issues like unique characteristics of the hazardous wastes reflecting on waste-waste and waste-technology compatibility.

173 citations


Proceedings Article
01 Jan 1999
TL;DR: Lower bounds of Q( $ Sort(V)) are shown for the I/Ocomplexity of graph theoretic problems like connected components, biconnected components, and minimum spanning trees, where E and V are the number of edges and vertices in the input graph, respectively.
Abstract: We show lower bounds of Q( $ Sort(V)) for the I/Ocomplexity of graph theoretic problems like connected components, biconnected components, and minimum spanning trees, where E and V are the number of edges and vertices in the input graph, respectively. We also present a deterministic O($ Sort(V) ’ max(l,loglog 9)) algorithm for the problem of graph connectivity, where B and D denote respectively the block size and number of disks. Our algorithm includes a breadth first search; this maybe of independent

170 citations


Journal ArticleDOI
TL;DR: A digraph-based approach is proposed for the problem of sensor location for identification of faults and various graph algorithms that use the developed digraph in deciding the location of sensors based on the concepts of observability and resolution are discussed.
Abstract: Fault diagnosis is an important task for the safe and optimal operation of chemical processes. Hence, this area has attracted considerable attention from researchers in the past few years. A variety of approaches have been proposed for solving this problem. All approaches for fault detection and diagnosis in some sense involve the comparison of the observed hehavior of the process to a reference model. The process behavior is inferred using sensors measuring the important variables in the process. Hence, the efficiency of the diagnostic approach depends critically on the location of sensors monitoring the process variables. The emphasis of most of the work on fault diagnosis has been more on procedures to perform diagnosis given a set of sensors and less on the actual location of sensors for efficient identification of faults. A digraph-based approach is proposed for the problem of sensor location for identification of faults. Various graph algorithms that use the developed digraph in deciding the location of sensors based on the concepts of observability and resolution are discussed. Simple examples are provided to explain the algorithms, and a complex FCCU case study is also discussed to underscore the utility of the algorithm for large flow sheets. The significance and scope of the proposed algorithms are highlighted.

158 citations


Journal ArticleDOI
TL;DR: In this article, a novel control method for a reactive volt-ampere compensator and harmonic suppressor system is proposed, which operates in cycle-by-cycle reference-current-controlled mode to achieve the instantaneous compensating feature.
Abstract: A novel control method for a reactive volt-ampere compensator and harmonic suppressor system is proposed. It operates without sensing the reactive volt-ampere demand and nonlinearities present in the load. The compensation process is instantaneous, which is achieved without employing any complicated and involved control logic. The compensator is operated in cycle-by-cycle reference-current-controlled mode to achieve the instantaneous compensating feature. A mathematical model of the scheme is developed. Detailed analysis and simulation results are presented. A laboratory prototype of the compensator is developed to validate the results.

156 citations


Journal ArticleDOI
TL;DR: In this paper, the optimal parameters of multiple tuned mass dampers (MTMD) for an undamped system to harmonic base excitation were investigated using a numerical searching technique. And the explicit formulae for the optimum parameters of MTMD (i.e. damping ratio, bandwidth and tuning frequency) were then derived using curve-fitting scheme that can readily be used for engineering applications.
Abstract: Optimum parameters of Multiple Tuned Mass Dampers (MTMD) for an undamped system to harmonic base excitation are investigated using a numerical searching technique. The criteria selected for the optimality is the minimization of steady-state displacement response of the main system. The explicit formulae for the optimum parameters of MTMD (i.e. damping ratio, bandwidth and tuning frequency) are then derived using curve-fitting scheme that can readily be used for engineering applications. The error in the proposed explicit expressions is investigated and found to be quite negligible. The optimum parameters of the MTMD system are obtained for different mass ratios and number of dampers. Copyright © 1999 John Wiley & Sons, Ltd.

Journal ArticleDOI
TL;DR: In this paper, a new model called the "Lamel model" is proposed as a further development of the Pancake model, which treats a stack of two lamella-shaped grains at a time.
Abstract: Rolling textures of low-carbon steel predicted by full constraints and relaxed constraints Taylor models, as well by a self-consistent model, are quantitatively compared to experimental results. It appears that none of these models really performs well, the best results being obtained by the Pancake model. Anew model (“Lamel model”) is then proposed as a further development of the Pancake model. It treats a stack of two lamella-shaped grains at a time. The new model is described in detail, after which the results obtained for rolling of low-carbon steel are discussed. The prediction of the overall texture now is quantitatively correct. However, the γ-fibre components are better predicted than the α-fibre ones. Finally it is concluded that further work is necessary, as the same kind of success is not guaranteed for other cases, such as rolling of f.c.c, materials.

Journal ArticleDOI
01 Jun 1999
TL;DR: This work proposes the use of a weaker correctness criterion called update consistency and outline mechanisms based on this criterion that ensure (1) the mutual consistency of data maintained by the server and read by clients, and (2) the currency of dataread by clients.
Abstract: A crucial consideration in environments where data is broadcast to clients is the low bandwidth available for clients to communicate with servers. Advanced applications in such environments do need to read data that is mutually consistent as well as current. However, given the asymmetric communication capabilities and the needs of clients in mobile environments, traditional serializability-based approaches are too restrictive, unnecessary, and impractical. We thus propose the use of a weaker correctness criterion called update consistency and outline mechanisms based on this criterion that ensure (1) the mutual consistency of data maintained by the server and read by clients, and (2) the currency of data read by clients. Using these mechanisms, clients can obtain data that is current and mutually consistent “off the air”, i.e., without contacting the server to, say, obtain locks. Experimental results show a substantial reduction in response times as compared to existing (serializability-based) approaches. A further attractive feature of the approach is that if caching is possible at a client, weaker forms of currency can be obtained while still satisfying the mutual consistency of data.

Journal ArticleDOI
TL;DR: In this paper, a predictive model was developed for simultaneous saccharification and fermentation (SSF) of starch to lactic acid acid using Lactobacillus delbrueckii.

Journal ArticleDOI
TL;DR: The fabrication of microtubular biosensors and sensor arrays based on polyaniline with superior transducing ability are described, resulting in a microtubule array that can analyze a sample containing a mixture of glucose, urea, and triglycerides in a single measurement.
Abstract: This paper describes the fabrication of microtubular biosensors and sensor arrays based on polyaniline with superior transducing ability. These sensors have been tested for the estimation of glucose, urea, and triglycerides. As compared to that of a macro sensor, the response of the microtubular sensor for glucose is higher by a factor of more than 103. Isoporous polycarbonate membranes have been used to fabricate inexpensive devices by simple thermal evaporation of gold using appropriate machined masks. Polyaniline deposition and enzyme immobilization have been done electrochemically. Electrochemical potential control has been used to direct enzyme immobilization to the chosen membrane device and avoid cross talk with adjacent devices. This has enabled the immobilization of a set of three different enzymes on three closely spaced devices, resulting in a microtubule array that can analyze a sample containing a mixture of glucose, urea, and triglycerides in a single measurement. This, in essence, is an “el...

Journal ArticleDOI
TL;DR: This paper presents a new method for incorporating imperfect FC (fault coverage) into a combinatorial model, SEA, which applies to any system for which the FC probabilities are constant and state-independent; the hazard rates are state- independent; and an FC failure leads to immediate system failure.
Abstract: This paper presents a new method for incorporating imperfect FC (fault coverage) into a combinatorial model. Imperfect FC, the probability that a single malicious fault can thwart automatic recovery mechanisms, is important to accurate reliability assessment of fault-tolerant computer systems. Until recently, it was thought that the consideration of this probability necessitated a Markov model rather than the simpler (and usually faster) combinatorial model. SEA, the new approach, separates the modeling of FC failures into two terms that are multiplied to compute the system reliability. The first term, a simple product, represents the probability that no uncovered fault occurs. The second term comes from a combinatorial model which includes the covered faults that can lead to system failure. This second term can be computed from any common approach (e.g. fault tree, block diagram, digraph) which ignores the FC concept by slightly altering the component-failure probabilities. The result of this work is that reliability engineers can use their favorite software package (which ignores the FC concept) for computing reliability, and then adjust the input and output of that program slightly to produce a result which includes FC. This method applies to any system for which: the FC probabilities are constant and state-independent; the hazard rates are state-independent; and an FC failure leads to immediate system failure.

Journal ArticleDOI
TL;DR: In this article, a vehicle-track model was developed to describe the short-term system dynamics of an ICE-1 carriage while running on an elastic track, disturbances by wheel radius deformations are assumed.

Journal ArticleDOI
TL;DR: Segregated structures, obtained experimentally, display organization in the presence of disorder and are captured by a continuum flow model incorporating collisional diffusion and density-driven segregation.
Abstract: An important industrial problem that provides fascinating puzzles in pattern formation is the tendency for granular mixtures to de-mix or segregate. Small differences in either size or density lead to flow-induced segregation. Similar to fluids, noncohesive granular materials can display chaotic advection; when this happens chaos and segregation compete with each other, giving rise to a wealth of experimental outcomes. Segregated structures, obtained experimentally, display organization in the presence of disorder and are captured by a continuum flow model incorporating collisional diffusion and density-driven segregation. Under certain conditions, structures never settle into a steady shape. This may be the simplest experimental example of a system displaying competition between chaos and order.

Journal ArticleDOI
02 Sep 1999-Chaos
TL;DR: Density and size segregation in a chute flow of cohesionless spherical particles is analyzed by means of computations and theory based on the transport equations for a mixture of nearly elastic particles.
Abstract: Mixing of granular solids is invariably accompanied by segregation, however, the fundamentals of the process are not well understood. We analyze density and size segregation in a chute flow of cohesionless spherical particles by means of computations and theory based on the transport equations for a mixture of nearly elastic particles. Computations for elastic particles (Monte Carlo simulations), nearly elastic particles, and inelastic, frictional particles (particle dynamics simulations) are carried out. General expressions for the segregation fluxes due to pressure gradients and temperature gradients are derived. Simplified equations are obtained for the limiting cases of low volume fractions (ideal gas limit) and equal sized particles. Theoretical predictions of equilibrium number density profiles are in good agreement with computations for mixtures of equal sized particles with different density for all solids volume fractions, and for mixtures of different sized particles at low volume fractions (ν<0.2), when the particles are elastic or nearly elastic. In the case of inelastic, frictional particles the theory gives reasonable predictions if an appropriate effective granular temperature is assumed. The relative importance of pressure diffusion and temperature diffusion for the cases considered is discussed.

Journal ArticleDOI
TL;DR: The performance of the proposed MAP-Markov random field based scheme for recovering the depth and the focused image of a scene from two defocused images is found to be better than that of the existing window-based depth from defocus technique.
Abstract: In this paper, we propose a MAP-Markov random field (MRF) based scheme for recovering the depth and the focused image of a scene from two defocused images. The space-variant blur parameter and the focused image of the scene are both modeled as MRFs and their MAP estimates are obtained using simulated annealing. The scheme is amenable to the incorporation of smoothness constraints on the spatial variations of the blur parameter as well as the scene intensity. It also allows for inclusion of line fields to preserve discontinuities. The performance of the proposed scheme is tested on synthetic as well as real data and the estimates of the depth are found to be better than that of the existing window-based depth from defocus technique. The quality of the space-variant restored image of the scene is quite good even under severe space-varying blurring conditions.

Journal ArticleDOI
TL;DR: In this article, the site occupation of the cations in non-stoichiometric Ni-Zr substituted barium ferrite BaFe12−2xNi0.63xZr0.50xO19−δ (0⩽x⵽0.5) has been investigated using Mossbauer and FT-IR spectroscopy to examine its influence on magnetic properties ( σ s, T c, K 1, etc.).

Journal ArticleDOI
TL;DR: Christensen's stochastic theory of hydrodynamic lubrication of rough surfaces is used to study the effect of surface roughness in an infinitely long porous journal bearing operating under steady conditions as mentioned in this paper.
Abstract: Christensen's stochastic theory of hydrodynamic lubrication of rough surfaces is used to study the effect of surface roughness in an infinitely long porous journal bearing operating under steady conditions. It is shown that the surface roughness considerably influences the bearing performance; the direction of the influence depends on the roughness type.

Journal ArticleDOI
TL;DR: Modeling congestion has provided a quantitative basis for understanding the contribution of different vehicle types in overall congestion, and it is useful for evolving the policy for congestion mitigation.
Abstract: A unified methodology has been proposed for the quantification of congestion, incorporating the volume and operational characteristics of traffic movement. The level of congestion has been modeled to relate to the causal influences of traffic movement. Modeling congestion has provided a quantitative basis for understanding the contribution of different vehicle types in overall congestion, and it is useful for evolving the policy for congestion mitigation. Quantified congestion level has been used as a logical and improved measure of effectiveness to account for the conceptual definition of level of service in a quantitative manner. Based on the congestion level, 10 levels of service have been proposed, with 9 in a stable flow zone (presently designated as \iA–\iE), and 1 representing an unstable operation (presently designated as \iF). The philosophy has been demonstrated by developing congestion models and assessing the effect of roadway width on congestion levels and service volumes. While it is possible to assess the realized benefits from an increase in roadway width, the required number of traffic lanes for a desired level of service can also be estimated.

Journal ArticleDOI
TL;DR: In this paper, a rotational spring was used to represent the crack section and the Frobenius method to enable possible detection of location of the crack based on the measurement of natural frequencies.

Journal ArticleDOI
01 Jun 1999
TL;DR: The core of a formal data model for network directories is developed, and a sequence of efficiently computable query languages with increasing expressive power are proposed, which share the flexibility and utility of the recent proposals for semi-structured data models.
Abstract: Heirarchically structured directories have recently proliferated with the growth of the Internet, and are being used to store not only address books and contact information for people, but also personal profiles, network resource information, and network and service policies. These systems provide a means for managing scale and heterogeneity, while allowing for conceptual unity and autonomy across multiple directory servers in the network, in a way for superior to what conventional relational or object-oriented databases offer. Yet, in deployed systems today, much of the data is modeled in an ad hoc manner, and many of the more sophisticated “queries” involve navigational access.In this paper, we develop the core of a formal data model for network directories, and propose a sequence of efficiently computable query languages with increasing expressive power. The directory data model can naturally represent rich forms of heterogeneity exhibited in the real world. Answers to queries expressible in our query languages can exhibit the same kinds of heterogeneity. We present external memory algorithms for the evaluation of queries posed in our directory query languages, and prove the efficiency of each algorithm in terms of its I/O complexity. Our data model and query languages share the flexibility and utility of the recent proposals for semi-structured data models, while at the same time effectively addressing the specific needs of network directory applications, which we demonstrate by means of a representative real-life example.

Journal ArticleDOI
TL;DR: In this article, a group theoretic method is used to establish the entire class of self-similar solutions to the problem of shock wave propagation through a dusty gas and necessary conditions for the existence of similarity solutions for shocks of arbitrary strength as well as for strong shocks are obtained.
Abstract: A group theoretic method is used to establish the entire class of self-similar solutions to the problem of shock wave propagation through a dusty gas. Necessary conditions for the existence of similarity solutions for shocks of arbitrary strength as well as for strong shocks are obtained. It is found that the problem admits a self-similar solution only when the ambient medium ahead of the shock is of uniform density. Collapse of imploding cylindrical and spherical shocks is worked out in detail to investigate as to how the shock involution is influenced by the mass concentration of solid particles in the medium, the ratio of the density of solid particles to that of initial density of the medium, the relative specific heat and the amplification mechanism of the flow convergence.

Journal ArticleDOI
TL;DR: A shorter average lifetime of tryptophans in the membranes-bound alpha-toxin as compared to the native toxin supported the conclusions based on iodide quenching of the membrane-bound toxin.

Journal ArticleDOI
TL;DR: In this paper, the authors studied the equilibrium and kinetic aspects of cobalt biosorption onto a new biosorbent, PFB1, which had a specific surface area of 256·8m2g−1 and showed high equilibrium capacities for cobalt uptake, the highest being 190 mg g−1.

Journal ArticleDOI
TL;DR: A mixed culture biofilm was developed with a sulfur oxidising, heterotrophic bacterium Thiosphaera pantotropha, autotrophic nitrifiers and other heterotrophs in a three stage rotating biological contactor (RBC).

Journal ArticleDOI
17 May 1999
TL;DR: This work investigates the effects of discovering `backlinks' from Web resources, namely links pointing to the resource, and describes tools for backlink navigation on both the client and server side, using an applet for the client or module for the Apache Web server.
Abstract: ``Life can only be understood backwards, but it must be lived forwards.'' Soren Kierkegaard From a user's perspective, hypertext links on the Web form a directed graph between distinct information sources. We investigate the effects of discovering `backlinks' from Web resources, namely links pointing to the resource. We describe tools for backlink navigation on both the client and server side, using an applet for the client and a module for the Apache Web server. We also discuss possible extensions to the HTTP protocol to facilitate the collection and navigation of backlink information in the World Wide Web.