scispace - formally typeset
Search or ask a question

Showing papers by "YMCA University of Science and Technology published in 2010"


Journal ArticleDOI
TL;DR: Existing frameworks and mechanisms of present web crawlers are taxonomically classified into four steps and analyzed to find limitations in searching the deep web.
Abstract: Web crawlers specialize in downloading web content and analyzing and indexing from surface web, consisting of interlinked HTML pages. Web crawlers have limitations if the data is behind the query interface. Response depends on the querying party's context in order to engage in dialogue and negotiate for the information. In this article, the authors discuss deep web searching techniques. A survey of technical literature on deep web searching contributes to the development of a general framework. Existing frameworks and mechanisms of present web crawlers are taxonomically classified into four steps and analyzed to find limitations in searching the deep web.

26 citations


Proceedings ArticleDOI
01 Dec 2010
TL;DR: A novel result optimization technique based on learning from historical query logs is being proposed, which predicts users' information needs and reduces their navigation time within the result list.
Abstract: Modern Information Retrieval Systems match the terms of a user query with available documents in their index and return a large number of Web pages generally in the form of a ranked list. It becomes almost impractical at the user end to examine every returned document, thus necessitating the need to look for some means of result optimization. In this paper, a novel result optimization technique based on learning from historical query logs is being proposed, which predicts users' information needs and reduces their navigation time within the result list. The method first performs query clustering in query logs based on a novel similarity function and then captures the sequential patterns of clicked web pages in each cluster using a sequential pattern mining algorithm. Finally, search result list is re-ranked by updating the existing PageRank values of pages using the discovered sequential patterns. The proposed work results in reduced search space as user intended pages tend to move upwards in the result list.

14 citations


Book ChapterDOI
09 Aug 2010
TL;DR: Architecture is being proposed that introduces a technique to continuously update/refresh the Hidden Web repository based on probability of updation of the web page.
Abstract: Hidden Web’s broad and relevant coverage of dynamic and high quality contents coupled with the high change frequency of web pages poses a challenge for maintaining and fetching up-to-date information. For the purpose, it is required to verify whether a web page has been changed or not, which is another challenge. Therefore, a mechanism needs to be introduced for adjusting the time period between two successive revisits based on probability of updation of the web page. In this paper, architecture is being proposed that introduces a technique to continuously update/refresh the Hidden Web repository.

9 citations


Posted Content
TL;DR: The main focus of this survey is targeted on the recent solutions available for overpowering the explored issues.
Abstract: The Semantic Web (SW) deals with transforming the information-oriented web into a knowledge-oriented web. SW makes product and service information more abundant and improves search mechanisms thus resulting in user satisfaction, but lack of standard ontologies and communication interface for a domain is creating a lot of hurdles in its implementation. Considerable research has been focused at overcoming these deficiencies through more efficient agent-oriented interactive algorithms and ontology-based system design. The main focus of this survey is targeted on the recent solutions available for overpowering the explored issues.

4 citations


Journal ArticleDOI
TL;DR: An ontology driven agent based focused crawler (O-ABFC) that emphasizes on the use of ontology & contextual information in crawling that uniquely contributes to the improvement of existing web crawling techniques.
Abstract: Existing focused crawlers (FCs) are based upon fixed model of web and thus are deficient in using available information. The premise of this paper is that ontology can play an important role in enhancing the efficiency of existing agent based focused crawlers. The paper proposes an ontology driven agent based focused crawler (O-ABFC) that emphasizes on the use of ontology & contextual information in crawling. Use of ontology is emerging as a promising tool that eliminates simple keyword based crawling method as it introduces semantics or contexts in which a keyword is being searched. The major benefit of proposed O-ABFC is that it bridges the gap between the actual concept and elucidation of data and uniquely contributes to the improvement of existing web crawling techniques.

4 citations


Journal ArticleDOI
TL;DR: This work contributes a unique strategy that could deploy mobile sensors in the subsurface so as to get real time information which otherwise is not possible, and the proposed algorithm provides efficient coverage and connectivity metric.
Abstract: The demand for oil is growing steadily from emerging and developing economies while oil field discoveries continue to decline. Therefore the gap between demand and supply will increase with time. Subsurface Exploration deals with extracting valuable hydrocarbons from oil wells. Due to its hazardous nature, it’s one of the most difficult fields to carry experimentations on. Uncertainties associated with this field are result of various factors such as lack of information regarding location, size and spread of natural resource. In order to handle above listed factors, mobile wireless sensors seems to be a promising paradigm for increasing productivity and throughput by serving as intelligent investigators. At the time of this listing, none of the researchers have proposed the deployment of mobile wireless sensors in the oil fields. Therefore, this work contributes a unique strategy that could deploy mobile sensors in the subsurface so as to get real time information which otherwise is not possible. Moreover, the proposed algorithm provides efficient coverage and connectivity metric. Also, a mathematical model has been presented along with its comparison with other existing node placement strategies in other related fields and it is found that our algorithm provides better coverage and connectivity.

4 citations


Proceedings ArticleDOI
01 Dec 2010
TL;DR: This paper proposes an architecture for relevant searching of web documents using data mining techniques such as clustering and association rules together with context and ontology to extract potentially useful documents from the database.
Abstract: The size of the publicly indexable World Wide Web (WWW) has probably surpassed 14.3 billion documents and as yet growth shows no sign of leveling off. Search engines encounter the problem of ambiguity in words; therefore, search engines use ontology to find pages with words that are syntactically different but semantically similar. The knowledge provided by ontology is extremely useful in defining the structure and scope for mining web content. Context-ontology is a shared vocabulary to share context information in a pervasive computing domain and include machine-interpretable definitions of basic concepts in the domain and relations among them. This paper proposes an architecture for relevant searching of web documents using data mining techniques such as clustering and association rules. These techniques together with context and ontology extract potentially useful documents from the database. Also, an algorithm has been devised which shows the working in sequence of steps. Finally, the results are compared with the prevailing approaches and with the help of an example it has been seen that CODT is better in context of relevancy.

2 citations


Journal ArticleDOI
TL;DR: The results of experiments indicated that the addition of GT up to 30% with cow dung (CD) produced good quality vermicompost in terms of increased NPK status as well as decrease in C : N and C : P ratio as discussed by the authors.
Abstract: The garden residues and trimmings were vermicomposted in order to mitigate environmental problems caused by them. The results of experiments indicated that the addition of Garden Trimmings (GT) up to 30% with cow dung (CD) produced good quality vermicompost in terms of increased NPK status as well as decrease in C : N and C : P ratio. A comparison of the decomposition of GT + CD in vermicomposting and composting (without worms) experiments indicated that vermicomposting with E. fetida is more efficient in maximising the extent of organic matter degradation. The growth of worms was lesser pronounced when they were fed on higher concentrations of GT in feed mixtures.

2 citations


Proceedings ArticleDOI
01 Dec 2010
TL;DR: An improved architecture for minimizing the handover latency in MIPv6 is improved and suggests the removal of DAD procedure and configuration of a new CoA for the MN by the NAR.
Abstract: With the advent of mobile devices, the internet now undertook a huge and unexpected explosion of growth The wireless mobile internet gives users access to the internet services while they are on the move This mobility has been supported through the Internet Protocol known as Mobile IP Mobile IP allows users with mobile devices to have continuous network connectivity to the internet without changing their IP addresses when moving from one network to another While on move, the Mobile Node (MN) undergoes a handover process, in which the MN gets disconnected from one network and tries to connect to another network, ie, New Access Router (NAR) The handover process involves various time-consuming procedures as Neighbor Discovery, Care-of-Address (CoA) configuration, Duplicate-Address-Detection (DAD), Binding Registration and Route Optimization Various researches have been devoted to the acceleration of movement detection and registration However, a time-consuming operation, ie, duplicate-address detection (DAD), was overlooked by most studies An improved architecture for minimizing the handover latency in MIPv6 is improved in this paper The improved architecture suggests the removal of DAD procedure and configuration of a new CoA for the MN by the NAR The result is evaluated on the basis of numerical analysis

2 citations


Proceedings ArticleDOI
01 Oct 2010
TL;DR: A clustering algorithm that aims at partitioning the set of documents into ordered clusters so that the documents within the same cluster are similar and are being assigned the closer document identifiers and the average value of the differences between the successive documents will be minimized and hence storage space would be saved.
Abstract: Granting efficient and fast accesses to the index is a key issue for performances of Web Search Engines. In order to enhance memory utilization and favor fast query resolution, WSEs use Inverted File (IF) indexes that consist of an array of the posting lists where each posting list is associated with a term and contains the term as well as the identifiers of the documents containing the term. Since the document identifiers are stored in sorted order, they can be stored as the difference between the successive documents so as to reduce the size of the index. This paper describes a clustering algorithm that aims at partitioning the set of documents into ordered clusters so that the documents within the same cluster are similar and are being assigned the closer document identifiers. Thus the average value of the differences between the successive documents will be minimized and hence storage space would be saved. The paper further presents the extension of this clustering algorithm to be applied for the hierarchical clustering in which similar clusters are clubbed to form a mega cluster and similar mega clusters are then combined to form super cluster. Thus the paper describes the different levels of clustering which optimizes the search process by directing the search to a specific path from higher levels of clustering to the lower levels i.e. from super clusters to mega clusters, then to clusters and finally to the individual documents so that the user gets the best possible matching results in minimum possible time.

2 citations