scispace - formally typeset
Open AccessProceedings ArticleDOI

Analyzing web application log files to find hit count through the utilization of Hadoop MapReduce in cloud computing environment

TLDR
This Hadoop MapReduce programming model is applied for analyzing web log files so that the authors could get hit count of specific web application and results are evaluated using Map and Reduce function.
Abstract
MapReduce has been widely applied in various fields of data and compute intensive applications and also it is important programming model for cloud computing. Hadoop is an open-source implementation of MapReduce which operates on terabytes of data using commodity hardware. We have applied this Hadoop MapReduce programming model for analyzing web log files so that we could get hit count of specific web application. This system uses Hadoop file system to store log file and results are evaluated using Map and Reduce function. Experimental results show hit count for each field in log file. Also due to MapReduce runtime parallelization response time is reduced.

read more

Citations
More filters
Proceedings ArticleDOI

Big data analysis in e-commerce system using HadoopMapReduce

TL;DR: This work proposes apredictive prefetching system based on preprocessing of web logs using HadoopMapReduce, which will provide accurate results in minimum response time for E-commerce business activities.
Proceedings ArticleDOI

Spark-based log data analysis for reconstruction of cybercrime events in cloud environment

TL;DR: The results show that Spark can be used as a fast platform for handling the diverse large size of log data and extract useful information that can assist digital investigators in the analysis immense amount of generated cloud log data in a given frame of time.
Proceedings ArticleDOI

Analysing Log Files For Web Intrusion Investigation Using Hadoop

TL;DR: The results of this experimental simulation indicate that Hadoop application is able to produce analysis results from large size web log files in order to assist the web intrusion investigation.
Journal ArticleDOI

EDAWS: A distributed framework with efficient data analytics workspace towards discriminative services for critical infrastructures

TL;DR: The general solution of EDAWS is proposed, a case study of smart residence prototype towards discriminative services in terms of information retrieval, personalized information push, and hot topic discovery is thoroughly discussed, and experimental results indicate that, data processing which runs on computing nodes has good scalability with data sizes and computing nodes, and the prototype passes from data to discriminatives services successfully.
Book ChapterDOI

Abnormal User Pattern Detection Using Semi-structured Server Log File Analysis

TL;DR: The objective of this paper is to find abnormal activity patterns of users from a huge amount of semi-structured server log file by using an open-source framework named Hadoop and the output plots will help in differentiating between the normal users and the intruders in a particular network.
References
More filters
Journal ArticleDOI

MapReduce: simplified data processing on large clusters

TL;DR: This paper presents the implementation of MapReduce, a programming model and an associated implementation for processing and generating large data sets that runs on a large cluster of commodity machines and is highly scalable.
Journal ArticleDOI

MapReduce: simplified data processing on large clusters

TL;DR: This presentation explains how the underlying runtime system automatically parallelizes the computation across large-scale clusters of machines, handles machine failures, and schedules inter-machine communication to make efficient use of the network and disks.
Proceedings ArticleDOI

The Hadoop Distributed File System

TL;DR: The architecture of HDFS is described and experience using HDFS to manage 25 petabytes of enterprise data at Yahoo! is reported on.
Book

Hadoop: The Definitive Guide

Tom White
TL;DR: This comprehensive resource demonstrates how to use Hadoop to build reliable, scalable, distributed systems: programmers will find details for analyzing large datasets, and administrators will learn how to set up and run Hadoops clusters.
Proceedings ArticleDOI

Pig latin: a not-so-foreign language for data processing

TL;DR: A new language called Pig Latin is described, designed to fit in a sweet spot between the declarative style of SQL, and the low-level, procedural style of map-reduce, which is an open-source, Apache-incubator project, and available for general use.
Related Papers (5)