Preprocessing: A Prerequisite for Discovering Patterns in Web Usage Mining Process

Authors

  • Ramya C., Shreedhara K. S., and Kavitha G Author

Abstract

Web log data is usually diverse and voluminous.
 This data must be assembled into a consistent, integrated and
 comprehensive view, in order to be used for pattern discovery.
 Without properly cleaning, transforming and structuring the
 data prior to the analysis, one cannot expect to find
 meaningful patterns. As in most data mining applications, data
 preprocessing involves removing and filtering redundant and
 irrelevant data, removing noise, transforming and resolving
 any inconsistencies. In this paper, a complete preprocessing
 methodology having merging, data cleaning, user/session
 identification and data formatting and summarization
 activities to improve the quality of data by reducing the
 quantity of data has been proposed. To validate the efficiency
 of 
the proposed preprocessing methodology, several
 experiments are conducted and the results show that the
 proposed methodology reduces the size of Web access log files
 down to 73-82% of the initial size and offers richer logs that
 are structured for further stages of Web Usage Mining (WUM).
 So preprocessing of raw data in this WUM process is the
 central theme of this paper.

Downloads

Download data is not yet available.

Downloads

Published

20.03.2013

How to Cite

Preprocessing: A Prerequisite for Discovering Patterns in Web Usage Mining Process. (2013). International Journal of Information and Electronics Engineering, 3(2), 196-199. https://ijiee.org/index.php/ijiee/article/view/656