Abstract— Unlike traditional one-way K-means clustering, co-clustering simultaneously cluster both data points and features of a two-dimensional data matrix. It is a powerful data analysis technique that can discover latent patterns hidden within particular rows and columns. Accordingly, co-clustering has been successfully applied to varied domains, including, but not only limited to, text clustering, microarray analysis, speech and video analysis, and natural language processing. Assuming a whole data matrix is available, usual co-clustering algorithm updates all row and column assignments in batch mode. In this paper, we develop an online incremental co-clustering algorithm that can update both row and column clustering statistics on the fly only for each available data point; thus, the proposed algorithm can handle stream data collected from sensor networks or handheld devices. Characteristics among batch, mini-batch, and online clustering and co-clustering algorithms are discussed and future work is provided.
Index Terms— Batch, mini-batch, incremental, online co-clustering.
H. Cho and M. K. An are with Sam Houston State University, Huntsville, TX 77341, USA (e-mail: hyukcho@shsu.edu, an@shsu.edu).
[PDF]
Cite: Hyuk Cho and Min Kyung An, "Co-Clustering Algorithm: Batch, Mini-Batch, and Online," International Journal of Information and Electronics Engineering vol. 4, no. 5, pp. 340-346, 2014.