Cover image for Data clustering in C++ : an object-oriented approach
Title:
Data clustering in C++ : an object-oriented approach
Personal Author:
Series:
Chapman & Hall/CRC data mining and knowledge discovery series
Publication Information:
Boca Raton : Chapman & Hall/CRC, 2011.
Physical Description:
xxiv, 496 p. : ill. ; 24 cm. + 1 CD (12 cm.)
ISBN:
9781439862230
General Note:
Accompanies by CD-ROM : CP 030887

Available:*

Library
Item Barcode
Call Number
Material Type
Item Category 1
Status
Searching...
30000010303025 QA278 G36 2011 Open Access Book Book
Searching...

On Order

Summary

Summary

Data clustering is a highly interdisciplinary field, the goal of which is to divide a set of objects into homogeneous groups such that objects in the same group are similar and objects in different groups are quite distinct. Thousands of theoretical papers and a number of books on data clustering have been published over the past 50 years. However, few books exist to teach people how to implement data clustering algorithms. This book was written for anyone who wants to implement or improve their data clustering algorithms.

Using object-oriented design and programming techniques, Data Clustering in C++ exploits the commonalities of all data clustering algorithms to create a flexible set of reusable classes that simplifies the implementation of any data clustering algorithm. Readers can follow the development of the base data clustering classes and several popular data clustering algorithms. Additional topics such as data pre-processing, data visualization, cluster visualization, and cluster interpretation are briefly covered.

This book is divided into three parts--

Data Clustering and C++ Preliminaries: A review of basic concepts of data clustering, the unified modeling language, object-oriented programming in C++, and design patterns A C++ Data Clustering Framework: The development of data clustering base classes Data Clustering Algorithms: The implementation of several popular data clustering algorithms

A key to learning a clustering algorithm is to implement and experiment the clustering algorithm. Complete listings of classes, examples, unit test cases, and GNU configuration files are included in the appendices of this book as well as in the CD-ROM of the book. The only requirements to compile the code are a modern C++ compiler and the Boost C++ libraries.


Author Notes

Guojun Gan, Manulife Financial, Toronto, Canada


Reviews 1

Choice Review

Data clustering is a process central to the field of data mining, with the aim of grouping objects so that those in the same cluster are more similar to one another than to those in other clusters. Here, Gan (Manulife Financial, Canada) not only explores this popular area and a wide variety of algorithms used therein, but also demonstrates step by step how to implement these algorithms in C++. The book is well structured, with an overview of the area, and a discussion of object-oriented programming concepts and software engineering techniques. This content precedes the reader's exposure to an extended array of C++ clustering implementations, leaving as the only prerequisite to fully comprehend the text a working knowledge of C++ programming. Gan is an expert in data clustering, and this book serves as a natural extension of his thesis work. The target audience seems to be somewhat limited, since only clustering algorithm implementers will maximally benefit from its contents; Gan's previous work coauthored with C. Ma and J. Wu, Data Clustering: Theory, Algorithms, and Applications (2007), will serve general readers better. Overall, a well-written, informative book that will be greatly appreciated by C++ data mining programmers. Summing Up: Recommended. Upper-division undergraduates through professionals/practitioners. D. Papamichail University of Miami