Clustering with Constraints Web Site

The purpose of this web site is to gather information relevant to the emerging area of clustering with background knowledge in the form of constraints. Right now it contains mostly my (Ian Davidson’s) work, but it will be greatly expanded in the near future.

Software

Constraint generator implementations of (read the README file) COP-k-Means, CVQE, LCVQE.
Implementations of PKM, MKM, PKMKM

People

Sugato Basu
Ian Davidson
Kiri Wagstaff

Tutorials

IEEE ICDM 2005 Clustering with Constraints – Theory and Practice (S.Basu (SRI) and Ian Davidson (SUNY))

ACM KDD 2006  Clustering with Constraints – Theory and Practice (S.Basu (SRI) and Ian Davidson (SUNY)) PDF of Slides

Books

Constrained Clustering: Algorithms, Applications and Theory, In Preparation Due out 2008 co-edited by Sugato Basu, Ian Davidson and Kiri Wagstaff. CRC Press. List of Invited Chapters

The book Semi-Supervised Learning, Edited by Olivier Chapelle, Bernhard Schölkopf and Alexander Zien contains some chapters on semi-supervised clustering.

Bibliography

Here is a not quite up to date list of papers in the area.

Papers Available On-Line By Area (some papers appear more than once)

Applications

Theory

Davidson I., Ravi, S.S., The Complexity of Non-Hierarchical Clustering With Instance and Cluster Level Constraints, Technical Report Version of Paper to Appear in the Journal of Knowledge Discovery and Data Mining PDF,

Davidson I., Ravi S.S., Identifying and Generating Easy Sets of Constraints For Clustering, To Appear 21st AAAI Conference, 2006. (acceptance rate 21%) PDF  

Davidson I. and Ravi, S. S. Hierarchical Clustering with Constraints: Theory and Practice, 9th European Principles and Practice of KDD, PKDD 2005. (acceptance rate 11%) (Email me for Journal/TR version) PDF Extended technical report with all proofs PDF

Davidson I., Ravi, S.S., Clustering under Constraints: Feasibility Issues and the $K$-Means Algorithm, 5th SIAM Data Mining Conference, (acceptance rate 14%). PDF,  

Learning Distance Metrics

Clustering to Satisfy All Constraints

Davidson I., Ravi, S.S., The Complexity of Non-Hierarchical Clustering With Instance and Cluster Level Constraints, Technical Report Version of Paper to Appear in the Journal of Knowledge Discovery and Data Mining PDF,

Davidson I., Wagstaff, K., Basu, Sugato., Measuring Constraint-Set Utility for Partitional Clustering Algorithms , To Appear in the Proceeding of ECML/PKDD 2006 (acceptance rate 9%) PDF,
Davidson I., Ravi S.S., Identifying and Generating Easy Sets of Constraints For Clustering, To Appear 21st AAAI Conference, 2006. (acceptance rate 21%) PDF  

Davidson I. and Ravi, S. S. Hierarchical Clustering with Constraints: Theory and Practice, 9th European Principles and Practice of KDD, PKDD 2005. (acceptance rate 11%) PDF Extended technical report with all proofs PDF

Davidson I., Ravi, S.S., Clustering under Constraints: Feasibility Issues and the $K$-Means Algorithm, 5th SIAM Data Mining Conference, (acceptance rate 14%). PDF,