"spkmeans" can also reduce the dimensionality of the the original word-document matrix through concept decomposition or QR decompostion of the concept vectors. This may be useful for classification and query retrieval.
You are welcome to use the code under the terms of the licence for research or commercial purposes, however please acknowledge its use with a citation:Dhillon, I. S. and Modha, D. M., "Concept Decompositions for Large Sparse Text Data using Clustering", Machine Learning, 42:1, pages 143-175, Jan, 2001.
Dhillon, I. S. and Fan, J. and Guan, Y., "Efficient Clustering of Very Large Document Collections", 2000, invited book chapter in Data Mining for Scientific and Engineering Applications, Kluwer Academic Publishers, 2001.
Here is a BiBTeX entry:
@ARTICLE{dhillon:modha:mlj01,
AUTHOR = {Dhillon, I. S. and Modha, D. S.},
TITLE = { Concept decompositions for large sparse text data using clustering},
JOURNAL = {Machine Learning},
YEAR = {2001},
MONTH = {Jan},
VOLUME = {42},
NUMBER = {1},
PAGES = {143--175} }@INCOLLECTION{dhillon:fan:guan00,
AUTHOR = {Dhillon, I. S. and Fan, J. and Guan, Y.},
TITLE = {Efficient Clustering of Very Large Document Collections},
BOOKTITLE = {Data Mining for Scientific and Engineering Applications},
PUBLISHER = {Kluwer Academic Publishers},
EDITOR = {R. Grossman, C. Kamath, V. Kumar and R. Namburu},
YEAR = {2001},
PAGES = {},
NOTE = {Invited book chapter}}
Bug reports and comments are appreciated!