To be Held in Conjunction with the
Fourth SIAM International Conference on Data Mining
(SDM 2004)
Applications in various domains often lead to very high-dimensional data; the dimension of the data being in the hundreds or thousands, for example in text/web mining and bioinformatics. In addition to the high dimensionality, these data sets are also often sparse. Clustering such high-dimensional data sets is a contemporary challenge. Successful algorithms must avoid the curse of dimensionality but at the same time should be computationally efficient.
A one-day workshop on Clustering High Dimensional Data and its Applications will be held in conjunction with SDM 2004 in Florida (April '04) to bring together researchers to present their current approaches and results in clustering high-dimensional data that arise in various applications. Particular applications of interest are bioinformatics, text mining, market-basket and web log analysis.
Clustering Algorithms and Models
- Probabilistic Models
- Information-Theoretic Formulations
- Vector-space Models
- Graph-based Models
- Software and Toolkits
Applications
- Bioinformatics
- Text Mining
- Web log analysis
- Factor Analysis
- Feature construction
Attendees are required to register for SDM 2004, but no separate registration is needed for this workshop.
Original papers on clustering high-dimensional data are solicited. For consideration, send an electronic submission (postscript or PDF versions printable on 8.5 x 11 paper only) to Jacob Kogan: kogan@math.umbc.edu; phone: (410)-455-3297; fax: (410)-455-1066.
An email including the title, authors and abstract of the paper should be sent separately in plain ASCII format (no HTML-tags please).
To guarantee consideration, manuscripts must be received by January 21, 2004, and must be no more than 10 pages excluding figures, tables, and references. Submission of work in progress is also encouraged.
All accepted papers whose camera-ready copies are received by the March 15, 2004 deadline (see below) will be distributed as photocopied proceedings available at the conference for purchase by attendees. Electronic copies will also be put on a SIAM web site.
Keynote Speaker: Charles Elkan, UC San Diego (Talk Slides)
Devasis Bassu, Telcordia Research
Pavel Berkhin, Yahoo!
Dan Boley, University of Minneapolis
Paul Bradley, Bradley Data Consulting, LLC
Chris Ding, NERSC, Lawrence Berkeley Lab
Bob Funderlic, North Carolina State University
Efim Gendler, iBoogie.tv
Joydeep Ghosh, University of Texas, Austin
Jon Kettenring, Telcordia Research
Shailesh Kumar, Fair Isaac
Arie Leizarowitz, Technion, Israel
Dharmendra Modha, IBM Almaden Research Center
Mark Teboulle, Tel-Aviv University
Zeev (Vladimir) Volkovich, Ort Braude College, Israel
Shi Zhong, Florida Atlantic University
Organizers Inderjit Dhillon Department of Computer Science University of Texas Austin, TX 78712-1188 Phone: (512) 471-9725 Fax: (512) 471-8885 |
Jacob Kogan Department of Mathematics and Statistics Univ. of Maryland, Baltimore County Baltimore, MD 21250 Phone: (410) 455-3297 Fax: (410) 455-1066 |
Last modified on March 13, 2004.