The 2006 IEEE International Conference on Data Mining
18 - 22 December 2006, Hong Kong
Accepted Paper List

Regular Papers

Paper IDPaper Title
DM235 Stability Region based Expectation Maximization for Model-based Clustering
Chandan Reddy, Hsiao-Dong Chiang, and Bala Rajaratnam
DM269 Anytime Classification Using the Nearest Neighbor Algorithm with Applications to Stream Mining
Ken Ueno, Xiaopeng Xi, Eamonn Keogh, and Dah-Jye Lee
DM278 Discovering partial orders in binary data
Deepak Rajan and Philip Yu
DM296 Bayesian State Space Modeling Approach for Measuring the Effectiveness of Marketing Activities and Baseline Sales from POS Data
Tomohiro Ando
DM301 Large Scale Detection of Irregularities in Accounting Data
Stephen Bay, Krishna Kumaraswamy, Markus Anderle, Rohit Kumar, and David Steier
DM303 Subjectivity Categorization in Weblog Space using Part-Of-Speech based Smoothing
Shen Huang, Jiao-Tao Sun, Xuanhui Wang, Hua-Jun Zeng, and Zheng Chen
DM323 Regularized Least Absolute Deviations Regression, an Efficient Algorithm for Parameter Tuning and its Application in Image Reconstruction
Li Wang, Ji Zhu, and Michael Gordon
DM330 Active Learning to Maximize Area Under the ROC Curve
Matt Culver, Deng Kun, and Stephen Scott
DM343 Turning Clusters into Patterns: Rectangle-based Discriminative Data Description
Byron Gao and Martin Ester
DM345 Relational Ensemble Classification
Christine Preisach and Lars Schmidt-Thieme
DM350 Latent Friend Mining from Blog Data
Dou Shen, Jian-Tao Sun, Qiang Yang, and Zheng Chen
DM352 Using an Ensemble of One-Class SVM Classifiers to Harden Payload-based Anomaly Detection Systems
Roberto Perdisci, Guofei Gu, and Wenke Lee
DM359 A Novel Scalable Algorithm for Supervised Subspace Learning
Jun yan, ning liu, Benyu Zhang, Qiang Yang, and Zheng Chen
DM364 Dimension Reduction for Supervised Ordering
Toshihiro Kamishima and Shotaro Akaho
DM369 An Efficient Reference-based Approach to Outlier Detection in Large Dataset
Yaling Pei, Osmar Zaiane, and Yong Gao
DM371 What is the dimension of your binary data?
Nikolaj Tatti, Taneli Mielikainen, Aristedes Gionis, and Heikki Mannila
DM376 Hierarchical Classification by Expected Utility Maximization
Korinna Bade, Eyke Hüllermeier, and Andreas Nürnberger
DM383 Finding Unusual Shapes
Li Wei and Eamonn Keogh
DM448 Incremental Mining of Frequent Query Patterns from XML Queries for Caching
Guoliang Li, Jianhua Feng, Jianyong Wang, Yong Zhang, and Lizhu Zhou
DM480 An information theoretic approach to detection of minority subsets in database
Shin Ando and Einoshin Suzuki
DM484 How Bayesians Debug
Chao Liu, Zeng Lian, and Jiawei Han
DM497 Integrating Features from Different Sources for Music Information Retrieval
Tao Li and Mitsunori Ogihara
DM512 Co-clustering documents and words using Bipartite Isoperimetric Graph Partitioning
Manjeet Rege, Ming Dong, and Farshad Fotouhi
DM517 Efficient Clustering of Uncertain Data
Wang Kay Ngai, Ben Kao, Chun Kit Chui, Reynold Cheng, Michael Chau, and Kevin Y Yip
DM539 On the Lower Bound of Local Optimums in K-Means Algorithm
Zhenjie Zhang, Bing Tian Dai, and Anthony K.H. Tung
DM558 Geometrically Inspired Itemset Mining
Florian Verhein and Sanjay Chawla
DM599 Latent Dirichlet Co-Clustering
Mahdi Shafiei and Evangelos Milios
DM625 Forecasting Skewed Biased Stochastic Ozone Days
Xiaojing Yuan, Kun Zhang, Wei Fan, Ian Davidson, and Xiangshang Li
DM628 Bregman Bubble Clustering: A Robust, Scalable Framework for Locating Multiple, Dense Regions in Data
Gunjan Gupta and Joydeep Ghosh
DM631 Local Correlation Tracking in Time Series
Spiros Papadimitriou, Jimeng Sun, and Philip Yu
DM632 Lazy Associative Classification
Adriano Veloso, Wagner Meira Jr., and Mohammed Zaki
DM633 Secure Distributed k-Anonymous Pattern Mining
Wei Jiang and Maurizio Atzori
DM637 Fast Random Walk with Restart and Its Applications
Hanghang Tong, Christos Faloutsos, and Jia-Yu Pan
DM638 P3.1: Identifying Follow-Correlation Itemset-Pairs
Shichao Zhang
DM643 STAGGER: Periodicity Mining of Data Streams using Expanding Sliding Windows
Mohamed Elfeky, Walid Aref, and Ahmed Elmagarmid
DM644 A Novel Method for Detecting Outlying Subspaces in High-dimensional Databases Using Genetic Algorithm
Ji Zhang
DM658 Learning to Use a Learned Model: A Two-Stage Approach to Classification
Luiza Antonie, Osmar Zaiane, and Robert Holte
DM670 δ-Tolerance Closed Frequent Itemsets
James Cheng, Yiping Ke, and Wilfred Ng
DM696 The Relationships Among Various Nonnegative Matrix Factorization Methods for Clustering
Tao Li, Chris Ding, and Shenghuo Zhu
DM697 Mining for tree-query associations in a graph
Eveline Hoekx and Jan Van den Bussche
DM701 Improving Personalization Solutions Through Optimal Segmentation of Customer Bases
Tianyi Jiang and Alexander Tuzhilin
DM702 Discovering Unrevealed Properties of Probability Estimation Trees:on Algorithm Selection and Performance Explanation
kun zhang, Wei Fan, Bill Buckles, Xiaojing Yuan, and zujia xu
DM718 A Parameterized Probabilistic Model of Network Evolution for Supervised Link Prediction
Hisashi Kashima and Naoki Abe
DM726 Data Mining Approaches to Criminal Career Analysis
Tim Cocx, Jeroen de Bruin, Walter Kosters, Jeroen Laros, and Joost Kok
DM737 Personalization in Context: Does Context Matter When Building Personalized Customer Models?
Michele Gorgoglione, Cosimo Palmisano, and Alex Tuzhilin
DM756 Mixed-Drove Spatio-Temporal Co-occurrence Pattern Mining: A Summary of Results
Mete Celik, Shashi Shekhar, James Rogers, James Shine, and Jin Yoo
DM764 COALA : A Novel Approach for the Extraction of an Alternate Clustering of High Quality and High Dissimilarity
Eric Kyoo Han Bae and James Bailey
DM791 LOCI: Load Shedding through Class-Preserving Data Acquisition
Peng Wang, Haixun Wang, Wei Wang, Baile Shi, and Philip S. Yu
DM793 Applying Data Mining to Pseudo-Relevance Feedback for High Performance Text Retrieval
Xiangji (Jimmy) Huang, YanRui Huang, Miao Wen, Aijun An, Yang Liu, and Josiah Poon
DM796 Cluster Ranking with an Application to Mining Mailbox Networks
Ido Guy, Ziv Bar-Yossef, Ronny Lempel, Yoelle S. Maarek, and Vladimir Soroka
DM803 Finding 'Who is talking to whom' in VoIP Networks via Progressive Stream Clustering
Olivier Verscheure, Michail Vlachos, Aris Anagnostopoulos, Pascal Frossard, Eric Bouillet, and Philip S Yu
DM847 P3C: A Robust Projected Clustering Algorithm
Gabriela Moise, Jorg Sander, and Martin Ester
DM848 Rapid Identification of Column Heterogeneity
Bing Tian Dai, Nick Koudas, Beng Chin Ooi, Divesh Srivastava, and Suresh Venkatasubramanian
DM852 Optimal Segmentation using Tree Models
Robert Gwadera, Aristides Gionis, and Heikki Mannila
DM864 Parallel Graph Mining on CMP Architectures
Gregory Buehrer and Srinivasan Parthasarathy
DM866 Accelerating Newton Optimization for Log-Linear Models through Feature Redundancy
Arpit Mathur and Soumen Chakrabarti
DM868 Frequent Closed Itemset Mining Using Prefix Graphs with an Efficient Flow-Based Pruning Strategy
H.D.K. Moonesinghe, Samah Fodeh,and Pang-Ning Tan
DM883 Converting Output Scores from Outlier Detection Algorithms into Probability Estimates
Jing Gao and Pang-Ning Tan
DM889 On the Use of Structure and Sequence-based Features for Protein Classification and Retrieval
Keith Marsolo and Srinivasan Parthasarathy
DM892 Who thinks who knows who? Socio-cognitive analysis of email networks
Nishith Pathak, Sandeep Mane, and Jaideep Srivastava
DM894 Boosting Kernel Models for Regression
Ping Sun and Xin Yao
DM903 A Data Mining Approach for Capacity Building of Stakeholders in Integrated Flood Management
Peter Owotoki, Natasa Manojlovic, Friedrich Mayer-Lindenberg, and Erik Pasche
DM926 Boosting for Learning Multiple Classes with Imbalanced Class Distribution
Yanmin Sun and Yang Wang
DM934 Extracting Keyphrases using Semantic Networks Structure Analysis
Chong Huang, Yonghong Tian, Tiejun Huang, Charles Ling, and Zhi Zhou
DM939 Meta Clustering
Rich Caruana, Mohamed Elhawary, Nam Nguyen,and Casey Smith
DM942 An Interactive Semantic Video Mining and Retrieval Platform - Application in Transportation Surveillance Video for Incident Detection
Xin Chen and Chengcui Zhang
DM944 Comparison of Descriptor Spaces for Chemical Compound Retrieval and Classification
Nikil Wale and George Karypis
DM954 Biclustering Protein Complex Interactions with a Biclique Finding Algorithm
Chris Ding, Ya Zhang, and Stephen Holbrook
DM958 The PDD Framework for Detecting Categories of Peculiar Data
Mahesh Shrestha, Howard Hamilton, Y. Y. Yao, Ken Konkel, and Liqiang Geng
DM963 Global and Componentwise Extrapolation for Accelerating Data Mining from Large Incomplete Data Set with the EM Algorithm
Chun-Nan Hsu, Han-Shen Huang, and Bo-Hou Yang
DM972 Adaptive Blocking: Learning to Scale Up Record Linkage
Mikhail Bilenko, Beena Kamath, and Raymond J. Mooney
DM975 Entity Resolution with Markov Logic
Parag Singla and Pedro Domingos
DM978 Dirichlet Aspect Weighting: A Generalized EM algorithm for Integrating External Data Fields with Semantically Structured Queries by using Gradient Projection Method
Atulya Velivelli and Thomas Huang

Short Papers

Paper IDPaper Title
DM224 GraphRank: Statistical Modeling and Mining of Significant Subgraphs in the Feature Space
Huahai He and Ambuj Singh
DM252 A Framework for Regional Association Rule Mining in Spatial Datasets
Wei Ding, Christoph F. Eick, Jing Wang, and XiaoJing Yuan
DM263 Mining Latent Associations of Objects Using a Typed Mixture Model --A case study on expert/expertise mining
Shenghua Bao, Yunbo Cao, Hang Li, Bing Liu, and Yong Yu
DM273 Decision Trees for Functional Variables
Suhrid Balakrishnan and David Madigan
DM283 Discover Bayesian Networks from Incomplete Data Using a Hybrid Evolutionary Algorithm
Man Leung Wong and Yuan Yuan Guo
DM285 Star-Structured High-Order Heterogeneous Data Co-clustering based on Consistent Information Theory
Bin Gao, Tie-Yan Liu, and Wei-Ying Ma
DM289 Mining Complex Time-Series Data by Learning the Temporal Structure Using Bayesian Techniques and Markovian Models
Yi Wang and Lizhu Zhou
DM300 Boosting the Feature Space: Text Classification for Unstructured Data on the Web
YANG SONG, Ding Zhou, Jian Huang, Isaac Councill, Hongyuan Zha, and C. Lee Giles
DM307 Improving Grouped-Entity Resolution using Quasi-Cliques
Byung-Won On, Ergin Elmacioglu, Dongwon Lee, Jaewoo Kang, and Jian Pei
DM308 Diverse Topic Phrase Extraction through Latent Semantic Analysis
Jilin Chen, Benyu Zhang, Jun Yan, and Qiang Yang
DM318 Temporal Data Mining in Dynamic Feature Spaces
Brent Wenerstrom and Christophe Giraud-Carrier
DM331 Constructing Ensembles for Better Ranking
Jin Huang and Charles Ling
DM351 AC-Close: Efficiently Mining Approximate Closed Itemsets by Core Pattern Recovery
Hong Cheng, Philip S. Yu, and Jiawei Han
DM354 Direct Marketing When There Are Voluntary Buyers
Yi-Ting Lai, Ke Wang, Daymond Ling, Hua Shi, and Jason Zhang
DM370 An Effective Algorithm for Mining Competitors from the Web
Rui Li, Shenghua Bao, Jin Wang, Yong Yu, and Yubo Cao
DM386 Intelligent Icons: Integrating Lite-Weight Data Mining and Visualization into GUI Operating Systems
Eamonn Keogh, Li Wei, Xiaopeng Xi, Stefano Lonardi, Jin Shieh, and Scott Sirowy
DM389 Semantic Overall and Partial Similarity of Temporal Query Logs for Similar Query Suggestion
ning liu, Jun yan, Benyu Zhang, Weiguo Fan, and Zheng Chen
DM397 High Quality, Efficient Hierarchical Document Clustering using Closed Interesting Itemsets
Hassan Malik and John Kender
DM400 Exploratory Under-Sampling for Class-Imbalance Learning
Xu-Ying Liu, Jianxin Wu, and Zhi-Hua Zhou
DM405 Semi-Supervised Kernel Regression
Meng Wang, Xian-Sheng Hua, Yan Song, Li-Rong Dai, and Hong-Jiang Zhang
DM409 Query-Sensitive Similarity Measure for Content-Based Image Retrieval
Zhi-Hua Zhou and Hong-Bin Dai
DM445 Adding Semantics to Email Clustering
Hua Li, Dou Shen, Benyu Zhang, Zheng Chen, and Qiang Yang
DM504 Deploying Approaches for Pattern Refinement in Text Mining
Sheng-Tang Wu, Yuefeng Li, and Yue Xu
DM538 Mining Maximal Quasi-Bicliques to Co-Cluster Stocks and Financial Ratios for Value Investment
Kelvin Sim, Jinyan Li, Vivekanand Gopalkrishnan, and Guimei Liu
DM540 Recommendation on Item Graphs
Fei Wang, Sheng Ma, and Tao Li
DM543 Cluster Based Core Vector Machine
Asharaf S, Narasimha Murty Musti, and Shirish Krishnaj Shevade
DM545 Adaptive Kernel Principal Component Analysis with Unsupervised Learning of Kernels
Daoqiang Zhang, Zhi-Hua Zhou, and Songcan Chen
DM549 Manifold Clustering of Shapes
Dragomir Yankov and Eamonn Keogh
DM556 Discovery of Collocation Episodes in Spatiotemporal Data
Huiping Cao, Nikos Mamoulis, and David W. Cheung
DM564 The Influence of Class Imbalance on Cost-Sensitive Learning: An Empirical Study
Xu-Ying Liu and Zhi-Hua Zhou
DM600 Entropy-based Concept Shift Detection
Peter Vorburger and Abraham Bernstein
DM609 Solution Path for Semi-Supervised Classification with Manifold Regularization
Gang Wang, Tao Chen, Dit-Yan Yeung, and Frederick H. Lochovsky
DM611 Pattern Mining in Frequent Dynamic Subgraphs
Karsten Borgwardt, Hans-Peter Kriegel, and Peter Wackersreuther
DM612 Corrective Classification: A Classifier Ensemble with Corrective and Diverse Base Learners
DM613 DSTree: A Tree Structure for Efficient Mining of Frequent Patterns from Data Streams
Carson Leung and Quamrul I. Khan
DM627 bitSPADE: A Lattice-Based Sequential Pattern Mining Algorithm Using Bitmap Representation
Sujeevan Aseervatham, Aomar Osmani, and Emmanuel Viennet
DM630 COSMIC: Conceptually Specified Multi-Instance Clusters
Matthias Schubert, Alexey Pryakhin, Arthur Zimek, and Hans-Peter Kriegel
DM634 Mining Generalized Graph Patterns based on User Examples
Pavel Dmitriev and Carl Lagoze
DM636 TOP-COP: Mining TOP-K Strongly Correlated Pairs in Large Databases
Hui Xiong, Mark Brodie, and Sheng Ma
DM646 Automatic Single-Organ Segmentation in Computed Tomography Images
Ruchaneewan Susomboon, Daniela Raicu, Jacob Furst, and David Channin
DM654 Searching for Pattern Rules
Guichong Li and Howard Hamilton
DM655 Enhancing Text Clustering using Concept-based Mining Model
Shady Shehata, Fakhri Karray, and Mohamed Kamel
DM657 Probabilistic segmentation and analysis of horizontal cells
Vebjorn Ljosa and Ambuj K. Singh
DM660 Semantic Smoothing for Model-based Document Clustering
Xiaodan Zhang, Xiaohua Zhou, and Xiaohua Hu
DM676 A Balanced Ensemble Approach to Weighting Classifiers for Text Classification
Gabriel Pui Cheong Fung, Jeffrey Yu, Haixun Wang, Huan Liu, and David W Cheung
DM685 Social Capital in Friendship-Event Networks
Louis Licamele and Lise Getoor
DM686 Cluster Analysis of Time-series Laboratory Test Data Based on the Trajectory Representation and Multiscale Comparison Techniques
Shoji Hirano and Shusaku Tsumoto
DM714 MARGIN: Maximal Frequent Subgraph Mining
Lini Thomas, Satyanarayana R Valluri, and Kamalakar Karlapalem
DM721 Distances and (Indefinite) Kernels for Sets of Objects
Adam Woznica, Alexandros Kalousis, and Melanie Hilario
DM733 NewsCATS: A News Categorization And Trading System
Marc-André Mittermayer and Gerhard Knolmayer
DM744 Comparisons of K-Anonymization and Randomization Schemes Under Linking Attacks
Zhouxuan Teng and Wenliang Du
DM745 Getting the Most Out of Ensemble Selection
Rich Caruana, Art Munson, and Alexandru Niculescu-Mizil
DM761 Multi-Tier Granule Mining for Representations of Multidimensional Association Rules
Yuefeng Li, Wanzhong Yang, and Yue Xu
DM768 Speedup Clustering with Hierarchical Ranking
Jianjun Zhou and Joerg Sander
DM773 Semantic Kernels for Text Classification based on Topological Measures of Feature Similarity
Stephan Bloehdorn, Roberto Basili, Marco Cammisa, and Alessandro Moschitti
DM784 Mining Maximal Generalized Frequent Geographic Patterns with Knowledge Constraints
Vania Bogorny, Joao Valiati, Sandro Camargo, Paulo Engel, and Luis Otavio Alvares
DM795 Window-based Tensor Analysis on High-dimensional and Multi-aspect Streams
Jimeng Sun, Spiros Papadimitriou, and Philip Yu
DM807 Object Identification with Constraints
Steffen Rendle and Lars Schmidt-Thieme
DM825 Minimum Enclosing Spheres Formulations for Support Vector Ordinal Regression
S K Shevade and Wei Chu
DM833 Mining Correlation between Motifs and Gene Expression
Yi Lu, Shiyong Lu, Adrian Platts, and Stephen Krawetz
DM857 TRIAS - An Algorithm for Mining Iceberg Tri-Lattices
Robert Jäschke, Andreas Hotho, Christoph Schmitz, Bernhard Ganter, and Gerd Stumme
DM870 Resource Management for Networked Classifiers in Distributed Stream Mining Systems
Deepak Turaga, Olivier Verscheure, Upendra Chaudhari, and Lisa Amini
DM880 An Experimental Investigation of Graph Kernels on two Collaborative Recommendation Tasks
Francois Fouss, Luh Yen, Alain Pirotte, and Marco Saerens
DM898 Opening the Black Box of Feature Extraction: Incorporating Visualization into High-Dimensional Data Mining Processes
jianting zhang and Le Gruenwald
DM905 Fast On-line Kernel Learning for Trees
Fabio Aiolli, Giovanni Da San Martino, Alessandro Sperduti, and Alessandro Moschitti
DM908 Rule-Based Platform for Web Service User Profiling
Jianping Zhang and Manu Shukla
DM911 Improving Nearest Neighbor Classifier using Tabu Search and Ensemble Distance Metrics
Muhammad Atif Tahir and James Smith
DM914 High-Performance Unsupervised Relation Extraction from Large Corpora
Benjamin Rosenfeld and Ronen Feldman
DM928 Detection of Interdomain Routing Anomalies Based on Higher-Order Path Analysis
Murat Ganiz, William Pottenger, Sudhan Kanitkar, and Mooi Chuah
DM929 On Trajectory Representation and Analysis for Scientific Data
Sameep Mehta, Raghu Machiraju, and Srinivasan Parthasarathy
DM933 Belief Propagation in Large, Highly Connected Graphs for 3D Part-Based Object Recognition
Frank DiMaio and Jude Shavlik
DM941 Fast Relevance Discovery in Time Series
Chang-shing Perng, Haixun Wang, and Sheng Ma
DM951 A Simple Yet Effective Data Clustering Algorithm
Soujanya Vadapalli, Satyanarayana Valluri, and Kamalakar Karlapalem
DM953 Plagiarism Detection in arXiv
Daria Sorokina, Johannes Gehrke, Simeon Warner, and Paul Ginsparg
DM959 Detecting Web Spam from Temporal Statistics of Websites
Guoyang Shen, Bin Gao, Tie-Yan Liu, Guang Feng, Shiji Song, and Hang Li
DM968 A Feature Selection and Evaluation Scheme for Computer Virus Detection
Olivier Henchiri and Nathalie Japkowicz
DM969 Probabilistic Enhanced Mapping with the Generative Tabular Model
PRIAM Rodolphe and Mohamed Nadif
DM970 Linear and Non-Linear Dimensional Reduction via Class Representatives for Text Classification
Dimitrios Zeimpekis and Efstratios Gallopoulos
DM979 Gradual Cube: Customize Profile on Mobile OLAP
LI Jun, Zhou Haofeng, and Wang Wei

Remarks for paper presentations

