PUBLICATIONS
· 2009
- S. Datta. C. Giannella, H. Kargupta, "Approximate Distributed K-Means Clustering Over a Peer-to-Peer Network", IEEE Transactions on Knowledge and Data Engineering, to appear.
- K. Das, K. Bhaduri, S. Arora, W. Griffin, K. Borne, C. Giannella, H. Kargupta, "Scalable Distributed Change Detection From Astronomy Streams Using Local, Asynchronous Eigen Monitoring Algorithms", SIAM Conference on Data Mining (SDM), 2009.
· 2008
- K. Bhaduri, R. Wolff, C. Giannella, H. Kargupta, "Distributed Decision Tree Induction in Peer-to-Peer Systems", Statistical Analysis and Data Mining , 1(2), 2008.
- K. Liu, C. Giannella, H. Kargupta, "A Survey of Attack Techniques on Privacy-Preserving Data Perturbation Methods", Chapter in "Privacy-Preserving Data Mining: Models and Algorithms.",
Series: Advances in Database Systems vol. 34, Springer, 2008.
· 2007
- H. Dutta, C. Giannella, K. Borne, H. Kargupta, "Distributed Top-k Outlier Detection from Astronomy Catalogs using the DEMAC System", Proceedings of the SIAM Conference on Data Mining (SDM'07), 2007.
· 2006
- S. Datta, K. Bhaduri, C. Giannella, R. Wolff, H. Kargupta, "Distributed Data Mining in Peer-to-Peer Networks", IEEE Internet Computing , pages 18-26, July/August 2006.
- K. Liu, C. Giannella, H. Kargupta, "An Attacker's View of Distance Preserving Maps For Privacy Preserving
Data Mining", Proceedings of the 10th European Conference on Principles and
Practice of Knowledge Discovery in Databases (PKDD), 2006. Lecture Notes
in Computer Science, volume 4213, pages 297-308.
- J. Branch, C. Giannella, B. Szymanski,
R. Wolff, H. Kargupta, "In-Network Outlier Detection in Wireless Sensor Networks", Proceedings of the 26th International Conference on
Distributed Computing Systems (ICDCS), 2006.
- C. Giannella, H. Dutta,
S. Mukherjee, and H. Kargupta, "Efficient Kernel Density Estimation Over
Distributed Data", Proceedings of 9th International Workshop on High
Performance and Distributed Mining, as part of the SIAM International
Conference on Data Mining (SDM), 2006.
- C. Giannella, H. Dutta,
K. Borne, R. Wolff, and H. Kargupta, "Distributed Data Mining for
Astronomy Catalogs", Proceedings of 9th Workshop on Mining Scientific and
Engineering Datasets, as part of the SIAM International Conference on
Data Mining (SDM), 2006.
- S. Datta, C. Giannella, and H. Kargupta, "K-Means Clustering over a Large, Dynamic Network",
Proceedings of the SIAM Conference on Data Mining (SDM'06), 2006.
- S. Bandyopadhyay,
C. Giannella, U. Maulik, H. Kargupta, K. Liu, and S. Datta. "Clustering
Distributed Data Streams in Peer-to-Peer Environments",
Information Sciences , 176(14), 1952-1985,2006.
· 2005
- J. da Silva, C. Giannella, R. Bhargava, H. Kargupta, M. Klusch, "Distributed Data Mining and Agents",
Engineering Applications of Artificial Intelligence , 18, 791-807, 2005.
- C. Giannella, B. Sayrafi, “An
Information Theoretic Histogram for One-Dimensional Selectivity Estimation”,
Proceedings of the ACM Symposium on Applied Computing (ACM SAC 2005) DTTA
track (short paper). Extended
version: Technical Report 584, Computer
Science Department, Indiana
University, download: www.cs.indiana.edu/ftp/techreports/index.html.
· 2004
- C. Giannella, K. Liu, T.
Olsen, and H. Kargupta, Communication Efficient Construction of
Decision Trees Over Heterogeneously Distributed Data, Proceedings
of the IEEE International Conference on Data Mining
(ICDM), 2004.
- C. Giannella, R. Bhargava,
and H. Kargupta, Multi-Agent
Systems and Distributed Data Mining, Proceedings of 8th International
Workshop on Cooperative Information Agents (CIA 2004), Erfurt, Germany,
September 27-29, 2004. Lecture
Notes in Artificial Intelligence, vol. 3191, Copyright Springer-Verlag
- R. Chen, C. Giannella, K.
Sivakumar, and H. Kargupta Distributed Data Mining for
Earth and Space Science Applications (ps) , Proceedings of the NASA
Earth Science Technology Conference, 2004.
- C. Giannella and E.
Robertson, On Approximation Measures for
Functional Dependencies (pdf), Information Systems 29(6),
2004.
· 2003
- C. Giannella, J. Han, E.
Robertson, C. Liu, "Mining Frequent Itemsets Over Arbitrary Time
Intervals in Data Streams". Technical Report 587, Computer Science Department, Indiana
University, Nov 2003 www.cs.indiana.edu/ftp/techreports/index.html.
An older version: C. Giannella, J. Han, J. Pei, X. Yan and P.S. Yu, Mining Frequent Patterns in Data Streams at
Multiple Time Granularities (pdf), H. Kargupta, A. Joshi, K.
Sivakumar, and Y. Yesha (eds.), Data Mining: Next Generation Challenges
and Future Directions, AAAI/MIT Press, 2003.
- C. Giannella and E.
Robertson, A Note on Approximation Measures for Multi-valued Dependencies
in Relational Databases, Information
Processing Letters vol 85 issue 3, February 2003.
· 2002 and earlier
- C. Giannella, An Axiomatic Approach to Defining Approximation Measures
for Functional Dependencies © Springer-Verlag (postscript), Lecture Notes in
Computer Science vol 2435, pg. 37-51 (proceedings of the 6th
East-European Conference on Advances in Databases and Information
Systems), 2002 -- Best Student Paper Award.
- C. Giannella, M. Dalkilic, D.
Groth, and E. Robertson, "Using Horizontal-Vertical Decompositions to
Improve Query Evaluations", Technical Report 558, Computer Science Department, Indiana
University, Feb 2002 (this paper can be downloaded from www.cs.indiana.edu/ftp/techreports/index.html).
Short version in Lecture Notes in Computer Science vol 2405, 2002
(proceedings of the 19th British National Conference on Databases (BNCOD))
26-41.
- C. Giannella, D. Van Gucht,
"On Adding a Connectedness Operator to FO+poly (linear)", Acta
Informatica 38(9), pages 621-648, 2002. An earlier, less
polished, version appears as Indiana University Computer Science
Department Technical Report, TR530, 2000. This paper can be downloaded
from www.cs.indiana.edu/ftp/techreports/index.html.
- C. Giannella and E. Robertson,
"On an Information Theoretic Approximation Measure for Functional
Dependencies", Technical Report 555, Computer Science Department, Indiana
University, August 2001 (this paper can be downloaded from www.cs.indiana.edu/ftp/techreports/index.html).
- C. Wyss, C. Giannella, and E.
Robertson, "FastFDs: A Heuristic-Driven Depth-First Algorithm for
Mining Functional Dependencies from Relation Instances", Technical
Report 551, Computer Science
Department, Indiana
University, July 2001 (this paper can be downloaded from www.cs.indiana.edu/ftp/techreports/index.html).
Short version in Lecture Notes in Computer Science 2112 (proceedings of
the 3rd International Conference on Data Warehousing and Knowledge
Discovery (DaWaK 2001), Munich, Germany, September 2001).
- C. Rood, C. Giannella, "Finding Minimal Keys in a Relation Instance"
(postscript) Unpublished draft.
- J. C. Nieves Sanchez, M.
Osorio, and C. Giannella, "Useful
Transformations in Answer Set Programming" (postscript), Workshop
on Answer Set Programming as part of the AAAI 2001 Spring Symposium
Series, March 26-28, 2001, Stanford, CA, Technical Report SS-01-01, pg
146-152.
- C. Giannella, John Schlipf An
Emperical Study of the 4-Valued Kripke Kleene Semantics and 4-Valued Well-Founded
Semantics in Random Propositional Logic Programs, Annals of Mathematics
and Artificial Intelligence 25 (1999) 3,4, pg 275-309, ed. J. Dix, J.
Lobo
- C. Giannella, John Schlipf,
An Emperical Study of the 3-Valued Kripke Kleene Semantics in Random
Propositional Logic Programs, Proceedings of the Logic Programming Track
of the 7th International Workshop on Non-Monotonic Reasoning 1998, pg.
41-50, ed. J. Dix, J. Lobo
- I wrote a Master's thesis
entitled "An Empirical Study of Non-Threshold Behavior of Iterative
3-SAT Algorithms" at the University
of Cincinnati
(advised by J. Franco). A copy of the document can be obtained on request
from the Department of Electrical and Computer Engineering & Computer
Science at the University
of Cincinnati. A
short version (15 pages, unpublished) is availible on-line below.
On Extending Two Threshold Algorithms to
Non-Threshold Algorithms by Attaching the Unit Clause Rule(compressed
postscript -- use gunzip to decompress)
- F.S. Annexstein and C.
Giannella, An empirical study of "lazy"
protocols for routing information in dynamic networks (compressed
postscript) International Colloquium on Structural Information and
Communication Complexity (SIROCCO '97) Carleton University Press,
Proceedings in Informatics 1, ISBN, Danny Krizanc and Peter Widmayer,
editors.
SOFTWARE AND OTHER LINKS
· Market-basket synthetic data generator (from
IBM Quest), here.
· Research presentation on data mining over a large, dynamic network (power-point slides).
· Research presentation on privacy preservng data mining -- Euclidean Distance Preserving Data Perturbation (power-point slides).
· High Support Itemset Presentation (Short) .