Chreston Miller

Jump to: navigation, search

About (CV)

I received my Ph.D. in Computer Science and Applications from the Department of Computer Science at Virginia Tech (2013). I worked under Dr. Francis Quek in the Vislab Research group. My research interests include data analytics, behavior analytics, and pattern analysis.

I am currently the chief Informaticist for the College of Engineering at Virginia Tech. I support engineering research groups (students/faculty) with all apsects of data management during a project. My main focus in this position is supporting the research groups with their data analytics.

Prior to the doctoral program, I received my bachelors of science degree in Computer Science with a double major in Mathematics from Virginia Tech in May 2006. I received my masters degree in Computer Science from Virginia Tech in December 2007. I have participated in an internship and co-op with the DoD at NSWC-DD (Naval Surface Warfare Center - Dahgren Division).

Current Research

Here I give an overview of the major areas of my research. Further details can be seen on my dissertation project page, plus my publications below.

Pattern Analysis and Identification through Structure [1][2][3][4][5][6][7]

There is a multitude of annotated behavior corpora (manual and automatic annotations) available as research expands in multimodal analysis of human behavior. Normally multimodal corpora analysis either separates the researcher from her data (e.g., statistical analysis or machine learning) or only engages them `after-the-fact' in a tedious task of scrubbing through annotated/raw data. I focus on how to identify patterns based on the structural nature of human behavior. By structure I mean discrete events that hold ordered relations in time that may vary from one occurrence to another.

The reason I approach from a structural angle is there are several nuances of human behavior that make their analysis challenging and require a focus on their sctructural nature. First, human behavior is variant. The idea represented by a behavioral interaction, e.g., a greeting between two individuals, may be exhibited many different ways in the data making identification difficult. Second, every observed behavior has the potential to be relevant to an expert depending on his/her analysis goal(s). Hence, there is no concept of noise but rather one of relevance. Third, a behavior's value to an expert may not be based on frequency or statistical significance but on subjective relevance. Lastly, for experts, there is no training data to build classifiers as the behavior sought may vary greatly or the behavior(s) of interest may not be known yet. They leverage their knowledge to identify and discover what is relevant.

This has lead me to a research area that explores different techniques to perform pattern analysis through the structural relationships of event with respect to order and time.

Comparison Metrics for Categorical Event Data [7]

Categorical data (e.g., descriptively label data such as 'John sat down' and 'Mary is running') is difficult to rank-order. When trying to choose amongst a collection of categorical data, the goal of the chooser, what the data means in context of the goal, and the high-level conceptual meaning of the data dictates the final choice. I arrived at this challenge while working on my model growth algorithm for structural model learning described here. When faced with a number of possible paths through the data, which one should be taken?

I address this problem through applying n-gram process as seen in speech processing for the probabilistic modeling of phoneme and word sequences (R2). N-grams are based on conditional probability where history and context are taken into account, e.g., Pr(X | Y) = probability of X given Y. The approach applies n-gram processing to better understand how events of a categorical nature interact. N-gram processing is a technique for probabilistic modeling of a sequence. Figure A (below) illustrates how n-grams are used given sequence a words to predict the next word. Such predictions are based on bigrams (n = 2) and trigrams (n = 3). Given that k elements have occurred, where k >= n, the probability of the next element can be given, hence a sequence of elements can be built one at a time as the previous elements give an idea of what is next. Figure B then illustrates the application of n-grams to a sequence of categorical data where one might predict the next piece of data using bigrams and trigrams.

NgramUse Small.png

The application of n-gram processing provides the rank-ordering of categorically temporal data.

Representations for Human Behavior

My work with structural learning lead to a set of relationship principles to describe the relationships amongst events. A number of grammars and representations have been developed, e.g., Allen ( R3), Freksa ( R4), Temporal Reference Language, Description Logics, Mörchen and Fradkin [R5], Time Series Knowledge Representation, T-Patterns, and Frequent Episode Mining. All of these have strengths but do not cover all areas needed to represent and describe the models and patterns of human behavior. My representational needs cover the areas of order, temporal description, grouping, and semantic description. A colleague and I are currently developing a new representation that combines the strengths of current representations and provides extensions.

Prior Research

Petri Net Graphical User Interfaces [8]

  • Project Description: This work investigated the adoption of graphical aids in place of textual information for conveyance of understanding Petri Net models with respect to mental workload on the user and the understanding capable by the model representation.
  • Motivation: The modeling language provided by Petri Nets has proven to be a promising means for representation. The tools and environments that support Petri Nets provide a graphical representation of the model and use textual information for conveyance of the detailed workings of the model. However, such use of textual information thrusts a heavy load on the user’s mental workload and impedes quick understanding of the models.
  • My Contribution: I was the sole person who worked on this project. This was published in a scholar series at Virginia Tech.

TanTab [9][10]

  • Project Description: TanTab is an interactive tabletop system that uses tangrams (a traditional puzzle with 7 geometric shapes) to mediate geometric learning while supporting group play among children, from PreK to Grade 3. The system affords physical and graphical interaction by supporting three modes: the physical, the virtual and the parametric.
  • Motivation: Previous studies have shown that using tangrams as physical manipulatives produce better geometric problem solving, while virtual manipulation on a screen supports better understanding of formal geometric parameters. However, no commercial system up to date is able to bridge between fully intuitive physical manipulation of forms (tangrams) and explicit control of geometric parameters through the virtual. TanTab aims to serve that purpose.
  • My Contribution: Lead developer for infrastructure and integration for computer vision system and tabeltop display system.

MacVisSTA [R1]

  • Project Description: MacVisSTA is a tool to annotate and analyze multi-channel/multimodal data with syncronizing support across multiple channels (e.g., audio and/or video).
  • Motivation: The analysis of multimodal data has become an increasingly important analysis approach. More systems, interfaces, and sensor arrays are becoming multimodal (i.e., using multiple modes of input). New breeds of software are needed to address the needs of analyzing this kind of data.
  • My Contributions: I worked on optimizations for the source code.

CardTable [11][12]

  • Project Description: CardTable is a notecard system that aids historians in formulating insight about historical facts. The facts are represented by virtual notecards on a large, high-resolution, tabletop display system. The user can interact with the virtual notecards on the tabeltop display through a handheld device (PDA) and essentially interact with these virtual notecards on the tabletop display.
  • Motivation: The inspiration comes from the affordance of physical notecards as used by researchers for years. Historians especially sift through stacks of facts and notes trying to make sense of historical events.
  • My Contribution: I developed the software for the PDA that interacted with the tabletop display.


  1. C. Miller, F. Quek, and L.P. Morency. Search Strategies for Pattern Identification in Multimodal Data: Three Case Studies. ICMR '14, Glasgow, UK. (AR: 39%). pdf
  2. C. Miller, F. Quek, and L.P. Morency. Interactive Relevance Search and Modeling: Support for Expert-Driven Analysis of Multimodal Data. ICMI ’13, Sydney, Autralia. (AR: 38%) pdf
  3. C. Miller, L.P. Morency and F. Quek. Structural and Temporal Inference Search (STIS): Pattern Identification in Multimodal Data. ICMI 2012 (35.8% ). pdf
  4. C. Miller and F. Quek. Interactive Data-Driven Discovery of Temporal Behavior Models From Events In Media Streams. ACM Multimedia, Oct. 29 - Nov. 2, 2012 (20.2%). pdf
  5. C. Miller. Interactive data-driven search and discovery of temporal behavior patterns from media streams. ACM Multimedia Doctoral Symposium, Oct. 29 - Nov. 2, 2012. pdf
  6. C. Miller and F. Quek. Toward Multimodal Situated Analysis. ICMI 2011 (47/120, 39%). Alicante, Spain. pdf
  7. 7.0 7.1 C. Miller, F. Quek, and N. Ramakrishnan. Structuring ordered nominal data for event sequence discovery. In MM ’10: Proceedings of the eighteenth ACM international conference on Multimedia. ACM, 2010. (29/85, 34.12%). pdf
  8. C. Miller. The relationship between mental workload and interface design for petri net environments. In T. Smith-Jackson and T. Coalson, editors, ISE 5604: Human Information Processing Scholar Series 2009-4. TR# VT-ISE-ACE2009-4, pages 55–60. 2011. pdf
  9. F. Quek, C. Miller, A. Joshi, Y. Verdie, R. Ehrich, M. Evans, S. Chu Yew Yee. and P. Chakraborty (2010). TanTab, A Tangram Tabletop System. Disney Research Learning Challenge Finalist. SIGGRAPH ’10, Los Angeles, CA.
  10. Michael A. Evans, Jesse L.M. Wilkins, Yannick Verdie, Chreston Miller, Elisabeth Drechsel, Eric Woods and Berrin Dogusoy. Technology to Facilitate Collaborative, Co-Constructive Learning: A Multi-Touch, Tangible User Interface for PreK-2 Mathematics. ICLS 2010.
  11. C. Miller, A. Robinson, R. Wang, P. Chung, and F. Quek. Interaction techniques for the analysis of complex data on high-resolution displays. In ICMI ’08, pages 21–28, New York, NY, USA, 2008. ACM. (AR: 44%). pdf
  12. C. Andrews, T. Henry, C. Miller, and F. Quek. Cardtable: An embodied tool for analysis of historical information. In Tabletop 2007, 2007.


  • R1. R. T. Rose, F. Quek, and Y. Shi. Macvissta: a system for multimodal analysis. In ICMI ’04, pages 259–264. ACM, 2004.
  • R2. P. F. Brown, P. V. deSouza, R. L. Mercer, V. J. D. Pietra, and J. C. Lai. Class-based n-gram models of natural language. Comput. Linguist., 18(4):467–479, 1992.
  • R3. J. F. Allen. Maintaining knowledge about temporal intervals. Commun. ACM, 26(11):832–843, 1983.
  • R4. C. Freksa. Temporal reasoning based on semi-intervals. Artificial Intelligence, 54(1-2):199 – 227, 1992.
  • R5. F. Mörchen and D. Fradkin. Robust mining of time intervals with semi-interval partial order patterns. In SIAM Conference on Data Mining (SDM), 2010.