Abstract: Our goal is to develop a visual monitoring system that passively observes moving objects in a site and learns patterns of activity from those observations. For extended sites, the system will require multiple cameras. Thus, key elements of the system are motion tracking, camera coordination, activity classification, and event detection. In this paper, we focus on motion tracking and show how one can use observed motion to learn patterns of activity in a site. Motion segmentation is based on an adaptive background subtraction method that models each pixel as a mixture of Gaussians and uses an online approximation to update the model. The Gaussian distributions are then evaluated to determine which are most likely to result from a background process. This yields a stable, real-time outdoor tracker that reliably deals with lighting changes, repetitive motions from clutter, and long-term scene changes. While a tracking system is unaware of the identity of any object it tracks, the identity remains the same for the entire tracking sequence. Our system leverages this information by accumulating joint co-occurrences of the representations within a sequence. These joint co-occurrence statistics are then used to create a hierarchical binary-tree classification of the representations. This method is useful for classifying sequences, as well as individual instances of activities in a site.