Abstract
Recognizing multiple interleaved activities in a video requires implicitly partitioning
the detections for each activity. Furthermore, constraints between activities are impor-
tant in finding valid explanations for all detections. We use Attribute Multiset Gram-
mars (AMGs) as a formal representation for a domain’s knowledge to encode intra- and
inter-activity constraints. We show how AMGs can be used to parse all the observa-
tions into ‘feasible’ global explanations. We also present an algorithm for building a
Bayesian network (BN) given an AMG and a set of detections. The set of labellings of
the BN corresponds to the set of all possible parse trees. Finding the best explanation
then amounts to finding the maximum a posteriori labeling of the BN. The technique
is successfully applied to two different problems including the challenging problem of
associating pedestrians and carried objects entering and departing a building.
the detections for each activity. Furthermore, constraints between activities are impor-
tant in finding valid explanations for all detections. We use Attribute Multiset Gram-
mars (AMGs) as a formal representation for a domain’s knowledge to encode intra- and
inter-activity constraints. We show how AMGs can be used to parse all the observa-
tions into ‘feasible’ global explanations. We also present an algorithm for building a
Bayesian network (BN) given an AMG and a set of detections. The set of labellings of
the BN corresponds to the set of all possible parse trees. Finding the best explanation
then amounts to finding the maximum a posteriori labeling of the BN. The technique
is successfully applied to two different problems including the challenging problem of
associating pedestrians and carried objects entering and departing a building.
Original language | English |
---|---|
Title of host publication | Proceedings of the British Machine Vision Conference 2009 |
DOIs | |
Publication status | Published - 2009 |