An outdoor video dataset annotated with action labels, collected from 29 participants wearing two head-mounted cameras (GoPro and SMI eye tracker) while assembling a camping tent. In total, this is over 7 hours of recordings. Tent assembly includes manual interactions with non-rigid objects such as spreading the tent, securing guylines, reading instructions, and opening a tent bag. An interesting aspect of the dataset is that it reflects participants' proficiency in completing or understanding the task. This leads to participant differences in action sequences and action durations. Our dataset also has several new types of annotations for two synchronised egocentric videos. These include task errors, self-rated uncertainty and gaze position, in addition to the task action labels.