In this paper, we focus on simultaneous inference of transportation modes and human activities in daily life via modelling and inference from multivariate time series data, which are streamed from off-the-shelf mobile sensors (e.g. embedded in smartphones) in real-world dynamic environments. The transportation mode will be inferred from the structured hierarchical contexts associated with human activities. Through our mobile context recognition system, an accurate and robust solution can be obtained to infer transportation mode, human activity and their associated contexts (e.g. whether the user is in moving or stationary environment) simultaneously. There are many challenges in analysing and modelling human mobility patterns within urban areas due to the ever-changing environments of mobile users. For instance, a user could stay at a particular location and then travel to various destinations depending on the tasks they carry within a day. Consequently, there is a need to reduce the reliance on location-based sensors (e.g. GPS), since they consume a significant amount of energy on smart devices, for the purpose of intelligent mobile sensing (i.e. automatic inference of transportation mode, human activity and associated contexts). Nevertheless, our system is capable of outperforming the simplistic approach that only considers independent classifications of multiple context label sets on data streamed from low-energy sensors.