Abstract
We address the problem of learning to recognize new objects on-the-fly efficiently. When using CNNs, a typical approach for learning new objects is by fine-tuning the model. However, this approach relies on the assumption that the original training set is available and requires high-end computational resources for training the ever-growing dataset efficiently, which can be unfeasible for robots with limited hardware. To overcome these limitations, we propose a new architecture that: 1) Instead of predicting labels, it learns to generate discriminative and separable embeddings of an object’s viewpoints by using a Supervised Triplet Loss, which is easier to implement than current smart mining techniques and the trained model can be applied to unseen objects. 2) Infers an object’s identity efficiently by utilizing a lightweight classifier in the features embedding space, this keeps the inference time in the order of milliseconds and can be retrained efficiently
when new objects are learned. We evaluate our approach on four real-world images datasets used for Robotics and Computer Vision applications: Amazon Robotics Challenge 2017 by MIT-Princeton, T-LESS, ToyBoX, and CORe50 datasets.
when new objects are learned. We evaluate our approach on four real-world images datasets used for Robotics and Computer Vision applications: Amazon Robotics Challenge 2017 by MIT-Princeton, T-LESS, ToyBoX, and CORe50 datasets.
Original language | English |
---|---|
Publication status | Published - 24 May 2019 |
Event | 2019 IEEE International Conference on Robotics and Automation (ICRA 2019) - Montreal, Canada Duration: 20 May 2019 → 24 May 2019 https://www.icra2019.org/ |
Conference
Conference | 2019 IEEE International Conference on Robotics and Automation (ICRA 2019) |
---|---|
Abbreviated title | ICRA2019 |
Country/Territory | Canada |
City | Montreal |
Period | 20/05/19 → 24/05/19 |
Internet address |