280 Downloads (Pure)


We address the problem of learning to recognize new objects on-the-fly efficiently. When using CNNs, a typical approach for learning new objects is by fine-tuning the model. However, this approach relies on the assumption that the original training set is available and requires high-end computational resources for training the ever-growing dataset efficiently, which can be unfeasible for robots with limited hardware. To overcome these limitations, we propose a new architecture that: 1) Instead of predicting labels, it learns to generate discriminative and separable embeddings of an object’s viewpoints by using a Supervised Triplet Loss, which is easier to implement than current smart mining techniques and the trained model can be applied to unseen objects. 2) Infers an object’s identity efficiently by utilizing a lightweight classifier in the features embedding space, this keeps the inference time in the order of milliseconds and can be retrained efficiently
when new objects are learned. We evaluate our approach on four real-world images datasets used for Robotics and Computer Vision applications: Amazon Robotics Challenge 2017 by MIT-Princeton, T-LESS, ToyBoX, and CORe50 datasets.
Original languageEnglish
Publication statusPublished - 24 May 2019
Event2019 IEEE International Conference on Robotics and Automation (ICRA 2019) - Montreal, Canada
Duration: 20 May 201924 May 2019


Conference2019 IEEE International Conference on Robotics and Automation (ICRA 2019)
Abbreviated titleICRA2019
Internet address


Dive into the research topics of 'Learning Discriminative Embeddings for Object Recognition on-the-fly'. Together they form a unique fingerprint.

Cite this