黄正能教授: Open Long-Tailed Recognition (OLTR) in Deep Learning Tasks

12月21日 9:30,线上

发布者:韦钰 发布时间:2020-12-08浏览次数:5770

报告内容:Open Long-Tailed Recognition (OLTR) in Deep Learning Tasks

报告人:黄正能 教授

报告时间:12月21日 9:30

报告方式:线上(腾讯会议:329 808 289)


报告人简介

Real world data often have a long-tailed and open-ended distribution. A practical recognition system must classify among majority and minority classes, generalize from a few known instances, and acknowledge novelty upon a never seen instance. The Open Long-Tailed Recognition (OLTR) of a deep learning architecture should be optimized for the classification accuracy over a balanced test set which include head, tail, and open classes, i.e., the OLTR must handle imbalanced classification, few-shot learning, and open-set recognition in one integrated algorithm, whereas existing classification approaches focus only on one aspect and deliver poorly over the entire class spectrum. We propose an OLTR deep learning platform, which uses re-balanced sampling strategy to improve the recognition accuracy of tail classes, without degrading the accuracy of head classes. To simultaneously determine the open classes, the embedding features are metric learned based on an auto-encoder architecture, and the dimensionality reduced features are tested by an innovative adaptive outlier factor (AOF) algorithm, which is an unsupervised anomaly detection method, to predict the open classes.



  报告内容简介:

 Multiple object tracking (MOT) and video object segmentation (VOS)  are crucial tasks in computer vision society. Further improvement and significance  can be achieved by effectively combining these two tasks together, i.e.,  multiple object tracking and segmentation (MOTS). However, most tracking-by-detection MOT methods, with available detected bounding boxes, cannot effectively handle static, slow-moving and fast moving camera scenarios simultaneously due to ego-motion and frequent occlusion. In this work, we propose a novel tracking framework, called “instance-aware MOT” (IAMOT), that can track multiple objects in either static or moving cameras by jointly considering the instance-level features and object motions. Overall, when evaluated on the MOTS20 and KITTI-MOTS dataset, our proposed method won the first place in Track3 of the BMTT Challenge in IEEE CVPR 2020 workshop. When Lidar information is available, we further propose a multi-stage framework called “Lidar and monocular Image Fusion based multi-object Tracking and Segmentation (LIFTS)” for MOTS. This proposed framework is also evaluated on BMTT Challenge 2020 Track2: KITTI-MOTS dataset and achieves the 2nd place ranking in the competition.