IEEE IV 2023 Tutorial Proposal

Title: A holistic view of perception in Intelligent Vehicles – Data Collection, Interpretation, and Prediction

Duration : Full day (6 hours)

 Presenters : Ghassan AlRegib and Mohit Prabhushankar

(Georgia Institute of Technology)

Tutorial Description

The goal of the tutorial is to introduce and expand on the challenges and potential solutions for machine learning based perception algorithms in the field of intelligent vehicles. These challenges start with the data collection process itself. With the threat of repurposing datasets for unintentional applications (https://exposing.ai/duke_mtmc/ ), it is imperative to follow best practices that take into account data privacy and fairness. Moreover, with the increase in size of datasets, the logistics of labeling needs to be considered. We introduce and tackle this challenge through an active learning setting where labelers work in conjunction with models to label an optimal subset of data. These are covered in Part 1 of the tutorial. Part 2 of the tutorial deals with the model training itself. We discuss state-of-the-art methods that provide perception solutions to object detection and segmentation. However, these methods are insufficient in safety-critical applications of intelligent vehicles. Specifically, deep learning-based methods suffer from robustness, calibration, and adversarial attack issues that inhibit their deployment in all settings. We discuss the potential challenges in training from the point of view of these inferential challenges and discuss potential recent solutions. Part 3 of the tutorial deals with deployment of models. Deployed models need to earn trust from a number of diverse trustees. These include the end users, the government regulators, the insurers among others. We discuss trust issues, specifically for perception, and expand on the applications of explainability and behavioral prediction applications that quantify it. We conclude with existing notions and technologies of safety and provide clues to how it may expand in the future.

A succinct representation of the tutorial is below

    • Introduction

    1. Data cycle in an ML pipeline

    2. Attacks on privacy, outputs, interpretability, and deployability

    • Types of data for intelligent vehicles

    1. Data based on sensor modality: RGB, RGBD, LIDAR, Ultrasound etc.

    2. Data based on lens types: Fisheye vs Ordinary wide-angle lens

    3. Perception data based on sensor positioning: Ego vehicle vs Infrastructure data

    • Data collection

    1. Privacy, fairness, and transparency

    2. Best practices

    • Data labeling

    1. Labeling and quality assurance

    2. Sequence-wise labeling

    3. Active Learning

    4. Cost-aware Active Learning

    • Conclusions and take-away messages

    • Introduction

    1. Application-based model training

    • Challenges related to inference

    1. Robustness

    2. Anomaly detection

    3. Model regression

    4. Model calibration

    5. Model epistemic uncertainty

    • Potential solutions

    1. Augmentations

    2. Data augmentations

    3. Other augmentations

    • Positive-congruent training

    • Contrastive Learning

    • Introspective Learning

    • Conclusions and take-away messages

    • Trust in perception deployed models

    1. Categorization based on users: Direct, Indirect, and Targeted trust

    • Applications of trust

    1. Explainability

    2. Predictive uncertainty estimation

    3. Behavioral prediction

    • Safety

    1. Regulatory approach to safety

    2. Existing technologies

      1. Driver Assistance Technologies

      2. Autonomous Vehicles

    3. Future research in safety

    • Conclusions and take-away messages

Tutorial Relevance

While perception in recognition of center-surround objects in ImageNet database has exceeded human capacity, it is not transferable to more complicated settings like driving in urban scenarios. The relevance of each part of the tutorial is mentioned below:

  1. Part 1: Data collection is an integral part of building intelligent vehicles. Recently however, repurposing such data for biometric surveillance and re-identification has caused an active humanitarian crisis that has caused retraction of some public domain data (https://exposing.ai/duke_mtmc/ ). We describe the challenges associated in collecting such data and detail best practices.

  2. Part 2: Model training is traditionally viewed through the lens of an application. For perception, this application is usually object detection. However, for AVs, it is essential to account for aberrant and anomalous situations during inference. Safety is paramount and contingencies must be accounted for. Recent mishaps with Tesla and Uber have demonstrated the need for model training from an inferential point of view.

  3. Part 3: Deployed model usage requires model interpretability – to the end users, to the designing engineers, to the insurers, to the governmental regulators among others. Hence, causal one-shot explainability is not practical. Recently, contextually relevant explanations in the form of counterfactual and contrastive explanations as well as prediction of behavior of pedestrians has gained prominence in the research fields. This is in addition to deployed safety and privacy standards set by governmental agencies. We discuss these in Part 3 of the tutorial.

Expected Audience:

This tutorial is intended for PhD students, professors, researchers and engineers working in different topics related to intelligent vehicles.

Recent Relevant Publications

  1. G. AlRegib and M. Prabhushankar, "Explanatory Paradigms in Neural Networks: Towards Relevant and Contextual Explanations," in IEEE Signal Processing Magazine, Special Issue on Explainability in Data Science, Feb. 18 2022. [PDF][Code]

  2. M. Prabhushankar, K. Kokilepersaud*, Y. Logan*, S. Trejo Corona*, G. AlRegib, C. Wykoff, "OLIVES Dataset: Ophthalmic Labels for Investigating Visual Eye Semantics," in Advances in Neural Information Processing Systems (NeurIPS 2022) Track on Datasets and Benchmarks, New Orleans, LA,, Nov. 29 - Dec. 1 2022 [PDF][Code]

  3. M. Prabhushankar, and G. AlRegib, "Introspective Learning : A Two-Stage Approach for Inference in Neural Networks," in Advances in Neural Information Processing Systems (NeurIPS), New Orleans, LA,, Nov. 29 - Dec. 1 2022. [PDF][Code]

  4. C. Zhou, G. AlRegib, A. Parchami, and K. Singh, “Learning Trajectory-Conditioned Relations to Predict Pedestrian Crossing Behavior,” in IEEE International Conference on Image Processing (ICIP), Bordeaux, France, Oct. 16-19 2022. [PDF][Code]

  5. R. Benkert, M. Prabhushankar, and G. AlRegib, “Forgetful Active Learning With Switch Events: Efficient Sampling for Out-of-Distribution Data,” in IEEE International Conference on Image Processing (ICIP), Bordeaux, France, Oct. 16-19 2022. [PDF]

  6. Y. Logan, R. Benkert, A. Mustafa, G. Kwon, G. AlRegib, "Patient Aware Active Learning for Fine-Grained OCT Classification," in IEEE International Conference on Image Processing (ICIP), Bordeaux, France, Oct. 16-19 2022. [PDF][Code]

  7. R. Benkert, M. Prabhushankar, G. AlRegib, A. Parchami, and E. Corona, "Gaussian Switch Sampling: A Second Order Approach to Active Learning," in IEEE Transactions on Artificial Intelligence (TAI), Feb. 05 2023. [PDF][Code]

  8. G. Kwon, M. Prabhushankar, D. Temel, and G. AlRegib, "Backpropagated Gradient Representations for Anomaly Detection," in Proceedings of the European Conference on Computer Vision (ECCV), SEC, Glasgow, Aug. 23-28 2020. [PDF][Code][Link]

  9. D. Temel, G. Kwon*, M. Prabhushankar*, and G. AlRegib, "CURE-TSR: Challenging Unreal and Real Environments for Traffic Sign Recognition," in Advances in Neural Information Processing Systems (NIPS) Workshop on Machine Learning for Intelligent Transportation Systems, Long Beach, CA, Dec. 2017 [PDF][Code]

  10. D. Temel, M-H. Chen, and G. AlRegib, "Traffic Sign Detection Under Challenging Conditions: A Deeper Look Into Performance Variations and Spectral Characteristics," in IEEE Transactions on Intelligent Transportation Systems, Jul. 2019. [PDF][Code]


Previous Edition: This will be the first time presenting this tutorial

Presentation Materials: All materials will be shared with the audience and the event attendees.


Presenters’ contact information and short biography

Dr. Ghassan AlRegib

Ghassan AlRegib (alregib@gatech.edu) is currently the John and Marilu McCarty Chair Professor in the School of Electrical and Computer Engineering at the Georgia Institute of Technology. He was a recipient of the ECE Outstanding Graduate Teaching Award in 2001 and both the CSIP Research and the CSIP Service Awards in 2003, the ECE Outstanding Junior Faculty Member Award, in 2008, and the 2017 Denning Faculty Award for Global Engagement. His research group, the Omni Lab for Intelligent Visual Engineering and Science (OLIVES) works on research projects related to machine learning, image and video processing and understanding, seismic interpretation, machine learning for ophthalmology, robustness, large-scale dataset creation, and deployable ML. His research group created more than 11 large-scale datasets.  He has participated in several service activities within the IEEE. He served as the TP co-Chair for ICIP 2020 and GlobalSIP 2014.

Dr. Mohit Prabhushankar

Mohit Prabhushankar (mohit.p@gatech.edu) received his Ph.D. degree in electrical engineering from the Georgia Institute of Technology (Georgia Tech), Atlanta, Georgia, 30332, USA, in 2021. He is currently a Postdoctoral Fellow in the School of Electrical and Computer Engineering at the Georgia Institute of Technology in the Omni Lab for Intelligent Visual Engineering and Science (OLIVES). He is working in the fields of image processing, machine learning, active learning, healthcare, and robust and explainable AI. He is the recipient of the Best Paper award at ICIP 2019 and Top Viewed Special Session Paper Award at ICIP 2020. He is the recipient of the ECE Outstanding Graduate Teaching Award, the CSIP Research award, and of the Roger P Webb ECE Graduate Research Assistant Excellence award, all in 2022. He participated in creating five large-scale datasets.