Multi-Modal Learning Using Physicians Diagnostics for Optical Coherence Tomography Classification

Personnel: Yash-yee Logan, Kiran Kokilepersaud, Gukyeong Kwon, Ghassan AlRegib, Charles Wykoff, Hannah Yu

Goal: To incorporate expert diagnostics and insights into the analysis of Optical Coherence Tomography (OCT) using multi-modal learning.

Challenges: Existing approaches for OCT classification typically entail transfer learning. Though transfer learning often yields better performance than medical experts,very rarely does this transition from research to real clinical settings. This is because transfer learning models make decisions based on unrelated, non-medical, internal weights that are not directly optimized for OCT. This lack of medical intuition behind the decision-making process inspires a lack of trust from the medical community.

Our Work: In [1] we argue that injecting ophthalmological assessments as another supervision in a learning framework is of great importance for the machine learning process to perform accurate and interpretable classification. We demonstrate the proposed framework through comprehensive experiments that compare the effectiveness of combining diagnostic attribute features with latent visual representations and show that they surpass the state-of-the-art approach.

Architectures for OCT classification: (A) Singlestream autoencoder (B) Dual-stream autoencoder

References:

Y. Logan, K. Kokilepersaud, G. Kwon and G. AlRegib, C. Wykoff, H. Yu, "Multi-Modal Learning Using Physicians Diagnostics for Optical Coherence Tomography Classification," in IEEE International Symposium on Biomedical Imaging (ISBI), Kolkata, India, Jan. 7 2022. [PDF]