Biomedical Applications of Contrastive Learning

Personnel: Kiran Kokilepersaud, Mohit Prabhushankar

Goal: Contrastive learning approaches enable unlabeled data to be incorporated into training machine learning setups that traditionally require full access to labels. It accomplishes this by enforcing similar data points together and dissimilar data points apart. Our area of research is how to leverage domain specific information that may be important within a medical setting such as.

Challenges: Current ideas in contrastive learning are based on concepts rooted in the natural image domain without consideration of relevant contexts within medical settings. For example, medical data oftentimes contains associated clinical information as well as distributions of severity of disease. However, current approaches rely on generating ad-hoc augmentations to create similar pairs of data without utilizing context that may be important within a medical context. The central idea challenge to be solved is how to leverage this idea of correlations in terms of clinical data as well as disease severity within a contrastive learning workflow

Our Work:

[1] describes a very relevant medical setting where there exists clinical information obtained on a patient visit alongside disease biomarker data that we want to train a model to detect. We describe a contrastive learning setup to make use of this abundant clinical information to detect key biomarkers of disease. [2] describes a method to leverage the severity of disease in an optical coherence tomography (OCT) scan for a contrastive learning setup. The intuition is that if we can estimate severity, then we can cluster images together based on severity for a novel positive pair selection strategy for contrastive learning. [4] is the dataset that contains the necessary data to do experiments for [1] and [2].

References:

  1. K. Kokilepersaud, S. Trejo Corona, M. Prabhushankar, G. AlRegib, C. Wykoff, "Clinically Labeled Contrastive Learning for OCT Biomarker Classification," IEEE Journal of Biomedical and Health Informatics, submitted on Jun. 23 2022.

  2. K. Kokilepersaud, M. Prabhushankar, G. AlRegib, S. Trejo Corona, C. Wykoff, "Gradient Based Labeling for Biomarker Classification in OCT," in IEEE International Conference on Image Processing (ICIP), Bordeaux, France, Oct. 

Datasets Utilized:

  1. Prabhushankar, M., Kokilepersaud, K. P., Logan, Y. Y., Corona, S. T., AlRegib, G., & Wykoff, C. “OLIVES Dataset: Ophthalmic Labels for Investigating Visual Eye Semantics." in Advances in Neural Information Processing Systems 35 (NeurIPS 2022), New Orleans, LA, Nov. 29 - Dec. 1 2022 [PDF] [CODE]