Contrastive Learning

 Personnel: Kiran Kokilepersaud, Mohit Prabhushankar

Goal: Contrastive learning approaches enable  unlabeled data to be incorporated into training machine learning setups that traditionally require full access to labels. It accomplishes this by enforcing similar data points together and dissimilar data points apart. Our area of research is how to leverage domain specific information that may be important to specific application areas.

Challenges: Current ideas in contrastive learning are based on concepts rooted in the natural image domain without consideration of relevant contexts within real-world application settings. For example, medical data oftentimes contains associated clinical information, seismic data has information regarding similar geophysical structures, and fisheye camera data has mathematical models to describe the nature of the distortion. However, current approaches rely on generating ad-hoc augmentations to create similar pairs of data without utilizing context that may be important for these domains. The central work is how to incorporate these auxiliary types of information to better inform contrastive learning tasks.

Our Work:

[1] describes a very relevant medical setting where there exists clinical information obtained on a patient visit alongside disease biomarker data that we want to train a model to detect. We describe a contrastive learning setup to make use of this abundant clinical information to detect key biomarkers of disease. [2] describes a method to leverage the severity of disease in an optical coherence tomography (OCT) scan for a contrastive learning setup. The intuition is that if we can estimate severity, then we can cluster images together based on severity for a novel positive pair selection strategy for contrastive learning. [4] is the dataset that contains the necessary data to do experiments for [1] and [2]. [3] describes a setting in the seismic domain where the volumetric positions of data can be utilized as a means to select similar instances of data.

References:

  1. K. Kokilepersaud, S. Trejo Corona, M. Prabhushankar, G. AlRegib, C. Wykoff, "Clinically Labeled Contrastive Learning for OCT Biomarker Classification," IEEE Journal of Biomedical and Health Informatics, submitted on Jun. 23 2022.

  2. K. Kokilepersaud, M. Prabhushankar, G. AlRegib, S. Trejo Corona, C. Wykoff, "Gradient Based Labeling for Biomarker Classification in OCT," in IEEE International Conference on Image Processing (ICIP), Bordeaux, France, Oct. 16-19 2022.

  3. K. Kokilepersaud, M. Prabhushankar, and G. AlRegib, "Volumetric Supervised Contrastive Learning for Seismic Semantic Segmentation," in International Meeting for Applied Geoscience & Energy (IMAGE), Houston, TX, Aug. 28-Sept. 1 2022. [PDF][Code]

Datasets Utilized:

  1. Prabhushankar, M., Kokilepersaud, K. P., Logan, Y. Y., Corona, S. T., AlRegib, G., & Wykoff, C. “OLIVES Dataset: Ophthalmic Labels for Investigating Visual Eye Semantics." in Advances in Neural Information Processing Systems 35 (NeurIPS 2022), New Orleans, LA, Nov. 29 - Dec. 1 2022 [PDF] [CODE]