Vision Language Model for Interpretable Medical Image Segmentation

  • Developed a novel approach utilizing multi-modal vision-language models to extract semantic information from image descriptions and images, enabling accurate segmentation of diverse medical images.
  • Conducted extensive evaluations of existing vision language models on multiple datasets, assessing their applicability and transferability to the medical domain.
  • Explored the impact of variations in image descriptions on model performance, revealing valuable insights into the model’s responsiveness to different prompts.
Rabin Adhikari
Rabin Adhikari
Research Assistant

Currently researching on semi-supervised and multi-modal learning, Medical Imaging, and Natural Language Processing