Hi, I am a computer engineering graduate from IOE, Pulchowk Campus, with a year of experience as a research assistant at NAAMII, a reputable research organization. My primary research interests lie in developing and applying machine learning techniques to solve practical problems, focusing on semi-supervised learning, multi-modal learning, and natural language processing.
At NAAMII, I gained valuable experience developing and implementing complex machine-learning models for various applications, including natural language processing and medical images. I have a strong foundation in algorithms, data structures, and mathematics, enabling me to create practical solutions to challenging problems. I am a self-motivated individual with strong self-management skills and comfortable working independently or as part of a team. I am passionate about producing high-quality results that have a real impact on society.
As I seek to further my knowledge and skills in machine learning, I am eager to pursue a Ph.D. program that would allow me to delve deeper into these research areas. With my research experience and passion for creating practical solutions, I am confident I can make meaningful contributions as a research assistant.
Bachelors in Computer Engineering, 2022
Tribhuvan University, Institute of Engineering, Pulchowk Campus
High School in Physical Sciences, 2017
SOS Hermann Gmeiner School Bharatpur, Bharatpur, Nepal
Supervisor: Bishesh Khanal, Ph.D.
Supervisor: Binod Bhattarai, Ph.D.
Medical image segmentation allows quantifying target structure size and shape, aiding in disease diagnosis, prognosis, surgery planning, and comprehension. Building upon recent advancements in foundation Vision-Language Models (VLMs) from natural image-text pairs, several studies have proposed adapting them to Vision-Language Segmentation Models (VLSMs) that allow using language text as an additional input to segmentation models. Introducing auxiliary information via text with human-in-the-loop prompting during inference opens up unique opportunities, such as open vocabulary segmentation and potentially more robust segmentation models against out-of-distribution data. Although transfer learning from natural to medical images has been explored for image-only segmentation models, the joint representation of vision-language in segmentation problems remains underexplored. This study introduces the first systematic study on transferring VLSMs to 2D medical images, using carefully curated 11 datasets encompassing diverse modalities and insightful language prompts and experiments. Our findings demonstrate that although VLSMs show competitive performance compared to image-only models for segmentation after finetuning in limited medical image datasets, not all VLSMs utilize the additional information from language prompts, with image features playing a dominant role. While VLSMs exhibit enhanced performance in handling pooled datasets with diverse modalities and show potential robustness to domain shifts compared to conventional segmentation models, our results suggest that novel approaches are required to enable VLSMs to leverage the various auxiliary information available through language prompts. The code and datasets are available at https://github.com/naamiinepal/medvlsm.