EMNLP 2025

November 06, 2025

Suzhou, China

Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.

In clinical science, biomarkers are crucial indicators for early cancer detection, prognosis, and guiding personalized treatment decisions. Although critical, extracting biomarkers and their levels from clinical texts remains a complex and underexplored problem in natural language processing research. In this paper, we present BIOPSY, an end-to-end pipeline that integrates a domain-adapted biomarker entity recognition model, a relation extraction model to link biomarkers to their respective mutations, a biomarker-type classifier, and finally, a tailored algorithm to capture biomarker expression levels. Evaluated on 5,000 real-world clinical texts, our system achieved an overall F1 score of 0.86 for oncology and 0.87 for neuroscience domains. This reveals the ability of the pipeline to adapt across various clinical sources, including trial records, research papers, and medical notes, offering the first comprehensive solution for end-to-end, context-aware biomarker extraction and interpretation in clinical research.

Downloads

Paper

Next from EMNLP 2025

Recover-LoRA: Data-Free Accuracy Recovery of Degraded Language Models via Low-Rank Adaptation
poster

Recover-LoRA: Data-Free Accuracy Recovery of Degraded Language Models via Low-Rank Adaptation

EMNLP 2025

Devleena Das and 2 other authors

06 November 2025