간단리뷰 Day1. Accelerating Drug Repurposing with AI: The Role of Large Language Models in Hypothesis Validation

Accelerating Drug Repurposing with AI: The Role of Large Language Models in Hypothesis Validation

https://www.biorxiv.org/content/10.1101/2025.06.13.659527v1.full

1. Problem

기존의 drug repurposing은 비용과 시간을 줄일 수 있지만, 계산적으로 제안된 약물-질환 연관성을 어떻게 신뢰할 수 있게 검증할 것인가가 가장 큰 한계. 실험/임상 검증은 너무 비싸고 오래 걸림.

2. Idea

Large Language Models (LLMs)을 활용해 문헌 기반으로 약물-질환 연관성의 타당성(validation) 을 빠르게 검증하자가 아이디어.
특히 프롬프트 엔지니어링 전략(zero-shot, few-shot, chain-of-thought 등)을 비교하여 어떤 방식이 가장 신뢰할 만한지 평가.

3. Data & Method

데이터:
- 기존 pathway 기반 방법(DREBIOP)으로 얻은 약물–질환 후보 21,968건 중 일부 샘플 (30건)
  - 질병-질병이 특정 생물학적 경로(pathway)를 공유한다면, 한 질환에 쓰이는 약물이 다른 질환에도 효과가 있을 수 있다. 즉, Disease 1 ↔ Disease 2 가 공통 경로를 갖고 있으면, Disease 2 치료제 → Disease 1 후보약물로 전이 가능.
- 비교를 위한 benchmark case 10건(문헌에서 확정된 사례)
모델: GPT-4o, Claude-3, Gemini-2, DeepSeek
방법:
- Phase 1: 4개 LLM × 10종 프롬프트 전략 → 10건 케이스
- Phase 2: 성능 좋은 프롬프트만 골라 GPT-4o, DeepSeek으로 30건 케이스 + benchmark 10건 테스트
지표: Accuracy, Precision, Recall, F1
프롬프트 전략 10가지: https://github.com/iratxe-zunzunegui/drug-repurposing-validation-LLMs/blob/main/data/prompt_templates.md
1. Zero-shot prompting – 아무 맥락 없이 바로 질문
2. Biological pathway contextualization – 공유 경로 기반임을 명시
3. Explicit reasoning – 논리적 근거 제시를 요구
4. Few-shot learning – 레이블 예시 2개 제공 후 질문
5. Few-shot + Explicit reasoning – 예시 + 근거 제시 결합
6. Chain-of-thought prompting – 단계적 추론 과정을 강제
  - This drug repurposing case was identified by analyzing shared biological pathways.
    To determine if minocycline can be repurposed for Parkinson’s disease, follow these steps:
    1. Identify the primary mechanism of action of minocycline.
    2. Examine whether this mechanism interacts with key pathological features of Parkinson’s disease.
    3. Consider existing biomedical literature or clinical trials supporting or refuting this repurposing case.
    Now, provide your answer beginning with ‘Viable’ or ‘Non-Viable,’ followed by a brief step-by-step explanation and references if available.
7. Self-critique prompting – 모델이 스스로 답을 검토·수정
8. Counterfactual reasoning – 비효과적일 가능성 먼저 검토 후 분류
9. Direct literature summarization – 관련 문헌 요약 및 근거 제시
10. Expert persona simulation – “생물의학 연구자” 역할을 부여해 답변

drug-repurposing-validation-LLMs/data/prompt_templates.md at main · iratxe-zunzunegui/drug-repurposing-validation-LLMs

Repository for the paper “Accelerating Drug Repurposing with AI”, including datasets, evaluation scripts, and LLM prompt experiments. - iratxe-zunzunegui/drug-repurposing-validation-LLMs

github.com

4. Evaluation & Findings

Phase 1:
- GPT-4o F1 = 0.86)와 DeepSeek(F1 = 0.85) 가장 안정적, Claude-3 (F1 = 0.71) 과 Gemini-2 (F1 = 0.74)
- P4(Few-shot), P5(Explicit Reasoning을 사용한 Few-shot), 그리고 P6(Chain-of-Thought)가 가장 높은 정확도와 정밀도를 보임
- P1(Zero-shot Prompting)과 P3(Zero-shot Prompting with Explicit Reasoning)은 재현율에서 좋은 성적을 보였지만, 정밀도가 낮았는데, 이는 실행 가능한 사례의 과분류 경향이 증가
Phase 2:
- GPT-4o → precision 높음 (거짓양성 줄임)
- DeepSeek → recall 높음 (후보를 많이 잡음)
- Trade-off: 정확성 vs 민감도
Benchmark cases:
- 두 모델 모두 Accuracy ≈ 0.92, F1 ≈ 0.92 (거의 완벽)
- Pathway-based 예측보다 훨씬 안정적 성능

5. Takeaway

LLM 기반 validation은 실제 문헌에 근거한 검증 자동화에 유용.
프롬프트 설계가 핵심: Chain-of-Thought, Few-shot > Zero-shot.
GPT-4o → 보수적·정밀한 검증용 / DeepSeek → 탐색적 후보 발굴용.
하지만 novel한 케이스에서는 불확실성이 커서 인간 검증은 여전히 필요.

저작자표시 (새창열림)

'Paper' 카테고리의 다른 글

간단리뷰 Day4. Principles and methods for transferring polygenic risk scores across global populations (0)	2025.10.12
간단리뷰 Day2. KGML-xDTD: A Knowledge Graph-based Machine Learning Frame work for Drug Treatment Prediction and Mechanism Description (0)	2025.10.05
Applications of Artificial Intelligence in Drug Repurposing (0)	2025.07.23
Transformers and genome language models (0)	2025.07.22
Generalized biological foundation model with unified nucleic acid and protein language (0)	2025.07.14

Bioinfomatics

간단리뷰 Day1. Accelerating Drug Repurposing with AI: The Role of Large Language Models in Hypothesis Validation

Accelerating Drug Repurposing with AI: The Role of Large Language Models in Hypothesis Validation

1. Problem

2. Idea

3. Data & Method

4. Evaluation & Findings

5. Takeaway

'Paper' 카테고리의 다른 글

티스토리툴바

간단리뷰 Day1. Accelerating Drug Repurposing with AI: The Role of Large Language Models in Hypothesis Validation

Accelerating Drug Repurposing with AI: The Role of Large Language Models in Hypothesis Validation

1. Problem

2. Idea

3. Data & Method

4. Evaluation & Findings

5. Takeaway

'Paper' 카테고리의 다른 글

'Paper' Related Articles

티스토리툴바