ACS Central Science · February 2026

From Prompt to Drug: Toward Pharmaceutical Superintelligence

Interactive visualization of Insilico Medicine & Eli Lilly's framework for fully autonomous, AI-orchestrated drug discovery — from natural language prompt to clinical candidate.

Insilico Medicine × Eli Lilly DOI: 10.1021/acscentsci.5c01473

The Prompt-to-Drug Pipeline

A scientist inputs a natural language request — the AI reasoning controller decomposes it into biology, chemistry, and clinical modules, each autonomously orchestrated.

💬 USER PROMPT

"Design a drug for idiopathic pulmonary fibrosis targeting a novel mechanism."

Stage 01
AI Reasoning Controller
Central orchestrator decomposes prompt, plans multi-step workflow, coordinates agents
Stage 02
Biology Module
Target discovery, hypothesis generation, disease pathway analysis, validation
Stage 03
Chemistry Module
Generative molecular design, docking, FEP, retrosynthesis, automated synthesis
Stage 04
Preclinical Module
In vitro/in vivo validation, ADMET prediction, toxicology screening, PK/PD
Stage 05
Clinical Module
Trial design, outcome prediction, patient stratification, regulatory strategy
21
Days to Hit Discovery
GENTRL · DDR1 inhibitor
18
Months Target → Phase I
IPF · Rentosertib (ISM001-055)
22
PCC Nominations
Since 2021 · 30+ assets
60–200
Molecules per Program
vs 5,000–10,000 traditional

System Architecture

The hierarchical architecture: a central reasoning controller orchestrates domain-specific AI agents, which interface with both computational tools and automated laboratory systems.

🧠
Reasoning Controller
Central Orchestrator

Advanced reasoning model (GPT-o1/Gemini-class) that decomposes high-level prompts into actionable tasks, plans multi-step workflows, delegates to domain agents, and revises strategies based on experimental readouts.

🤖
Domain Agents
Specialized AI Systems

Each pipeline stage has dedicated AI agents with domain-specific training: biology agents mine omics data, chemistry agents generate molecules, clinical agents model trial outcomes. Agents communicate through structured APIs.

🔬
Lab Automation
Physical Execution Layer

Microfluidic synthesis, high-throughput screening, automated assays. Humanoid-in-the-loop systems interact with legacy equipment. 24/7 continuous experimentation with minimal downtime between cycles.

📊
Data Feedback Loop
Closed-Loop Learning

Experimental results feed back into the controller. Failed hypotheses update priors. Successful results trigger next-stage planning. Each iteration refines the model's understanding of the target biology and chemical space.

AI Platforms & Tools

Existing Insilico Medicine platforms that serve as foundational components for the Prompt-to-Drug vision, plus the broader ecosystem of AI drug discovery tools.

PandaOmics
Target Discovery Engine
Mines scientific literature, patents, grants, omics data, and clinical trial databases to identify and prioritize novel therapeutic targets. Uses NLP + knowledge graphs for hypothesis generation.
Biology Module
DORA
AI Research Assistant
Agentic AI research tool that analyzes scientific literature, generates biological hypotheses, and proposes experimental designs. Powered by multimodal foundation models.
Biology Module
Chemistry42
Generative Molecular Design
Designs novel molecules from user prompts using 3D structural analysis. Incorporates ADMET prediction, molecular docking, and free-energy perturbation for lead optimization.
Chemistry Module
Chemistry42 Retrosynthesis
Synthesis Planning
Plans synthetic routes for novel compounds using AI-driven retrosynthetic analysis. Identifies commercially available starting materials and optimizes reaction conditions.
Chemistry Module
InClinico
Clinical Trial Prediction
Forecasts clinical trial outcomes, predicts probability of success at each phase, identifies optimal patient populations, and recommends trial design parameters.
Clinical Module
GENTRL
Tensorial RL for Drug Design
Generative Tensorial Reinforcement Learning — discovered potent DDR1 kinase inhibitors in 21 days. Autoencoder-based model that learns molecular structure-property relationships.
Generative AI
nach0
Multimodal Foundation Model
Natural and Chemical Languages foundation model — jointly processes molecular representations and natural language for multi-task drug discovery applications.
Foundation Model
Pharma.AI
Unified Platform
End-to-end generative AI-powered solution spanning biology, chemistry, and medicine development. Orchestration layer for all Insilico AI subsystems.
Platform

Timeline Comparison

Traditional drug discovery vs. AI-accelerated pipelines: a dramatic compression across every stage.

Traditional Drug Discovery

Total: 10–15 years · $2.6B average cost
Target Discovery3–5 years
Lead Optimization2–3 years
Preclinical1–2 years
Clinical Trials6–8 years

AI-Accelerated Pipeline

Target: 3–5 years · ~$300M–500M est. cost
AI Target Discovery1–6 months
Generative Chemistry21 days–6 months
AI-Guided Preclinical6–12 months
AI-Optimized Clinical2–4 years

Cost Reduction by Stage

Estimated savings from AI integration

AI Drug Discovery Evolution

Key milestones and capability expansion

Proof-of-Concept Case Studies

Individual pipeline stages that have already been automated and validated in real-world drug discovery programs.

Case Study 1 · Nature Biotechnology 2019

DDR1 Kinase Inhibitor — 21-Day Discovery

Using the GENTRL (Generative Tensorial Reinforcement Learning) model, Insilico discovered potent and selective DDR1 kinase inhibitors in just 21 days from concept to hit compound. An additional 27 days for synthesis and validation — total 48 days from start to validated hit.

Day 0: Project initiation Day 12: GENTRL generates candidates Day 21: Hit compound identified Day 48: Synthesized & validated
Case Study 2 · Nature Biotechnology 2024 / Nature Medicine 2025

Rentosertib (ISM001-055) — IPF TNIK Inhibitor

First AI-discovered drug to advance to Phase IIa. Generative AI identified TNIK as a novel target for idiopathic pulmonary fibrosis. From target discovery to Phase I in 18 months — vs. 3–6 years typical. Phase IIa showed positive proof-of-concept results, validating AI-driven drug development in the clinical setting.

PandaOmics: TNIK target ID Chemistry42: Lead optimization 18 months → Phase I Phase IIa: Positive PoC
Case Study 3 · JCIM 2022

CDK20 Inhibitor — 30-Day Design

Chemistry42 designed a novel CDK20 inhibitor within 30 days using structure-based generative design. The compound was synthesized and validated in cell-based assays with sub-micromolar potency. Demonstrates the chemistry module's ability to rapidly explore novel chemical space.

Day 0: Target structure input Day 30: Lead compound designed Validated: Sub-µM potency
Case Study 4 · Portfolio-Wide 2021–2024

22 Preclinical Candidates Across 30+ Programs

Powered by the Pharma.AI platform, Insilico has nominated 22 preclinical candidates since 2021, averaging 12–18 months per program (vs. 3–6 years traditional) with only 60–200 molecules synthesized per program (vs. 5,000–10,000 typical). Key therapeutic areas: fibrosis, oncology, immunology, CNS.

22 PCC nominations 12–18 mo avg turnaround 60–200 compounds/program $120M Qilu Pharma collab

Compounds Synthesized per Program

AI-driven efficiency vs. traditional screening

Time to PCC Nomination

Months from project start to preclinical candidate

Safeguards & Risk Mitigation

The framework acknowledges significant risks with autonomous AI-driven drug discovery and proposes multi-layered safeguards.

Risk Severity Description Mitigation
Hallucinations High AI generates plausible but incorrect biological hypotheses or molecular structures Multi-agent validation, experimental verification checkpoints
Error Propagation High Errors in early stages cascade through pipeline (wrong target → wasted chemistry) Stage-gate reviews, human oversight at critical decision points
Data Bias Medium Training data biases lead to systematic blind spots in target or chemical space Diverse training data, bias audits, adversarial testing
Overfitting to Metrics Medium AI optimizes computational scores that don't translate to biological activity Wet-lab validation loops, multi-objective optimization
Regulatory Uncertainty Medium Autonomous AI decisions lack clear regulatory framework for approval Auditability mechanisms, "AI arms" in clinical trials
Lab Safety Low Autonomous synthesis without adequate safety checks Hazard prediction models, synthesis safety constraints
👤
Human-in-the-Loop
Critical Decision Points

Human experts review and approve at high-stakes junctures: target validation, lead candidate selection, IND filing decisions, and clinical protocol design. The AI proposes; humans dispose.

🤖
Humanoid-in-the-Loop
Physical Lab Automation

Robotic systems (including humanoid robots) interact with legacy lab equipment, enabling 24/7 experimentation. Bridge between digital AI decisions and physical experimental execution.

🔍
Auditability
Decision Traceability

Every AI decision logged with reasoning chain, confidence scores, and evidence sources. Full traceability from prompt to candidate enables regulatory review and failure analysis.

⚕️
AI Arms in Trials
Real-World Validation

Clinical trials include "AI arms" where AI-predicted outcomes are prospectively validated against real patient data. Builds evidence base for AI reliability in clinical decision-making.

Key References

Peer-reviewed publications underlying the Prompt-to-Drug framework.

  1. Zhavoronkov, A. et al. From Prompt to Drug: Toward Pharmaceutical Superintelligence. ACS Central Science (2026). DOI: 10.1021/acscentsci.5c01473
  2. Zhavoronkov, A. et al. Deep learning enables rapid identification of potent DDR1 kinase inhibitors. Nature Biotechnology 37, 1038–1040 (2019). DOI
  3. Ren, F. et al. A small-molecule TNIK inhibitor targets fibrosis in preclinical and clinical models. Nature Biotechnology (2024). DOI
  4. Ren, F. et al. A generative AI-discovered TNIK inhibitor for idiopathic pulmonary fibrosis: a randomized phase 2a trial. Nature Medicine (2025). DOI
  5. Ivanenkov, Y. A. et al. Chemistry42: An AI-Driven Platform for Molecular Design and Optimization. J. Chem. Inf. Model. (2023). DOI
  6. Kamya, P. et al. PandaOmics: An AI-Driven Platform for Therapeutic Target and Biomarker Discovery. J. Chem. Inf. Model. (2024). DOI
  7. Ozerov, I. V. et al. Prediction of Clinical Trials Outcomes Based on Target Choice and Clinical Trial Design with Multi-Modal Artificial Intelligence. Clin. Pharmacol. Ther. (2023). DOI
  8. Livne, M. et al. nach0: multimodal natural and chemical languages foundation model. Chemical Science 15, 8380–8389 (2024). DOI
  9. Zhavoronkov, A. et al. Hallmarks of aging-based dual-purpose disease and age-associated targets predicted using PandaOmics AI-powered discovery engine. Aging (2020). DOI
  10. Subbiah, V. The next generation of evidence-based medicine. Nature Medicine 29, 49–58 (2023).