RUMI
Autonomous Scientific Cognition Framework
Scientific hypothesis generation is extremely slow. Researchers manually sift through thousands of papers, extract findings, cross-reference contradictions, and form hypotheses over months. Most drug resistance mechanisms (e.g., KRAS G12C) still remain unexplained because no system can rapidly mine contradictions at scale across live literature.
- Literature Ingestion: Queries PubMed databases in real-time, retrieving domain-specific oncology and pharmacology papers.
- Entity Extraction: Uses NER pipelines to identify genes, compounds, pathways, and mutation markers from raw paper text.
- Knowledge Graph: Constructs a dynamic semantic graph (59+ nodes, 40+ relationships) mapping all discovered biological interactions.
- Contradiction Mining Engine: Detects logical conflicts in the graph — e.g., Paper A says X activates Y; Paper B says X inhibits Y. These are the hypothesis seeds.
- Hypothesis Formulation: Synthesizes high-confidence testable hypotheses with validation plans (Western blot / qRT-PCR protocols).
Reduces hypothesis formulation from months to minutes. Demonstrated on KRAS G12C resistance pathways — generated 2 novel, testable hypotheses linking sotorasib resistance to RAC1/PAK1 reactivation and PI3Kγ-AKT bypass. Confidence scores generated by Groq inference provider.