Cell2Sentence: Understanding Single-Cell Biology With LLMs
Yale University and Cell2Sentence-Scale (C2S-Scale) announce a set of open-source large language models trained to understand biology at the single-cell level
By bridging the gap between biology and artificial intelligence, C2S-Scale transforms intricate cellular data into easily understood cell sentences
Trillions of cells make up each human, and each one has a specific purpose, such as constructing organs, battling infections, or transporting oxygen
In the “Scaling Large Language Models for Next-Generation Single-Cell Analysis” session, Google is excited to introduce Cell2Sentence-Scale (C2S-Scale), a set of robust, open-source LLMs that “read” and “write” biological data at the single-cell level
From characterizing the cell types of individual cells to producing summaries of entire tissues or experiments, Cell2Sentence-Scale can automatically provide biological summaries of scRNA-seq data at various levels of complexity
Google work’s main conclusion is that biological language models exhibit well-defined scaling rules, with performance improving predictably with increasing model size
Predicting a cell’s reaction to a perturbation, like as a medication, gene deletion, or cytokine exposure, is one of the most fascinating uses of Cell2Sentence-Scale
Google use comparable strategies to improve Cell2Sentence-Scale models for biological reasoning, just as reinforcement learning is used to fine-tune big language models like Gemini to follow instructions and respond in useful, human-aligned ways
Cell2Sentence materials and models are now accessible on websites like GitHub and HuggingFace