Xiong, H. Y. et al. The human splicing code reveals new insights into the genetic determinants of illness. Science (2015).
Van Nostrand, E. L. et al. Sturdy transcriptome-wide discovery of RNA-binding protein binding websites with enhanced CLIP (eCLIP). Nat. Strategies 13, 508–514 (2016).
Google Scholar
Brar, G. A. & Weissman, J. S. Ribosome profiling reveals the what, when, the place and the way of protein synthesis. Nat. Rev. Mol. Cell Biol. 16, 651–664 (2015).
Google Scholar
Herzog, V. A. et al. Thiol-linked alkylation of rna to evaluate expression dynamics. Nat. Strategies 14, 1198–1204 (2017).
Google Scholar
Jaganathan, Okay. et al. Predicting splicing from main sequence with deep studying. Cell 176, 535–548 (2019).
Google Scholar
Linder, J., Koplik, S. E., Kundaje, A. & Seelig, G. Deciphering the impression of genetic variation on human polyadenylation utilizing APARENT2. Genome Biol. 23, 232 (2022).
Google Scholar
Agarwal, V. & Kelley, D. R. The genetic and biochemical determinants of mRNA degradation charges in mammals. Genome Biol. 23, 245 (2022).
Google Scholar
Merico, D. et al. G p.Met645Arg causes Wilson illness by selling exon 6 skipping. NPJ Genom. Med. 5, 16 (2020).
Google Scholar
Richards, S. et al. Requirements and tips for the interpretation of sequence variants: a joint consensus suggestion of the American School of Medical Genetics and Genomics and the Affiliation for Molecular Pathology. Genet. Med. 17, 405–424 (2015).
Google Scholar
Celaj, A. et al. An RNA basis mannequin allows discovery of illness mechanisms and candidate therapeutics. Preprint at bioRxiv (2023).
Lotfollahi, M. et al. Predicting mobile responses to complicated perturbations in high-throughput screens. Mol. Syst. Biol. 19 (2023).
Ji, Y., Zhou, Z., Liu, H. & Davuluri, R. V. DNABERT: pre-trained bidirectional encoder representations from transformers mannequin for DNA-language in genome. Bioinformatics 37, 2112–2120 (2021).
Google Scholar
Chen, J. et al. Interpretable RNA basis mannequin from unannotated information for extremely correct RNA construction and performance predictions. Preprint at (2022).
Nguyen, E. et al. Hyenadna: long-range genomic sequence modeling at single nucleotide decision. In thirty seventh Convention on Neural Info Processing Techniques (NeurIPS 2023) (2023).
Penić, R. J., Vlašić, T., Huber, R. G., Wan, Y. & Šikić, M. Rinalmo: general-purpose RNA language fashions can generalize nicely on construction prediction duties. Nat. Commun. (2025).
Li, S. et al. mRNA-LM: full-length built-in SLM for mRNA evaluation. Nucleic Acids Res. (2025).
Nguyen, E. et al. Sequence modeling and design from molecular to genome scale with evo. Science (2024).
Yuan, Y., Chen, Q. & Pan, X. DgRNA: a long-context RNA basis mannequin with bidirectional consideration mamba2. Preprint at bioRxiv (2024).
Brixi, G. et al. Genome modelling and design throughout all domains of life with Evo 2. Nature (2026).
Devlin, J., Chang, M.-W., Lee, Okay. & Toutanova, Okay. Bert: pre-training of deep bidirectional transformers for language understanding. In Proc. 2019 Convention of the North American Chapter of the Affiliation for Computational Linguistics 4171–4186 (2019).
Radford, A., Narasimhan, Okay., Salimans, T. & Sutskever, I. Bettering language understanding by generative pre-training. Preprint at OpenAI (2018).
Music, Y., Wang, T., Cai, P., Mondal, S. Okay. & Sahoo, J. P. A complete survey of few-shot studying: evolution, functions, challenges, and alternatives. ACM Comput. Surv. (2023).
Taliun, D. et al. Sequencing of 53,831 various genomes from the NHLBI TOPMed Program. Nature 590, 290–299 (2021).
Google Scholar
Gudmundsson, S. et al. Variant interpretation utilizing inhabitants databases: Classes from gnomAD. Hum. Mutat. 43, 1012–1030 (2021).
Google Scholar
Chen, S. et al. A genomic mutational constraint map utilizing variation in 76, 156 human genomes. Nature 625, 92–100 (2023).
Google Scholar
Lindblad-Toh, Okay. et al. A high-resolution map of human evolutionary constraint utilizing 29 mammals. Nature 478, 476–482 (2011).
Google Scholar
Sullivan, P. F. et al. Leveraging base-pair mammalian constraint to know genetic variation and human illness. Science (2023).
Christmas, M. J. et al. Evolutionary constraint and innovation throughout tons of of placental mammals. Science (2023).
Dalla-Torre, H. et al. Nucleotide transformer: constructing and evaluating strong basis fashions for human genomics. Nat. Strategies (2024).
Lu, A. X., Lu, A. X. & Moses, A. Evolution is all you want: phylogenetic augmentation for contrastive studying. Preprint at (2020).
Chen, T., Kornblith, S., Norouzi, M. & Hinton, G. A easy framework for contrastive studying of visible representations. In thirty seventh Worldwide Convention on Machine Studying (PMLR, 2020).
Kirilenko, B. M. et al. Integrating gene annotation with orthology inference at scale. Science (2023).
Gu, A. & Dao, T. Mamba: linear-time sequence modeling with selective state areas. In First Convention on Language Modeling (COLM) (2024).
Truthful, B. et al. World impression of unproductive splicing on human gene expression. Nat. Genet. 56, 1851–1861 (2024).
Google Scholar
Schertzer, M. D. et al. Cas13d-mediated isoform-specific rna knockdown with a unified computational and experimental toolbox. Nat. Commun. (2025).
Koonin, E. V. Orthologs, paralogs, and evolutionary genomics. Annu. Rev. Genet. 39, 309–338 (2005).
Google Scholar
Frankish, A. et al. GENCODE 2021. Nucleic Acids Res. 49, D916–D923 (2021).
Google Scholar
O’Leary, N. A. et al. Reference sequence (RefSeq) database at NCBI: present standing, taxonomic growth, and practical annotation. Nucleic Acids Res. 44, D733–745 (2016).
Google Scholar
Koonin, E. V. & Wolf, Y. I. Constraints and plasticity in genome and molecular-phenome evolution. Nat. Rev. Genet. 11, 487–498 (2010).
Google Scholar
Sayers, E. W. et al. Database assets of the Nationwide Middle for Biotechnology Info in 2023. Nucleic Acids Res. 51, D29–D38 (2023).
Google Scholar
Yeh, C.-H. et al. Decoupled contrastive studying. In European Convention on Laptop Imaginative and prescient 668–684 (2022).
Kelley, D. R. et al. Sequential regulatory exercise prediction throughout chromosomes with convolutional neural networks. Genome Res. 28, 739–750 (2018).
Google Scholar
Karollus, A. et al. Species-aware DNA language fashions seize regulatory components and their evolution. Genome Biol. (2024).
Abramson, J. et al. Correct construction prediction of biomolecular interactions with AlphaFold 3. Nature 630, 493–500 (2024).
Google Scholar
Schrödinger, L. & DeLano, W. Pymol. http://www.pymol.org/pymol
Warren, C. F. A., Wong-Brown, M. W. & Bowden, N. A. BCL-2 household isoforms in apoptosis and most cancers. Cell Loss of life Dis. 10, 177 (2019).
Google Scholar
Lavatory, L. S. W. et al. Bcl-xl/bcl2l1 is a essential anti-apoptotic protein that promotes the survival of differentiating pancreatic cells from human pluripotent stem cells. Cell Loss of life Dis. (2020).
Wickenhagen, A. et al. A prenylated dsRNA sensor protects towards extreme COVID-19. Science 374, eabj3624 (2021).
Google Scholar
Lee, N. Okay., Tang, Z., Toneyan, S. & Koo, P. Okay. EvoAug: enhancing generalization and interpretability of genomic deep neural networks with evolution-inspired information augmentations. Genome Biol. 24, 105 (2023).
Google Scholar
Lu, A. X., Zhang, H., Ghassemi, M. & Moses, A. Self-supervised contrastive studying of protein representations by mutual info maximization. Preprint at bioRxiv (2020).
Pertea, M. et al. CHESS: a brand new human gene catalog curated from 1000’s of large-scale RNA sequencing experiments reveals intensive transcriptional noise. Genome Biol. 19, 208 (2018).
Google Scholar
Von Kügelgen, J. et al. Self-supervised studying with information augmentations provably isolates content material from model. In thirty fifth Convention on Neural Info Processing Techniques (NeurIPS 2021) (2021).
Xiao, M.-S. et al. Genome-scale exon perturbation screens uncover exons essential for cell health. Mol. Cell 84, 2553–2572.e19 (2024).
Google Scholar
Spies, N., Burge, C. B. & Bartel, D. P. 3’ UTR-isoform alternative has restricted affect on the steadiness and translational effectivity of most mRNAs in mouse fibroblasts. Genome Res. 23, 2078–2090 (2013).
Google Scholar
Irimia, M. et al. A extremely conserved program of neuronal microexons is misregulated in autistic brains. Cell 159, 1511–1523 (2014).
Google Scholar
García-Pérez, R. et al. The panorama of expression and different splicing variation throughout human traits. Cell Genom. (2023).
Letunic, I. & Bork, P. Interactive tree of life (iTOL) v6: latest updates and new developments. Nucleic Acids Res. (2024).
Kissinger, R. & NIAID Visible & Medical Arts. Chosen illustrations: nonhuman primate, cow, deer, wooden mouse, human male, and DNABrush (BIOART-000388, 000100, 000110, 000020, 000238, 000127). NIAID NIH BioArt Supply (2024).
McInnes, L., Healy, J., Saul, N. & Großberger, L. Umap: uniform manifold approximation and projection. J. Open Supply Softw. (2018).
van den Oord, A., Li, Y. & Vinyals, O. Illustration studying with contrastive predictive coding. Preprint at (2018).
Vaswani, A. et al. Consideration is all you want. In thirty first Convention on Neural Info Processing Techniques (NIPS 2017) (2017).
Gu, A., Goel, Okay. & Ré, C. Effectively modeling lengthy sequences with structured state areas. In Worldwide Convention on Studying Representations (ICLR 2022) (2022)
Georgakopoulos-Soares, I. et al. Transcription issue binding web site orientation and order are main drivers of gene regulatory exercise. Nat. Commun. (2023).
Sohn, Okay. Improved deep metric studying with multi-class N-pair loss goal. In Proc. thirtieth Worldwide Convention on Neural Info Processing Techniques (NIPS’16) 1857–1865 (Curran, 2016).
Sugimoto, Y. & Ratcliffe, P. J. Isoform-resolved mRNA profiling of ribosome load defines interaction of HIF and mTOR dysregulation in kidney most cancers. Nat. Struct. Mol. Biol. 29, 871–880 (2022).
Google Scholar
Thul, P. J. et al. A subcellular map of the human proteome. Science 356 (2017).
Rodriguez, J. M. et al. APPRIS: choosing functionally essential isoforms. Nucleic Acids Res. 50, D54–D59 (2022).
Google Scholar
Consortium, T. G. O. et al. The Gene Ontology knowledgebase in 2023. Genetics 224, iyad031 (2023).
Google Scholar
Ashburner, M. et al. Gene Ontology: software for the unification of biology. Nat. Genet. 25, 25–29 (2000).
Google Scholar
Zhou, N. et al. The CAFA problem stories improved protein operate prediction and new practical annotations for tons of of genes by means of experimental screens. Genome Biol. 20, 244 (2019).
Google Scholar
Rodriguez, J. M. et al. APPRIS: annotation of principal and different splice isoforms. Nucleic Acids Res. 41, D110–D117 (2013).
Google Scholar
Shi, R. et al. mRNABench: A curated benchmark for mature mRNA property and performance prediction. Preprint at bioRxiv (2024).
Consortium, E. P. et al. An built-in encyclopedia of DNA components within the human genome. Nature 489, 57 (2012).
Google Scholar
Ietswaart, R. et al. Genome-wide quantification of RNA move throughout subcellular compartments reveals determinants of the mammalian transcript life cycle. Mol. Cell 84, 2765–2784 (2024).
Google Scholar
Fazal, F. M. et al. Atlas of subcellular RNA localization revealed by APEX-seq. Cell 178, 473–490 (2019).
Google Scholar
Jumper, J. et al. Extremely correct protein construction prediction with AlphaFold. Nature 596, 583–589 (2021).
Google Scholar
Martin, F. J. et al. Ensembl 2023. Nucleic Acids Res. 51, D933–D941 (2022).
Google Scholar
Zar, J. H. Biostatistical Evaluation, fifth edn (Pearson, 2009).
Fradkin, P. et al. Orthrus: in direction of evolutionary and practical RNA basis fashions. Zenodo (2024).
Agarwal, V. & Kelley, D. Saluki: predicting mRNA half-life from mRNA sequence. Zenodo (2022).
Fradkin, P. et al. Orthrus model 1.0.0 (pc software program). Zenodo (2025).



