Rewritten Title: **"Decoding DNA’s Dynamic Dialogue: Cross-Strand Interactions In Sequence Language Models"**

Devlin, J., Chang, M.-W., Lee, K. & Toutanova, K. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proc. 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers) (eds Burstein, J. et al.) 4171–4186 (Association for Computational Linguistics, 2019).

Achiam, J. et al. GPT-4 technical report. Preprint at (2023).

Avsec, Ž. et al. Advancing regulatory variant effect prediction with AlphaGenome. Nature 649, 1206–1218 (2026).

Article Google Scholar

Brixi, G. et al. Genome modelling and design across all domains of life with Evo 2. Nature 652, 1349–1361 (2026).

Luo, X., Kang, X. & Schönhuth, A. Predicting the prevalence of complex genetic diseases from individual genotype profiles using capsule networks. Nat. Mach. Intell. 5, 114–125 (2023).

Article Google Scholar

Benegas, G., Albors, C., Aw, A. J., Ye, C. & Song, Y. S. A DNA language model based on multispecies alignment predicts the effects of genome-wide variants. Nat. Biotechnol. 43, 1960–1965 (2025).

Article Google Scholar

Xu, A. et al. SNPBag: a foundation model for multitask genome-scale SNP analysis. Preprint at Research Square (2025).

Li, H. et al. BMFM-DNA: a SNP-aware DNA foundation model to capture variant effects. Preprint at (2025).

Gao, Z., Liu, Q., Zeng, W., Jiang, R. & Wong, W. H. EpiGePT: a pretrained transformer-based language model for context-specific human epigenomics. Genome Biol. 25, 310 (2024).

Article Google Scholar

Zeng, W., Guo, H., Liu, Q. & Wong, W. H. Improving polygenic prediction from whole-genome sequencing data by leveraging predicted epigenomic features. Proc. Natl Acad. Sci. USA 122, e2419202122 (2025).

Article Google Scholar

Zhou, H., Shrikumar, A. & Kundaje, A. Towards a better understanding of reverse-complement equivariance for deep learning models in genomics. In Proc. 16th Machine Learning in Computational Biology Meeting (eds Knowles, D. A. et al.) 1–33 (PMLR, 2022).

Ji, Y., Zhou, Z., Liu, H. & Davuluri, R. V. DNABERT: pre-trained bidirectional encoder representations from transformers model for DNA-language in genome. Bioinformatics 37, 2112–2120 (2021).

Article Google Scholar

Zhou, Z. et al. DNABERT-2: efficient foundation model and benchmark for multi-species genomes. In Proc. International Conference on Learning Representations (OpenReview, 2024).

Mallet, V. & Vert, J.-P. Reverse-complement equivariant networks for DNA sequences. Adv. Neural Inf. Process. Syst. 34, 13511–13523 (2021).

Google Scholar

Schiff, Y. et al. Caduceus: bi-directional equivariant long-range DNA sequence modeling. In Proc. 41st International Conference on Machine Learning (eds Salakhutdinov, R. et al.) Vol. 235, 43632–43657 (PMLR, 2024).

Dalla-Torre, H. et al. Nucleotide transformer: building and evaluating robust foundation models for human genomics. Nat. Methods 22, 287–297 (2025).

Article Google Scholar

Nguyen, E. et al. HyenaDNA: long-range genomic sequence modeling at single nucleotide resolution. Adv. Neural Inf. Process. Syst. 36, 43177–43201 (2023).

Google Scholar

Sanabria, M., Hirsch, J., Joubert, P. M. & Poetsch, A. R. DNA language model GROVER learns sequence context in the human genome. Nat. Mach. Intell. 6, 911–923 (2024).

Article Google Scholar

Press, O., Smith, N. A. & Lewis, M. Train short, test long: attention with linear biases enables input length extrapolation. In Proc. International Conference on Learning Representations (OpenReview, 2022).

Dao, T. FlashAttention-2: faster attention with better parallelism and work partitioning. In Proc. International Conference on Learning

I’ll paraphrase the reference list entries to make them more readable while keeping the HTML structure intact:

(Curran Associates, 2024).

Lee, N. K., Tang, Z., Toneyan, S. & Koo, P. K. EvoAug: enhancing generalization and interpretability of genomic deep neural networks through evolution-inspired data augmentation techniques. Genome Biol. 24, 105 (2023).

Article

Google Scholar

Ma, M. Maintaining reverse-complement consistency in DNA language models. Preprint at (2025).

Duan, Q. et al. JanusDNA: a robust bi-directional hybrid DNA foundation model. Adv. Neural Inf. Process. Syst. 38, 68791–68818 (2026).

Google Scholar

Shrikumar, A., Greenside, P. & Kundaje, A. Leveraging reverse-complement parameter sharing to enhance deep learning models for genomics. Preprint at bioRxiv (2017).

Choi, C. H. et al. DNA actively guides its own transcription initiation process. Nucleic Acids Res 32, 1584–1590 (2004).

Article

Google Scholar

Rohs, R. et al. How DNA shape influences protein–DNA recognition. Nature 461, 1248–1253 (2009).

Article

Google Scholar

Kabir, A. et al. Integrating DNA breathing dynamics with deep learning foundational models to enhance genome-wide prediction of human transcription factor binding. Nucleic Acids Res. 52, e91 (2024).

Gordân, R. et al. Genomic regions adjacent to E-box binding sites affect DNA binding specificity of bHLH transcription factors via DNA shape. Cell Rep. 3, 1093–1104 (2013).

Article

Google Scholar

Chen, Y. et al. Structural insights into p53 binding to the BAX response element: DNA unwinding and compression enable base-pair insertion. Nucleic Acids Res. 41, 8368–8376 (2013).

Article

Google Scholar

Zhou, T. et al. Quantitative modeling of transcription factor binding specificities incorporating DNA shape features. Proc. Natl Acad. Sci. USA 112, 4654–4659 (2015).

Article

Google Scholar

Mitra, R. et al. Applying geometric deep learning to predict protein–DNA binding specificity. Nat. Methods 21, 1674–1683 (2024).

Article

Google Scholar

Adam, S., Klingel, V., Radde, N. E., Bashtrykov, P. & Jeltsch, A. Evaluating the accuracy of the epigenetic copy machine: a comprehensive specificity analysis of the DNMT1 DNA methyltransferase. Nucleic Acids Res. 51, 6622–6633 (2023).

Article

Google Scholar

Bashtrykov, P. et al. The specificity of Dnmt1 for hemimethylated CpG site methylation is governed by its catalytic domain. Chem. Biol. 19, 572–578 (2012).

Article

Google Scholar

Senadeera, D. C. et al. Dual-branch VideoMamba with gated class token fusion for detecting violent content. Preprint at (2025).

Yu, T., Cheng, L., Khalitov, R., Olsson, E. B. & Yang, Z. Self-distillation enhances self-supervised learning for DNA sequence inference tasks. Neural Netw. 183, 106978 (2025).

Article

Google Scholar

Yang, H. et al. HAD: hybrid architecture distillation surpasses teacher performance in genomic sequence modeling. Preprint at (2025).

Hu, J. et al. Comba: enhancing nonlinear RNNs through closed-loop control mechanisms. Preprint at (2025).

Grešová, K., Martinek, V., Čechák, D., Šimeček, P. & Alexiou, P. Genomic Benchmarks: a curated collection of datasets for classifying genomic sequences. BMC Genom. Data 24, 25 (2023).

Zhou, J. & Troyanskaya, O. G. Predicting the effects of noncoding variants using a deep learning-based sequence model. Nat. Methods 12, 931–934 (2015).

Article

Google Scholar

Li, J., Pu, Y., Tang, J., Zou, Q. & Guo, F. DeepATT: a hybrid category attention neural network for determining functional effects of DNA sequences. Brief. Bioinform. 22, bbaa159 (2021).

Article

Google Scholar

Li, Z. et al. An innovative, interpretable deep learning framework for designing synthetic enhancers with wide-ranging activity across species. Nucleic Acids Res. 52, 13447–13468 (2024).

Article Google Scholar

Cheng, W. et al. DNALONGBENCH: a comprehensive benchmark suite for evaluating long-range DNA prediction tasks. Nat. Commun. 16, 10108 (2025).

Article Google Scholar

Avsec, Ž. et al. Accurate prediction of gene expression from DNA sequence through integration of long-range interactions. Nat. Methods 18, 1196–1203 (2021).

Article Google Scholar

de Almeida, B. P., Reiter, F., Pagani, M. & Stark, A. DeepSTARR: a tool for predicting enhancer activity from DNA sequence and enabling the creation of synthetic enhancers from scratch. Nat. Genet. 54, 613–624 (2022).

Article Google Scholar

Gosai, S. J. et al. Machine-guided engineering of cis-regulatory elements targeting specific cell types. Nature 634, 1211–1220 (2024).

Article Google Scholar

Fishman, V. et al. GENA-LM: a collection of open-source foundational DNA language models tailored for long sequences. Nucleic Acids Res. 53, gkae1310 (2025).

Article Google Scholar

Linder, J., Srivastava, D., Yuan, H., Agarwal, V. & Kelley, D. R. A unified model of gene regulation that predicts RNA-seq coverage directly from DNA sequence. Nat. Genet. 57, 949–961 (2025).

Article Google Scholar

Ahmed, F. S., Aly, S. & Liu, X. EPI-Trans: a powerful transformer-based deep learning model for predicting enhancer–promoter interactions. BMC Bioinform. 25, 216 (2024).

Article Google Scholar

Feng, H. et al. Evaluating DNA foundation models across genomic and genetic tasks. Nat. Commun. 16, 10780 (2025).

Article Google Scholar

Van Der Harst, P. & Verweij, N. Discovery of 64 new genetic loci broadens our understanding of the genetic basis of coronary artery disease. Circ. Res. 122, 433–443 (2018).

Article Google Scholar

Medina, I. et al. Deficiency in Hck/Fgr kinases decreases plaque growth and stability by reducing monocyte recruitment and movement within plaques. Circulation 132, 490–501 (2015).

Article Google Scholar

Bobryshev, Y. V. Monocyte recruitment and the formation of foam cells in atherosclerosis. Micron 37, 208–222 (2006).

Article Google Scholar

Li, J. et al. New perspectives: dynamic foam cells originating from macrophages in atherosclerosis. J. Cell. Physiol. 236, 6154–6167 (2021).

Article Google Scholar

Heinz, S. et al. Simple combinations of lineage-determining transcription factors activate cis-regulatory elements essential for macrophage and B cell identity. Mol. Cell 38, 576–589 (2010).

Article Google Scholar

Beltagy, I., Peters, M. E. & Cohan, A. Longformer: a transformer architecture designed for processing long documents. Preprint at (2020).

Schneider, V. A. et al. Assessment of GRCh38 and de novo haploid genome assemblies confirms the lasting reliability of the reference assembly. Genome Res. 27, 849–864 (2017).

Article Google Scholar

Nguyen, E. et al. Modeling and designing sequences from molecular to genome scale using Evo. Science 386, eado9336 (2024).

Article Google Scholar

Su, J. et al. Roformer: an improved transformer incorporating rotary position embeddings. Neurocomputing 568, 127063 (2024).

Article Google Scholar

Mou, L. et al. Tree-based convolution and heuristic matching for natural language inference. In Proc. 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers) (eds Erk, K. et al.) 130–136 (Association for Computational Linguistics, 2016).

Upadhyaya, A., Nejdl, W. & Fisichella, M. Leveraging empathy and ethical considerations to detect relevance and categorize information in climate and COVID-19 tweets. In Proc. 33rd ACM International Conference on Information and Knowledge Management 4091–4095 (ACM, 2024).

Zhang, K. et al. EATN: a highly efficient adaptive transfer network for analyzing sentiment at the aspect level. IEEE Trans. Knowl. Data Eng. 35, 377–389 (2021).

Google Scholar

Morales-Brotons, D., Vogels, T. & Hendrikx, H. Using exponential moving averages of weights in deep learning: behavior and advantages. Trans. Mach. Learn. Res. 2024, 1–27 (2024).

Google Scholar

Yang, C. et al. CrossDNA: V1.1.0. Zenodo (2026).

Top Posts

The Unraveling Index: Where Material Breaches Appear and the Tally Stops Here

The Fed’s Next Move: Why Bitcoin’s 65% Crash Could Repeat

Meta and Anthropic Ignite a $10 Billion AI Power Struggle

Rewritten title:”Decoding DNA’s Dynamic Dialogue: Cross-Strand Interactions in Sequence Language Models”

Flawless AI Agent Scorecard: Why Finance Still Pulls the Trigger

2026 Showdown: Run These 4 Local LLMs Smoothly on Just One 24GB GPU

The Micro-Loop That Turbocharges RAG: Parsing Questions Before Retrieval

Ignite Your Neural Network: Demystifying Backpropagation for Curious Minds

WANDR: The Open Benchmark Stress-Testing Research Agents That Wander Wide and Deep

Unlock Loyalty: Revolutionizing FinTech Retention Secrets

The Unraveling Index: Where Material Breaches Appear and the Tally Stops Here

The Fed’s Next Move: Why Bitcoin’s 65% Crash Could Repeat

Meta and Anthropic Ignite a $10 Billion AI Power Struggle

580 Million Cars Connected: The Silent OEM Telematics Revolution by 2030

Smart Self-Categorize: Power Query & DAX Magic for Orphaned Rows

Flawless AI Agent Scorecard: Why Finance Still Pulls the Trigger

SBA’s 8(a) Overhaul Sparks Democratic Uprising: Eligibility Battle Looms

Feyn AI Unveils SQRL: The Text-to-SQL Model That Dances with Your Database First

Trending

The Unraveling Index: Where Material Breaches Appear and the Tally Stops Here

The Fed’s Next Move: Why Bitcoin’s 65% Crash Could Repeat

Latest Posts

Not More Data, but Better World Models – Unite.AI

OpenAI Is Hiring Head of Preparedness, Amid AI Cyberattack Fears

Subscribe to Updates

Top Posts

Rewritten title:**”Decoding DNA’s Dynamic Dialogue: Cross-Strand Interactions in Sequence Language Models”**

Related Posts

Rewritten title:”Decoding DNA’s Dynamic Dialogue: Cross-Strand Interactions in Sequence Language Models”