View a PDF of the paper titled Apple Intelligence Foundation Language Models, by Tom Gunter and Zirui Wang and Chong Wang and Ruoming Pang and Andy Narayanan and Aonan Zhang and Bowen Zhang and Chen Chen and Chung-Cheng Chiu and David Qiu and Deepak Gopinath and Dian Ang Yap and Dong Yin and Feng Nan and Floris Weers and Guoli Yin and Haoshuo Huang and Jianyu Wang and Jiarui Lu and John Peebles and Ke Ye and Mark Lee and Nan Du and Qibin Chen and Quentin Keunebroek and Sam Wiseman and Syd Evans and Tao Lei and Vivek Rathod and Xiang Kong and Xianzhi Du and Yanghao Li and Yongqiang Wang and Yuan Gao and Zaid Ahmed and Zhaoyang Xu and Zhiyun Lu and Al Rashid and Albin Madappally Jose and Alec Doane and Alfredo Bencomo and Allison Vanderby and Andrew Hansen and Ankur Jain and Anupama Mann Anupama and Areeba Kamal and Bugu Wu and Carolina Brum and Charlie Maalouf and Chinguun Erdenebileg and Chris Dulhanty and Daniel Parilla and Dominik Moritz and Doug Kang and Eduardo Jimenez and Evan Ladd and Fangping Shi and Felix Bai and Frank Chu and Fred Hohman and Hadas Kotek and Hannah Gillis Coleman and Jane Li and Jeffrey Bigham and Jeffery Cao and Jeff Lai and Jessica Cheung and Jiulong Shan and Joe Zhou and John Li and Jun Qin and Karanjeet Singh and Karla Vega and Kelvin Zou and Laura Heckman and Lauren Gardiner and Margit Bowler and Maria Cordell and Meng Cao and Nicole Hay and Nilesh Shahdadpuri and Otto Godwin and Pranay Dighe and Pushyami Rachapudi and Ramsey Tantawi and Roman Frigg and Sam Davarnia and Sanskruti Shah and Saptarshi Guha and Sasha Sirovica and Shen Ma and Shuang Ma and Simon Wang and Sulgi Kim and Suma Jayaram and Vaishaal Shankar and Varsha Paidi and Vivek Kumar and Xin Wang and Xin Zheng and Walker Cheng and Yael Shrager and Yang Ye and Yasu Tanaka and Yihao Guo and Yunsong Meng and Zhao Tang Luo and Zhi Ouyang and Alp Aygar and Alvin Wan and Andrew Walkingshaw and Andy Narayanan and Antonie Lin and Arsalan Farooq and Brent Ramerth and Colorado Reed and Chris Bartels and Chris Chaney and David Riazati and Eric Liang Yang and Erin Feldman and Gabriel Hochstrasser and Guillaume Seguin and Irina Belousova and Joris Pelemans and Karen Yang and Keivan Alizadeh Vahid and Liangliang Cao and Mahyar Najibi and Marco Zuliani and Max Horton and Minsik Cho and Nikhil Bhendawade and Patrick Dong and Piotr Maj and Pulkit Agrawal and Qi Shan and Qichen Fu and Regan Poston and Sam Xu and Shuangning Liu and Sushma Rao and Tashweena Heeramun and Thomas Merth and Uday Rayala and Victor Cui and Vivek Rangarajan Sridhar and Wencong Zhang and Wenqi Zhang and Wentao Wu and Xingyu Zhou and Xinwen Liu and Yang Zhao and Yin Xia and Zhile Ren and Zhongzheng Ren
View PDF
HTML (experimental)
Abstract:
We developed foundational language models to drive Apple Intelligence features. These include a compact ~3 billion parameter model optimized for on-device performance and a larger, server-based model built for Private Cloud Compute. Both are engineered to handle diverse tasks with high efficiency, precision, and responsibility. This report outlines the model architectures, training datasets, training procedures, inference optimization strategies, and evaluation findings. We also emphasize our commitment to Responsible AI and explain how these principles are integrated into every stage of model development.
Submission history
From: Daniel Parilla [view email]
[v1]
Mon, 29 Jul 2024 18:38:49 UTC (19,292 KB)
[v2]
Wed, 27 May 2026 04:08:47 UTC (19,292 KB)



![Apple Intelligence’s Foundation Language Models: A Deep Dive into the Core Architecture [2604.19846] Neural posterior estimation of the neutrino direction in IceCube using transformer-encoded normalizing flows on the sphere](https://technologiesdigest.com/wp-content/uploads/2026/04/260419846-Neural-posterior-estimation-of-the-neutrino-direction-in-IceCube-1024x597.png)