Algorithmic Compression via Pretrained Neural Networks

Marcus Hutter describes this new paper as “a short recap of the last ~15 publications over 5 years of the Universal Artificial Intelligence team” (at Google DeepMind): https://www.mdpi.com/1099-4300/28/6/596

Abstract:

The success of large neural networks trained for sequential prediction via log-loss minimization over massive and diverse datasets has sparked debate regarding the fundamental limits of this paradigm. While these models are not explicitly programmed to perform planning and search, their behavior increasingly resembles complex reasoning and adaptive problem-solving. This paper reviews a series of theoretical and empirical works, aiming to bridge the gap between the practical success of LLMs and formal theories of computation and intelligence—that is, algorithmic information theory and Universal Artificial Intelligence. Grounded in the framework of memory-based meta-learning, the main argument is that training sequence models to predict the next token across diverse tasks implicitly meta-trains them to perform algorithmic compression, thereby performing (amortized) Bayesian inference over the task in-context. Consequently, when pretrained on a sufficiently rich data distribution, the resulting neural networks behave as if compressing by inferring the generative algorithm producing the observed data. We discuss recent theoretical and empirical evidence demonstrating that this approach can approximate Solomonoff induction in the theoretical limit, match exact Bayesian inference on complex sources in practice, achieve strong compression on out-of-distribution data, and synthesize complex in-context algorithms like chessboard evaluations. As models become more capable and general, the theoretical understanding through the lens of algorithmic information theory, including hard theoretical limits and how far practical models are from them, becomes increasingly relevant. We thus conclude our paper by outlining a number of open research questions to further bridge the gap from well-understood theory to modern machine learning practice.

recent posts

Author

Leave a comment Cancel reply

recent posts

Author

Algorithmic Compression via Pretrained Neural Networks

Share this:

Leave a comment Cancel reply