An exactly solvable model for emergence and scaling laws in the multitask sparse parity problem

Advances in Neural Information Processing Systems 37 (NeurIPS 2024) Curran Associates 37 (2024)

Authors:

Yoonsoo Nam, Nayara Fonseca, Sh Lee, Christopher Mingard, Ard A Louis

Abstract:

Deep learning models can exhibit what appears to be a sudden ability to solve a new problem as training time, training data, or model size increases, a phenomenon known as emergence. In this paper, we present a framework where each new ability (a skill) is represented as a basis function. We solve a simple multi-linear model in this skill-basis, finding analytic expressions for the emergence of new skills, as well as for scaling laws of the loss with training time, data size, model size, and optimal compute. We compare our detailed calculations to direct simulations of a two-layer neural network trained on multitask sparse parity, where the tasks in the dataset are distributed according to a power-law. Our simple model captures, using a single fit parameter, the sigmoidal emergence of multiple new skills as training time, data size or model size increases in the neural network.

Fluctuation Dissipation Relations for Active Field Theories

(2024)

Authors:

Martin KjΓΈllesdal Johnsrud, Ramin Golestanian

Bipartite Sachdev-Ye models with Read-Saleur symmetries

Physical Review B: Condensed Matter and Materials Physics American Physical Society 110:12 (2024) 125140

Authors:

Jonathan Classen-Howes, Paul Fendley, A Pandey, Siddharth Ashok Parameswaran

Abstract:

We introduce an SU⁑(𝑀)-symmetric disordered bipartite spin model with unusual characteristics. Although superficially similar to the Sachdev-Ye (SY) model, it has several markedly different properties for 𝑀β‰₯3. In particular, it has a large nontrivial nullspace whose dimension grows exponentially with system size. The states in this nullspace are frustration-free and are ground states when the interactions are ferromagnetic. The exponential growth of the nullspace leads to Hilbert-space fragmentation and a violation of the eigenstate thermalization hypothesis. We demonstrate that the commutant algebra responsible for this fragmentation is a nontrivial subalgebra of the Read-Saleur commutant algebra of certain nearest-neighbor models such as the spin-1 biquadratic spin chain. We also discuss the low-energy behavior of correlations for the disordered version of this model in the limit of a large number of spins and large 𝑀, using techniques similar to those applied to the SY model. We conclude by generalizing the Shiraishi-Mori embedding formalism to nonlocal models, and apply it to turn some of our nullspace states into quantum many-body scars.

Bipartite Sachdev-Ye models with Read-Saleur symmetries

Physical Review B American Physical Society (APS) 110:12 (2024) 125140

Authors:

J Classen-Howes, P Fendley, A Pandey, Sa Parameswaran

Abstract:

<jats:p>We introduce an <a:math xmlns:a="http://www.w3.org/1998/Math/MathML"><a:mrow><a:mi>SU</a:mi><a:mo>(</a:mo><a:mi>M</a:mi><a:mo>)</a:mo></a:mrow></a:math>-symmetric disordered bipartite spin model with unusual characteristics. Although superficially similar to the Sachdev-Ye (SY) model, it has several markedly different properties for <b:math xmlns:b="http://www.w3.org/1998/Math/MathML"><b:mrow><b:mi>M</b:mi><b:mo>β‰₯</b:mo><b:mn>3</b:mn></b:mrow></b:math>. In particular, it has a large nontrivial nullspace whose dimension grows exponentially with system size. The states in this nullspace are frustration-free and are ground states when the interactions are ferromagnetic. The exponential growth of the nullspace leads to Hilbert-space fragmentation and a violation of the eigenstate thermalization hypothesis. We demonstrate that the commutant algebra responsible for this fragmentation is a nontrivial subalgebra of the Read-Saleur commutant algebra of certain nearest-neighbor models such as the spin-1 biquadratic spin chain. We also discuss the low-energy behavior of correlations for the disordered version of this model in the limit of a large number of spins and large <c:math xmlns:c="http://www.w3.org/1998/Math/MathML"><c:mi>M</c:mi></c:math>, using techniques similar to those applied to the SY model. We conclude by generalizing the Shiraishi-Mori embedding formalism to nonlocal models, and apply it to turn some of our nullspace states into quantum many-body scars.</jats:p> <jats:sec> <jats:title/> <jats:supplementary-material> <jats:permissions> <jats:copyright-statement>Published by the American Physical Society</jats:copyright-statement> <jats:copyright-year>2024</jats:copyright-year> </jats:permissions> </jats:supplementary-material> </jats:sec>

Statistics of matrix elements of local operators in integrable models

Physical Review X American Physical Society 14:3 (2024) 031048

Authors:

Fabian Essler, Bart de Klerk

Abstract:

We study the statistics of matrix elements of local operators in the basis of energy eigenstates in a paradigmatic, integrable, many-particle quantum theory, the Lieb-Liniger model of bosons with repulsive delta-function interactions. Using methods of quantum integrability, we determine the scaling of matrix elements with system size. As a consequence of the extensive number of conservation laws, the structure of matrix elements is fundamentally different from, and much more intricate than, the predictions of the eigenstate thermalization hypothesis for generic models. We uncover an interesting connection between this structure for local operators in interacting integrable models and the one for local operators that are not local with respect to the elementary excitations in free theories. We find that typical off-diagonal matrix elements ⟨𝝁⁒|π’ͺ|β’π€βŸ© in the same macrostate scale as exp⁑(βˆ’π‘π’ͺ⁒𝐿⁒ln⁑(𝐿)βˆ’πΏβ’π‘€π’ͺ 𝝁,𝝀), where the probability distribution function for 𝑀π’ͺ 𝝁,𝝀 is well described by FrΓ©chet distributions and 𝑐π’ͺ depends only on macrostate information. In contrast, typical off-diagonal matrix elements between two different macrostates scale as exp⁑(βˆ’π‘‘π’ͺ⁒𝐿2), where 𝑑π’ͺ depends only on macrostate information. Diagonal matrix elements depend only on macrostate information up to finite-size corrections.