I liked reading this article. Its cool to really poke into the hidden perplexity behind patterns of ‘thought’ in llms. They aren’t merely simple ‘auto complete’.

The finding that claude does math in a different way then it says it does and can anticipate words ahead of generation time are facinating.