You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The higher the LLM layer, the more attention is focused on a few key tags. Therefore, if after a few iterations of the underlying op, it is possible to detach from the underlying op and use only the high-layer loop. Then the solver can be expressed as a question-specific LLM. This idea aligns closely with the ultimate computing arch I envisioned in dreams. With the same computing power, at a higher lvl of abstraction, cognition can be improved by 3-6 OOMs, eq to 40 yrs of Moore’s Law. The PyramidKV marks the beginning of vast potential for innovation in this field.
This computing arch has appeared in my dreams many times, and I feel it’s becoming true. It involves generating a general question-specific graph and training a very small network in real time.
Inspired by human cognitive processes, I believe it’s overlooked to emphasize problem-solving at a conceptual lvl rather than repetitive, low-level processing.
The text was updated successfully, but these errors were encountered:
doptime
changed the title
Congratulation on the greate work & hope for future versions
Congratulation on the great work & hope for future versions
Jun 13, 2024
从信息与统计视角看LLM 的出路
a) 对我们这个世界的改造能动性,来自一种后验洞察,这种后验必须立足当下的一切,系统内外的条件。这意味着对于求解器而言,一来条件是海量的,二来推理是深度的。所以这种情况下,求解器绝无可能可以把后验求解器看做是先验知识。深度的推理练习可以缓解这一点,但绝无近零推理的可能。
b) 这种后验求解器优化的难度在于条件的稀疏性。更在于稀疏性组合的巨量可能性。这在赫伯特·西蒙 有限理性中有比较好的展示。稀疏使得认知不得不总具备新鲜的可能。就像传统工程学中,优化绝不是事先可以规划的东西,但它一定是事后的魔法。看看还能做什么而推动和优化的程度常常让我们震惊。同时关系的稀疏性意味着细致的认知和评估通常是可能得。
结论如果我们追求一个非常强大的artificial super intelligence,那么更大很可能不是大模型的必须,而更深的推理训练是必须。更深的推理以为着必须引入一种高效得多的,推理深度深得多的,支持long thinking 的架构。
The higher the LLM layer, the more attention is focused on a few key tags. Therefore, if after a few iterations of the underlying op, it is possible to detach from the underlying op and use only the high-layer loop. Then the solver can be expressed as a question-specific LLM. This idea aligns closely with the ultimate computing arch I envisioned in dreams. With the same computing power, at a higher lvl of abstraction, cognition can be improved by 3-6 OOMs, eq to 40 yrs of Moore’s Law. The PyramidKV marks the beginning of vast potential for innovation in this field.
This computing arch has appeared in my dreams many times, and I feel it’s becoming true. It involves generating a general question-specific graph and training a very small network in real time.
Inspired by human cognitive processes, I believe it’s overlooked to emphasize problem-solving at a conceptual lvl rather than repetitive, low-level processing.
The text was updated successfully, but these errors were encountered: