OpenMythos — Claude의 내부 구조를 공개 논문만으로 역추적해본 오픈소스 시도
→ Claude "Mythos" 아키텍처를 공개된 연구 문헌만 가지고 처음부터 다시 조립해본 이론적 재구성 프로젝트입니다.
→ 핵심 가설은 Mythos가 같은 레이어를 여러 번 돌리는 Recurrent-Depth Transformer(Looped Transformer)라는 것입니다.
→ Chain-of-Thought처럼 중간 토큰을 뱉는 방식이 아니라, 한 번의 forward pass 안에서 잠재 공간(latent space)에서 조용히 반복 추론이 일어납니다.
→ 깊이는 looping으로, 영역 간 넓이는 MoE(Mixture of Experts)로 해결한다는 것이 저자의 설명입니다.
→ PyTorch 구현과 함께 안정성 증명, 스케일링 법칙, 루프 인덱스 임베딩 같은 보조 아이디어까지 정리되어 있습니다.

기존 Transformer와 뭐가 다른가
기존 Transformer는 서로 다른 수백 개의 레이어를 직렬로 쌓아 깊이를 확보합니다. OpenMythos가 재구성한 Looped Transformer는 구조를 세 덩어리로 나눕니다. Prelude(입력 인코딩) → Recurrent Block(반복 실행) → Coda(출력 정리)의 흐름인데, 중간의 Recurrent Block을 같은 가중치로 여러 번 돌립니다. 더 어려운 문제일수록 loop 수를 늘려서 더 깊이 생각하게 만드는 구조입니다.

핵심 업데이트 규칙
매 loop마다 hidden state는 h_{t+1} = A·h_t + B·e + Transformer(h_t, e) 식으로 업데이트됩니다. 여기서 중요한 점은 원래 입력 e가 매 loop마다 다시 주입된다는 것입니다. 이게 없으면 반복이 길어질수록 원본 신호가 흐려져버리는데, input injection이 그걸 막아줍니다.

왜 Mythos가 이 구조일 거라고 보는가
저자는 네 가지 근거를 제시합니다. 첫째, Looped Transformer는 훈련 중 본 적 없는 조합을 다루는 systematic generalization을 통과합니다. 둘째, 5-hop 추론으로 학습해도 inference 시 loop를 늘리면 10-hop 문제를 풀어내는 depth extrapolation이 관찰됩니다. 셋째, 각 loop는 continuous latent space에서의 CoT 한 단계에 해당하며, 이는 Saunshi 등(2025) 논문에서 형식적으로 증명되었습니다. 넷째, k개 레이어를 L번 돌리면 kL개 레이어 모델과 비슷한 품질을 내므로 파라미터 폭발 없이 깊이를 확보할 수 있습니다.

주의사항
이 저장소는 어디까지나 공개 문헌을 기반으로 한 이론적 재구성이며, Anthropic이 Mythos를 실제로 이 구조로 만들었는지는 확인되지 않았습니다. 리포지토리는 MIT 라이선스이며 PyTorch 예제 코드와 API 문서가 포함되어 있습니다. 실행에는 attention type(mla 또는 gqa) 선택과 MythosConfig 설정이 필요합니다.

#LoopedTransformer #ClaudeMythos #MoE #AIArchitecture #OpenSource

Telegram

OpenMythos — An Open Source Attempt to Backtrack Claude’s Internal Structure Based on Public Papers

→ This is a theoretical reconstruction project that reassembles the Claude "Mythos" architecture from scratch using only publicly available research literature.

→ The core hypothesis is that Mythos is a Recurrent-Depth Transformer (Looped Transformer) that runs the same layer multiple times.

→ Unlike Chain-of-Thought, which spits out intermediate tokens, iterative inference occurs quietly within the latent space within a single forward pass.

→ The author explains that depth is addressed through looping, while breadth between regions is resolved through MoE (Mixture of Experts).

→ Along with the PyTorch implementation, supporting ideas such as stability proofs, scaling laws, and loop index embeddings are also organized.

**How It Differs from Existing Transformers**

Existing Transformers secure depth by stacking hundreds of different layers in series. The Looped Transformer reconstructed by OpenMythos divides the structure into three blocks. The flow is Prelude (Input Encoding) → Recurrent Block (Iterative Execution) → Coda (Output Cleanup), where the intermediate Recurrent Block is run multiple times with the same weight. This structure encourages deeper thinking by increasing the number of loops for more difficult problems.

Key Update Rule

In every loop, the hidden state is updated using the formula h_{t+1} = A·h_t + B·e + Transformer(h_t, e). The important point here is that the original input e is re-injected in every loop. Without this, the original signal would become blurred as the iteration lengthens, but input injection prevents this.

Why Mythos Is Presumed to Have This Structure

The author presents four reasons. First, the Looped Transformer passes systematic generalization, which handles combinations never seen during training. Second, even when trained with 5-hop inference, depth extrapolation is observed where increasing the number of loops during inference allows the model to solve 10-hop problems. Third, each loop corresponds to a single CoT step in continuous latent space, which was formally proven in the paper by Saunshi et al. (2025). Fourth, running k layers L times yields quality similar to a kL-layer model, allowing for depth to be achieved without parameter explosion.

Note

This repository is strictly a theoretical reconstruction based on public literature, and it has not been verified whether Anthropic actually built Mythos with this structure. The repository is under the MIT license and includes PyTorch example code and API documentation. Running the repository requires selecting an attention type (mla or gqa) and configuring MythosConfig.

#LoopedTransformer #ClaudeMythos #MoE #AIArchitecture #OpenSource

This round of Ethereum memes started with a puppy and a reply from Elon Musk.

A few days ago, SpaceX founder Elon Musk replied to a post by media personality Glenn Beck on SpaceX. The post mentioned that a teenage girl, before dying of cancer, designed a Shiba Inu plush toy named Asteroid and sent it on the 2024 SpaceX Polaris Dawn mission. The toy served as a zero-gravity indicator on the spacecraft, becoming the first thing humans saw when entering weightlessness...

Ethereum Meme Season Makes a Comeback

The Volo Protocol Liquid Staking protocol on the Sui ecosystem has just become the victim of a serious security attack, resulting in approximately $3.5 million worth of assets being withdrawn from its vaults. The...

The Sui Volo Protocol project was hacked, resulting in a loss of $3.5 million.

Block expanded its Cash App to the 6-12 age group with a savings interest rate of 3.25%, but completely removed the Bitcoin feature from managed accounts.