Developers reverse-engineered Apple Neural Engine's proprietary APIs, achieving neural network training on ANE for the first time.
This article is machine translated
Show original
According to ME News, on March 3 (UTC+8), developer Manjeet Singh (GitHub: maderix) collaborated with Claude Opus to reverse engineer Apple's undocumented private APIs, achieving for the first time a neural network training implementation including backpropagation on the Apple Neural Engine (ANE) of the M4 chip. ANE is an accelerator designed by Apple specifically for inference; Apple has never officially released its training capabilities, and developers can only indirectly access its inference functionality through the CoreML framework. This project bypasses CoreML, directly mapping from over 40 private classes such as `_ANEClient` and `_ANECompiler` to the complete software stack of the IOKit kernel driver. They also discovered the `_ANEInMemoryModelDescriptor` interface, which allows direct in-memory model compilation—a crucial step for training, as recompilation is required for each weight update. Currently, training of a single transformer layer (dim=768, seq=512) is implemented, with a step time of 9.3ms on an M4 processor. The ANE utilization rate is 11.2% (1.78 TFLOPS, theoretical peak 15.8 TFLOPS). The input gradients for forward and backward propagation are calculated on the ANE, while the weight gradients and Adam optimizer are performed on the CPU. The project also found that the core computational primitive of ANE is convolution rather than matrix multiplication. Using 1x1 convolution to express matrix multiplication can achieve approximately 3x throughput improvement, and bypassing CoreML for direct calls provides an additional 2-4x gain. Apple's official claim of "38 TOPS" is misleading. The project is still in its early stages: it only supports single-layer training, uses synthetic data, and has approximately 119 resource leaks after compilation that require restarting the process to avoid. Multi-layer training and support for real data are still under development. The project is open-source under the MIT license and has received approximately 2800 stars in 5 days. (Source: ME)
Source
Disclaimer: The content above is only the author's opinion which does not represent any position of Followin, and is not intended as, and shall not be understood or construed as, investment advice from Followin.
Like
Add to Favorites
Comments
Share
Relevant content





