Ethereum's Buterin Highlights Breakthroughs in AI-Powered Local Data Analysis, Citing Enhanced Anonymity Capabilities for the Network

Table of Contents Ethereum co-founder Vitalik Buterin has stated that advances in local AI, particularly DeepSeek V4, can strengthen Ethereum privacy tools. Buterin shared that the model’s 2-bit quantized version runs within 90 GB of VRAM. He noted performance differences across hardware, with Apple devices reaching 35 tokens per second. He also drew a direct connection between local AI infrastructure and Ethereum’s privacy layer, calling for Ethereum-specific AI model development. Buterin confirmed that DeepSeek V4 now has a 2-bit quantized version available for local use. The model runs within 90 GB of VRAM, making it accessible on consumer hardware. This marks a step forward for users who want AI tools that operate without third-party servers. Performance, however, depends heavily on the hardware in use. Apple devices deliver around 35 tokens per second, while AMD hardware runs at roughly 7 tokens per second. Buterin noted this gap as a concern worth addressing for the broader local AI movement. He posted on X, writing: “IMO actually taking the effort to properly support more than one hardware manufacturer is a great example of the difference between mere ‘decentralized AI’ and genuine ‘CROPS AI’.” Updates since then: * Deepseek v4 is out. There *is* a 2-bit quant that can run within 90 GB ( https://t.co/X3AFAsiH02 ), and it works, however it's only fast on Apple hardware (I've head ~35 tok/s). On AMD, it's ~7 tok/s. IMO actually taking the effort to properly support more… https://t.co/zo04n5Cx0F — vitalik.eth (@VitalikButerin) May 27, 2026 Buterin also highlighted LuceBox Hub as a useful tool for running dense models more efficiently. On his RTX 5090 laptop, it produced roughly twice the tokens per second compared to llama.cpp. He described it as promising, though still in early development. Buterin pointed out that CROPS AI and the CROPS Ethereum access layer share key technical ground. Zero-knowledge proofs, for instance, could enable paid calls to remote large language models. That same ZK infrastructure, he noted, also supports private RPC reads on Ethereum. This overlap means progress in local AI development feeds directly into Ethereum privacy tooling. Rather than building these systems in isolation, Buterin sees them as naturally connected efforts. Shared infrastructure reduces duplication and accelerates both tracks simultaneously. He also referenced Leanstral, a finetuned Mistral model built for writing Lean code. It fits within 70 GB and runs at around 38 tokens per second on AMD hardware. Buterin noted it performs competitively against much larger one-trillion-parameter models on that specific task. From there, he made the case for Ethereum-specific finetuned models. Such models, he argued, would directly improve smart contract and protocol code security. He connected this point to his broader push for formal verification in Ethereum development, linking to a recent post on his personal site.