This OPEN-SOURCE Chip is Faster Than a GPU (And CHEAPER!) | Tenstorrent Chips Explained

This OPEN-SOURCE Chip is Faster Than a GPU (And CHEAPER!) | Tenstorrent Chips Explained

Jim Keller's Revolutionary AI Chip

Introduction to Jim Keller and His Vision

  • Jim Keller, known for his work on iPhone chips and AMD, claims, "Whatever Nvidia does, we'll do the opposite."
  • He has developed a chip that outperforms Nvidia's best inference system at a fraction of the cost.

The Unique Architecture of Keller's Chip

  • Keller rejected all existing assumptions about Nvidia’s architecture and started from scratch.
  • The chip operates on open-source architecture, aiming to reduce costs associated with server operations dominated by Nvidia.

Insights into GPU Limitations

  • Modern GPUs have significant overhead due to hardware schedulers and memory management units; less than half is used for actual computations.
  • AI workloads are predictable, allowing for a design that eliminates unnecessary hardware components.

Innovative Data Management Approach

  • The compiler manages data movement instead of relying on dedicated hardware, fundamentally changing processor design.
  • Each core in the chip has its own memory and instructions, preventing idle time waiting for other cores.

Cost Efficiency Through Design Choices

Memory Strategy

  • Unlike Nvidia’s expensive HBM memory, Keller opted for standard GDDR6 memory found in gaming GPUs.
  • This choice reduces costs while leveraging software prefetching to optimize data access without needing high bandwidth.

Addressing Bandwidth Challenges

  • While low bandwidth can be an issue with larger models, the architecture was designed with scaling in mind from the outset.

Scaling Solutions for Data Centers

Networking Innovations

  • Traditional methods like NVLink add complexity; Keller integrated 400 GB per second Ethernet directly into each chip.
  • This allows multiple chips to function as a unified system rather than separate entities.

Performance Metrics

  • The architecture achieves impressive benchmarks such as processing 350 tokens per second at significantly lower operational costs compared to Nvidia ($6 vs. $30 per million tokens).

Barriers to Adoption

Software Compatibility Concerns

  • Despite high compatibility (90%) with existing models, enterprise clients require absolute certainty before making large investments.

Jim Keller's Track Record

  • Keller has a history of leaving projects just before they reach their peak success; this raises concerns about long-term commitment to Tenstor.

Future Prospects

Open Source Momentum

  • The open-source nature of Tenstor’s software fosters community-driven improvements faster than traditional proprietary systems could achieve.

Leadership Dynamics

  • For the first time, Jim Keller is not just an architect but also CEO—indicating a potential shift in his approach towards building lasting success.
Video description

The AI Chip Nvidia Hates: Jim Keller’s Tenstorrent MasterpieceJim Keller has spent four years building an open-source AI chip that beats Nvidia's best inference system at a fifth of the cost. Starting from scratch, Keller completely rejected Nvidia's hardware and software architectures. With AI tools burning through billions in electricity and Nvidia controlling the market, Tenstorrent aims to fix this massive cost problem. The Hardware RevolutionModern GPUs are burdened by hidden overheads—like hardware schedulers and memory management units—meaning less than half the chip actually computes. Keller realized AI math is perfectly predictable, eliminating the need for dedicated hardware to manage unpredictability. Tenstorrent removed all traffic controllers and schedulers, shifting data management entirely to software. The open-source compiler maps the entire data journey before the chip turns on. The architecture uses independent RISC-V Tensix cores, each with local S-RAM, so cores never wait for each other. Memory & Networking SolutionsInstead of using expensive High Bandwidth Memory (HBM), Tenstorrent uses standard GDDR6 bulk memory. When scaling datacenters, standard chips spend too much time communicating, forcing reliance on expensive NV-Link or complex InfiniBand switches. Tenstorrent solved this through unified memory and networking: Without hardware schedulers, the freed-up space was filled with 200 megabytes of on-die S-RAM. The compiler pre-fetches exactly what each core needs from the GDDR6 pool to prevent bottlenecks. Tenstorrent baked 400 Gigabit-per-second Ethernet directly into every Blackhole chip, making each simultaneously a processor and a router. Software pre-maps data movement across the entire network cluster. 32 chips in a Galaxy server act as one unified brain, and 36 servers linked together act as a massive supercomputer without bottlenecks. AI Performance & CostOn complex models like DeepSeek R1 671 Billion, community benchmarks show this architecture easily pushes 350 tokens per second. Crucially, it costs just $6 per million tokens compared to Nvidia’s $30—delivering the same performance for five times less. The Industry ChallengeIf the hardware is this efficient, why isn't it in every datacenter? The Software Gap: Tenstorrent claims 90% of Hugging Face models run out of the box, but enterprise buyers (like hospitals or banks) demand absolute certainty. Furthermore, Blackhole software is still catching up to the optimizations of previous generation Wormhole chips. Jim Keller’s Track Record: Keller laid the foundations for AMD's Zen architecture, Apple's A-series chips, and Tesla's Full Self-Driving silicon, but historically leaves before projects fully mature. At four years into Tenstorrent, enterprise buyers fear he might walk away. However, because Tenstorrent is fully open-source, the global community is accelerating software fixes faster than isolated engineering teams. More importantly, Keller isn't just a hired architect this time; he is the CEO building his own future. By the way, Nvidia recently paid twenty billion dollars to make another competitor disappear. Watch this video to find out what scared them! SEO Tags & Keywords:Jim Keller, Tenstorrent, Nvidia GPU AI Chips, Open Source Architecture, RISC-V, S-RAM vs HBM, High Bandwidth Memory, GDDR6 GPU, NV-Link Alternative, InfiniBand Switches, Blackhole Chip, DeepSeek R1 671 Billion, Enterprise AI Servers, Hugging Face Models, Wormhole Chips, AMD Zen Architecture, Apple A-Series Chips, Tesla Full Self Driving Silicon.