SuperGemma-4 (26B) UNCENSORED + Hermes,OpenClaw,OpenCode: THIS IS SO CRAZY!!!
Introduction to Super Gemma 4
Overview of Super Gemma 4
- The video introduces Super Gemma 4, a community fine-tuned version of Google's Gemma 4, aimed at local power users.
- This specific release is the uncensored MLX 4-bit V2 by Jun Song on Hugging Face and is not an official Google product.
Features and Improvements
- Super Gemma 4 is designed for less restricted local model use, enhancing usability for agent workflows compared to the stock version.
- The original Gemma 4 has strong features like native system prompt support and function calling but lacks the openness desired by some users.
Performance Metrics
Benchmarks and Claims
- The creator claims that Super Gemma 4 offers improved performance with a benchmark score of 95.8 versus the original's 91.4.
- It boasts an average generation speed of 46.2 tokens per second, showing gains in various tasks including code and logic processing.
Practical Usability
- Unlike other chaotic uncensored models, this version aims to maintain practical utility while being more open.
Setup Instructions
Installation Process
- To use Super Gemma 4 on Apple silicon, install MLX-LM via pip and start the local server with specific commands provided in the video.
Important Notes
- Users are advised against manually forcing a chat template path during setup to avoid corrupting responses.
Integration with Tools
Using with Hermes Agent
- Once set up, any tool compatible with OpenAI endpoints can utilize Super Gemma 4; Hermes agent is highlighted as a suitable option.
Configuration Steps
- Users can configure Hermes to point to their local MLX server using the custom OpenAI route for enhanced functionality.
Open Claw Compatibility
Alternative Use Cases
- Open Claw can also leverage Super Gemma 4 through its custom provider path instead of relying on cloud APIs.
Memory Management
- There are options available for tuning memory limits within Open Claw settings if needed.
Gemma 4: Exploring the GGUF Version
Overview of GGUF Version for Non-Mac Users
- The GGUF version is introduced as an alternative for users not on Mac, specifically mentioning a "super Gemma 4 26B uncensored GGUF V2" available on Hugging Face.
- This version is designed for broader compatibility with tools like llama.cpp, LM Studio, Jan, and Open Web UI, making it suitable for Windows or Linux users.
Features and Improvements
- The GGUF variant employs a neutral embedded template to mitigate older prompt writing bugs that could lead to unintended coding modes or tool call behaviors.
- It aims to enhance the chat experience by providing cleaner interactions when run through local servers compatible with OpenAI interfaces.
Target Audience and Use Cases
- Super Gemma 4 serves as an uncensored option for those seeking less filtered outputs while maintaining practical applications in agent workflows such as coding and logic tasks.
- The model is particularly beneficial for Mac users running MLX easily or those using Hermes agents and Open Claw, allowing integration into existing assistant stacks.
Community Perspective
- The speaker expresses enthusiasm about this community-driven release, highlighting its balance between being uncensored yet practical. If successful in real-world use, it could become a preferred local variant of Gemma 4.