MCP profile

Github RightNow AI Forge Mcp Server

Turn PyTorch into fast CUDA/Triton kernels on real datacenter GPUs with up to 14x speedup.

Developer ToolsPackageJavaScript/TypeScriptOpen SourceExternal

Book a demo View source

Last updated

March 16, 2026

Visibility

Public

ByRegistry

About This MCP Server

Forge MCP Server

Swarm agents that turn slow PyTorch into fast CUDA/Triton kernels, from any AI coding agent.

Forge transforms PyTorch models into production-grade CUDA/Triton kernels through automated multi-agent optimization. Using 32 parallel AI agents with inference-time scaling, it achieves up to 14x faster inference than torch.compile(mode='max-autotune-no-cudagraphs') while maintaining 100% numerical correctness.

This MCP server connects any MCP-compatible AI coding agent to Forge. Your agent submits PyTorch code, Forge optimizes it with swarm agents on real datacenter GPUs, and returns the fastest kernel as a drop-in replacement.

1. Authenticate: The agent calls forge_auth, which opens your browser. Sign in once, tokens are stored locally at ~/.forge/tokens.json and auto-refresh. 2. Optimize: The agent sends your PyTorch code via forge_optimize. The MCP server POSTs to the Forge API and streams SSE events in real time. 3. Benchmark: 32 parallel Coder+Judge agents generate kernels, compile them, test correctness against the PyTorch reference, and profile performance on real datacenter GPUs. 4. Return: The MCP server collects all results and returns the optimized code, speedup metrics, and iteration history. The output is a drop-in replacement for your original code.

Each optimization costs 1 credit. Credits are only charged for successful runs (speedup >= 1.1x). Failed runs and cancelled jobs are not charged.

Capabilities

Optimize existing kernels - Submit PyTorch code, get back an optimized Triton/CUDA kernel benchmarked against torch.compile(max-autotune)Generate new kernels - Describe an operation (e.g. "fused LayerNorm + GELU + Dropout"), get a production-ready optimized kernel32 parallel swarm agents - Coder+Judge agent pairs compete to discover optimal kernels, exploring tensor core utilization, memory coalescing, shared memory tiling, and kernel fusion simultaneouslyReal datacenter GPU benchmarking - Every kernel is compiled, tested for correctness, and profiled on actual datacenter hardware250k tokens/sec inference - Results in minutes, not hoursSmart detection - The agent automatically recognizes when your code would benefit from GPU optimizationOne-click auth - Browser-based OAuth sign-in. No API keys to manage.

Specifications

Status

live

Industry

Developer Tools

Hosting

Hosting Options

Package

API

Integrate this server into your application. Choose a connection method below.

Configure

Configuration

json

{
  "mcpServers": {
    "forge": {
      "command": "npx",
      "args": ["-y", "@rightnow/forge-mcp-server"]
    }
  }
}

Performance

Usage

Quick Reference

Name: Github RightNow AI Forge Mcp Server
Function: Turn PyTorch into fast CUDA/Triton kernels on real datacenter GPUs with up to 14x speedup.
Transport: Package
Language: JavaScript/TypeScript
Source: External (Registry)
License: Open Source

Get started

Ready to integrate this MCP server?

Developer ToolsLiveVerified

May 25, 2026 · External

View →

Book a demo

Github RightNow AI Forge Mcp ServerGithub RightNow AI Forge Mcp Server

About This MCP Server

Specifications

Hosting

Hosting Options

API

Configure

Performance

Usage

Quick Reference

Ready to integrate this MCP server?

Related MCP Servers

Smithery Ai Cookbook Ts Smithery Cli

Smithery Ai Github

Smithery Arjunkmrm Lta Mcp

Smithery Arjunkmrm Sg Bus Test

Smithery BigVik193 Reddit User Mcp

Smithery Blockscout Mcp Server

Discover agents, MCP servers, and skills in one governed surface

Github RightNow AI Forge Mcp Server