AI Technology • January 27, 2025

OpenAI Just Released GPT-OSS: Everything You Need to Know

OpenAI just dropped GPT-OSS-120B and GPT-OSS-20B—massive open-weight language models for local use. Here's everything you need to know, including specs, performance, downloads, and what this means for developers.

On August 5, 2025, OpenAI made one of its biggest announcements since GPT-4o: the release of gpt-oss-120b and gpt-oss-20b—two powerful open-weight language models that developers can download, run locally, and fully customize. This marks OpenAI's first open-weight LLM release since GPT-2 in 2019, and it's a significant step for local AI deployment, especially for developers and enterprises focused on speed, control, and privacy.

These new models offer high-level reasoning, tool use, and chain-of-thought capabilities, all under an Apache 2.0 license. Best of all, they can run without API calls or cloud dependencies.

What Are GPT-OSS-120B and GPT-OSS-20B?

GPT-OSS is OpenAI's new family of open-weight transformer-based models designed to deliver strong reasoning and task performance—without requiring cloud-based compute. Here's a quick breakdown:

ModelTotal ParametersActive Params/Token (MoE)Memory RequiredContext Length
gpt-oss-120b117 billion5.1 billion~80GB128,000 tokens
gpt-oss-20b21 billion3.6 billion~16GB128,000 tokens

Both models support a chain-of-thought (CoT) configuration (low, medium, high), meaning developers can control reasoning depth with a simple prompt instruction, trading off between latency and output quality.

How Do These Models Perform?

Despite being open-weight, GPT-OSS models deliver near-proprietary performance. GPT-OSS-120B benchmarks close to OpenAI's o4-mini, while GPT-OSS-20B sits around o3-mini—a remarkable achievement for local-first models.

Performance Highlights:

  • Codeforces (Coding): GPT-OSS-120B nearly matches o4-mini in programming tasks.
  • HealthBench: GPT-OSS outperforms several proprietary models in realistic health queries.
  • AIME Math Exams: GPT-OSS models beat o3-mini and closely trail o4-mini.
  • Tool Use & CoT: Strong results in Tau-Bench (tool calling) and multi-step reasoning.

Can I Run GPT-OSS on My Machine?

Yes. That's part of the appeal.

  • GPT-OSS-20B is optimized for edge and consumer devices with 16GB RAM. It can run on modern laptops and desktops.
  • GPT-OSS-120B requires ~80GB, ideal for high-end GPUs like Nvidia H100 or server setups.
  • Thanks to mixture-of-experts (MoE), only a small fraction of the full parameters are active at any time, making inference much more efficient.

Where to Download GPT-OSS

You can access and run the models right now:

  • Download on HuggingFace
  • Browse source on GitHub
  • Try the OpenAI Model Playground

They're available in multiple formats (PyTorch, Metal, ONNX), and also integrated with deployment tools like:

Ollama
vLLM
LM Studio
Cloudflare
Vercel
AWS Bedrock
Microsoft Foundry Local

Why This Matters for Developers and AI Teams

Until now, OpenAI's models were only available via API. That meant depending on the cloud, paying usage fees, and exposing user data to external infrastructure.

With GPT-OSS, OpenAI gives developers full control. You can now:

  • Run models offline
  • Fine-tune for niche domains
  • Maintain data privacy
  • Build low-latency apps without hitting an API

For AI startups, privacy-first enterprises, and edge applications, this opens the door to much more flexible and efficient deployments.

How Safe Are These Models?

OpenAI conducted extensive alignment and safety testing—including "malicious fine-tuning" scenarios where they intentionally tried to make the models act badly. Even after adversarial training, the models failed to reach OpenAI's "high risk" threshold. Key safeguards include:

  • Deliberative alignment techniques
  • Instruction hierarchy to refuse unsafe prompts
  • CBRN data filtering during pre-training
  • External expert audits before release

Plus, OpenAI is launching a $500,000 Red Teaming Challenge to crowdsource potential safety risks across the open-source community.

Are GPT-OSS Models Open Source?

Sort of. These are open-weight models—not truly open-source in the traditional sense.

OpenAI is releasing:

  • The model weights
  • Inference code
  • Tokenizer (o200k_harmony)
  • Reference implementations in Python & Rust

But not:

  • The full training dataset
  • Training code or logs

This strikes a balance between transparency and safety, giving developers powerful tools without opening the door to harmful misuse.

What's Next for GPT-OSS?

This release positions OpenAI alongside Meta (Llama), Mistral, and DeepSeek in the open-weight arena. But it has two key advantages:

  • Better performance on reasoning and tool use
  • Integration across OpenAI's own ecosystem (APIs, playgrounds, and infra)

Future updates may include:

  • Native API integration
  • Multimodal versions
  • Smaller quantized models for mobile

Final Thoughts: Why GPT-OSS Changes the Game

OpenAI's GPT-OSS-120B and GPT-OSS-20B aren't just new models. They're a shift in how developers can build with powerful LLMs.

For the first time in years, developers can now:

  • Access frontier-level performance
  • Keep everything on-premises
  • Customize everything from reasoning level to fine-tuning

It's open-weight AI with real-world use cases in mind—perfect for startups, researchers, and enterprises looking to break free from black-box cloud models.


Want More LLM Deep Dives?

At Cassius AI, we specialize in making sense of the evolving AI landscape—from agents to open-weight models. Subscribe to our newsletter or explore how we help startups grow using agentic AI.