Using gpt-oss:20b in OpenCode with a larger context and high reasoning

Setting up a fully open and local agent workflow for private data

Last updated on 2025-11-14 3 min read Level:beginner

I have hundreds of private journal entries stored locally, many with incorrect date formats. The correct date is in the filename, so I wanted to use an LLM to batch update them. However, I do not want to share my private journal with any third-party service. Instead, I used Ollama with gpt-oss:20b and OpenCode to do this all locally.

The naive solution of just using gpt-oss:20b with Ollama’s defaults doesn’t work (e.g., tool calls don’t work). The main issues is that Ollama sets the context window to only 4096 tokens and medium reasoning by default. It took me too long to figure out how to get gpt-oss:20b working well in OpenCode (mostly because the OpenAI compatible API of Ollama doesn’t allow setting the context window), so I’m writing the steps down (partly for my future self).

In the end I could simply run the following command to batch update my journal entries:

for file in ./*.md; do
  opencode -m ollama/gpt-oss-20b-high-32k run \
    "Use the date in the filename \"${file}\" and correct the 'date:' entry in the YAML front matter of this file"
done

Yes, of course I could have scripted this without an LLM, but where’s the fun in that? Also I might want to do more complex updates in the future on less structured data.

This post describes how to:

Create a long-context variant of gpt-oss:20b in Ollama.
Configure OpenCode to use that model with high reasoning enabled by default.

1. Create a long-context model in Ollama #

Start an interactive session with gpt-oss:20b:

ollama run gpt-oss:20b

In the prompt:

>>> /set parameter num_ctx 32768
>>> /save gpt-oss-20b-32k
>>> /bye

This:

Sets the context window to 32k tokens (num_ctx 32768).
Saves a new model/tag named gpt-oss-20b-32k with that parameter stored.

Adjust num_ctx as needed for your GPU (e.g. 16384, 65536, etc.).

2. Configure OpenCode to use high reasoning #

OpenCode talks to Ollama via the OpenAI-compatible API. In:

~/.config/opencode/opencode.json

add a provider configuration similar to:

{
  "$schema": "https://opencode.ai/config.json",
  "provider": {
    "ollama": {
      "npm": "@ai-sdk/openai-compatible",
      "name": "Ollama (local)",
      "options": {
        "baseURL": "http://localhost:11434/v1",
        "apiKey": "ollama"
      },
      "models": {
        "gpt-oss-20b-high-32k": {
          "id": "gpt-oss-20b-32k",
          "options": {
            "extraBody": {
              "think": "high"
            }
          }
        }
      }
    }
  },

  "model": "ollama/gpt-oss-20b-high-32k"
}

Notes:

"id": "gpt-oss-20b-32k" refers to the custom Ollama model with the larger context window.
"think": "high" enables high reasoning for gpt-oss:20b on each request.

3. Select the model in OpenCode #

Run OpenCode:

opencode

Inside OpenCode:

> /models

Then select:

ollama/gpt-oss-20b-high-32k

From this point, OpenCode uses gpt-oss:20b with:

A larger context window (configured in Ollama), and
High reasoning enabled by default (via think: "high").

Edit this page

Python AI LLM Ollama Local-First Open-Source Opencode

Using gpt-oss:20b in OpenCode with a larger context and high reasoning

1. Create a long-context model in Ollama #

2. Configure OpenCode to use high reasoning #

3. Select the model in OpenCode #

Bas Nijholt

Staff Engineer

Related