MIT License IntelliJ 2024.1+ v0.0.1.8 - early-stage Public benchmark

Local-first AI coding agent for JetBrains

Code with AI you can
inspect, benchmark, and run locally.

RefIo brings agentic coding into IntelliJ without a WebView and without forcing your code into the cloud. Use Ollama or LM Studio locally, connect cloud models when you choose, and inspect every tool call, token, cost, and file write.

34 measured runs 8 models compared 0 lock-in by design
RefIo
Chat Execution Context Agents RAG Debug Logs API
extract token validation into a separate service
🤖 Assistant
Scanning the auth module to map token-related code. I'll propose a plan before making changes.
grep_search validateToken
12 matches across 6 files under auth/**
read_file AuthService.kt
class AuthService { fun validateToken(token: String) : Result… }
🤖 Assistant
Proposed a new TokenValidator service with three entry points. Plan ready for your approval.
Working|
Type a message... (@context, /prompt, !subagent)
+ Plan Ollama — Qwen 3.5 35B Stop
Loop: 2 (PLAN) 12% (15.2K / 128.0K) Rq:1/3 ↓ 0 / 15.2K

Proof, not vibes

The public benchmark for local and cloud coding models

RefIo has a companion benchmark viewer at benchmark.refio.dev. It shows how models perform on real coding tasks, including score, reliability, duration, token usage, cost, notes, and generated artifacts.

benchmark.refio.dev Open live benchmark
34 recorded attempts
8 models compared
5 human-scored criteria
Model Env Runs Signal
Gamma4 31B DGX local 6 Strong
Qwen 3.6 27B DGX local 6 Reliable
GPT 5.1 codex mini Cloud 6 Low cost

Real outputs, not screenshots only

Each result can include the HTML, image, or video artifact so visitors can inspect what the model actually produced.

Cost and speed visible

Cloud runs show average and total cost. Local runs show duration and tokens, so the trade-off is transparent.

Built for public trust

Filtering, leaderboard, task detail, Pareto charts, and metric help make every score easier to challenge and improve.

Why it exists

A coding agent for developers who want control

RefIo is designed for people who want the leverage of AI coding without giving up observability, privacy, model choice, or the native JetBrains experience.

Native JetBrains experience

Pure Swing UI, no WebView shell. It lives in the IDE where your project, diffs, files, and errors already are.

Three execution modes

Chat (talk, no tools), Plan (read-only, code-enforced), Agent (edit with file snapshots before every write).

Local-first models

Ollama, LM Studio — runs fully offline with no-egress mode. Or connect to OpenAI, Anthropic, Gemini.

Visible tool calls

Every prompt is inspectable, every tool call is explicit, and token usage plus cost stay visible per session.

Safety layers

Path sandboxing, command policies, snapshots before writes, rollback direction, and no-egress mode.

Two interfaces, one core

Same :core Gradle module drives the IntelliJ plugin and a full-screen terminal TUI.

Execution

Think. Inspect. Execute.

Three modes, one workflow. Choose the level of control you need — from quick answers to full autonomous refactoring.

01

Chat — Talk

Ask questions about your code. Full project context via @mentions, RAG retrieval, code citations. No tools, no changes.

Conversation No tools
03

Agent — Edit

Full read/write with automatic file snapshots. Per-tool, per-mode permissions (ON / ASK / OFF). Iteration & cycle guardrails. Always reversible.

Full read/write Snapshots + rollback

Local-first

Designed around local models

RefIo works out-of-the-box with Ollama and LM Studio. Token budgeting, compaction and RAG thresholds adapt to the active model's context window.

Local models, practical

  • Tested with Qwen3 (qwen3.5:9b / :35b / :122b) via Ollama
  • Small context windows? Conversation compaction at ~85% usage
  • Tool result compression (FULL → DETAILED → SUMMARY)
  • Universal tool-calling protocol — works with models lacking native function-calling
  • No-egress mode blocks all cloud calls at the LLM client layer
  • Cloud adapters: OpenAI, Anthropic, Gemini, OpenRouter, Custom OpenAI, Z.AI

Token budget

Scales to the active model's context window. Per-section allocation: system prompt, RAG, conversation, tools — you can tune the ratios.

RAG pipeline

Semantic chunking, local embeddings, cosine similarity search. 5 language analyzers (Kotlin, Java, Python, TS, HTML). All stored in SQLite, fully offline.

Two front-ends

Same :core Kotlin module drives both the IntelliJ plugin (Swing) and a full-screen terminal TUI (Mordant + JLine).

Priorities

Built for the workflows where AI has to earn trust

RefIo is intentionally opinionated: local-first when possible, cloud when useful, transparent by default, and benchmarked in public.

Optimized for

  • Native JetBrains UX (Swing, no WebView)
  • Local-first workflows (Ollama, LM Studio, no-egress)
  • Visible tool calls and inspectable prompts
  • Public model benchmark at benchmark.refio.dev
  • Auditable Kotlin codebase, MIT licensed
  • Shared core between IntelliJ plugin and terminal TUI

Not trying to be

  • A VS Code / Cursor plugin (no plans)
  • An inline autocomplete replacement
  • An enterprise suite at v0.0.1.x
  • A mature multi-agent framework (see Roadmap)
  • Mass-market. Early-stage, fast-changing.

Audience

Who RefIo is for

A narrow, honest target. Not for everyone.

JetBrains power users exploring local LLMs
Kotlin / Java / JVM developers
Open-source contributors looking for an early project
Developers who want to audit their AI tools

Get started

Up and running in 3 steps

Works with local models out of the box. No API keys required.

1

Install Ollama & pull models

ollama pull nomic-embed-text ollama pull qwen3.5:9b # or :35b / :122b for bigger hardware
2

Install RefIo plugin in IntelliJ

Settings → Plugins → Install from disk (or Marketplace when available)

3

Open RefIo tool window — pick a mode — go

View → Tool Windows → RefIo. Chat, Plan, or Agent.

Early-stage · actively developed

RefIo is v0.0.1.x — the foundation is in place, the depth is growing. Small commit history, breaking changes possible pre-1.0. See the Roadmap for known gaps and where the project is heading. Contributions, feedback, and bug reports welcome.

Bring agentic coding into JetBrains
without losing control.

Start local with Ollama, connect cloud models when it makes sense, and use the public benchmark to decide which model deserves the next task.