SHIFT Guide
Everything you need to install, configure, and use SHIFT to optimize multimodal AI API requests.
Installation
Homebrew (recommended)
brew tap alohaninja/shift
brew install shift-ai
Quick install script
curl -fsSL https://raw.githubusercontent.com/alohaninja/shift/main/install.sh | sh
Installs to ~/.local/bin. Detects OS and architecture automatically (macOS x86/arm, Linux x86/arm).
From crates.io
cargo install shift-preflight-cli
Pre-built binaries
Download from GitHub Releases for macOS (x86/arm) and Linux (x86/arm).
From source
git clone https://github.com/alohaninja/shift.git && cd shift
cargo install --path shift-cli
Quick Start
# Transform an OpenAI request (stdin/stdout pipe)
cat request.json | shift-ai -p openai -m balanced > safe_request.json
# Transform an Anthropic request from a file
shift-ai request.json -p anthropic -m economy > safe_request.json
# See what would change without modifying anything
shift-ai request.json --dry-run -o report
# Compose with curl
shift-ai request.json -p openai | curl -s -X POST \
https://api.openai.com/v1/chat/completions \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-H "Content-Type: application/json" \
-d @-
Verify Installation
shift-ai --version
shift-ai --help
# Quick validation -- transform a sample payload
echo '{"model":"gpt-4o","messages":[{"role":"user","content":"hello"}]}' | shift-ai
# Check stats tracking
shift-ai gain
CLI Reference
shift-ai [OPTIONS] [FILE]
Arguments:
[FILE] Input file (JSON request payload). Reads stdin if omitted.
Options:
-p, --provider <PROVIDER> Target provider [default: openai]
[openai, anthropic, claude]
-m, --mode <MODE> Drive mode [default: balanced]
[performance, balanced, economy]
--svg-mode <MODE> SVG handling [default: raster]
[raster, source, hybrid]
-o, --output <FORMAT> Output format [default: json]
[json, report, json-report, both]
--dry-run Show what would change without modifying
--profile <FILE> Custom provider profile JSON
--model <MODEL> Target model (overrides model in payload)
--no-stats Disable saving run statistics
-v, --verbose Verbose output
Commands:
shift-ai gain Show cumulative token savings
shift-ai gain --daily Day-by-day breakdown
shift-ai gain --format json Machine-readable output for dashboards
shift-ai preflight [FILE] Validate payload and preview optimizations
-p, --provider <PROVIDER> Target provider [default: openai]
-m, --mode <MODE> Drive mode [default: balanced]
--model <MODEL> Target model override
--profile <FILE> Custom provider profile JSON
-v, --verbose Verbose output
Drive Modes
Choose your optimization level per request with -m / --mode.
| Mode | Behavior |
|---|---|
| performance | Minimal transforms. Only enforce hard provider limits (max dimension, max file size). Preserve original fidelity. |
| balanced (default) | Moderate optimization. Resize oversized images, recompress bloated files. Remove obvious waste. |
| economy | Aggressive optimization. Downscale everything to 1024px, drop excess images beyond provider limits, minimize token usage. |
SVG Handling
Most AI model APIs reject SVG. SHIFT detects SVG inputs and handles them based on --svg-mode:
| Mode | Behavior |
|---|---|
| raster (default) | Rasterize SVG to PNG via resvg. Provider-safe, works everywhere. |
| source | Drop the SVG image from the payload (provider-incompatible). A warning is emitted. Future versions may inject SVG XML as a text content block. |
| hybrid | Rasterize to PNG and retain SVG source as text. Best of both worlds. |
Gain Dashboard
SHIFT automatically records run statistics to ~/.shift/stats.jsonl. View cumulative savings:
$ shift-ai gain
=== SHIFT Cumulative Savings ===
Runs: 42
Images: 156 processed, 89 modified
Bytes: 247.3 MB saved
Token Savings (estimated):
OpenAI: 52,400 -> 12,300 tokens (76.5% saved)
Anthropic: 84,200 -> 28,100 tokens (66.6% saved)
$ shift-ai gain --daily
=== SHIFT Daily Token Savings ===
Date Runs Images OpenAI saved Anthropic saved
----------------------------------------------------------
2026-04-20 8 24 3,200 5,400
2026-04-21 12 42 4,800 8,200
2026-04-22 22 90 12,100 18,500
Use shift-ai gain --format json for machine-readable output suitable for dashboards.
Provider Profiles
Built-in constraints for the two major multimodal providers:
| Provider | Max images | Max dimension | Max file size | Megapixel limit |
|---|---|---|---|---|
| OpenAI | 10 | 2048 px | 20 MB | -- |
| Anthropic | 20 | 8000 px | 5 MB | 1.15 MP |
Profiles include per-model overrides (gpt-4o, gpt-4.1, claude-sonnet-4, etc.) and fall back to provider defaults for unknown models.
Supported Formats
Detected and processed
Raster images: PNG JPEG GIF WebP BMP TIFF
Vector images: SVG (auto-rasterized to PNG)
Encodings: base64 data URIs raw base64 URL references
BMP and TIFF are auto-converted to PNG. SVGs are rasterized. Everything else passes through if it meets provider constraints.
Custom Profiles
Load a custom provider profile with --profile custom.json. The JSON structure matches the built-in profiles in profiles/.
Library Usage
SHIFT is split into two crates: shift-preflight (library) and shift-preflight-cli (binary, installs as shift-ai). The library can be used directly in Rust applications:
[dependencies]
shift-preflight = "0.4"
use shift_preflight::{pipeline, ShiftConfig, DriveMode};
use serde_json::json;
let payload = json!({
"model": "gpt-4o",
"messages": [{
"role": "user",
"content": [{
"type": "image_url",
"image_url": {"url": "data:image/png;base64,..."}
}]
}]
});
let config = ShiftConfig {
mode: DriveMode::Balanced,
provider: "openai".to_string(),
..Default::default()
};
let (safe_payload, report) = pipeline::process(&payload, &config).unwrap();
eprintln!("{}", report); // what changed and why
Project Structure
shift/
├── shift-core/ Library crate: shift-preflight
│ └── src/
│ ├── inspector/ Format detection, metadata extraction
│ ├── policy/ Provider profiles, constraint evaluation
│ ├── transformer/ Image resize, recompress, SVG rasterize
│ ├── payload/ OpenAI + Anthropic message format parse
│ ├── pipeline.rs Orchestrator: inspect -> policy -> transform
│ ├── cost.rs Token estimation (OpenAI tile, Anthropic pixel)
│ ├── stats.rs Persistent run statistics, gain summaries
│ ├── report.rs Transformation report with token savings
│ └── mode.rs DriveMode, SvgMode, ShiftConfig
├── shift-cli/ Binary crate → shift-ai
├── runtime/ TypeScript: @shift-preflight/runtime
│ ├── src/core/ Optimizer, binary detection, data conversion
│ ├── src/middleware/ AI SDK middleware (transformParams)
│ ├── src/proxy/ HTTP proxy (Hono server + routes)
│ └── test/ Unit + integration tests
├── profiles/ Provider constraint JSON (embedded at compile)
├── tests/ Test images and sample payloads
└── .github/workflows/ CI + release pipelines
AI Agent Skill
If you use an AI coding agent (Claude Code, Cursor, Copilot, Windsurf, etc.), you can teach it to use SHIFT automatically:
npx skills add alohaninja/shift
This installs the shift-ai-preflight skill, which teaches your agent when and how to use SHIFT to optimize image payloads. The skill activates automatically when building or debugging multimodal API requests with inline base64 images.
Runtime: Middleware & Proxy
The @shift-preflight/runtime package provides two ways to integrate SHIFT into any AI agent or application without manually calling the CLI. Both use shift-ai under the hood and gracefully no-op if it's not installed.
npm install @shift-preflight/runtime
OpenCode Plugin
The @shift-preflight/opencode-plugin package auto-starts the SHIFT proxy on every OpenCode launch. No manual proxy management needed.
Add the plugin and provider base URL to your opencode.json (project-level or ~/.config/opencode/opencode.json):
{
"plugin": ["@shift-preflight/opencode-plugin"],
"provider": {
"anthropic": {
"options": {
"baseURL": "http://localhost:8787/v1"
}
}
}
}
Important: The baseURL must include /v1. OpenCode's Anthropic client appends /messages (not /v1/messages) to the base URL.
On every OpenCode launch the plugin:
- Checks prerequisites — verifies
shift-aiis on PATH. Silently skips if not installed. - Probes port 8787 — if the proxy is already running, skips startup. Verifies identity via
/health. - Starts the proxy — spawns a detached background process that outlives the OpenCode session.
- Verifies startup — confirms the proxy is healthy before proceeding.
OpenCode auto-installs npm plugins at startup via Bun. No npm install needed. First launch takes ~6 seconds (npx download); subsequent launches detect the running proxy instantly.
Once the proxy is running, other agents can piggyback on it:
# Other agents reuse the same proxy
ANTHROPIC_BASE_URL=http://localhost:8787 claude # Claude Code
OPENAI_BASE_URL=http://localhost:8787 codex # Codex CLI
To disable the proxy temporarily, prefix the baseURL key with an underscore ("_baseURL") in your opencode.json.
AI SDK Middleware
For any Vercel AI SDK app (OpenCode, Next.js, custom agents), the middleware intercepts transformParams and optimizes all images in all messages before they reach the provider.
import { shiftMiddleware } from "@shift-preflight/runtime";
import { wrapLanguageModel, generateText } from "ai";
import { anthropic } from "@ai-sdk/anthropic";
const model = wrapLanguageModel({
model: anthropic("claude-sonnet-4-20250514"),
middleware: shiftMiddleware({ mode: "balanced" }),
});
// All images in all messages are now optimized automatically
const result = await generateText({ model, messages: [...] });
Configuration options:
| Option | Default | Description |
|---|---|---|
mode | "balanced" | Drive mode: performance, balanced, or economy |
minSize | 100_000 | Skip images smaller than this (bytes) |
disabled | false | Kill switch — pass through unchanged |
provider | auto | Override auto-detected provider |
binary | auto | Path to shift-ai binary |
onOptimize | — | Callback with per-image metrics |
Already-optimized images are tagged with a skip marker and won't be re-processed on subsequent conversation turns.
HTTP Proxy
For any agent in any language, the proxy sits between your agent and the AI provider. It intercepts requests, optimizes images via SHIFT, and forwards to the real API. Auth headers and SSE streams pass through unchanged.
npx @shift-preflight/runtime proxy --port 8787 --mode balanced --verbose
Then set the base URL for your agent:
# Claude Code
export ANTHROPIC_BASE_URL=http://localhost:8787
# Codex CLI
export OPENAI_BASE_URL=http://localhost:8787
# Gemini CLI
export GEMINI_API_BASE=http://localhost:8787
# Python
client = Anthropic(base_url="http://localhost:8787")
client = OpenAI(base_url="http://localhost:8787/v1")
Proxy routes:
| Route | Provider |
|---|---|
POST /v1/messages | Anthropic |
POST /v1/chat/completions | OpenAI |
POST /v1beta/models/* | Google (passthrough, native support pending) |
The proxy can also be started programmatically:
import { startProxy } from "@shift-preflight/runtime/proxy";
const server = await startProxy({
port: 8787,
mode: "balanced",
verbose: true,
});
Provider support:
| Provider | Middleware | Proxy | Constraints |
|---|---|---|---|
| Anthropic | Yes | Yes | 5MB max, 1.15 MP, 8000px |
| OpenAI | Yes | Yes | 20MB max, 2048px, tile tokens |
| Yes | Passthrough* | 20MB max, 3072px |
* Google proxy optimization pending native SHIFT support for Gemini payload format.
Troubleshooting
shift-ai: command not found
If you installed via the curl script, make sure ~/.local/bin is in your PATH:
echo 'export PATH="$HOME/.local/bin:$PATH"' >> ~/.zshrc # or ~/.bashrc
source ~/.zshrc
400 errors from AI APIs
This is exactly what SHIFT prevents. Make sure you're piping your request through SHIFT before sending to the API:
shift-ai request.json -p openai | curl -s -X POST \
https://api.openai.com/v1/chat/completions \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-H "Content-Type: application/json" \
-d @-
SVG not rendering after rasterization
SHIFT uses resvg for SVG rasterization, which supports most SVG features including gradients, text, and viewBox. If your SVG uses external resources (fonts, linked images), they may not render. Try --svg-mode source to pass the SVG as text instead.
Stats file location
Run statistics are stored at ~/.shift/stats.jsonl. Delete this file to reset your cumulative stats. Use --no-stats to disable tracking for a single run.
Roadmap (v2+)
- Video: frame sampling, keyframe extraction, resolution downscale
- Audio: compression, transcription to text
- Documents: chunking, summarization, text extraction
- Smart image selection: near-duplicate detection, keep most informative
- Caption fallback: replace low-value images with text descriptions
- Adaptive policies: dynamic adjustment based on request size and latency targets