InvisibleGPT: Ambient AI Assistant

A minimalist tool for capturing prompts and delivering AI responses through various input/output methods, designed for zero context switching.

What I Built

Three Input Modes:

Clipboard: Automatically processes when you copy text
Keyboard: Cmd+. to capture, Cmd+Shift+. to submit
Terminal: Direct CLI input for scripting

Four Output Modes:

PDF: Two-column research paper format via pdflatex
Clipboard: Silent copy, paste anywhere
Terminal: Colored output for readability
File: Append to log file

Plus:

Conversation memory across interactions
Configurable model selection (o4-mini, gpt-4o, gpt-3.5-turbo, etc.)
TOML-based configuration with environment variable overrides

Technical Details

Configuration System

[model]
name = "o4-mini-2025-04-16"

[input]
method = "keyboard"

[output]
method = "clipboard"
file = "output.txt"

[pdf]
pdflatex_path = ""
output_file = "output.pdf"

Override via CLI:

python src/main.py --input clipboard --output pdf --model gpt-4o

LaTeX Character Escaping

Automatic handling of special characters (\, {, }, $, &, %, #, _, ~, ^) for PDF generation. Prevents most formatting errors.

Clipboard Polling

last_clipboard = ""
while True:
    current = pyperclip.paste()
    if current != last_clipboard and should_process(current):
        process_prompt(current)
    time.sleep(0.1)  # 100ms poll

PDF Pipeline

LaTeX source → pdflatex → cleanup aux files → output PDF. Two-column article format with proper typography.

Use Cases

Research Writing: Copy question → PDF appears. No app switching.

Coding: Hit shortcut → ask question → response in clipboard → paste into code.

Batch Processing: Script feeds prompts via terminal → all responses append to single file.

Performance

Clipboard detection: <100ms
API request: 1-5s depending on model
PDF generation: 500ms-1s
Memory: ~50MB
CPU (idle): <1%

Learnings

What Worked:

Zero context switching with clipboard/keyboard modes
TOML config is cleaner than JSON
LaTeX escaping prevented 90% of PDF failures
Different output modes for different workflows

What Could Be Better:

Clipboard polling wastes CPU; OS-level hooks would be better
No streaming (waits for full response)
Single conversation only (no session management)
LaTeX template is inflexible

Unexpected Use Cases:

Live lecture note-taking (clipboard mode)
Code review comments (keyboard → clipboard)
SQL generation from natural language (terminal scripting)

Future Work

Streaming responses for better UX
Multiple conversation sessions
Custom templates (markdown → PDF, HTML output)
OS integration (macOS Services, Windows context menu)
Smart context detection (code, equations, URLs)