June 5, 2026

Val Town: 100ms Live Code Deploys + AI Bug Catching

Share:

Tool of the Week

Val Town plugin deploys live code in 100ms

Coding agents can now write and deploy Val Town functions directly via MCP server + platform Skills, cutting the dev-to-live loop from manual steps to plugin-native execution.

Agents gain native access to Val Town's runtime, storage, and execution model without context switching. Reduces friction for agentic workflows that need persistent, deployable code artifacts.

Replaces manual Val Town dashboard interactions for agent-driven development. Requires Claude Code, Codex, or Cursor with plugin support. Ready now—single-command install, but adoption depends on your agent framework maturity.

  • write idiomatic vals and deploy live to Val Town in 100ms
  • list, read, create, edit, run, and deploy vals; view logs; query your SQLite and Blob storage; manage env vars, branches, and version history
  • npx plugins add val-town/plugins
mcp-servercoding-agentsdeployment-automationval-townplugin-integration

Dev Signal

Get issues like this in your inbox — free, 3x a week.

Quick Signals

Voice sessions now bootstrap with user profile context

OpenClaw 2026.5.21 injects IDENTITY.md, USER.md, and SOUL.md into voice session instructions by default, eliminating mid-conversation context setup.

Voice agents no longer start stateless and confused about identity—they inherit persona and user context from bootstrap, reducing inference errors and improving reliability. Eliminates the need to re-establish agent identity during each session.

Replaces manual context injection patterns. Requires populated IDENTITY.md/USER.md/SOUL.md files and upgrade to 2026.5.21+. Set bootstrapContextFiles to empty array if you need anonymous sessions. Ready to use now with zero additional config—it ships on by default.

  • realtime voice sessions now include bounded IDENTITY.md, USER.md, and SOUL.md profile context in the session instructions by default
  • voice sessions bootstrap with your IDENTITY.md (who you are), USER.md (who you're helping), and SOUL.md (how you behave) as bounded context
  • This is a security-positive change. When voice sessions have bounded, explicit context, they don't have to guess or infer who they are.
  • No additional config needed — it's on by default
voice-agentsopenai-realtimesession-contextidentity-managementconfiguration

getdebug 0.4.0 catches AI-app bugs Bandit misses

Regex prefilters catch prompt-injection and unbounded-stream patterns; Bandit and Semgrep generate false positives on safe allowlist-then-run patterns because they don't track data provenance.

Existing Python SAST (Bandit, Semgrep) have zero AI-app-specific rules and flag safe patterns as vulnerable, forcing manual triage. getdebug fills the gap: 100% precision/recall on AI-specific fixtures, zero false positives on real code.

Complements rather than replaces Bandit and Semgrep. Run all three: `bandit -r .`, `semgrep --config auto .`, then `npx @getdebug/cli@0.4.0 analyze .`. Requires Node.js runtime for getdebug CLI. Worth trying now on Python LLM projects; optional Ollama integration for on-device LLM analysis.

  • pattern-based regex prefilters in JS/TS + Python (new in 0.4.0)
  • unsafe-tool-output fixture via their generic subprocess.run(shell=True) rules
  • getdebug's regex specifically requires the tool_call.input.X / block.input.X reference in the sink arg
  • Both tools miss the other four behavioural categories (pii-in-prompt, unsafe-role-merge, prompt-injection, unbounded-stream) entirely
  • getdebug 6 are all AI-app categorized
  • None of them subsume the others
python-securityllm-app-securitystatic-analysissastopen-source

LiteRT-LM ships native Gemma 4 multi-token prediction support

Speculative decoding with co-located GPU execution of drafter and primary model eliminates cross-IP data transfers, achieving 2.2x faster inference on mobile hardware.

On-device LLM inference latency is a hard constraint for mobile UX. Native MTP support with optimized KV cache management means you can ship faster agentic features without rebuilding inference pipelines.

Replaces hand-rolled speculative decoding or slower runtimes like llama.cpp for Gemma 4 on Android/iOS. Requires Swift/Kotlin/JavaScript adoption and GitHub source access. Worth trying now if you're shipping Gemma 4 mobile—benchmarks are Google-attributed and the API is production-ready.

  • up to 2.2x faster inference
  • the highest-performing runtime environment for Gemma models
  • optimizing the data interplay between the primary Gemma 4 model and the MTP drafter
  • 1.8x to 3.7x faster than competing frameworks like llama.cpp, MLX, Cactus, and ONNX
  • the ~2.58GB Gemma 4 E2B model taking just 607MB on Apple mobile CPUs
gemma-4speculative-decodingmobile-inferenceliterton-device-llm

Generative UI moves agents beyond text to UI

CopilotKit exposes app state as readables and actions, letting agents directly manipulate UI instead of narrating decisions—ship three patterns: static tool calls, declarative JSON schema, or raw HTML generation.

Replaces text-only agent output with direct UI manipulation, enabling human-in-the-loop confirmation flows and richer interactive experiences without custom chat parsing. Reduces boilerplate for connecting LLM agents to React frontends.

Ready now for React/Next.js + LangGraph stacks. CopilotKit (MIT, 30k+ GitHub stars) handles serialization and chat UI out-of-box; you define readables and actions as hooks next to components. Requires agent backend (LangGraph, Agent Framework) and clear action/state contracts. Start with static AG-UI pattern, graduate to declarative A2UI if you need schema-driven rendering.

  • Generative UI is the paradigm shift that enables AI agents to dynamically generate the UI elements required to complete a task
  • three core patterns exist for building Generative UI: Static Generative UI (AG-UI), Declarative Generative UI (A2UI, Open-JSON-UI), and Open-ended Generative UI (MCP Apps)
  • CopilotKit: a comprehensive framework for building agent-driven interfaces
  • It is an open-source (MIT) framework with 30k+ stars on GitHub
  • The readable/action model is clean. You declare what the agent can see (useCopilotReadable) and do (useCopilotAction) right next to the React components that own the state
generative-uicopilotkitlanggraphagent-toolsreact

Stream LLM tokens to browser with fetch, not EventSource

Use fetch + response.body.getReader() to consume model streams and re-emit as SSE events, giving you POST support, custom headers, and AbortController cancellation that EventSource lacks.

Token streaming over 15-40 seconds feels responsive to users instead of a frozen spinner; proper cancellation prevents orphaned GPU jobs and duplicate billing; anti-buffering headers (no-transform, X-Accel-Buffering: no) force proxies to flush tokens immediately instead of batching them at the end.

Replaces naive response-waiting patterns and EventSource for LLM endpoints. Requires Next.js 15 Route Handler, AbortController wiring through streamModel generator, TextDecoder buffering to respect TCP boundaries, and maxDuration tuning on Vercel. Ready now—this is production code from spectr-ai's security report tool.

  • the server forwards hundreds of text fragments coming out of the model in real time
  • EventSource is the obvious tool for SSE, and it handles reconnection for free. But it only does GET requests
  • controller.enqueue(encoder.encode(`data: ${JSON.stringify(event)}\n\n`))
  • no-transform and X-Accel-Buffering: no. These are the anti-buffering headers. no-transform tells proxies not to gzip-buffer the body, and X-Accel-Buffering: no disables nginx's response buffer
  • When the browser aborts, Next.js aborts request.signal, which I pass into streamModel, which passes it to the model fetch
ssestreamingllmnext-jsbackpressure

Data Point

LLMs fail basic security exploits reliably

GPT-4.5 solves SQL injection 70% of the time; Claude Sonnet 4.6 hits budget limits before breaching—guardrails work, but inconsistently.

Security teams need empirical data on LLM attack surface before deploying agents with data access. This benchmark reveals which models leak user data and which ones stop themselves.

Replaces hand-waving about LLM safety with actual exploit metrics. Requires building a vulnerable test app matching your threat model. Worth running now if you're shipping agents with sensitive context—the variance between models is stark.

  • GPT-5.5 performed the best, solving the task in seven out of 10 runs
  • DeepSeek-V4-Pro was the runner-up with only three successful runs
  • Claude Sonnet 4.6 was the most expensive model to run, and it only solved the task on two runs
  • Many models could not complete the task due to security guardrails
llm-securityprompt-injectionagent-safetybenchmark

Enjoying Dev Signal? Get every issue in your inbox.

Free forever · 3 issues a week · One-click unsubscribe