inferis-ml examples

Run AI models directly in the browser — no server, no API keys, data never leaves your device. Each example loads a real model via Web Workers so the page stays responsive.

GitHub Repository →

Web Workers WebGPU / WASM LLMS Text-to-Image Image-to-Image Image-to-Video Video-to-Video @huggingface/transformers @huggingface/transformers @mlc-ai/web-llm onnxruntime Zero server cost

Browser Capabilities

Instant hardware report with no model download. Shows whether your browser supports WebGPU GPU acceleration, WASM SIMD/Threads, and cross-tab model sharing via SharedWorker. inferis uses this report to pick the fastest execution path automatically.

import { detectCapabilities } from 'inferis-ml';
const caps = await detectCapabilities();
// caps.webgpu.supported → true/false
// caps.wasm.simd        → true/false
// caps.hardwareConcurrency → 8

Detecting…

Example 1

LLM Chat

Chat with a language model running entirely in your browser via WebGPU. Multi-turn conversation, streaming tokens, no server required.

text-generation web-llm WebGPU stream()

Example 2

Semantic Similarity

Compare two texts with vector embeddings. Uses cosine similarity to measure how related the meanings are.

feature-extraction mxbai-embed-xsmall

Example 3

Sentiment Analysis

Classify text as positive or negative with a confidence score. Classic NLP task in the browser.