inferis-ml examples

Run AI models directly in the browser — no server, no API keys, data never leaves your device. Each example loads a real model via Web Workers so the page stays responsive.

GitHub Repository →
Web Workers WebGPU / WASM LLMS Text-to-Image Image-to-Image Image-to-Video Video-to-Video @huggingface/transformers @huggingface/transformers @mlc-ai/web-llm onnxruntime Zero server cost

Detecting…

Example 1
LLM Chat
Chat with a language model running entirely in your browser via WebGPU. Multi-turn conversation, streaming tokens, no server required.
text-generation web-llm WebGPU stream()
Example 2
Semantic Similarity
Compare two texts with vector embeddings. Uses cosine similarity to measure how related the meanings are.
feature-extraction mxbai-embed-xsmall
Example 3
Sentiment Analysis
Classify text as positive or negative with a confidence score. Classic NLP task in the browser.
text-classification distilbert-sst2
Example 4
Text Generation
Stream tokens in real time from a language model. Shows the streaming API and AbortController cancel.
text-generation stream() gpt2
Example 5
Image Classification
Drop an image and get the top-5 predicted labels with confidence scores. ViT model, runs fully in-browser.
image-classification vit-base-patch16
Example 6
Question Answering
Provide a context paragraph and ask a question — the model extracts the exact answer span from the text.
question-answering distilbert-distilsqa
Example 7
Named Entity Recognition
Highlights persons, organizations, and locations inline in text using a token classification model.
token-classification bert-base-NER
Example 8
Priority Queue
Queue multiple inference requests simultaneously with different priorities — high-priority tasks run first.
priority queue sentiment-analysis