100% ON-DEVICE · ZERO CLOUD

Your documents, conversations, and knowledge — processed by intelligence that runs entirely on your device. No servers. No leaks. No compromises.

Network calls0
Data uploaded0 bytes
Third-party accessNone
EncryptionAES-256-GCM
InferenceOn-device
Model size2.8 GB
Parameters4 Billion
Works offlineAlways
Network calls0
Data uploaded0 bytes
Third-party accessNone
EncryptionAES-256-GCM
InferenceOn-device
Model size2.8 GB
Parameters4 Billion
Works offlineAlways

See it in action

Intelligence that fits
in your pocket

Chat with your documents, get sourced answers, and process sensitive data — all without an internet connection.

Natural language chat with RAG-powered answers
PDF extraction, OCR scanning, document indexing
Camera capture for instant document processing
9:41
UnQuest

Summarize the key clauses in my NDA

Based on your NDA document, here are the key clauses:

Confidentiality Period: 3 years from disclosure date

Non-Compete: 12-month restriction within 50mi radius

IP Assignment: All work product belongs to Company

Source: NDA_Agreement.pdf — Page 2-4
Processed entirely on device
Ask about your documents...

Capabilities

Everything runs on your device

No API keys. No subscriptions. No data leaving your phone. Just intelligence, right where your data lives.

Absolute Privacy

Every computation happens on your device's neural engine. Your data never touches a server — not even ours.

Real Intelligence

Powered by state-of-the-art 4B parameter models running natively on Apple Silicon. Not a toy — real LLM inference.

Document Processing

Drop in PDFs, scan receipts, photograph contracts. UnQuest extracts, indexes, and understands your documents instantly.

Instant RAG

Ask questions about your documents and get sourced answers. On-device vector search finds exactly what you need.

AES-256 Encrypted

Everything stored on device is encrypted with military-grade AES-256-GCM. Keys live in the Secure Enclave.

Apple Intelligence

Automatically routes simple tasks to Apple's built-in AI when available, saving the heavy model for complex work.

Models

Two engines. One device.

UnQuest intelligently routes tasks to the right model. Complex reasoning goes to Qwen. Simple queries go to Apple Intelligence.

Qwen 3.5 4B

Primary

Alibaba's latest small language model. Exceptional reasoning and instruction following at 4 billion parameters.

Parameters

4B

Quantization

Q4_K_M

Size

2.8 GB

Context

4K tokens

Apple Intelligence

iOS 26+

Apple's on-device foundation model. Zero download needed — built into the operating system. Used for lightweight tasks.

Download

None

Tasks

Chat, classify

Speed

Instant

Requires

iOS 26+

Smart Inference Routing

Simple chat & classification

Apple Intelligence

RAG, analysis & workflows

Qwen 3.5 4B

Document processing

Qwen 3.5 4B

How it works

Three steps. Zero cloud.

01

Download once

Install UnQuest. A compact 2.8GB AI model downloads to your device. After that — zero internet needed.

02

Feed it your world

Drop PDFs, scan documents with your camera, import notes. UnQuest chunks, embeds, and indexes everything locally.

03

Ask anything

Chat naturally. Get sourced answers from your documents. Summarize contracts. Extract data. All in airplane mode.

Privacy First

They upload.
We don't.

Your data on their servers

Your data on your device

API calls for every query

Zero network calls. Ever.

They read your prompts

Nobody reads your prompts

Monthly subscriptions

One-time download. Done.

Data used for training

Your data trains nothing

Compatibility

Built for modern Apple Silicon

Running a 4-billion parameter model requires serious silicon. UnQuest targets devices with 8GB RAM and Metal GPU acceleration.

iOS

17.0+

RAM

8 GB minimum

Storage

~3 GB for model

Chip

A17 Pro / A18+

DeviceChipRAMStatus
iPhone 17 Pro / Pro MaxA19 Pro12 GB
Best experience
iPhone 17 / AirA198 GB
Full support
iPhone 16 Pro / Pro MaxA18 Pro8 GB
Recommended
iPhone 16 / Plus / 16eA188 GB
Full support
iPhone 15 Pro / Pro MaxA17 Pro8 GB
Full support
iPhone 15 / Plus & earlierA16 / older6 GB
Insufficient RAM

FAQ

Questions answered

Yes. Once you download the model (~2.8GB one-time download), every computation happens on your device's processor. You can use UnQuest in airplane mode, in a bunker, or on a desert island. Zero network calls are made for inference — ever.

ChatGPT and Claude send your data to remote servers for processing. UnQuest runs the AI model directly on your device using llama.cpp and Metal GPU acceleration. Your prompts, documents, and responses never leave your device. The trade-off: the model is smaller (4B vs 100B+ parameters), so it's best for focused tasks like document Q&A, summarization, and extraction rather than open-ended creative writing.

Any iPhone with 8GB RAM and Apple Silicon — that's iPhone 15 Pro, all iPhone 16 models, and the entire iPhone 17 lineup. Standard iPhone 15 and older models don't have enough memory to run a 4B parameter model. iPad and Mac support is planned.

The app itself is small. The AI model is a one-time ~2.8GB download (Qwen 3.5 4B in Q4_K_M quantization). After that, no further downloads are needed. On iOS 26+ devices, Apple Intelligence tasks require zero additional downloads.

All stored data — documents, conversations, embeddings — is encrypted with AES-256-GCM. The encryption key is stored in the Secure Enclave via Apple's Keychain. Even if someone extracts your phone's storage, the data is unreadable without biometric authentication.

Yes. UnQuest uses Apple's PDFKit for text extraction and the Vision framework (VNRecognizeTextRequest) for OCR on scanned documents and camera captures. Documents are chunked, embedded, and indexed locally for instant search and retrieval.

On an A18 Pro device, expect 15-25 tokens per second with the Qwen 3.5 4B model using Metal GPU acceleration. Responses start streaming immediately — you don't wait for the full answer. Apple Intelligence tasks (classification, simple chat) respond near-instantly.

The core promise is absolute privacy — we will never route your content through external servers. Future plans include encrypted iCloud sync (using your own iCloud private database) for multi-device access, but inference will always remain on-device.

Built with

llama.cppInference engine
MetalGPU acceleration
Apple Neural EngineML hardware
AES-256-GCMEncryption
HNSWVector search
PDFKit + VisionDocument processing

Your AI should be
yours alone

UnQuest is coming soon. Join the waitlist for early access.

No spam. We'll email you once when it's ready.