Reference Guide

Meta AI

Llama models, the Meta AI assistant across Facebook/Instagram/WhatsApp, AI Studio, the Llama API, and the new closed-weight Muse Spark — what each one is and where it fits.

← Back to Reference Hub

Best for: Long-context document work, multimodal pipelines, and self-hosted production deployments where open weights matter.

  • Scout — 17B active / 109B total params, 16 experts, 10M-token context (largest open model at launch)
  • Maverick — 17B active / 400B total params, 128 experts, 1M-token context
  • Behemoth — ~288B active / ~2T total, used as a teacher model for codistillation; never publicly released
  • First Llama family with Mixture-of-Experts architecture and native multimodality (text + image input)
  • Available on Hugging Face, llama.com, Bedrock, Vertex, Azure; runs in vLLM, TGI, llama.cpp, Ollama
  • Released April 5, 2025 under the Llama 4 Community License

Limitations: Lukewarm reception vs Qwen3 and DeepSeek R1 on reasoning benchmarks. License caps commercial use at 700M monthly active users (effectively a hyperscaler clause). Behemoth never shipped, and the original AGI Foundations team that built Llama 4 was dissolved after release.

Open-Weight ModelsFree Weights

Best for: The mature, broadly-deployed workhorse for most production Llama usage in 2026 — cheap, fast, well-supported by every inference host.

  • Llama 3.1 — 8B, 70B, and 405B dense models with 128K context
  • Llama 3.2 — 1B/3B text + 11B/90B vision (first multimodal Llama, edge-friendly small sizes)
  • Llama 3.3 70B — dense model approaching 405B quality at a much smaller size
  • Lowest-cost tier: Llama 3.1 8B at ~$0.02 / $0.05 per million tokens on hosted providers
  • Supported on Bedrock, Vertex, Azure AI, Together, Fireworks, Groq, Replicate, and self-hosting
  • Strong open ecosystem of fine-tunes, derivatives, and quantizations

Limitations: Pre-MoE architecture, so larger sizes are heavier to serve than Llama 4. English-dominant training data. Reasoning lags behind closed frontier models and behind Qwen3 / DeepSeek R1 on many evals.

Open-Weight ModelsFree Weights

Best for: The first product of Meta Superintelligence Labs (launched April 8, 2026) and the new engine behind the consumer Meta AI assistant. Optimized for visual understanding and "personal superintelligence" use cases.

  • Strong multimodal/visual perception — designed to "see and understand what you're looking at"
  • Powers Meta AI in the standalone app, meta.ai web, and is rolling into Instagram, WhatsApp, Messenger, Facebook, and Ray-Ban / Oakley Meta glasses
  • First Meta flagship ever shipped without open weights — a sharp pivot from the Llama strategy
  • Available only via private API preview to select partners
  • Marks the operational debut of MSL under Alexandr Wang (Chief AI Officer) and Nat Friedman

Limitations: Closed-source with no published model card or parameter count. No public API at launch. No third-party benchmarks vs GPT-5 / Claude / Gemini. Open-weight versions of future MSL models are promised but unconfirmed.

Closed Frontier ModelPrivate Preview

Best for: A free, ubiquitous consumer assistant inside the apps people already use. Distribution is the moat — this assistant lives where billions of users already chat.

  • Chat, real-time web answers, and image understanding from photos
  • Standalone Meta AI app and meta.ai web for full-screen conversations
  • Embedded in Facebook, Instagram, WhatsApp, and Messenger search bars and DMs
  • Imagine image generation (free, in-chat or at imagine.meta.com)
  • Ray-Ban Meta and Oakley Meta glasses integration: visual Q&A, translation, navigation, photo-based nutrition
  • Reels, posts, and creator content woven into answers with attribution
  • Now powered by Muse Spark (rolling out April 2026)

Limitations: Country availability uneven — full features US-first, EU rollout slowed by DMA compliance. Image generation quality below Midjourney and DALL·E for stylized work. Cross-app memory still rolling out.

Consumer AssistantFree

Best for: No-code custom AI characters distributed natively into Instagram, Messenger, and WhatsApp — Meta's answer to OpenAI's custom GPTs, but living inside social apps instead of a chat UI.

  • Custom AI Characters — available to any user; build a persona with name, look, personality, and topic boundaries
  • Creator AI — for Instagram creators; auto-replies to DMs and story replies in the creator's voice
  • Templates for trivia hosts, cooking teachers, travel guides, fitness coaches, etc.
  • Configure entirely in plain text — no coding required
  • Distributes natively into Instagram chat, Messenger, and WhatsApp
  • Built on Llama under the hood

Limitations: US-only as of April 2026. Not all Instagram accounts have access yet. No real revenue-share model for creators. Some of Meta's own AI persona launches drew criticism in 2025, and content moderation on third-party characters is still a question mark.

Custom Agent BuilderFreeUS-Only

Best for: First-party hosted Llama inference from Meta itself — previewed at LlamaCon (April 29, 2025) as a direct competitor to OpenAI/Anthropic APIs and to third-party Llama hosts like Together, Fireworks, and Groq.

  • One-click API key creation and an interactive playground for Scout and Maverick
  • Python and TypeScript SDKs
  • OpenAI-SDK-compatible — drop-in for code already using openai client libraries
  • Hosted fine-tuning (LoRA and full) on Llama 3.3 8B with eval tooling included
  • Take-your-weights-out: tuned models are portable to any host — no lock-in
  • Meta does not train on prompts or responses

Limitations: Limited free preview throughout 2025-26; paid GA pricing not yet finalized. Smaller model selection than Together / Fireworks / Groq today. Fine-tuning catalog narrower than AWS Bedrock. For production traffic, third-party Llama hosts often still win on price, throughput, and region coverage.

Hosted InferencePreview

Best for: Meta's video generation research — not a product you can call directly today, but the underlying tech that's seeding video tools inside Instagram and Reels.

  • 30B-parameter text-to-video research model announced October 2024
  • Up to 16-second 1080p HD video with synchronized audio
  • Text-to-video, image-to-video, and personalized video (your face)
  • Video editing via plain-text instructions
  • Rolling into Instagram and Reels creator features through 2025-26

Limitations: Not publicly released as weights or as an API. Behind Veo 3 and Runway in market access (OpenAI discontinued Sora on April 26, 2026 — API ends September 24, 2026). No fixed launch timeline for direct access.

Video GenerationResearch Only

Best for: The infrastructure layer that the entire AI industry runs on — originally Meta-built, now governed by the independent PyTorch Foundation under the Linux Foundation. Meta remains the largest contributor.

  • PyTorch — the dominant deep learning framework; powers Llama, most Hugging Face models, and historically OpenAI training infrastructure
  • torchtune — native post-training and fine-tuning library
  • torchao — quantization and model optimization
  • ExecuTorch — on-device inference for mobile and edge
  • BSD-licensed, fully open-source

Limitations: When people argue about Meta's "open-source AI" strategy, PyTorch is the most consequential piece — far more so than the Llama license itself. It's the substrate for the entire ecosystem.

ML FrameworkOpen Source

Best for: Targeted, single-purpose open-weight models that complement the main Llama line.

  • Llama Guard — open-weight safety classifier for input/output moderation; widely used in production Llama deployments
  • Code Llama — Llama 2 code-specialized derivative from 2023; effectively superseded by general-purpose Llama 3.x and Llama 4 on code tasks; no recent updates
  • SeamlessM4T — open-weight speech-to-speech and speech-to-text translation model covering ~100 languages

Limitations: Code Llama is essentially deprecated in 2026 — reach for Llama 3.3 70B or Llama 4 Maverick instead. SeamlessM4T is still maintained but moves slower than the main Llama line.

Specialized ModelsFree Weights