Product update

NVIDIA Maxine is now called NVIDIA AI for Media. Your access to NIM microservices, SDKs, and early access programs continues without interruption.

NVIDIA AI for Media

NVIDIA AI for Media (formerly NVIDIA Maxine) is a collection of SDKs, NVIDIA NIM and Blueprints that enhance audio, video, and augmented reality effects for media and entertainment workflows. Built on the NVIDIA AI platform, AI for Media enables developers to deliver studioâ€‘quality audio and highâ€‘resolution video enhancement and effects for real-time and offline AI audio and video pipelinesâ€”from local to cloud. With many features optimized for ultraâ€‘low latency, NVIDIA AI for Media supports content creation, livestreaming, broadcast, and post-production pipelines and can be deployed on Holoscan for Media or in ISV or standalone applications.

With NVIDIA NIMâ„¢, part of NVIDIA AI Enterprise, developers can access AI for Media capabilities with easy-to-use microservices designed for secure, reliable, high-performance deployment across clouds, data centers, and workstations.

Try Now Request 90-Day License

Benefits

Best-in-Class AI Capabilities

NVIDIA AI for Media offers world-class pretrained models for developers to deploy premium augmented reality, audio, and video quality features.

Real-Time AI Performance

AI for Media includes many real-time AI features for inference on NVIDIA GPUs, resulting in low-latency audio, video, and augmented reality (AR) effects with high network resilience.

Complete AI Pipeline

AI for Media offers a breadth of tools for complete audio and video enhancement pipelines with multiple low-latency effects that can be chained together.

Multi-Cloud, Customizable Deployment

AI for Mediaâ€™s cloud-native microservices allow for flexible, fast deployment and updates.

Use Cases

Livestreaming

AI for Media delivers many features with ultraâ€‘lowâ€‘latency AI processing that enhances audio and video quality in real time, even in dynamic and bandwidthâ€‘constrained environments. Streaming ISVs that leverage AI for Media enable their live creators and production teams to clean up audio, upscale and relight video, and apply realâ€‘time visual effects while maintaining consistent onâ€‘air quality. AI for Media supports interactive, highâ€‘throughput streaming workflows that scale across onâ€‘prem, cloud, and edge deploymentsâ€”ensuring premium live experiences for global audiences.

Professional Broadcast

AI for Media brings realâ€‘time, AIâ€‘powered enhancement to broadcast and IPâ€‘based production. It improves audio and video quality with speech processing, visual enhancement, and speaker intelligence. AI for Media supports STâ€¯2110, integrates with NVIDIA Holoscan for Media, and enables reliable, scalable AI deployment across modern softwareâ€‘defined infrastructures.

Content Creation

AI for Media enhances content creation workflows by improving audio, video, and visual effects with GPU-accelerated AI. It boosts speech clarity, removes noise, enhances video resolution, and adds AR capabilities, all without specialized equipment or complex post-production. ISVs that integrate NVIDIA AI for Media SDKs and microservices into their creator tools and platforms accelerate their usersâ€™ production of high-quality content for social, marketing, and digital media channels.

Whatâ€™s New In AI for Media?

Easy-to-use microservices and SDKs designed for secure, reliable, high-performance deployment across clouds, data centers, RTX workstations, and RTX PCs:

Content Localization Blueprint

The Content Localization Blueprint is a modular, scalable, NIM-centric reference architecture for media producers to localize content for global audiences, unlocking new revenue. It supports audio and video post-production workflows by orchestrating NVIDIA and partner AI microservices for features like speech translation, active speaker detection, and AI-driven lip-sync.

Try It Now

Synthetic Video Detector

Synthetic Video Detector detects AIâ€‘generated video with high accuracy on uncompressed and compressed content, producing results in real time on NVIDIA GPUs. It is intentionally biased toward false positives over false negatives to prioritize safety.

Try It Now

LipSync

The Lipâ€‘Sync ST 2110 NIM synchronizes lip movements with speech in live, IPâ€‘based broadcast video pipelines. It is designed for realâ€‘time dubbing workflows in NVIDIA Holoscan for Media environments.

Try it Now Apply to Private Access

Active Speaker Detection

ASD ST 2110 brings multiâ€‘speaker detection and identification to live broadcast workflows over IP video. It enables realâ€‘time speaker tagging within NVIDIA Holoscan for Media.

Try It Now

Background Noise Removal

Background Noise Removal removes a wide range of ambient noises from audio recordings while preserving expressive speech qualities.

Try It Now

Studio Voice NIM

Studio Voice ST 2110 brings studioâ€‘quality speech enhancement to live broadcast audio pipelines. It supports professional IPâ€‘based media workflows using standard input equipment.

Try It Now

Video Relighting

Relighting uses AIâ€‘generated HDRI to reâ€‘illuminate a person in live or recorded video to match target lighting conditions while preserving realism, texture quality, and camera look. It integrates a moving subject naturally into complex environments and is delivered as an NVIDIA AI for Media NIM.

Read the Documentation

RTX Video Super Resolution

RTX Video Super Resolution upscales 16:9 video from 480p to as high as 8K using AI, with user controls for sharpness, blur, denoising, and hallucination limits. The model can be fineâ€‘tuned to source content and runs within NVIDIA AI for Media. Also available as a Python Wheel.

View on NGC

3D Body Pose

3D Body Pose is a singleâ€‘camera, markerâ€‘less, and rigâ€‘free motion capture NIM that outputs fullâ€‘body 3D animations using skeletal tracking. It enables realistic body motion capture without specialized hardware.

Coming Spring 2026

Audio Effects

The Audio Effects SDK enables real-time broadcast audio enhancements, including noise and room echo removal, audio super-resolution, and acoustic echo cancellation, improving speech clarity and overall sound quality in various recording environments.

View on NGC (Linux)View on NGC (Windows)

Video Effects

The Video Effects SDK uses GPU-powered Tensor Cores to accelerate video processing, offering filters like AI Green Screen, Background Blur, Super Resolution, Upscale, Webcam Denoising, and Video Relighting for enhanced real-time video effects and quality improvements.

View on NGC

Augmented Reality

The Augmented Reality SDK enables real-time face and body tracking, landmark detection, eye contact adjustment, facial expression estimation, and LipSync, powered by NVIDIA GPUs for accelerated performance, supporting diverse AR, animation, and modeling applications.

View on NGC

Get Started With NVIDIA AI for Media

Experience in the API Catalog

For individuals looking to experience AI for Media NIM microservices, the API catalog offers a UI-based playground and access to NVIDIA-managed API endpoints for free as a great starting point.

Experience Now

Limited Availability

AI for Media is part of NVIDIA AI Enterprise, providing enterprise-grade security, support, and stability for production-ready AI. Request a free evaluation license for a 90-day trial.

Apply today

Get Early Access to New Features

This program is available to a limited number of applicants based on use case and infrastructure fit.

Apply for Early Access

Private Access Program

To get access to the LipSync feature of the Content localization Blueprint, please request to join our

Apply for Private Access

NVIDIA AI for Media Learning Library

Explore more AI for Media models to enhance your media pipeline.

Try Now