Apple has released Core AI, a native framework designed to give developers a clean, unified API for running machine learning inference on-device across iOS, macOS, and related platforms. The documentation is live on Apple's developer portal, and while the announcement has been low-key, the technical implications are significant for anyone building AI-powered apps in the Apple ecosystem.

The core value proposition is straightforward: instead of stitching together Core ML, BNNS, and Metal Performance Shaders yourself — each with different APIs and abstraction levels — Core AI aims to provide a single entry point for model execution that routes work intelligently to the right hardware (CPU, GPU, or Neural Engine) under the hood.

Apple Quietly Launches Core AI: A Native On-Device ML Framework for iOS and macOS

For builders, this matters because on-device inference means lower latency, no API costs, and no user data leaving the device. That combination is increasingly a product requirement, not just a nice-to-have, especially in health, productivity, and enterprise apps where privacy is non-negotiable.

The framework appears to complement rather than replace Core ML. Think of Core ML as the model packaging and conversion layer, and Core AI as the runtime execution layer with a more modern, flexible API surface. Developers already working with ONNX or custom model formats should pay attention to how Core AI handles model ingestion — that detail will determine how much conversion work is required.

Practical next step: pull up the official documentation at developer.apple.com/documentation/coreai and map it against your current ML stack. If you're already shipping Core ML models, the migration path is likely shallow. If you're calling external inference APIs for tasks that could run locally on modern Apple Silicon, this framework is worth benchmarking against your current setup immediately.