NazorKit is a library built on top of MLX-Swift to easily integrate on-device vision language models into your iOS app.
The name "Nazor" is inspired by the Persian word "نظر" - "nazar" meaning vision/sight/gaze).
Swift Package Manager handles the distribution of Swift code and comes built into the Swift compiler.
To add NazorKit to your project, simply include it in your Package.swift file:
dependencies: [
.package(url: "https://github.com/rryam/NazorKit.git", .upToNextMajor(from: "0.1.0"))
]Or add NazorKit to your project through Xcode's package manager:
- In Xcode, go to File > Add Packages...
- Enter the package URL:
https://github.com/rryam/NazorKit - Select the version or branch you want to use (e.g.
main) - Click Add Package
Get up and running with NazorKit in minutes. Here is an example of analyzing an image:
import NazorKit
import SwiftUI
struct ContentView: View {
@VLMServiceProvider private var vlmService
@State private var image: UIImage?
@State private var generatedDescription: String = ""
var body: some View {
VStack {
if let image {
Image(uiImage: image)
.resizable()
.scaledToFit()
.analyzeMedia(
service: vlmService,
prompt: "Describe this image in detail",
image: image
) { description in
generatedDescription = description
}
Text(generatedDescription)
.padding()
}
}
}
}- Features
- Installation
- Quick Start
- Basic Usage
- Advanced Configuration
- Video Analysis
- Requirements
- Dependencies
- Contributing
- License
- Support
- SwiftUI-first API design
- Support for iOS 16.0+, macOS 14.0+, and visionOS 1.0+
- Image analysis capabilities
- Video analysis support
- Built on top of MLX for efficient model inference
- Customizable model configurations
- Easy-to-use property wrappers and view modifiers
Here's a simple example of how to analyze an image using NazorKit:
struct ContentView: View {
@VLMServiceProvider private var vlmService
@State private var image: UIImage?
@State private var generatedDescription: String = ""
var body: some View {
VStack {
if let image {
Image(uiImage: image)
.resizable()
.scaledToFit()
.analyzeMedia(
service: vlmService,
prompt: "Describe this image in detail",
image: image
) { description in
generatedDescription = description
}
Text(generatedDescription)
.padding()
}
}
}
}You can customize the VLM service with specific model configurations:
@VLMServiceProvider(
configuration: .qwen2VL2BInstruct4Bit,
generateParameters: .init(temperature: 0.8),
maxTokens: 1000
) private var vlmServiceYou can fine-tune the generation process with custom parameters:
let generateParameters = GenerateParameters(
temperature: 0.8, // Controls randomness (0.0-1.0)
topP: 0.9 // Nucleus sampling parameter
)
@VLMServiceProvider(
configuration: .qwen2VL2BInstruct4Bit,
generateParameters: generateParameters,
maxTokens: 1000
) private var vlmServiceNazorKit also supports video analysis:
import AVKit
struct VideoAnalysisView: View {
@VLMServiceProvider private var vlmService
@State private var analysis: String = ""
let videoURL: URL
var body: some View {
VStack {
VideoPlayer(player: AVPlayer(url: videoURL))
.frame(height: 300)
.analyzeMedia(
service: vlmService,
prompt: "What's happening in this video?",
video: videoURL
) { description in
analysis = description
}
Text(analysis)
.padding()
}
}
}I welcome contributions to NazorKit! Here is how you can help:
- Fork the repository and create a feature branch
- Make your changes following the existing code style
- Add tests for new functionality
- Update documentation as needed
- Submit a pull request with a clear description
- Clone the repository
- Open
Package.swiftin Xcode or VS Code forks or CLIs - Run tests to ensure everything works
- Make your changes and test them
- Follow SwiftLint rules (run
swiftlint lint) - Use Swift 6.0+ features where appropriate
NazorKit is available under the MIT license. See LICENSE for more information.
- Thanks to the MLX team for their excellent work on the MLX and the MLX Swift framework!