Scanning text using SwiftUI and Vision on iOS

Apple’s Vision framework contains computer vision related functionality and with iOS 13 it can detect text on images as well. Moreover, Apple added a new framework VisionKit what makes it easy to integrate document scanning functionality. For demonstrating the usage of it, let’s build a simple UI what can present the scanner and display scanned text.

Cropping text area when scanning documents

Scanning text with Vision

VisionKit has VNDocumentCameraViewController and when presented, it allows scanning documents and cropping scanned documents. It uses delegate for publishing scanned documents via an instance of VNDocumentCameraScan. This object contains all the taken images (documents). Next, we can use VNImageRequestHandler in Vision for detecting text on those images.

Presenting document scanner with SwiftUI

As VNDocumentCameraViewController is UIKit view controller we can’t directly present it in SwiftUI. For making this work, we’ll need to use a separate value type conforming to UIViewControllerRepresentable protocol. UIViewControllerRepresentable is the glue between SwiftUI and UIKit and enables us to present UIKit views. This protocol requires us to define the class of the view controller and then implementing makeUIViewController(context:) and updateUIViewController(_:context:). In addition, we’ll also create coordinator what is going to be VNDocumentCameraViewController’s delegate. SwiftUI uses UIViewRepresentableContext for holding onto the coordinator and managing the view controller updates behind the scenes. Our case is pretty simple, we just use completion handler for passing back scanned text or nil when it was closed or error occurred. No need to update the view controller itself, only to pass data from it back to SwiftUI.

ContentView is the main SwiftUI view presenting the content for our very simple UI with static text, button and scanned text. When pressing on the button, we’ll set isShowingScannerSheet property to true. As it is @State property, then this change triggers SwiftUI update and sheet modifier will take care of presenting ScannerView with VNDocumentCameraViewController. When view controller finishes, completion handler is called and we will update the text property and set isShowingScannerSheet to false which triggers tearing down the modal during the next update.


With the new addition of VisionKit and text recognition APIs, it is extremely easy to add support of scanning text using camera.

If this was helpful, please let me know on Twitter @toomasvahter. Feel free to subscribe to RSS feed. Thank you for reading.


VisionKitExample (Xcode 11)

2 Replies to “Scanning text using SwiftUI and Vision on iOS”

  1. Hi,

    thank you very much for the code. I modified it a bit to comply with the new Xcode 11 GM.
    I replaced the Button with a NavigationLink in a NavigationView.

    Do you know how to return from the ScannerView to the original view when hitting the Save button?
    The Programm does not return to the original NavigationView.



    1. Hi, updated my sample app to Xcode 11 GM as well. Haven’t played around with NavigationLink much but seems like detail view can use @Environment(\.presentationMode) var presentationMode for dismissing itself.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s