Apple Vision framework – Text extraction from image - swift

I am using Vision framework for iOS 11 to detect text on image.
The texts are getting detected successfully, but how we can get the detected text?

Recognizing text in an image
VNRecognizeTextRequest works starting from iOS 13.0 and macOS 10.15 and higher.
In Apple Vision you can easily extract text from image using VNRecognizeTextRequest class, allowing you to make an image analysis request that finds and recognizes text in an image.
Here's a SwiftUI solution showing you how to do it (tested in Xcode 13.4, iOS 15.5):
import SwiftUI
import Vision
struct ContentView: View {
var body: some View {
ZStack {
Color.black.ignoresSafeArea()
Image("imageText").scaleEffect(0.5)
SomeText()
}
}
}
The logic is the following:
struct SomeText: UIViewRepresentable {
let label = UITextView(frame: .zero)
func makeUIView(context: Context) -> UITextView {
label.backgroundColor = .clear
label.textColor = .systemYellow
label.textAlignment = .center
label.font = .boldSystemFont(ofSize: 25)
return label
}
func updateUIView(_ uiView: UITextView, context: Context) {
let path = Bundle.main.path(forResource: "imageText", ofType: "png")
let url = URL(fileURLWithPath: path!)
let requestHandler = VNImageRequestHandler(url: url, options: [:])
let request = VNRecognizeTextRequest { (request, _) in
guard let obs = request.results as? [VNRecognizedTextObservation]
else { return }
for observation in obs {
let topCan: [VNRecognizedText] = observation.topCandidates(1)
if let recognizedText: VNRecognizedText = topCan.first {
label.text = recognizedText.string
}
}
} // non-realtime asynchronous but accurate text recognition
request.recognitionLevel = VNRequestTextRecognitionLevel.accurate
// nearly realtime but not-accurate text recognition
request.recognitionLevel = VNRequestTextRecognitionLevel.fast
try? requestHandler.perform([request])
}
}
If you wanna know a list of supported languages for recognition, read this post please.

Not exactly a dupe but similar to: Converting a Vision VNTextObservation to a String
You need to either use CoreML or another library to perform OCR (SwiftOCR, etc.)

This will return a overlay image with rectangle box on detected text
Here is the full xcode project
https://github.com/cyruslok/iOS11-Vision-Framework-Demo
Hope it is helpful
// Text Detect
func textDetect(dectect_image:UIImage, display_image_view:UIImageView)->UIImage{
let handler:VNImageRequestHandler = VNImageRequestHandler.init(cgImage: (dectect_image.cgImage)!)
var result_img:UIImage = UIImage.init();
let request:VNDetectTextRectanglesRequest = VNDetectTextRectanglesRequest.init(completionHandler: { (request, error) in
if( (error) != nil){
print("Got Error In Run Text Dectect Request");
}else{
result_img = self.drawRectangleForTextDectect(image: dectect_image,results: request.results as! Array<VNTextObservation>)
}
})
request.reportCharacterBoxes = true
do {
try handler.perform([request])
return result_img;
} catch {
return result_img;
}
}
func drawRectangleForTextDectect(image: UIImage, results:Array<VNTextObservation>) -> UIImage {
let renderer = UIGraphicsImageRenderer(size: image.size)
var t:CGAffineTransform = CGAffineTransform.identity;
t = t.scaledBy( x: image.size.width, y: -image.size.height);
t = t.translatedBy(x: 0, y: -1 );
let img = renderer.image { ctx in
for item in results {
let TextObservation:VNTextObservation = item
ctx.cgContext.setFillColor(UIColor.clear.cgColor)
ctx.cgContext.setStrokeColor(UIColor.green.cgColor)
ctx.cgContext.setLineWidth(1)
ctx.cgContext.addRect(item.boundingBox.applying(t))
ctx.cgContext.drawPath(using: .fillStroke)
for item_2 in TextObservation.characterBoxes!{
let RectangleObservation:VNRectangleObservation = item_2
ctx.cgContext.setFillColor(UIColor.clear.cgColor)
ctx.cgContext.setStrokeColor(UIColor.red.cgColor)
ctx.cgContext.setLineWidth(1)
ctx.cgContext.addRect(RectangleObservation.boundingBox.applying(t))
ctx.cgContext.drawPath(using: .fillStroke)
}
}
}
return img
}

Related

Why does all formatting disappear from an NSTextView when using NSViewRepresentable and SwiftUI?

I am making a small program using SwiftUI that allows users to create rich text "notes" in an NSTextView. I have enabled all of the formatting features from NSTextView, including the ability to work with images. The program is only for macOS and not for iOS/iPadOS.
The problem I am facing is that whenever the user types anything in the NSTextView, the caret moves to the end and all formatting and images disappear.
Since I am just using the standard formatting options provided by Apple, I have not subclassed NSTextStorage or anything like that. My use-case should be pretty simple.
The program is tiny so far and the entire source code is on GitHub (https://github.com/eiskalteschatten/ScratchPad), but I'll post the relevant code here.
This is my NSViewRepresentable class for the NSTextView:
import SwiftUI
struct RichTextEditor: NSViewRepresentable {
#EnvironmentObject var noteModel: NoteModel
func makeNSView(context: Context) -> NSScrollView {
let scrollView = NSTextView.scrollableTextView()
guard let textView = scrollView.documentView as? NSTextView else {
return scrollView
}
textView.isRichText = true
textView.allowsUndo = true
textView.allowsImageEditing = true
textView.allowsDocumentBackgroundColorChange = true
textView.allowsCharacterPickerTouchBarItem = true
textView.isAutomaticLinkDetectionEnabled = true
textView.displaysLinkToolTips = true
textView.isAutomaticDataDetectionEnabled = true
textView.isAutomaticTextReplacementEnabled = true
textView.isAutomaticDashSubstitutionEnabled = true
textView.isAutomaticSpellingCorrectionEnabled = true
textView.isAutomaticQuoteSubstitutionEnabled = true
textView.isAutomaticTextCompletionEnabled = true
textView.isContinuousSpellCheckingEnabled = true
textView.usesAdaptiveColorMappingForDarkAppearance = true
textView.usesInspectorBar = true
textView.usesRuler = true
textView.usesFindBar = true
textView.usesFontPanel = true
textView.importsGraphics = true
textView.delegate = context.coordinator
context.coordinator.textView = textView
return scrollView
}
func updateNSView(_ nsView: NSScrollView, context: Context) {
context.coordinator.textView?.textStorage?.setAttributedString(noteModel.noteContents)
}
func makeCoordinator() -> Coordinator {
Coordinator(self)
}
class Coordinator: NSObject, NSTextViewDelegate {
var parent: RichTextEditor
var textView : NSTextView?
init(_ parent: RichTextEditor) {
self.parent = parent
}
func textDidChange(_ notification: Notification) {
guard let _textView = notification.object as? NSTextView else {
return
}
self.parent.noteModel.noteContents = _textView.attributedString()
}
}
}
On GitHub: https://github.com/eiskalteschatten/ScratchPad/blob/main/ScratchPad/Notes/RichTextEditor.swift
And this is my NoteModel class responsible for managing the NSTextView content:
import SwiftUI
import Combine
final class NoteModel: ObservableObject {
private var switchingPages = false
#Published var pageNumber = UserDefaults.standard.value(forKey: "pageNumber") as? Int ?? 1 {
didSet {
UserDefaults.standard.set(pageNumber, forKey: "pageNumber")
switchingPages = true
noteContents = NSAttributedString(string: "")
openNote()
switchingPages = false
}
}
#Published var noteContents = NSAttributedString(string: "") {
didSet {
if !switchingPages {
saveNote()
}
}
}
private var noteName: String {
return "\(NoteManager.NOTE_NAME_PREFIX)\(pageNumber).rtfd"
}
init() {
openNote()
}
private func openNote() {
// This is necessary, but macOS seems to recover the stale bookmark automatically, so don't handle it for now
var isStale = false
guard let bookmarkData = UserDefaults.standard.object(forKey: "storageLocationBookmarkData") as? Data,
let storageLocation = try? URL(resolvingBookmarkData: bookmarkData, options: .withSecurityScope, relativeTo: nil, bookmarkDataIsStale: &isStale)
else {
ErrorHandling.showErrorToUser("No storage location for your notes could be found!", informativeText: "Please try re-selecting your storage location in the settings.")
return
}
let fullURL = storageLocation.appendingPathComponent(noteName)
let options = [NSAttributedString.DocumentReadingOptionKey.documentType: NSAttributedString.DocumentType.rtfd]
do {
guard storageLocation.startAccessingSecurityScopedResource() else {
ErrorHandling.showErrorToUser("ScratchPad is not allowed to access the storage location for your notes!", informativeText: "Please try re-selecting your storage location in the settings.")
return
}
if let _ = try? fullURL.checkResourceIsReachable() {
let attributedString = try NSAttributedString(url: fullURL, options: options, documentAttributes: nil)
noteContents = attributedString
}
fullURL.stopAccessingSecurityScopedResource()
} catch {
print(error)
ErrorHandling.showErrorToUser(error.localizedDescription)
}
}
private func saveNote() {
// This is necessary, but macOS seems to recover the stale bookmark automatically, so don't handle it for now
var isStale = false
guard let bookmarkData = UserDefaults.standard.object(forKey: "storageLocationBookmarkData") as? Data,
let storageLocation = try? URL(resolvingBookmarkData: bookmarkData, options: .withSecurityScope, relativeTo: nil, bookmarkDataIsStale: &isStale)
else {
ErrorHandling.showErrorToUser("No storage location for your notes could be found!", informativeText: "Please try re-selecting your storage location in the settings.")
return
}
let fullURL = storageLocation.appendingPathComponent(noteName)
do {
guard storageLocation.startAccessingSecurityScopedResource() else {
ErrorHandling.showErrorToUser("ScratchPad is not allowed to access the storage location for your notes!", informativeText: "Please try re-selecting your storage location in the settings.")
return
}
let rtdf = noteContents.rtfdFileWrapper(from: .init(location: 0, length: noteContents.length))
try rtdf?.write(to: fullURL, options: .atomic, originalContentsURL: nil)
fullURL.stopAccessingSecurityScopedResource()
} catch {
print(error)
ErrorHandling.showErrorToUser(error.localizedDescription)
}
}
}
On GitHub: https://github.com/eiskalteschatten/ScratchPad/blob/main/ScratchPad/Notes/NoteModel.swift
Does anyone have any idea why this is happening and/or how to fix it?
I have found these similar issues, but they don't really help me much:
Replacing NSAttributedString in NSTextStorage Moves NSTextView Cursor - I don't have any custom syntax highlighting or anything like that.
Cursor always jumps to the end of the UIViewRepresentable TextView when a newline is started before the final line + after last character on the line - Only solves the caret issue and causes jerky scroll behavior in longer documents.
Edit: I forgot to mention that I'm using macOS Ventura, but am targeting 12.0 or higher.
Edit #2: I have significantly updated the question to reflect what I've found through more debugging.

Swift and JSON url Image, edited

So this is the code that I am using to create my table view from scratch. My question is how can I parse an image if the image is of string (url) format?
class ArticleCell : UITableViewCell {
var article: Article? {
didSet {
articleTitle.text = article?.title
//articleImage.image = article?.urlToImage
descriptionTitle.text = article?.description
}
}
private let articleTitle : UILabel = {
let lbl = UILabel()
lbl.textColor = .black
lbl.font = UIFont.boldSystemFont(ofSize: 20)
lbl.textAlignment = .left
return lbl
}()
private let descriptionTitle : UILabel = {
let desclbl = UILabel()
desclbl.textColor = .black
desclbl.font = UIFont.boldSystemFont(ofSize: 10)
desclbl.textAlignment = .left
return desclbl
}()
override init(style: UITableViewCell.CellStyle, reuseIdentifier: String?) {
super.init(style: style, reuseIdentifier: reuseIdentifier)
addSubview(articleTitle)
addSubview(descriptionTitle)
Because then what I would like to do is:
addsubview(articleImage)
I get an error as I am declaring an image but it is in a string format. Now, using storyboard is easy, but programmatically I have this issue.
Is it more understandable now? I am so sorry if I made confusion.
To load an image from a URL (or string) we first need to do the following:
guard let urlToImage = article?.urlToImage, let urlContent = URL(string: urlToImage) else { return }
if let data = try? Data(contentsOf: urlContent) {
if let image = UIImage(data: data) {
articleImage.image = image
}
}
}
However this creates a terrible UI experience for the user as image loading from a URL synchronously will hold up our main thread and everyone else in the queue will be stuck waiting for image loading to finish.
For the user's sake, let's set title and description so long but send image loading on some background thread. It should look something like this:
var article: Article? {
didSet {
articleTitle.text = article?.title
descriptionTitle.text = article?.description
guard let urlToImage = article?.urlToImage, let urlContent = URL(string: urlToImage) else { return }
DispatchQueue.global().async {
if let data = try? Data(contentsOf: urlContent) {
if let image = UIImage(data: data) {
DispatchQueue.main.async { [weak self] in
self?.articleImage.image = image
self?.setNeedsLayout()
}
}
}
}
}
}
When we have our image we set it on main thread again and call setNeedsLayout() to tell the cell that it needs to adjust the layout of it's subviews.
NOTE: The above solution works if you are displaying very little cells. If you are displaying MANY cells on your tableview and you want to scroll while they load, you will encounter the age old problem of loading cell content (in this case the image) at the incorrect index path. I suggest reading up on how to asynchronously load images into table and collection views. Have fun!

Convert HTML to NSAttributedString in Background?

I am working on an app that will retrieve posts from WordPress and allow the user to view each post individually, in detail. The WordPress API brings back the post content which is the HTML of the post. (Note: the img tags are referencing the WordPress URL of the uploaded image)
Originally, I was using a WebView & loading the retrieved content directly into it. This worked great; however, the images seemed to be loading on the main thread, as it caused lag & would sometimes freeze the UI until the image had completed downloading. I was suggested to check out the Aztec Editor library from WordPress; however, I could not understand how to use it (could not find much documentation).
My current route is parsing the HTML content and creating a list of dictionaries (keys of type [image or text] and content). Once it is parsed, I build out the post by dynamically adding Labels & Image views (which allows me to download images in background). While this does seem overly-complex & probably the wrong route, it is working well (would be open to any other solutions, though!) My only issue currently is the delay of converting an HTML string to NSAttributedText. Before adding the text content to the dictionary, I will convert it from a String to an NSAttributedString. I have noticed a few seconds delay & the CPU of the simulator getting up to 50-60% for a few seconds, then dropping. Is there anyway I could do this conversion on a background thread(s) and display a loading animation during this time?
Please let me know if you have any ideas or suggestions for a better solution. Thank you very much!
Code:
let postCache = NSCache<NSString, AnyObject>()
var yPos = CGFloat(20)
let screenWidth = UIScreen.main.bounds.width
...
func parsePost() -> [[String:Any]]? {
if let postFromCache = postCache.object(forKey: postToView.id as NSString) as? [[String:Any]] {
return postFromCache
} else {
var content = [[String:Any]]()
do {
let doc: Document = try SwiftSoup.parse(postToView.postContent)
if let elements = try doc.body()?.children() {
for elem in elements {
if(elem.hasText()) {
do {
let html = try elem.html()
if let validHtmlString = html.htmlToAttributedString {
content.append(["text" : validHtmlString])
}
}
} else {
let imageElements = try elem.getElementsByTag("img")
if(imageElements.size() > 0) {
for image in imageElements {
var imageDictionary = [String:Any]()
let width = (image.getAttributes()?.get(key: "width"))!
let height = (image.getAttributes()?.get(key: "height"))!
let ratio = CGFloat(Float(height)!/Float(width)!)
imageDictionary["ratio"] = ratio
imageDictionary["image"] = (image.getAttributes()?.get(key: "src"))!
content.append(imageDictionary)
}
}
}
}
}
} catch {
print("error")
}
if(content.count > 0) {
postCache.setObject(content as AnyObject, forKey: postToView.id as NSString)
}
return content
}
}
func buildUi(content: [[String:Any]]) {
for dict in content {
if let attributedText = dict["text"] as? NSAttributedString {
let labelToAdd = UILabel()
labelToAdd.attributedText = attributedText
labelToAdd.numberOfLines = 0
labelToAdd.frame = CGRect(x:0, y:yPos, width: 200, height: 0)
labelToAdd.sizeToFit()
yPos += labelToAdd.frame.height + 5
self.scrollView.addSubview(labelToAdd)
} else if let imageName = dict["image"] as? String {
let ratio = dict["ratio"] as! CGFloat
let imageToAdd = UIImageView()
let url = URL(string: imageName)
Nuke.loadImage(with: url!, into: imageToAdd)
imageToAdd.frame = CGRect(x:0, y:yPos, width: screenWidth, height: screenWidth*ratio)
yPos += imageToAdd.frame.height + 5
self.scrollView.addSubview(imageToAdd)
}
}
self.scrollView.contentSize = CGSize(width: self.scrollView.contentSize.width, height: yPos)
}
extension String {
var htmlToAttributedString: NSAttributedString? {
guard let data = data(using: .utf8) else { return NSAttributedString() }
do {
return try NSAttributedString(data: data, options: [.documentType: NSAttributedString.DocumentType.html, .characterEncoding:String.Encoding.utf8.rawValue], documentAttributes: nil)
} catch {
return NSAttributedString()
}
}
var htmlToString: String {
return htmlToAttributedString?.string ?? ""
}
}
( Forgive me for the not-so-clean code! I am just wanting to make sure I can achieve a desirable outcome before I start refactoring. Thanks again! )

Creating Gif Image Color Maps in iOS 11

I recently was having an issue when creating a Gif where if it got too big colors went missing. However thanks to help from SO someone was able to help me find a work around and create my own color map.
Previous Question here...iOS Colors Incorrect When Saving Animated Gif
This worked great up until iOS 11. I can't find anything in the docs that changed and would make this no longer work. What I have found is if I remove kCGImagePropertyGIFImageColorMap A gif is generated but it has the original issue where colors go missing if the gif gets to large like my previous question. This makes sense as this was added to fix that issue.
Suspected Issue...
func createGifFromImage(_ image: UIImage) -> URL{
let fileProperties: [CFString: CFDictionary] = [kCGImagePropertyGIFDictionary: [
kCGImagePropertyGIFHasGlobalColorMap: false as NSNumber] as CFDictionary]
let documentsDirectoryPath = "file://\(NSTemporaryDirectory())"
if let documentsDirectoryURL = URL(string: documentsDirectoryPath){
let fileURL = documentsDirectoryURL.appendingPathComponent("test.gif")
let destination = CGImageDestinationCreateWithURL(fileURL as CFURL, kUTTypeGIF, 1, nil)!
CGImageDestinationSetProperties(destination, fileProperties as CFDictionary);
let colorMap = image.getColorMap()
print("ColorMap \(colorMap.exported as NSData)")
let frameProperties: [String: AnyObject] = [
String(kCGImagePropertyGIFImageColorMap): colorMap.exported as NSData
]
let properties: [String: AnyObject] = [
String(kCGImagePropertyGIFDictionary): frameProperties as AnyObject
]
CGImageDestinationAddImage(destination, image.cgImage!, properties as CFDictionary);
if (!CGImageDestinationFinalize(destination)) {
print("failed to finalize image destination")
}
return fileURL
}
//shouldn't get here
return URL(string: "")!
}
Here is a link to download a test project. Note if you run it on a 10.3 simulator it works great but if you run it on iOS 11 it is a white image.
https://www.dropbox.com/s/hdmkwyz47ondd52/gifTest2.zip?dl=0
Other code that is referenced...
extension UIImage{
//MARK: - Pixel
func getColorMap() -> ColorMap {
var colorMap = ColorMap()
let pixelData = self.cgImage!.dataProvider!.data
let data: UnsafePointer<UInt8> = CFDataGetBytePtr(pixelData)
var byteIndex = 0
for _ in 0 ..< Int(size.height){
for _ in 0 ..< Int(size.width){
let color = Color(red: data[byteIndex], green: data[byteIndex + 1], blue: data[byteIndex + 2])
colorMap.colors.insert(color)
byteIndex += 4
}
}
return colorMap
}
}
ColorMap
struct Color : Hashable {
let red: UInt8
let green: UInt8
let blue: UInt8
var hashValue: Int {
return Int(red) + Int(green) + Int(blue)
}
public static func == (lhs: Color, rhs: Color) -> Bool {
return [lhs.red, lhs.green, lhs.blue] == [rhs.red, rhs.green, rhs.blue]
}
}
struct ColorMap {
var colors = Set<Color>()
var exported: Data {
let data = Array(colors)
.map { [$0.red, $0.green, $0.blue] }
.joined()
return Data(bytes: Array(data))
}
}
This is a bug in iOS 11.0 and 11.1. Apple has fixed this in iOS 11.2+
I saw zip project. it's seems not "swifty".. :)
Anyway:
below You can find minimal code that works for iOS 11.
we can start from this
questions:
1) what should happen for different sizes? we muse resize GIF of make a black area around original image in a new big image
2) can you give me more details about color maps?
3) what is all the stuff with matrixes? it seems you want to fill in black? this is not the smart and quicker approach. I will use Apple functions to scale up/fill background.
I can help You once You have kindly answered.
class ViewController: UIViewController {
override func viewDidLoad() {
super.viewDidLoad()
// Do any additional setup after loading the view, typically from a nib.
if let image = UIImage(named: "image1"){
createGIFFrom(image: image)
}
}
final func localPath()->URL{
let tempPath = NSTemporaryDirectory()
let url = URL.init(fileURLWithPath: tempPath)
return url.appendingPathComponent("test.gif")
}
private final func createGIFFrom(image: UIImage){
let fileURL = self.localPath()
let type: CFString = kUTTypeGIF// kUTTypePNG
// from apple code:
//https://developer.apple.com/library/content/technotes/tn2313/_index.html
guard let myImageDest = CGImageDestinationCreateWithURL(fileURL as CFURL, type, 1, nil) else{
return
}
guard let imageRef = image.cgImage else{
return
}
// Add an image to the image destination
CGImageDestinationAddImage(myImageDest, imageRef, nil)
CGImageDestinationFinalize(myImageDest)
print("open image at: \(fileURL)")
}
}

Function in Swift to Append a Pdf file to another Pdf

I created two different pdf files in two different views using following code:
private func toPDF(views: [UIView]) -> NSData? {
if views.isEmpty {return nil}
let pdfData = NSMutableData()
UIGraphicsBeginPDFContextToData(pdfData, CGRect(x: 0, y: 0, width: 1024, height: 1448), nil)
let context = UIGraphicsGetCurrentContext()
for view in views {
UIGraphicsBeginPDFPage()
view.layer.renderInContext(context!)
}
UIGraphicsEndPDFContext()
return pdfData
}
In the final view I call both files using:
let firstPDF = NSUserDefaults.standardUserDefaults().dataForKey("PDFone")
let secondPDF = NSUserDefaults.standardUserDefaults().dataForKey("PDFtwo")
My question is: Can anyone suggest a function which append the second file to the first one? (Both are in NSData Format)
Swift 4:
func merge(pdfs:Data...) -> Data
{
let out = NSMutableData()
UIGraphicsBeginPDFContextToData(out, .zero, nil)
guard let context = UIGraphicsGetCurrentContext() else {
return out as Data
}
for pdf in pdfs {
guard let dataProvider = CGDataProvider(data: pdf as CFData), let document = CGPDFDocument(dataProvider) else { continue }
for pageNumber in 1...document.numberOfPages {
guard let page = document.page(at: pageNumber) else { continue }
var mediaBox = page.getBoxRect(.mediaBox)
context.beginPage(mediaBox: &mediaBox)
context.drawPDFPage(page)
context.endPage()
}
}
context.closePDF()
UIGraphicsEndPDFContext()
return out as Data
}
This can be done quite easily with PDFKit and its PDFDocument.
I'm using this extension:
import PDFKit
extension PDFDocument {
func addPages(from document: PDFDocument) {
let pageCountAddition = document.pageCount
for pageIndex in 0..<pageCountAddition {
guard let addPage = document.page(at: pageIndex) else {
break
}
self.insert(addPage, at: self.pageCount) // unfortunately this is very very confusing. The index is the page *after* the insertion. Every normal programmer would assume insert at self.pageCount-1
}
}
}
Swift 5:
Merge pdfs like this to keep links, etc...
See answer here