VNRecognizeTextRequest digital / seven-segment numbers - swift

I basically followed this great tutorial on VNRecognizeTextRequest and modified some things:
https://bendodson.com/weblog/2019/06/11/detecting-text-with-vnrecognizetextrequest-in-ios-13/
I am trying to recognise text from devices with seven-segment-style displays which seems to get a bit tricky for this framework. Often it works, but numbers with comma are hard and if there's a a gap as well. I'm wondering whether there is the possibility to "train" this recognition engine. Another possibility might be to somehow tell it to specifically look for numbers, maybe then it can focus more processing power on that instead of generically looking for text?
I use this modified code for the request:
ocrRequest = VNRecognizeTextRequest { (request, error) in
guard let observations = request.results as? [VNRecognizedTextObservation] else { return }
for observation in observations {
guard let topCandidate = observation.topCandidates(1).first else { continue }
let topCandidateText = topCandidate.string
if let float = Float(topCandidateText), topCandidate.confidence > self.bestConfidence {
self.bestCandidate = float
self.bestConfidence = topCandidate.confidence
}
}
if self.bestConfidence >= 0.5 {
self.captureSession?.stopRunning()
DispatchQueue.main.async {
self.found(measurement: self.bestCandidate!)
}
}
}
ocrRequest.recognitionLevel = .accurate
ocrRequest.minimumTextHeight = 1/10
ocrRequest.recognitionLanguages = ["en-US", "en-GB"]
ocrRequest.usesLanguageCorrection = true
There are 3 global variables in this class regarding the text recognition:
private var ocrRequest = VNRecognizeTextRequest(completionHandler: nil)
private var bestConfidence: Float = 0
private var bestCandidate: Float?
Thanks in advance for your answers, even though this is not directly code-related, but more concept-related (i.e. "am I doing something wrong / did I overlook an important feature?" etc.).
Example image that work:
Example that half works:
(recognises 58)
Example that does not work:
(it has a very low confidence for "91" and often thinks it's just 9 or 9!)

Related

How can we get bounding box results from VNRecognizedTextObservation after we filter compactMap results

I have this code to get observation of recognized texts.
guard let observations =
request.results as? [VNRecognizedTextObservation] else {
return
}
To get top candidates
let recognized = observations.compactMap { observation in
return observation.topCandidates(1).first
}
And to get recognized text with high confidence
let confidenceL = recognized.filter{$0.confidence > 0.3}
Now I want to draw something around recognized higher confidence text but
I can only could get coordinates from observation like this
observations[k].topRight
and confidenceL array different from observations as we can guess so how can I find which observations contain confidenceL[k] text.
Well
let recognized = observations.compactMap { observation in
return observation.topCandidates(1).first
}
and
guard let observations =
request.results as? [VNRecognizedTextObservation] else {
return
}
arrays sizes are matching so I just gave up using
let confidenceL = recognized.filter{$0.confidence > 0.3}
with tears and did something like this
for k in 0..< recognized.count {
if recognized[k].confidence > 0.3 {
// Doing drawing things here
}
}

Numpy's argmax() in Swift

I'm working on a project which converts user's face to emoji. I use Apple's ARKit in this purpose.
I need to get the most probable option. I wrote this code:
func renderer(for anchor: ARAnchor) {
guard let faceAnchor = anchor as? ARFaceAnchor else {
return
}
let shapes = faceAnchor.blendShapes
let browDownLeft = shapes[.browDownLeft]!.doubleValue
let browInnerUp = shapes[.browInnerUp]!.doubleValue
let browOuterUpLeft = shapes[.browOuterUpLeft]!.doubleValue
let leftBrowMax = max(browDownLeft, browInnerUp, browOuterUpLeft)
switch leftBrowMax {
case browDownLeft:
userFace.leftBrow = .browDown
case browInnerUp:
userFace.leftBrow = .browInnerUp
case browOuterUpLeft:
userFace.leftBrow = .browOuterUp
default:
userFace.leftBrow = .any
}
}
I need to duplicate function's body six time (for brows, eyes and mouth sides), so I want to write it in a more convenient way. Is there any options in Swift like numpy's argmax function? Also I need to specify arguments range, because arguments for mouth should not be compared with arguments for brows.
You can use something like this:
func maxBlendShape(for blendShapes: [ARFaceAnchor.BlendShapeLocation], in shape: [ARFaceAnchor.BlendShapeLocation: NSNumber]) -> Double? {
blendShapes
.compactMap { shape[$0] }
.map(\.doubleValue)
.max()
}
Usage would then be something like this:
maxBlendShape(for: [.browDownLeft, .browInnerUp, .browOuterUpLeft], in: faceAnchor.blendShapes)
Note: Nothing here is specific to ARKit, you just filter some keys from the dictionary and find their max value. A generic solution could look like this:
extension Dictionary where Value == NSNumber {
func maxDouble(for keys: [Key]) -> Double? {
keys
.compactMap({self[$0]})
.map(\.doubleValue)
.max()
}
}
faceAnchor.blendShapes.maxDouble(for: [.browInnerUp, .browDownLeft, .browOuterUpLeft])

Why is my lineString not converting to mapShape in geoSwift - (only happens with one specific search), could be external library bug?

Im using the GEOSwift Library: https://github.com/GEOSwift/GEOSwift
My best guess is that if you look at the string image linked, it looks as if its not a proper circle, so maybe it is a bug in the library? But i am not at all sure about this!
Im having an issue only when i enter one specific linestring.
My app takes an array of route coordinates, converts them into WKT String (representing a line). It then Creates a buffer around this line, then converts this into a mapShape.
It runs fine, until i search one specific route.
It fails here:
func bufferPolyline(routeCoords: [CLLocationCoordinate2D], completion: #escaping (_ bufferCoordsArray: [LatLng]) -> ()) {
var wktString = ""
var i = 0
while i < routeCoords.count {
let lat = routeCoords[i].latitude
let lng = routeCoords[i].longitude
if i == routeCoords.count-1 {
let wktLast = " \(lng) \(lat)"
wktString += "\(wktLast)"
i += 1
}
if i >= 1 && i <= routeCoords.count-2 {
let wktMid = " \(lng) \(lat),"
wktString += "\(wktMid)"
i += 1
}
if i == 0 {
let wktFirst = "\(lng) \(lat),"
wktString += "\(wktFirst)"
i += 1
}
}
let linestring = Geometry.create("LINESTRING(\(wktString))")!
let string = linestring.buffer(width: 0.05)!
guard let shapeLine = string.mapShape() as? MKPolygon else {
preconditionFailure() // FAILURE HAPPENS HERE.
}
}
Here are links to images to see how it looks:
LineString - https://imgur.com/a/7OLPZkM
String - https://imgur.com/a/KJRfpRX
the linestring, and string values are still coming through even when shapeLine doesnt initialise so im struggling to see where im going wrong. They also seem to be formatted the same way.
I tried to google for a WKT String validator, but didnt find one, but i assume it should be ok, as i return multiple other searches with no issues. (i.e. the shapeLine returns a value)
My question is: does this look like a problem in my code, or a possible bug of the library? (i have little faith in my code!)

Can you send objects other than strings in URLQueryItems?

Ok, I am building an iMessage app and to transfer data back and forth I have to use URLQueryItems. I am working with an SKScene and need to transfer Ints, CGPoints, images, etc. Reading Apple's documentation and my own attempts it seems like you can only store strings in URLQueryItems.
As this us the only way to pass data back and forth, is there a (better) way to store other types of data? Currently I have been doing this:
func composeMessage(theScene: GameScene) {
let conversation = activeConversation
let session = conversation?.selectedMessage?.session ?? MSSession()
let layout = MSMessageTemplateLayout()
layout.caption = "Hello world!"
let message = MSMessage(session: session)
message.layout = layout
message.summaryText = "Sent Hello World message"
var components = URLComponents()
let queryItem = URLQueryItem(name: "score",value: theScene.score.description)
components.queryItems = [queryItem] //array of queryitems
message.url = components.url!
print("SENT:",message.url?.query)
conversation?.insert(message, completionHandler: nil)
}
Then on the flip side I have to convert this string back to an Int again. Doing this with CGPoints will be inefficient.. how would one pass something like a CGPoint in a URLQueryItem? Any other way than storing the x and y values as strings?
EDIT: This is how I have been receiving data from the other person and putting into their scene:
override func willBecomeActive(with conversation: MSConversation) {
// Called when the extension is about to move from the inactive to active state.
// This will happen when the extension is about to present UI.
// Use this method to configure the extension and restore previously stored state.
let val = conversation.selectedMessage?.url?.query?.description
print("GOT IT ", val)
if(val != nil)
{
scene.testTxt = val!
}
}
As you discovered, to pass data via URLQueryItem, you do have to convert everything to Strings since the information is supposed to be represented as a URL after all :) For CGPoint information, you can break the x and y values apart and send them as two separate Ints converted to String. Or, you can send it as a single String value in the form of "10,5" where 10 is the x and 5 is the y value but at the other end you would need to split the value on a comma first and then convert the resulting values back to Ints, something like this (at the other end):
let arr = cgPointValue.components(separatedBy:",")
let x = Int(arr[0])
let y = Int(arr[1])
For other types of data, you'd have to follow a similar tactic where you convert the values to String in some fashion. For images, if you have the image in your resources, you should be able to get away with passing just the name or an identifying number. For external images, a URL (or part of one if the images all come from the same server) should work. Otherwise, you might have to look at base64 encoding the image data or something if you use URLQueryItem but if you come to that point, you might want to look at what you are trying to achieve and if perhaps there is a better way to do it since large images could result in a lot of data being sent and I'm not sure if iMessage apps even support that. So you might want to look into limitations in the iMessage app data passing as well.
Hope this helps :)
You can use iMessageDataKit library for storing key-value pairs in your MSMessage objects. It makes setting and getting data really easy and straightforward like:
let message: MSMessage = MSMessage()
message.md.set(value: 7, forKey: "moveCount")
message.md.set(value: "john", forKey: "username")
message.md.set(values: [15.2, 70.1], forKey: "startPoint")
message.md.set(values: [20, 20], forKey: "boxSize")
if let moveCount = message.md.integer(forKey: "moveCount") {
print(moveCount)
}
if let username = message.md.string(forKey: "username") {
print(username)
}
if let startPoint = message.md.values(forKey: "startPoint") {
print("x: \(startPoint[0])")
print("y: \(startPoint[1])")
}
if let boxSize = message.md.values(forKey: "boxSize") {
let size = CGSize(width: CGFloat(boxSize[0] as? Float ?? 0),
height: CGFloat(boxSize[1] as? Float ?? 0))
print("box size: \(size)")
}
(Disclaimer: I'm the author of iMessageDataKit)

Why in swift are variables option in a function but not in playground

I am puzzled. I need to compare product date codes. they look like 12-34-56. I wrote some code to break the parts up and compare them. this code works fin in the play ground. But when i make it a function in a view controller values come up NIL and i get a lot of "Optional("12-34-56")" values when printed to the log or viewed in a break. I tried unwrapping in many locations but nothing takes.? don't be confused by the variables date and month because they are not product codes can have 90 days and 90 months depending on the production machine used.
func compaireSerial(oldNumIn: NSString, newNumIn: String) -> Bool {
// take the parts of the number and compare the pics on at a time.
// Set up the old Num in chunks
let oldNum = NSString(string: oldNumIn)
let oldMonth = Int(oldNum.substringToIndex(2))
let oldDay = Int(oldNum.substringWithRange(NSRange(location: 3, length: 2)))
let oldYear = Int(oldNum.substringFromIndex(6))
print(oldMonth,oldDay, oldYear)
// Set up the new Num in chunks
let newNum = NSString(string: newNumIn)
let newMonth = Int(newNum.substringToIndex(2))
let newDay = Int(newNum.substringWithRange(NSRange(location: 3, length: 2)))
let newYear = Int(newNum.substringFromIndex(6))
print(newMonth, newDay, newYear)
// LETS Do the IF comparison steps.
if oldYear < newYear {
return true
} else if oldMonth < newMonth {
return true
} else if oldDay < newDay {
return true
} else {
return false
}
}
May thanks to any one. Im totally stumped
All Int() initializers with String parameters return always an optional Int.
The realtime result column in a Playground doesn't indicate the optional but printing it does.
let twentyTwo = Int("22") | 22
print(twentyTwo) | "Optional(22)\n"
I don't see how i can delete my question so ill post this to let others know it is fixed. Turns out the auction works okay but the NSUserDefaults value coming in was optional. So i was feeding the optional in. After unwrapping the NSUser value all works.