How do I select text from a scanned photo?

How do I select text from a scanned photo? - flutter

I'm in the process of writing an app with which you can take a picture of a text and then the text is scanned and transferred to a variable. I've done that with the plugin firebase_ml_vision and everything works.
The problem I have is that I want to decide for myself which text is scanned from the photo. For example, this could work in such a way that each word and number is automatically given a frame and the user then taps the words that are transferred to the variable. This also works with Google translator (see screenshot) but unfortunately I haven't found anything yet how to do it... Do you know how it works?

The firebase-mlkit's text recognition API returns a frame as well as cornerPoints for each of the VisionTextBlock, VisionTextLine, and VisionTextElement:
https://firebase.google.com/docs/reference/swift/firebasemlvision/api/reference/Classes/VisionTextBlock
They should help you to select the words, lines, or text blocks.

Related

how to pass the text of my speech recognition to my <ion-searchbar> in Ionic

I hope you can help me with this, you will see on the one hand I have my speech recognition that works well, and what I speak shows me well and keeps it well, the problem is that I don't know how I can pass the value obtained to me due to later search.
It is possible to say that the is occupying it with its property of (ionChange) = event ($ event).
This searchbar filters a list of items where I pass them through a pipe.

Manage image deletion in a WYSIWYG editor

When an HTML editor is used and images are added from the local computer, they are uploaded to a server and a link is obtained to put it in the image src attribute. What happens when the img element is removed from the editor? How would the image be deleted from the server? In this case I understand that the image deletion event could be detected and then call a service to delete it. But what happens if the user adds a new image and leaves the page? How would it be deleted in these cases?
In both cases, if the deletion of the images is not managed, it could happen that the server is filled with unused images. How do you usually solve this problem? How is the proper way to solve this?

That's a nice question there. And yeah, for sure the server would fill up with unused images in some point. I'm not an expert on this but I'll try to suggest something so I can implement it too in my WYSIWYG editor haha. I suppose you have a custom modal for the insertion of the image. Upon clicking the button you could save the image link to an array and at SAVE || on leaving the document edit || on popstate event you could make a regex that checks the innerHTML of the editor for the specific SRC. If is not found then you could push an ajax request with the image name so you can deleteit. For sure there are more efficient and complex ways to achieve that. Such as creating text ranges and track elements on keydown - Backspace(8) / Deletekey(46).
An other way is that you could track the images that are in use. When the document is saved regex out the images in the document, push them to a db table and periodically make a check from the back end so you can delete those that are not in use.
I don't know if my suggestions are helpful or not. I just saw an interesting subject so I jumped in. Cheers mate.

iPhone sdk how to retrieve the Table of Contents from PDF file?

I am using VFR reader to display my pdf's. I need to extract the Table of Contents on a button click and display it in a tableview then it should lead to the respective pages while tapping on each.I googled for this and got these links
Create a table of contents from a pdf file
http://mobile.tutsplus.com/tutorials/iphone/ios-sdk-adding-a-table-of-contents-to-an-ipad-reader/
And i came to know that, to get TOC we must use "CGPDFDocumentGetCatalog(pdf doc)". But in my reader that "CGPDFDocumentGetCatalog(pdf doc)" is not at all getting called. Now how can i extract my TOC from my pdf file? Kindly help me out of this. I am struggling on this for a week. Thanks in advance.

Unfortunately I think the two answers you refer to point to different implementation strategies, which are both possibly valid but are different.
The first question is what the PDF files you have and want to show in your app look like. There is no such thing as a predefined TOC object in a PDF file, there are simply different ways to emulate this. The two most common ways are:
A) Bookmarks, which are a way to add little pieces of text to a structured tree, where each piece of text points to a specific location in the PDF file. These bookmarks can be added in the design application or later (there are specific tools to do so) and they can implement whatever structure.
B) Your PDF file might contain something that looks like a classic TOC from a book, which is basically just text on the opening pages, optionally with hyperlinks to specific locations in the book.
The second link you refer to shows how to create user interface where you can show the TOC in. The remaining question then is to figure out what items you want to display in the TOC window. In this second link you point to, the solution presented is to provide hard-coded items specific to one specific book. Of course this approach is not very useful when you want to display just any book.
So the question you are left with is how to figure out what items to display and where they link to.
If you consider my possibility A) above: a PDF file with bookmarks, the answer could be relatively simple. Answer 1 you point to explains how to look at the different structures inside a PDF file - bookmarks are simply such a structure (Defined in section 12.3 of the PDF specification: http://www.adobe.com/content/dam/Adobe/en/devnet/acrobat/pdfs/PDF32000_2008.pdf)
This means you could use the techniques shown there to walk the different objects in the PDF file, and find each bookmark. The bookmark will give you the text to display and the actual location in the PDF file that text should jump to when clicked.
If you consider my possibility B) above: a PDF file without bookmarks but a classic TOC, this will be much harder to solve. Such table of contents are simply text on one or more pages, optionally with hyperlinks. Of course you could try to find all text on these pages (if you can figure out on which page the TOC starts and ends), but you'd then also have to figure out where that item links to. If there are no hyperlinks involved, that would be a daunting task.
So your first question should be how generic you want to solve this problem. Do you know which PDF files you'll want to display? Can you devise a TOC for these files yourself (as in your solution 2)? If not, can you be sure all PDF files contain bookmarks? The answer to those questions will largely determine the rest of your strategy...

Text field re-enter word

Good afternoon. I would like to know if there is a method that makes the text-field capable of "remember" what the user wrote the previous times, so when the user writes the first letter a list with the proposed words will appear!! :) Thanks!

It can possible to show the list of values in tableview. you can use autocomplete.
Read this article. and also with source code. This will help you.
Sample image

Not that easy, you will have to do some modifications, for example you will need to store in an NSMutalbeArray all the value inserted, then you will have to present a table when the user starts to edit the UITextField this will change as the user writes

as your app runs, there will be many text that you need to remember/store that it will come to the point that your text that you have to remember will grow to size. you can store your text to a local database (sqlite), then as the user writes on the text field, do a search query and return the data.
unless you dont want to remember the text entered when the user used and exited/quit your app.

get PDF page title

Is it possible to get page title via iText?
The PdfTextExtractor returns all text from the page but I don't know what line is title. Also, title may contain more than one line
I don't know coordinates of title thus I can't use RegionTextRenderFilter
I can try to analyze the font size and take the line(s) with biggest font but TextRenderInfo doesn't provide public access to gs (private final GraphicsState gs)
Any other ideas?

Pages within a PDF don't have titles, they just have text that happens to be bold or in a large font and appears in an area you consider to be "more top" than other pieces of text. It sounds like you know this already, I just needed to be clear on this.
See my post here which shows how to get font information by subclassing ITextExtractionStrategy. My sample targets iTextSharp which is the .Net port of iText but they match pretty much feature-to-feature. The biggest differences is that Java uses getXXX and setXXX whereas .Net just uses XXX for both. Otherwise everything should port just fine.
The moral of the story is that you are going to have to write some arbitrary rules defining what you think of as a "title" and then parse based on those rules.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse