How do I search a large text file (book)? [closed] - swift

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
I have tried to find the information in books and many places on the net all to no end. What I want to do is an app which is basically a book. I want to add a search function to it which is in an ideal world the ibooks search. The other thing I am not clear on is where do I put the file (book) which is to be searched. I hope this makes things a bit more clear.

There is nothing built into the swift programming language to do it. You need to create your own index from the book text in order to search it efficiently.
To create the index you first remove the stopwords -- words are very frequent and is not supposed to have search result like "the", "is", etc. (you can find a sample list of stopwords here).
Next step would be stemming. You can read more about it here. It is essentially converting words to their stem in order to find them when different derivation of them are searched. For example when one searches for run, you show results for ran too.
After that you create an index which could be a simple dictionary of .
To create the index you traverse the processed text (the stemmed text with no stopwords), and add every word to your index. If the word is already present in the index, you simply add the new occurrence to the index and if it is not there, you add it to the dictionary.
The above process does not need to be done necessarily using swift and you might be able to find programs that do this for you and you simply add the resulting index to your ios program.

Related

Firestore write data missing [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 1 year ago.
Improve this question
I have this code, and it is supposed to write to Firestore. However, when it does the function, in the back end, it shows highlighted as red and then disappears.
db.collection("jokes").document("Dad Jokes").setData(["\(dadJokeNum + 1)": Joke.text!])
Please help.
Using SetData without Merge will delete existing values - it is suggested that if the document exists prior, you should always be using merge: true. Firestore also has a limited 1 write per second each, you should be managing writes together as it is most likely similar writes will conflict and potentially resolve out of order.

Improving Search Algorithem using Regex in CoreData? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 1 year ago.
Improve this question
I'm transitioning from a SQLite implementation to CoreData.
In SQLite, searches are fairly limited. In a typical search, for a string like "card", I would want to know if any members of a set of letters like [n,l,j,x], would be a valid part of a word, or a whole word, in a stored dictionary of strings.
So, in the example above, I would have to look for "nard","lard","jard","xard" and then repeat that process for each subsequent position in the string: "cnrd","clrd","cjrd","cxrd".
This is slightly controlled by the fact that I only need a single match per position in the target string to "qualify" it, so I can search for "nard","cnrd","cand","carn", and if I get a match at any point, I can mark that point in the target word as qualified, and focus on the other targets.
Thus, if I got a match at "nard" and no other matches, the next loop might check "clrd","cald","carl", and so on. If I got matches at "nard","cand", the next loop would be "clrd","carl" : you get the idea.
Does CoreData, which I know under the hood is just SQLite anyway, offer any more advanced features that would allow me to improve the default algorithms I've used, perhaps using regex? Can a pattern like \^{3}[nljx]\ be somehow used?
I'm not at the point where I'm writing the code to experiment in this direction, so anything people can point me to is great.
When you use a SQLite store with Core Data, predicates are translated into SQLite code and executed in SQLite. Predicates therefore have SQLite's limits on what's possible. Core Data can use other store types with different capabilities and limitations-- for example, you can use a predicate that's any arbitrary block of code, but the entire persistent store gets loaded into memory all the time. Whether one of those would work for you depends on how much data you have.
Yes, you can use a predicate with NSMatchesPredicateOperatorType to do regex searching. SQLite doesn't support regexes directly, but Core Data registers a custom NSCoreDataMatches function to do the work without bringing everything into memory.

RESTful API - Get last of an element [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 years ago.
Improve this question
What's the best practice for getting the last added element (let's say we know that because of a created_at field on the resource).
Should it be a call to the get all with max results on 1 like:
GET ../rest/v1/article?page=0&size=1&order=created_at,desc
and will return an array of one element
or maybe an "special" call like:
GET ../rest/v1/article/last
and will return an element.
I am looking for a best practice if there's one pattern for this.
Thanks!
I'm not a RESTful expert, but in my opinion the first solution seems the best.
The second is more practical, but routes are often associated with resources, the addition of a "last", especially preceded by a "/" seems strange to me.
In addition, API users usually use the sorting parameters, and what about users which need 10 last elements ?
If you add something after ../rest/v1/article, it must be an ID for one particular element, a sub-resource, or for actions that are outside the CRUD like ../rest/v1/article/:id/subscription.
Both URLs are RESTful and identify a Resource. Of course the first would return a collection containing a single element while the second would return this element directly.
The first form will be automatically supported if you support paging and sorting at all.
You write
or maybe an "special" call
I don't see this as an 'or', it should be an 'and'. The second form is optional and it could be helpful. If you have an URL pattern like
GET ../rest/v1/article/{id}
it is easy to implement logic that can distinguish normal IDs like, for example, 123 or A567 from special IDs like 'last`.

Swift - Set vs Array [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 6 years ago.
Improve this question
On one hand I want an ordered collection, on the other hand I want every item in the collection to appear only once.
I can either use an array and sort it every time I insert an item - and insert only if not in the array.
or use a Set data structure and sort it every time i query the data
Does someone have better solution?
There are several third-party libraries implementing an ordered set in Swift, so you could check them out.
Also, you could write your own implementation of an ordered set (you can base it on an existing one) if it is not an overkill for your task. The way you choose really depends on your app.
And in the end, you could use one of two ways that you proposed: using a built-in array or a set. In order to choose between them, take a look at your app: what action will be performed more often? Getting an access to elements in order (use array then) or addition/deletion of existing elements (probably, the set is the way to go).
This part was edited based on comments below
If you go for an array, note, that a built-in contains for arrays will not know that an array is sorted, so it will probably be O(N), not O(log(N)). So you should either write a custom replacement for the contains method, or (this is, once again probably a better way), write a custom collection class that implements contains the right way (however, since contains is a protocol extension method of SequenceType, my knowledge of Swift, I'm afraid, is not good enough yet to tell you how to do it properly, maybe someone else will).
UPDATE (based on your comment to your question):
I believe, in your particular case (a chat app) array is superior. You only have to sort old messages once, and you will not probably try to add very old messages once again, you only have to make sure you don't add new messages twice (it is implementation-dependent though, so you know better, I'm just assuming). So you only have to check that the last messages in your old array do not overlap with first messages in the array that you add. Sort of :)

What is the best reversable hash algorithm for a URL? (near-Zero collision!) [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
Suppose I have one URL.
http://google.com ...I'd like to turn it into a hash. S3jvZLSDK.
Then take this hash and reverse it! into http://google.com.
To you geeks out there--what is the BEST method to do this for near-ZERO collision?
If you can reverse it, then by definition it isn't a hash. It's an encoding. Any encoding will have zero collisions (otherwise it wouldn't be able to accurately reverse it).
A common encoding for this purpose is base64.
The whole point of a hash is that it isn't reversible (short of brute-force, trying every possible input until the output matches).
Is this for a URL shortening service? The usual way of doing this is to store http://google.com in a database under a unique key, and when someone queries with that key (which could be ‘S3jvZLSDK’ if you really like random strings, but could just as easily be ‘1’) you spit the value you remembered back out again.
Are you trying to write something like a URL shortener? If so, just generate a random string, then use a big hash table, relational database (with indexes), etc. to relate keys (S3jvZLSDK) to URLs (google.com) and vice versa.
That will give you an easy solution for handling collisions (key already exists, URL already exists) and fast lookups.
There is no way to get near-zero collisions, but you can make collisions arbitrarily unlikely if you use a cryptographic hash with a large output size. The SHA-2 family contains a version with a 512 bit key; that should do you.