Improving Search Algorithem using Regex in CoreData? [closed] - swift

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 1 year ago.
Improve this question
I'm transitioning from a SQLite implementation to CoreData.
In SQLite, searches are fairly limited. In a typical search, for a string like "card", I would want to know if any members of a set of letters like [n,l,j,x], would be a valid part of a word, or a whole word, in a stored dictionary of strings.
So, in the example above, I would have to look for "nard","lard","jard","xard" and then repeat that process for each subsequent position in the string: "cnrd","clrd","cjrd","cxrd".
This is slightly controlled by the fact that I only need a single match per position in the target string to "qualify" it, so I can search for "nard","cnrd","cand","carn", and if I get a match at any point, I can mark that point in the target word as qualified, and focus on the other targets.
Thus, if I got a match at "nard" and no other matches, the next loop might check "clrd","cald","carl", and so on. If I got matches at "nard","cand", the next loop would be "clrd","carl" : you get the idea.
Does CoreData, which I know under the hood is just SQLite anyway, offer any more advanced features that would allow me to improve the default algorithms I've used, perhaps using regex? Can a pattern like \^{3}[nljx]\ be somehow used?
I'm not at the point where I'm writing the code to experiment in this direction, so anything people can point me to is great.

When you use a SQLite store with Core Data, predicates are translated into SQLite code and executed in SQLite. Predicates therefore have SQLite's limits on what's possible. Core Data can use other store types with different capabilities and limitations-- for example, you can use a predicate that's any arbitrary block of code, but the entire persistent store gets loaded into memory all the time. Whether one of those would work for you depends on how much data you have.

Yes, you can use a predicate with NSMatchesPredicateOperatorType to do regex searching. SQLite doesn't support regexes directly, but Core Data registers a custom NSCoreDataMatches function to do the work without bringing everything into memory.

Related

RESTful API - Get last of an element [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 years ago.
Improve this question
What's the best practice for getting the last added element (let's say we know that because of a created_at field on the resource).
Should it be a call to the get all with max results on 1 like:
GET ../rest/v1/article?page=0&size=1&order=created_at,desc
and will return an array of one element
or maybe an "special" call like:
GET ../rest/v1/article/last
and will return an element.
I am looking for a best practice if there's one pattern for this.
Thanks!
I'm not a RESTful expert, but in my opinion the first solution seems the best.
The second is more practical, but routes are often associated with resources, the addition of a "last", especially preceded by a "/" seems strange to me.
In addition, API users usually use the sorting parameters, and what about users which need 10 last elements ?
If you add something after ../rest/v1/article, it must be an ID for one particular element, a sub-resource, or for actions that are outside the CRUD like ../rest/v1/article/:id/subscription.
Both URLs are RESTful and identify a Resource. Of course the first would return a collection containing a single element while the second would return this element directly.
The first form will be automatically supported if you support paging and sorting at all.
You write
or maybe an "special" call
I don't see this as an 'or', it should be an 'and'. The second form is optional and it could be helpful. If you have an URL pattern like
GET ../rest/v1/article/{id}
it is easy to implement logic that can distinguish normal IDs like, for example, 123 or A567 from special IDs like 'last`.

Swift - Set vs Array [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 6 years ago.
Improve this question
On one hand I want an ordered collection, on the other hand I want every item in the collection to appear only once.
I can either use an array and sort it every time I insert an item - and insert only if not in the array.
or use a Set data structure and sort it every time i query the data
Does someone have better solution?
There are several third-party libraries implementing an ordered set in Swift, so you could check them out.
Also, you could write your own implementation of an ordered set (you can base it on an existing one) if it is not an overkill for your task. The way you choose really depends on your app.
And in the end, you could use one of two ways that you proposed: using a built-in array or a set. In order to choose between them, take a look at your app: what action will be performed more often? Getting an access to elements in order (use array then) or addition/deletion of existing elements (probably, the set is the way to go).
This part was edited based on comments below
If you go for an array, note, that a built-in contains for arrays will not know that an array is sorted, so it will probably be O(N), not O(log(N)). So you should either write a custom replacement for the contains method, or (this is, once again probably a better way), write a custom collection class that implements contains the right way (however, since contains is a protocol extension method of SequenceType, my knowledge of Swift, I'm afraid, is not good enough yet to tell you how to do it properly, maybe someone else will).
UPDATE (based on your comment to your question):
I believe, in your particular case (a chat app) array is superior. You only have to sort old messages once, and you will not probably try to add very old messages once again, you only have to make sure you don't add new messages twice (it is implementation-dependent though, so you know better, I'm just assuming). So you only have to check that the last messages in your old array do not overlap with first messages in the array that you add. Sort of :)

How do I search a large text file (book)? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
I have tried to find the information in books and many places on the net all to no end. What I want to do is an app which is basically a book. I want to add a search function to it which is in an ideal world the ibooks search. The other thing I am not clear on is where do I put the file (book) which is to be searched. I hope this makes things a bit more clear.
There is nothing built into the swift programming language to do it. You need to create your own index from the book text in order to search it efficiently.
To create the index you first remove the stopwords -- words are very frequent and is not supposed to have search result like "the", "is", etc. (you can find a sample list of stopwords here).
Next step would be stemming. You can read more about it here. It is essentially converting words to their stem in order to find them when different derivation of them are searched. For example when one searches for run, you show results for ran too.
After that you create an index which could be a simple dictionary of .
To create the index you traverse the processed text (the stemmed text with no stopwords), and add every word to your index. If the word is already present in the index, you simply add the new occurrence to the index and if it is not there, you add it to the dictionary.
The above process does not need to be done necessarily using swift and you might be able to find programs that do this for you and you simply add the resulting index to your ios program.

Is there a rule of thumb for how granular an object should be in OO programming? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
I'm in school learning OO programming, and for the next few months, every assignment involves dice games and word games like jumble and hangman. Each assignment has us creating a new class for these variables; HangmanWordArray, JumbleWordArray, etc. In the interest of reusability, I want to create a class (or series of classes) that can be re-used for my assignments. I'm having a hard time visualizing what that would look like, and hope my question makes sense...
Let's assume that the class(es) will contain properties with accessors and mutators, and methods to return the various objects...a word, a letter, a die roll. Is there a rule of thumb for how to organize these classes?
Is it best to keep one object per class? Or group all the objects in a single class because they're all related as "stuff I need for assignments?" Or group by data type, so all the numeric objects are in one class and all the strings in another?
I guess I'm grappling with how a programmer, in the real world, makes decisions about how to group objects within a class or series of classes, and some of the questions and thought processes I should be using to frame this type of design scenario.
Realistically, it varies from project to project. There is no guaranteed 'right way' to group anything; it all depends on your needs. What it comes down to is manageability, meaning how easily you can read and update old code. If you can contain all your games in a single 'games' class, then there's nothing wrong with doing it. However, if your games are very complicated with many subs and variables, perhaps moving them all to their own class would be easier to manage.
That being said, there are ways to logically group items. For instance if you have a lot of solo functions that are used for manipulation (char to string, string to int, html encode/decode, etc.), you may decide to create a 'helper functions' class to hold them all. Similarly, if your application uses a database connection, you may create a class to hold and manage a shared connection as well as have methods for getting query results and executing non-queries.
Some people try to break things down to much. For example, instead of having the database core mentioned above, they might create one class to create and manage the database connection. They will create another class to then use the connection class to handle queries. Not that this method won't work, but it may become very difficult to manage when items are split up too small.
Without knowing exactly what you are doing, there's no way to tell you how to do it. If you reuse the same methods in each project, then perhaps you can place them somewhere that they can be shared. The best way I found to figuring out what works best is just to try it out and see how it responds!
What I see people doing is breaking down their objects and methods until each method is just a handful of code; if any method exceeds a page of code, they will try to break down the object structure further in order to shorten things up.
I personally have no objection to long methods, as long as they are readable. I think a "one-page limit" tends to create too much granularity, and risks more confusion rather than less. But this seems to be the current fashion.
Just reporting what I'm seeing in the wild.

What is the best reversable hash algorithm for a URL? (near-Zero collision!) [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
Suppose I have one URL.
http://google.com ...I'd like to turn it into a hash. S3jvZLSDK.
Then take this hash and reverse it! into http://google.com.
To you geeks out there--what is the BEST method to do this for near-ZERO collision?
If you can reverse it, then by definition it isn't a hash. It's an encoding. Any encoding will have zero collisions (otherwise it wouldn't be able to accurately reverse it).
A common encoding for this purpose is base64.
The whole point of a hash is that it isn't reversible (short of brute-force, trying every possible input until the output matches).
Is this for a URL shortening service? The usual way of doing this is to store http://google.com in a database under a unique key, and when someone queries with that key (which could be ‘S3jvZLSDK’ if you really like random strings, but could just as easily be ‘1’) you spit the value you remembered back out again.
Are you trying to write something like a URL shortener? If so, just generate a random string, then use a big hash table, relational database (with indexes), etc. to relate keys (S3jvZLSDK) to URLs (google.com) and vice versa.
That will give you an easy solution for handling collisions (key already exists, URL already exists) and fast lookups.
There is no way to get near-zero collisions, but you can make collisions arbitrarily unlikely if you use a cryptographic hash with a large output size. The SHA-2 family contains a version with a 512 bit key; that should do you.