How to create and search in a language dictionary - iphone

I'm working on an Xcode project.
I need to add a dictionary of words, like English dictionary for example. Then, when required, I need to find some words in it.
I have been thinking about creating an array from a file to store all the words, then sort it in some way I don't know and at last do a binary search or something like this to find the word I'm looking for.
Do you think is a good idea?
Do you have any clue to sort the array?
Is binary search a good idea?

The best way would be to use a database (e.g. SQL-based), or CoreData to store it. That is unless you require the words to be entirely in the memory, which would be the case if you often listed all of them. But even then it can be solved by lazy-loading.

Related

Creating a dictionary from another dictionary keeping the structure intact

I have a big dictionary with deep hierarchy in it... I want to read it and create another dictionary with same structure but with some modifications while I am reading the source dictionary.
Modifications are like if the keyName is "server" then remove that key, if the keyName is "notification" then alter its value.
What is the best way to do this keeping the structure of source dictionary intact.
Read the Deep Copies section of Collections Programming Topics. In fact, you should really read the entire document. You'll end up reading it all at some point anyway (or worse, having us repeatedly point you there), and it's only a few dozen pages.
I know this probably isn't the answer you were looking for, but the alternative is for someone here to code up a method that deep copies dictionaries for you. I'm not going to do that. If you get stuck on something specific, by all means, ask here.

Check if NSString contains a common first name on iPhone

I am wondering what the best approach would be to check whether or not a common first name is contained within an NSString on an iPhone app. I've got a sorted flat text file of ~5500 common American first names delimited by new lines. The NSString I am searching within for a name is not very long, most likely the size of a normal sentence.
My original plan was to load the sorted list into memory and then iterate over every word in the NSString performing a binary search of the list to determine whether or not that word was a common name.
Am I better off trying to put this name list into CoreData or a SQLite table and performing a query with that? My understanding is I would not have to load the entire list into memory if I went that route.
I am guessing this situation is a common problem with word dictionaries for word games, so I'm just wondering what the best practice is for fast lookups. Thanks!
SQLite sounds ideal for this in terms of both speed of lookup and minimising memory usage. It would also make it potentially possible to update the first name list over the internet if so desired.
Using Core Data (which is in effect an elabourate wrapper around SQLite) would be overkill in this instance, especially as you don't require the ORM like capabilities.
An NSSet might be useful as well. Dave DeLong's answer for another question demonstrates that NSSets have constant look-up times, i.e. O(1).
Load your names into an NSMutableSet one by one. This will be the slowest part but will only need to be done once. If your file is a simple line-delimited file of names, it may be easier to use the standard C library for reading the file, since line-by-line input is not well-supported by Cocoa.
After that, simply use [nameSet containsObject:name] to check whether it is in the list.
A couple of drawbacks to this approach:
The name you want to test must be in the same case as the name in the set, that is “paul” and “Paul” are different strings. You can circumvent this by converting all names to lowercase before inserting them into the set, and then also converting the name you want to check into lowercase before checking it against the set.
It might be easier just to go with the already-accepted answer.

How can I index a bunch of files in Perl?

I'm trying to clean up a database by first finding unreferenced objects. I have extracted all the database objects into a list, and all the ddl code into files, I also have all the Java source code for the project.
Basically what I want to do (preferably in Perl as it's the scripting language that I'm most familiar with) is to somehow index the contents of all the extracted database ddl and Java files (to speed up the search), step through the database object list and then search through all the files (using the index) to see if those objects are referenced anywhere and create a report.
If you could point me in the right direction to find something that indexes all those files in a way that I can search them (preferably in Perl) I would greatly appreciate it.
The key here is to be able to do this programatically, not manually (using something like Google desktop search).
Break the task down into its steps and start at the beginning. First, what does a record look like, and what information in it connects it to another record? Parse that record, store its unique identifier and a list of the things it references.
Once you have that list, invert it. For each reference, create a list of the objects referenced. Count them by their identifier. You should be able to get the ones whose count is zero.
That's a very general answer, but you asked a very general question. If you are having trouble, break it down into just one of those steps and ask a more specific question, supplying sample data and the code you've tried so far.
Good luck,
An interesting module you might use to do what you want is KinoSearch, it provides you the kind of indexing you said to be looking for. Then you can go through the object identifiers and check if there are references to it.

MongoDB: What's a good way to get a list of all unique tags?

What's the best way to keep track of unique tags for a collection of documents millions of items large? The normal way of doing tagging seems to be indexing multikeys. I will frequently need to get all the unique keys, though. I don't have access to mongodb's new "distinct" command, either, since my driver, erlmongo, doesn't seem to implement it, yet.
Even if your driver doesn't implement distinct, you can implement it yourself. In JavaScript (sorry, I don't know Erlang, but it should translate pretty directly) can say:
result = db.$cmd.findOne({"distinct" : "collection_name", "key" : "tags"})
So, that is: you do a findOne on the "$cmd" collection of whatever database you're using. Pass it the collection name and the key you want to run distinct on.
If you ever need a command your driver doesn't provide a helper for, you can look at http://www.mongodb.org/display/DOCS/List+of+Database+Commands for a somewhat complete list of database commands.
I know this is an old question, but I had the same issue and could not find a real solution in PHP for it.
So I came up with this:
http://snipplr.com/view/59334/list-of-keys-used-in-mongodb-collection/
John, you may find it useful to use Variety, an open source tool for analyzing a collection's schema: https://github.com/jamescropcho/variety
Perhaps you could run Variety every N hours in the background, and query the newly-created varietyResults database to retrieve a listing of unique keys which begin with a given string (i.e. are descendants of a specific parent).
Let me know if you have any questions, or need additional advice.
Good luck!

Help with dictionaries, arrays and plists on iPhone

I would appreciate some help with something I working on and have not done before now and having some proplems because I don't think I understand exactly how to do this. What I'm wanting to do i'm sure is simple to most all of you and will be to me as soon as I do it the first time correctly....anyway.... I have a tableview that I'm needing to populate with two things, a username and a number with a count of items (the username could be a primary key). Currently I have a tableview populating and editable with an array....no problem....I know how to do that.
The two parts I need help with understanding is to:
read a plist with these two values into a dictionary, and read them into two different arrays that I can use with my tables.
Save the arrays back to the dictionary and then back to a plist.
I think I'm getting the most confused with how to store these two things in dictonary keys and values. I've looked that over but just not "getting it".
I would appreciate some short code examples of how to do this or a better way to accomplish the same thing.
As always, thanks for your awesome help....
You can use NSArray method writeToFile: atomically: to dump your data into a file, you can then use initWithContentOfFile to retrieve the information from t hat file just as you dumped it previosly. I believe if you have dictionaries in your array you should be able to get them back this way. You can always use core data as well for storage if you find your structures to store are getting complex and dumping the in a file and getting them back to recreate some o bjects is becoming messy.
The approach that would perhaps be the simplest is to store the data as an array of dictionaries. This has the issue that recreating the array from a plist with mutable leaves is convoluted at best.
But if you can tolerate the performance hit of replacing dictionaries when updating the list instead of modifying them, it might definitely be the simplest course of action.
This also has the added benefit that your datasource only needs to deal with one array, and that the whole shebang would be Key-Value Compliant, which might further simplify your code.