Word database for iOS custom keyboard - swift

I am writing a custom keyboard for an Asian language and I have a word database with over half a million words. I use Realm for now and use it to give word suggestions. When users type the first few letters keyboard will search the DB and provide words based on priority values given to each word. But this seems inefficient compared to other keyboards in the App Store, I can't find any concrete way or idea on this issue. Anyone can point me in the direction to increase the efficiency of word searching with a custom iOS keyboard.
I haven't tried CoreData but generally, the realm is considered faster than CoreData.

First, type of storage: maybe consider using plist files or .text files wouldn't be bad.
and saving words in a sorted way in ASCII mode would be great.
Second: you need an algorithm to break into a group of words so fast. you can do this by saving the ASCII code.
Here is an example of a binary search algorithm :
Please search around about different algorithms.

Related

Recomendations for Light Summary Algorithm on ionic

I'm trying to upload big texts to my database, on this case cant be text files because are generated dynamically on each use of the app. And this can create like hundreds of files if I use that method.
The main problem is that MySQL has a maximum number of characters quantity allowed.
I saw that some people recommend MD5 or SHA to reduce big text to a short string. But this only works in one way.
Is there another way to do this, or with other Light Summary Algorithm?

Huge dictionary with random selections: iPhone Dev

i am making an app that will have a very large dictionary of words i choose (so that the words aren't too complicated) and i want it to randomly choose the words. I dont have a problem with the randomly selecting words, but what would be the best way to store all these words, and how? I feel like using an NSMutable array would take up too much memory creating thousands of objects, so what else can i use... Thanks for you help
Core data!, is your best option, or to manage your own SQLite
check a core data tutorial
or a SQLite on iOS tutorial
If all all the application needs to do is access words at random (so no key based queries, or updates), an alternative to core data and SQLite would be to just fseek() to a random location in a flat text file of newline delimited words and then read out the next complete word, possibly with fscanf(dict,"%s\n%s\n",partial_word,full_word).
Deal with EOF by retrying with a different random number, or limit the fseek() range to never hit last word in file.
An issue with the above outline is words won't be uniformly selected. There is a bias towards words following long words. Discarding strlen(partial_word) (or a larger random number) of words before keeping a word might help the distribution if it is a concern.

will this implementation affects the user experience

I am assigned with the task to implement a functionality to shorten text typed text
For example , I type text like "you" when I highlight it and it has to change like "u"
I will have table which has list of words which has longer version of text and with text to be replaced.so whenever a user types word and highlights it i want to query the db for match , if a match is found I want to replace the word with the shortened word.
This is not my idea and am being assigned to this implementation.
I think this functionality will down the speed of the app responsiveness. And it has some disadvantages over the user friendliness of the application.
So I'd like to hear your opinions on what are the disadvantages it has and how can I implement this in a better manner. Or is that ok to have this kind of functionlity? Won't it affect the app speed?
It's hard to imagine that you'll see a noticeable decrease in performance. Even the iPhone 3G's processor runs at around 400MHz, and someone typing really fast on an iPhone might get four or five characters entered in a second. A simple implementation of the sort of thing you're talking about would involve a lookup in a data structure such as a dictionary, tree, or database, and you ought to be able to do that pretty quickly.
Why not try it? Implement the simplest thing you can think of and measure its performance. For the purpose of measuring, you might want to use a loop to repeatedly look up words from an array. Count the number of lookups you can do in, say, 20 seconds, and divide by 20 to get the average number per second.
i dont think it will take a lot of performance, anyway you can use the profiler to check how long every method is taking, as for the functionality, i believe you should give the user the opportunity to "undo" and keep his own word (same as apple's auto correction)

Is it possible to create a dictionary in PostgreSQL dynamically?

I'm new to full text search in PostgreSQL and discovered things like Dictionaries and stop words in it.
I have a table with a lot of words from many texts. I want to create my own dictionary and put the first 30 most frequent words as stop words.
Is it possible to do this at runtime?
Anything is possible. Not everything is feasible.
What you can do without too much difficulty is create a stored procedure in a language like pl/perlU which breaks up the words, analyzes them, and writes stop words to a file. You'd have to do a pg_ctl reload in order to ensure that the new stop words file was used. However I don't think you can dynamically determine stop words at search time because if you search through the strings to look for stop words, there isn't much point in then having full text searching.
The actual stop words file is just a new-line separated list of words. Also I think you'd need to start with a template for stemming purposes. Trying to dynamically discover stemming would be both difficult and error-prone.

Full Text Searching in Apple's Core Data Framework

I would like to implement a full text search in an iPhone application. I have data stored in an sqlite database that I access via the Core Data framework. Just using predicates and ORing a bunch of "contains[cd]" phrases for every search word and column does not work well at all.
What have you done that seems to work well?
We have FTS3 working very nicely on 150,000+ records. We are getting subsecond query times returning over 200 results on a single keyword query.
Presently the only way to get Sqlite FTS3 working on the iPhone is to compile your own binary and link it to your project. To my knowledge, the binary included in your own project will not work with Core Data. Perhaps Apple will turn on the FTS3 compiler option in a future release?
You can still link in your own Sqlite FTS3 binary and use it just for full text searches. This would be very similar to the way Sphinx or Lucene are used in Web App environments. Note you will still have to update the search index at some point to keep synchronicity with the Core Data stores.
Good luck !!
I assume that by "does not work well" you mean 'performs badly'. Full-text search is always relatively slow, especially in memory or space constrained environments. You may be able to speed things up by making sure the attributes you're searching against are indexed and using BEGINSWITH[cd] instead of CONTAINS[cd]. My recollection (can't find the cocoa-dev post at this time) is that SQLite will use the index for prefix matching, but falls back to linear search for infix searches.
I use contains[cd] in my predicate and this works fine. Perhaps you could post your predicate and we could see if there's an obvious fault.
Sqlite has its own full text indexing module: http://sqlite.org/fts3.html
You have to have full control of the SQL you send to the db (I don't know how Core Data works), but using the full text indexing module is key to speed of execution and simplicity in your SQL SELECT statements that do full text searching.
Using CONTAINS is fine if you don't need fast execution, but selects made with it can't make use of regular indexes so are destined to be slow, and the larger the database the slower it will be. Using real full text indexing allows same sort of searches as you can do with 'CONTAINS', but things are indexed for fast results even with large db's.
I've been working on this same problem and just got around to following up on my post about this from a few weeks ago. Instead of using CONTAINS, I created a separate entity with an instance for each canonicalized word. I added an index on the words (in XCode model builder) and can then use a BEGINSWITH operator to exploit the index. Nevertheless, as I just posted a few minutes ago, query time is still very slow for even small data sets.
There must be a better way! After all, we see this sort of full text search in lots of apps!