application (search) that supports misspellings

application (search) that supports misspellings - iphone

I'm creating a simple application, first time working with SQLite.
I want a regular search box and to display results as a user types (is this possible)
And how to make the search to be able to support misspellings, if i wrote "Canda" to find me "Canada","Candy" or whatever i have in the DB similar to the search
any help or literature will be helpful

Concerning (1), you should take a look at UISearchDisplayController, in short it'll allow you to easily search and display results as you type.
For (2), my first thought would be to perhaps represent the "likely" misspellings in your data model? In addition it would be interesting to also augment this list of misspellings - Apple's own SMS app is doing something like this, so that it learns from your misspellings as you go along.

UISearchDisplayController will kick of searches as the user types in text and thus narrowing the results down continously.
To my best of knowledge fuzzy matching is not directly supported in SQLite even though the feature have been requested numerous times.
It requires a quote complex algoritm to do this effectively as detailed in this Wikipedia article.
There exists a module called fuzzystrmatch for PostgreSQL, but it does have it's limitations, especially with multibyte strings (like UTF-8) which are the native string formats for SQLite. This might give you a hint of a possible implementation.

Related

Autocomplete with Natural language

My project needs some natural language processing. I'm completely new to the field.
what I'm trying to achieve is that when the User enter the description of the product I look for in my database which description is nearest and suggest that the category, product group and sub-group (the tree of the product).
For this titles 250 extracts products for each subgroup.
What is the specific term in NLP for doing this? I tried googling for a while, but had no luck since I don't know the term. Any good tutorials to start with? Are there any good libraries in doing this specific task?
Thank you.

From what I can tell autocomplete or text prediction/predictive search isn't really a big research area in NLP. It wasn't even covered in any of my graduate level classes and I do research in this area. I think the reason is that there are solutions that exist which are good enough for the vast majority of real world problems.
I'm not sure which language you work in, but the library you want to work with is probably Lucene if you are dealing with java, perhaps setting up a Solr instance if this is a general problem for you and you are dealing with a large number of ontologies.
You can find some reason tutorials/examples here on stack overflow, such as:
How to implements auto suggest using Lucene's new AnalyzingInfixSuggester API?

Creating a generic server-side validation function in Coldfusion

So, I've been trying to clean up my code and learn things that I should always do...well of course server-side validation is one of those things that I should always do. However, what happens when I have this huge form? I really would like to have a generic function that allows me to pass the data type and field name. Is there a secure way to do this in Coldfusion?
I've been looking into doing this for a while, but I've come to dead ends and can't find any info on doing something like this on the web. It seems like Coldfusion does not offer this ability.
However, I think it would be cool if there was a way to specify an attribute in your input tags that had the data type of the field. Then, it would be uber nice if Coldfusion stuck it into a struct for you with your field names.
Is there anyway to accomplish this or can someone elaborate on the most efficient way to do server-side validation?

That would be great if CF had something like that! Good news, it does, for years now! :)
What you're looking for is cfinput (and cfform) tag. This tag includes the validation specifics right in the tag like you're wanting (great minds think alike, right?). You can specify the validation, the error message, if it should validate client or server side - all kinds of neat tricks.
Check here for implementation - it's quite easy to use:
http://livedocs.adobe.com/coldfusion/8/htmldocs/help.html?content=Tags_i_07.html
Be warned that a lot of code diva's hate cfform / cfinput. In reality, there is nothing wrong with them when implemented correctly. It can be abused and it won't fit for every solution, but that is true of everything in the toolbox. By and large, for most form input and validation situations it works great.
If you hate that idea, another is to use the built in type attribute of cfparam and catch your errors.
For example, at the top of your form processing page, you can :
<cfparam name="form.cardNumber" type="creditcard">
When this is reached, if the value in that variable is not of that type, it will throw an exception that you can catch. This keeps you from having to write the if() and pattern matching. Additionally, if there isn't a type built in, you can specify a regular expression for pattern matching.
Here's some more information and the types supported:
http://livedocs.adobe.com/coldfusion/8/htmldocs/help.html?content=Tags_p-q_01.html
Let me know what you think!

I would encourage you to look at possibly using a ColdFusion framework like CFWheels (or ColdBox) which has a lot of this type of functionality already built in to make development a TON easier. Using CFWheels has been one of the best decisions I've made as a developer and my development skills have grown significantly over the past year. There's some great screencasts online to get started. http://cfwheels.org/screencasts

How to model multilingual database with Zend, l18n mysql?

I know this topic was discusses a couple of times, but none of them represents the ultimate solution for me.
Situation
I'm designing a relational mysql database which later should hold multilingual content. You know this from the Wikipedia or Microsoft Tech Support Pages. The contents should be the same for every language. e.g If translations are missing the site offers you the same content automatically translated or in the languages which the information is available in. If some values are not set, it should fallback to the second or default browser language or translate it e.g. through google. Development environment is Zend.
My ideas so far are for Solving the Problem:
Two Primary Keys: (ID, Language)
Advantage: Easy Database Access through database abstraction layers.
Problem: Foreign Keys, Relations ships, Fallbacks
Columns with language suffix:
Advantage: DB Performance, No relational Problems.
Problem: Database abstraction layers cannot handle this?
Has any concept proven itself or is preferable over the other? Has anyone already created something like this and can share his experience with me? Does a modified Zend DB Controller exist for this situation? How do you link this information to a form?
Thank you for your help, hints and suggestions!
Kind regards,
Manuel

The second option would be not maintainable (this should be added on the minuses side). To actually add another language you'll need to modify table and abstraction layers. Sounds like a nightmare.
The first option seems much more promising but unfortunately there is a lot to do to make it work. However, from my experience this is rather typical solution, so I would not reinvent the wheel.
What I have to add is, language fallback should be done on the Zend side, database would miss some information. You may think of some kind of index table to hold information such as unique id of the contents and available languages. If you need to serve something, you would read such record, compare it against of Accept Languages and ask database again for valid contents (using the most suitable language). The only problem is, you would need to create such an index table somehow (the best way I see would be trigger on inserting contents to your content table).
A lot of work but the problem is not too easy.

I am working on the exact same problem right now.
Somehow it does not make sense to me to add everything into the same database. Lets say I want to go to the extreme and support some 50 languages this would just bloat my DB. So, I tend to keep my main DB in my main language and then introduce some Zend_Translate concept into it. Zend_Translate should give you the fallback solution you are looking for. While the main navigation and core design is not much of a problem for my web site my biggest concern right now is how to store all the main content and how to translate because these elements contain HTML among other things. For the main content I will probably use some alternate approach and use a separate DB with tables for each language.

My plattform will be a community driven database. So I actually gonna rely on humans translating it. You have to store the information anyways, so my first concern is not the database size or performance, but easy usability. So far my idea is to implement some structure as described above, not yet sure if i'll do it in doctrine or not.
Language decision:
Start, application gets users preset language, secondary language, english mother-tong of the article. Fetching the article from the database I will check the following for every column: 1. is the primary language available? 2. Is the secondary language available? 3. If neither of them, display article in mother-tong or english and offer the user to translate it with suggestions from the google translate api. I guess it's gonna be quite a bit of coating and manipulating controllers or building a business model doing this.
#tawfekov is something like this or similar easily realizable with doctrine?

Implementing full text search on iPhone?

I'm looking for suggestions on the best way to implement a full-text search on some static data on the iPhone.
Basically I have an app that contains the offline version of a web site, about 50MB of text, and I'd like for users to be able to search for terms. I figure that I should somehow build an table of ("word", reference_to_file_containing_word) or something, put that into either Core Data or just sqlite, index the "word" column, then have the search facility search the table for search terms and take the intersection of the sets of results for the terms or something.
That wouldn't allow people to search for phrases but it would be pretty easy and probably not too slow.
I'd like to just use existing SDK features for this. Should I use Core Data or sqlite?
Does anyone have any other ideas on how this could be done?

You want to place every word in the document in its own row in a database? That's going to take up even more space than the document itself.
I would recommend just searching through the text; regex is actually pretty fast. Otherwise, you could implement Boyer-Moore fairly easily.
[Edit] If you insist on creating an index of words, you can't beat a trie. It would be faster than using a database, and most likely take up less space than the documents themselves (unlike the database)

The answer is FTS3 for SQLite. Google it, there are many tutorials on how to get it working on iPhone.
And the easy way to use SQLite on iPhone is using FMDB.

Guidance needed in Writing Specifications [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
I was asked (at a place i just began working) to create simple specs for some new functionality that is going to be added to an existing Registration system. I need a little help since i've never done this before.
Here are two diagrams that show the current workflow and the new workflow.
Current Workflow: http://img80.imageshack.us/img80/102/currentworkflow.png
New Workflow http://img245.imageshack.us/img245/6748/newworkflow.png
I know they might be a bit vague but here's what's basically happening.
We are adding a new import form to an existing windows application.
We are modifying an existing form by adding a search button which will search
search and populate data read by an ocr.
I'm a new developer and i'm pretty bad at writing documents in general, but i would like to improve on this. Maybe some examples on how to go about writing something like this would be helpful. I've googled for some examples, but most of the ones i've found are on creating a brand new system. I need something that shows how to write one for modifying an exisiting system.
Here's my attempt at a specification. Maybe someone can critique it. At least then i will know what i need to improve. http://cid-ddb3f6a92ec2b97e.skydrive.live.com/self.aspx/.Public/Specs.docx
Thanks

I love writing specs (I'm a rare one in my company).
Diagrams are a good way to go, but for the more literally minded I start with a full specification template that has a ton of headings in it. For a new system, you'd generally have something to say for everyone. In your case you've specifically mentioned it's an existing app you're modifying, but the point is not to fill out all of the headings - the point is to think about them, and then delete them after due consideration. For example:
Business Requirements (short synopsis of the need, as explained to the business, non-technical users)
Use Cases (usually for bigger specs only)
Functional Requirements
Overview
Flowcharts etc.
Configuration
Error Reporting
Testing
Documentation
Training
Assumptions and Additional Constraints
Third-party Software Requirements
Internationalization
Expandability (e.g. for bits that might need to plug-in to others etc)
Customization
Questions (for questions that still need to be answered by someone to finish the spec)
Also if it's really technical then you might need an introduction sections for:
- Target Audience
- Terminology
- Examples
All of these is generally overkill for all but the largest of designs. But even for a modification, I'd go through every item and consider whether I need to write anything or not. I think this is where a lot of the value of writing a spec comes from - the process of creation. In other words, trying to be thorough and not miss too much. All the benefits that come afterwards - like being able to do estimates, being able to explain the functionality to others etc - are nice side-effects. As long as it doesn't end up completely garbled, and suits your company okay, I think that's more important than the specific appearance, format or contents of the spec.
EDIT: Comments on your specification
I think you've done a reasonable job here. Most developers should be able to take the spec and produce something sensible, and most business analysts should be able to look at the spec and work out what it does and how it works. In my comments below, keep in mind that there's always a trade-off between how detailed you want the spec and how much time you have. I tend to believe the more detailed the spec the more time everyone saves, but that's not the case for everyone.
If you want this to be clearly understood by a business user (e.g. the customer), then the Objective section could maybe contain a sentence or two describing the problem it solves. In other words not what it will achieve, but why.
It's worth explicitly naming the intermediate staging table here. At the very least it means if someone comes back to the spec a year from now, they know exactly where to look in the database.
Minor point: in my experience screenshots that contain unrealistic data are harder to understand. Instead of showing "My Sample Form", "Name", "Address" etc, it'd perhaps be easier to understand with some sensible data. Can still be fake to protect the customer's data, like "123 Fake St" etc. Not a huge deal though.
It's not clear what will happen when something goes wrong. Are there to be any checks that the data in the staging table is in a valid format? If not, is the user given an error message, or otherwise logged somewhere? One error per row of invalid data, or one for the whole batch? The form consists of a single button - something I think we can agree isn't the world's greatest UI, but I understand sometimes these things happen - perhaps it could be enhanced with a logging window to show the results of the import. The answers might be straightforward, but the developer needs to know what they are.
Perhaps not an issue depending on how much data there is, but if there was a lot and it will take a while, it might be worth having a progress bar. Or, mention if the data will be imported in stages.
Would it be worth mentioning the definition of the permanent table to which data is moved? Are all fields moved from the staging table to the permanent table, or only some? If only some, can you show what maps to what? If the permanent table has different data lengths - for example if Address Street is a Varchar(30) - what would happen if the data won't fit? Again, perhaps simple answers, but ones that would be very usefully answered here.
Perhaps worth mentioning if the data will be imported in a single transaction or not - if the data import fails partway through, if everything rolled back, or is half the imported data left imported?
If another developer will be doing this work, I think they're far more likely to get the work right if you mocked-up / draw the screens for them. Even if it's just a form with one button, and even though I can take a good guess at what your search pop-up form will look like, I would make no mistakes if I knew exactly what it's supposed to look like. Tools like Balsamiq Mockups (and see examples here) are wonderful for quick mocks, though the default "comic sans" look may not ride well with managers. I'd rather have a dirty mockup than none at all though. (Note: the free version of Balsamiq doesn't let you save images, but you can achieve the same with the export/import functionality. Also you can't save to an image file like PNG etc, but you can use a screen-capture program to take a picture of what you draw.)
Minor point: I try to avoid personal pronouns like "I", "we", "our", just to make it a little more professional and better for customers to read if necessary. I only noticed one "our", so you've mostly got it right in terms of tone here.
Minor point: are varchars enough or will there be non-standard characters in there that require unicode (i.e. nvarchar)?
It's less clear to me what's happening in the Voter Add/Update Form, but I don't have knowledge of your application - maybe everyone involved will say "oh right, I get it". For example I don't understand the relevance of "ImpRecord001" and "ImpRecord002" - would it be worth mentioning in the design what these batch codes actually mean in the real world?
Is the "Search Data" button the same as the "Search OCR" button?

For any document: first consider why you are writing it - who will read it, what do they need to know? How much detail is appropriate? Another couple of general ideas
If may be useful to then think about the sources of information that go into what you are writing. One result of that might be that you make sure that what you write can be verified. If for example an information source is a person, especially for IT docs it might be a non-IT person telling you stuff, then you may be quite careful about how your present some information so that the "source" can also understand what you are saying.
Also consider carefully what comes after the current document. For example might a test plan be written on the basis of what you write? This might lead you to present information in tables that quite naturally get expanded to test cases.
So to your specific question. What do you mean by "spec"? The workflow you give isn't enough for a user to look at and agree "Yes please, that's what I want". It's not enough for someone to write some code. I'm thinking you might need several documents.
1). Some kind of requirements doc. One format you might use is a storyboard. This focuses on what the user can see and do. Exactly what data is shown on each screen. If there are computations underlying what's displayed you may need to have appendices describing these. This doc is read by both users and developers. Powerpoint or Word could be used.
2). From that you might derive some explicit data models. Item-by-item, field-by-field. data types, sizes, validation etc. I might use date modelling tool, or UML or just a spreadsheet. Primary audience is developer, but ideally a user (or a business anlyst intermediary) could verify the details. [If you don't have a business analyst, you probably are the business analyst :-) ]
3). More technical, a spec for the developers referring to items 1&2. A decomposition of the implemntation. Names of modules, packages, classes or whatever you are using. Defintions of transformations, algorithms and calculations. A more technical doc. I would use UML, but any precise form of capture would do. This is where we might really drill down into what some of the detailed boxes in your workflow mean.
As has been observed, in general we also need to make sure the developer udnerstands the non-functional requirements, such as security and data volumes. In your situation this may be be implcitly understood, so possibly you may not need it now ... in some other life you may, so it might be a good idea to at least have a one liner in place to remind you for the future.

Those are an excellent start for a spec.
I would add to them by creating mock screen shots of what you want the windows application to look like.
On top of that you can add the details of each data field, and what the allowed values are.
Include details of any exceptions you can think of, and how you want errors reported.
You might also want to consider what sort of reporting, and security/auditing you need, as these will need to be included in the design.
Finally, it's worthwhile to sit down with the developer and talk them through the process, going through each step, as i'm sure further details will be needed.

Some of the steps down at the bottom are a bit wordy. Try splitting them up and make sure the word IF never appears. IFs should be designated by using a diamond and splitting out the flow paths based on the conditional.

Categories

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse