I want to develop a system for detecting and preventing bank transactions fraud using complex events processing. Ive been researching and looks like markov chain could help me with this.
What are the general steps and data flow to develop such system?
Im not looking for complete answers, just the generic steps, so i can research and obtain my own answers.
Given this scenario what exactly can i predict with markov chain?
I know its not a specific question but im not looking for specific answers
you need full insight into all possible parameters and their dependencies. For every parameter, you will require one dimension, and the calculations will become rather big. there are other algorithms that are much simpler and more elegant.
Markov chains are more suited for fever parameters. For scenarios like this you should try other methods like "Support vector machines".
Also have a look intomore elegant solutions like Benford's Law
This is a question that possibly borders on the intersection of the general usage of MATLAB and/or signal processing. Thought I would first ask the question in a MATLAB forum before trying signal processing.
So our lecturer read out his notes/paper and said the equation
could be implemented as a filter.
At first, it seemed difficult to follow the idea but when realizing that integration is same as finding areas under the curve which seems similar to applying a low pass filter so that only the portion of the signal under the threshold is allowed to pass through, it made a bit of sense. But how - meaning to say which function - can I use to implement the above equation? Do I need three filters or can I use just one? How do I use the terms preceding the integrals in the filter?
Thanks in advance
I am developing an application where I need it to analyze the incoming frequency with the built-in microphone on the iphone/ipad. I know that I need to use FFT and I have found a framework that can help me on that. My only concern was is there is a code or framework that includes Band-Pass filtering? Suggestions are welcome.
EDIT
Pardon my ignorance. I previously posted that I wanted to use just a Band-Pass equation, when I found out that Band-pass is both Low & High Pass filters. I still welcome suggestions.
You can always do this yourself using a biquad filter.
Here's a great document explaining how they work and what coefficients you need to plug in to create a bandpass filter: http://musicweb.ucsd.edu/~tre/biquad.pdf
On iOS 4.x, there is the built-in Accelerate vDSP framework for FFT and convolution. But unless you want to build on top of the FFT or convolution routines, there is nothing built-in for band-pass filtering. Fast convolution filtering using an FFT for overlap add/save can be very efficient, depending on your filter kernel requirements and the signal length.
I've noticed in Apple's accelerometer sample code they use both regular and adaptive low pass filters, what is the difference?
They are both IIR 1st order low pass filters (simple, and laggy in responsiveness when compared with other DSP techniques). The adaptive filter switches to a higher frequency roll-off (and thus becomes even less smoothing but more responsive) for larger accelerations.
There are other higher quality (and more complicated) DSP filtering techniques for motion sensing often used by portable game developers.
In general, any adaptive filter will adjust itself based on the input signal. I'm not sure if you'll notice much difference in practice. Just try it out and see if one gives better response to what you're trying to do.
IIR Adaptive has a flatter response curve which means you have a higher fidelity output. Or, in other words, what you put in is what you get out.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 3 years ago.
Improve this question
I find this question a little tricky. Maybe someone knows an approach to answer this question. Imagine that you have a dataset(training data) which you don't know what it is about. Which features of training data would you look at in order to infer classification algorithm to classify this data? Can we say anything whether we should use a non-linear or linear classification algorithm?
By the way, I am using WEKA to analyze the data.
Any suggestions?
Thank you.
This is in fact two questions in one ;-)
Feature selection
Linear or not
add "algorithm selection", and you probably have three most fundamental questions of classifier design.
As an aside note, it's a good thing that you do not have any domain expertise which would have allowed you to guide the selection of features and/or to assert the linearity of the feature space. That's the fun of data mining : to infer such info without a priori expertise. (BTW, and while domain expertise is good to double-check the outcome of the classifier, too much a priori insight may make you miss good mining opportunities). Without any such a priori knowledge you are forced to establish sound methodologies and apply careful scrutiny to the results.
It's hard to provide specific guidance, in part because many details are left out in the question, and also because I'm somewhat BS-ing my way through this ;-). Never the less I hope the following generic advice will be helpful
For each algorithm you try (or more precisely for each set of parameters for a given algorithm), you will need to run many tests. Theory can be very helpful, but there will remain a lot of "trial and error". You'll find Cross-Validation a valuable technique.
In a nutshell, [and depending on the size of the available training data], you randomly split the training data in several parts and train the classifier on one [or several] of these parts, and then evaluate the classifier on its performance on another [or several] parts. For each such run you measure various indicators of performance such as Mis-Classification Error (MCE) and aside from telling you how the classifier performs, these metrics, or rather their variability will provide hints as to the relevance of the features selected and/or their lack of scale or linearity.
Independently of the linearity assumption, it is useful to normalize the values of numeric features. This helps with features which have an odd range etc.
Within each dimension, establish the range within, say, 2.5 standard deviations on either side of the median, and convert the feature values to a percentage on the basis of this range.
Convert nominal attributes to binary ones, creating as many dimensions are there are distinct values of the nominal attribute. (I think many algorithm optimizers will do this for you)
Once you have identified one or a few classifiers with a relatively decent performance (say 33% MCE), perform the same test series, with such a classifier by modifying only one parameter at a time. For example remove some features, and see if the resulting, lower dimensionality classifier improves or degrades.
The loss factor is a very sensitive parameter. Try and stick with one "reasonnable" but possibly suboptimal value for the bulk of the tests, fine tune the loss at the end.
Learn to exploit the "dump" info provided by the SVM optimizers. These results provide very valuable info as to what the optimizer "thinks"
Remember that what worked very well wih a given dataset in a given domain may perform very poorly with data from another domain...
coffee's good, not too much. When all fails, make it Irish ;-)
Wow, so you have some training data and you don't know whether you are looking at features representing words in a document, or genese in a cell and need to tune a classifier. Well, since you don't have any semantic information, you are going to have to do this soley by looking at statistical properties of the data sets.
First, to formulate the problem, this is more than just linear vs non-linear. If you are really looking to classify this data, what you really need to do is to select a kernel function for the classifier which may be linear, or non-linear (gaussian, polynomial, hyperbolic, etc. In addition each kernel function may take one or more parameters that would need to be set. Determining an optimal kernel function and parameter set for a given classification problem is not really a solved problem, there are only useful heuristics and if you google 'selecting a kernel function' or 'choose kernel function', you will be treated to many research papers proposing and testing various approaches. While there are many approaches, one of the most basic and well travelled is to do a gradient descent on the parameters-- basically you try a kernel method and a parameter set , train on half your data points and see how you do. Then you try a different set of parameters and see how you do. You move the parameters in the direction of best improvement in accuracy until you get satisfactory results.
If you don't need to go through all this complexity to find a good kernel function, and simply want an answer to linear or non-linear. then the question mainly comes down to two things: Non linear classifiers will have a higher risk of overfitting (undergeneralizing) since they have more dimensions of freedom. They can suffer from the classifier merely memorizing sets of good data points, rather than coming up with a good generalization. On the other hand a linear classifier has less freedom to fit, and in the case of data that is not linearly seperable, will fail to find a good decision function and suffer from high error rates.
Unfortunately, I don't know a better mathematical solution to answer the question "is this data linearly seperable" other than to just try the classifier itself and see how it performs. For that you are going to need a smarter answer than mine.
Edit: This research paper describes an algorithm which looks like it should be able to determine how close a given data set comes to being linearly seperable.
http://www2.ift.ulaval.ca/~mmarchand/publications/wcnn93aa.pdf