What should be in a model

What should be in a model - mvvm

I am struggling with the question of what belongs in the Model in a MVVM architecture. Suppose I have an object called a Line which is made up of x,y data point pairs. I want to be able to add points to the line and I want to be able to do interpolation of points. For instance, I want to return a vector of y values when the user passes in a vector of x values. In order to do the interpolation I need the points in the Line object to be sorted in ascending x. Should the model just be the collection of Points? Does the model handle the sorting and interpolation or should a service handle this?

What should be in a model?
Short answer: The data
Longer answer:
The model represents the part of the application that is dedicated to data storage. So database access goes in the model. A good represention of that is a Repository-Pattern. (Which, in a more rigid interpretation of MVVM belongs to the service layer).
Your ViewModel holds the dynamic data. This means selection, ordering, locales, and so on.
A very good start on MVVM (and MVVM in action) is this video by Jason Dolinger. The sourcecode of the example can be found here
It is just one hour long, but i covers very good the aspects of MVVM and why it is that successful.

Related

Multiple object tracking using radar data and extended kalman filter

thanks in advance.
I am new to the multiple object tracking field. So, I have been working on this for a couple of days. I have developed my first version of a single object tracker using an extended Kalman filter. I am estimating position, velocity by assuming a constant acceleration model. Now my question is how can I convert the existing model for multiple objects tracking. The main problem is I am using radar data. So, I am not able to get the references for developing the tracker. So, One good example or steps to achieve can help me in understanding the concept.

The answer to this question depends on a lot of things. For example, how much control and knowledge do you have over the whole system? If you know how many targets you need to track you can add all of them to the Kalman Filter state and for every measurement you perform data association to find out to which object a given measurement belongs. An easy association metric would be nearest neighbor.
If you don't know how many targets there will be you will want to implement a track management where each target you are tracking represents a track and you can model birth and death probabilities of targets.
Multi Target Tracking is a vast field and if you want to have an in-depth mathematical introduction I would recommend the 2015 survey paper "Multitarget Tracking" by Ba-Ngu Vo et al. You should be able to find a preprint pdf online.
If you are looking more for a lightweight tutorial I would assume it should be possible to find some tutorial or example code online where to start. As mentioned in the first paragraph, nearest neighbor association for a fixed amount of objects might be a good first step.

How to plot all the stream lines in paraview?

I am simulating the case "Cavity driven lid" and I try to get all the stream lines with the stream tracer of paraview, but I only get the ones that intersect the reference line, and because of that there are vortices that are not visible. How can I see all the stream-lines in the domain?
Thanks a lot in adavance.

To add a little bit to Mathieu's answer, if you really want streamlines everywhere, then you can create a Stream Tracer With Custom Source (as Mathieu suggested) and set your data to both the Input and the Seed Source. That will create a streamline originating from every point in your dataset, which is pretty much what you asked for.
However, while you can do this, you will probably not be happy with the results. First of all, unless your data is trivially small, this will take a long time to compute and create a large amount of data. Even worse, the result will be so dense that you won't be able to see anything. You will get all those interesting streamlines through vortices, but they will be completely hidden by all the boring streamlines around them.
Thus, you are better off with trying to derive a data set that contains seed points that are likely to trace a stream through the vortices that you are interested in. One thing you might want to try is to compute the vorticity of your vector field (Gradient Of Unstructured Data Set when turning on advanced option Compute Vorticity), find the magnitude of that (Calculator), and then use the Threshold filter to pull out the cells with large vorticity. Then use that as your Seed Source.
Another (probably better) option if your data is 2D or you can extract an interesting surface along the flow of your data is to use the Surface LIC plugin. Details can be found at https://www.paraview.org/Wiki/ParaView/Line_Integral_Convolution.

You have to choose a representative source for your streamline.
You could use a "Sphere Source", so in the StreamTracer properties.
If that fails, you can use a StreamTracerWithCustomSource and use your own source that you will have to create yourself first.

Is it possible to use graph-on-parent with [clone]d abstractions in some way?

I know you can open an abstraction with the vis message, but I haven't found a way to present my abstractions in the patch containing the clone object. Perhaps dynamic patching is the only way to achieve this? I have searched the pd forum, mailing list and Facebook group without success.

Currently (as pd 0.48-1) there is no way of making the [clone] read the GOP of it's contents.
As a workaround you can encapsulate the [clone] object into an abstraction that provides a GUI that displays information about the selected clonede instance.
For example, let's say you have a Object called [HarmonicSeries] that, given an fundamental, it uses a [clone] object to create 8 instances of [Harmonic], each one containing a osc~ of the desired frequency. And you want to display the frequency of each Harmonic. Instead of using GOP on [Harmonic] you would use GOP on [HarmonicSeries] and provide an Interface to selected the desired harmonic to collect information.
The [harmonic] abstraction: it expects two parameters:
The fundamental frequency
The index of the harmonic
Then it multiplies both to get the harmonic's frequency and store it on an [float]. When it receives a bang it then outputs that frequency to it's left outlet.
[
Let's [clone] it and wrap it into the [HarmonicSeries] abstraction.
When the user clicks on the [hradio] to select the desired harmonic it sends a bang message to the correct harmonic, which in turn sends the stored frequency to it's outlet. It then displays the harmonic's index and the harmonic's frequency in number boxes.
Here's an example of it working (in the [HarmonicSeries-help] object)
This is a simple example but the principle is the same with complex cases. You encapsulate the [clone] into an abstraction that provides an interface for reading data from the cloned instances.

Unsupervised Anomaly Detection with Mixed Numeric and Categorical Data

I am working on a data analysis project over the summer. The main goal is to use some access logging data in the hospital about user accessing patient information and try to detect abnormal accessing behaviors. Several attributes have been chosen to characterize a user (e.g. employee role, department, zip-code) and a patient (e.g. age, sex, zip-code). There are about 13 - 15 variables under consideration.
I was using R before and now I am using Python. I am able to use either depending on any suitable tools/libraries you guys suggest.
Before I ask any question, I do want to mention that a lot of the data fields have undergone an anonymization process when handed to me, as required in the healthcare industry for the protection of personal information. Specifically, a lot of VARCHAR values are turned into random integer values, only maintaining referential integrity across the dataset.
Questions:
An exact definition of an outlier was not given (it's defined based on the behavior of most of the data, if there's a general behavior) and there's no labeled training set telling me which rows of the dataset are considered abnormal. I believe the project belongs to the area of unsupervised learning so I was looking into clustering.
Since the data is mixed (numeric and categorical), I am not sure how would clustering work with this type of data.
I've read that one could expand the categorical data and let each category in a variable to be either 0 or 1 in order to do the clustering, but then how would R/Python handle such high dimensional data for me? (simply expanding employer role would bring in ~100 more variables)
How would the result of clustering be interpreted?
Using clustering algorithm, wouldn't the potential "outliers" be grouped into clusters as well? And how am I suppose to detect them?
Also, with categorical data involved, I am not sure how "distance between points" is defined any more and does the proximity of data points indicate similar behaviors? Does expanding each category into a dummy column with true/false values help? What's the distance then?
Faced with the challenges of cluster analysis, I also started to try slicing the data up and just look at two variables at a time. For example, I would look at the age range of patients accessed by a certain employee role, and I use the quartiles and inter-quartile range to define outliers. For categorical variables, for instance, employee role and types of events being triggered, I would just look at the frequency of each event being triggered.
Can someone explain to me the problem of using quartiles with data that's not normally distributed? And what would be the remedy of this?
And in the end, which of the two approaches (or some other approaches) would you suggest? And what's the best way to use such an approach?
Thanks a lot.

You can decide upon a similarity measure for mixed data (e.g. Gower distance).
Then you can use any of the distance-based outlier detection methods.

You can use k-prototypes algorithm for mixed numeric and categorical attributes.
Here you can find a python implementation.

Running k-medoids algorithm in ELKI

I am trying to run ELKI to implement k-medoids (for k=3) on a dataset in the form of an arff file (using the ARFFParser in ELKI):
The dataset is of 7 dimensions, however the clustering results that I obtain show clustering only on the level of one dimension, and does this only for 3 attributes, ignoring the rest. Like this:
Could anyone help with how can I obtain a clustering visualization for all dimensions?

ELKI is mostly used with numerical data.
Currently, ELKI does not have a "mixed" data type, unfortunately.
The ARFF parser will split your data set into multiple relations:
a 1-dimensional numerical relation containing age
a LabelList relation storing sex and region
a 1-dimensional numerical relation containing salary
a LabelList relation storing married
a 1-dimensional numerical relation storing children
a LabelList relation storing car
Apparently it has messed up the relation labels, though. But other than that, this approach works perfectly well with arff data sets that consist of numerical data + a class label, for example - the use case this parser was written for. It is a well-defined and consistent behaviour, though not what you expected it to do.
The algorithm then ran on the first relation it could work with, i.e. age only.
So here is what you need to do:
Implement an efficient data type for storing mixed type data.
Modify the ARFF parser to produce a single relation of mixed type data.
Implement a distance function for this type, because the lack of a mixed type data representation means we do not have a distance to go with it either.
Choose this new distance function in k-Medoids.
Share the code, so others do not have to do this again. ;-)
Alternatively, you could write a script to encode your data in a numerical data set, then it will work fine. But in my opinion, the results of one-hot-encoding etc. are not very convincing usually.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse