How to write a calculation to get 0 in Tableau - tableau-api

I had a column named ratio where values will be 0%- 100%.
I need to write a calculation where id Ratio was 100% then it should be 0
calculation i used :
If ABS(Ratio) <1 then 1 ELSE 0 END
But i can see some value as 1 showing for 100 percentage.
ABS i used ther might be chance value will get negative

Related

Crystal reports must compute a field only if it is greater than zero

I created a report that compares two amounts and shows its increase or decrease percentage.
logic is
amount1 compared to amount2 then lastly show its % inc/dec
I have this field than computes for the increase/decrease of the number
formula is
(tonumber({tblReclass.Amount})/tonumber({tblReclass.AverageAmt}))*100-100
however there are data rows that contain zero values and zero division throws an error so I decided to put an if statement and the code is now this
if {tblReclass.Amount} > 0 and {tblReclass.AverageAmt} > 0 then
(tonumber({tblReclass.Amount})/tonumber({tblReclass.AverageAmt}))*100-100
else
0
it now throws an error after the then statement it says
a string is required here
what must be revised in the code
The computation works fine if I remove the zero values
so what I did temporarily was remove the zero data values but this report now shows incomplete data. I want to show the zero values
Try this version:
if tonumber({tblReclass.AverageAmt}) > 0 then
(tonumber({tblReclass.Amount})/tonumber({tblReclass.AverageAmt}))*100-100
else 0

System Dynamics simulation - Translating Stella into AnyLogic syntax

I modelled the following logic in stella:
(IF "cause" > 0 THEN MONTECARLO("probabilityofconsequence") ELSE 0
But Im not getting the correct syntax on AnyLogic:
(cause > 0) ? (uniform() < probabilityofconsequence) ? 1 : 0 : 0
Any ideas?
Disclaimer:
What stella does is with the Montecarlo function a series of zeros and ones from a Bernoulli distribution based on the probability provided. The probability is the percentage probability of an event happening per DT divided by DT (it is similar too, but not the same as, the percent probability of an event per unit time). The probability value can be either a variable or a constant, but should evaluate to a number between 0 and 100/DT (numbers outside the range will be set to 0 or 100/DT). The expected value of the stream of numbers generated summed over a unit time is equation to probability/100.
MONTECARLO is equivalent to the following logic:
IF (UNIFORM(0,100,<seed>) < probability*DT THEN 1 ELSE 0
the equivalent in anylogic should be:
cause>0 && uniform(0,100) < probability*DT ? 1 : 0
you need to create a variable called DT that is the equal to either the fixed time step that you have chosen in your model configuration, or the value you consider that should be adequate.
Since anylogic depending on how you are running the model, doesn't consider the fixed time step as fixed, you need to define the DT yourself.
No matter what, you are going to get results not exactly equal to stella probably since the time steps are not necessarily the same... but maybe similar enough should satisfy you

MSE in neuralnet results and roc curve of the results

Hi my question is a bit long please bare and read it till the end.
I am working on a project with 30 participants. We have two type of data set (first data set has 30 rows and 160 columns , and second data set has the same 30 rows and 200 columns as outputs=y and these outputs are independent), what i want to do is to use the first data set and predict the second data set outputs.As first data set was rectangular type and had high dimension i have used factor analysis and now have 19 factors that cover up to 98% of the variance. Now i want to use these 19 factors for predicting the outputs of the second data set.
I am using neuralnet and backpropogation and everything goes well and my results are really close to outputs.
My questions :
1- as my inputs are the factors ( they are between -1 and 1 ) and my outputs scale are between 4 to 10000 and integer , should i still scaled them before running neural network ?
2-I scaled the data ( both input and outputs ) and then predicted with neuralnet , then i check the MSE error it was so high like 6000 while my prediction and real output are so close to each other. But if i rescale the prediction and outputs then check The MSE its near zero. Is it unbiased to rescale and then check the MSE ?
3- I read that it is better to not scale the output from the beginning but if i just scale the inputs all my prediction are 1. Is it correct to not to scale the outputs ?
4- If i want to plot the ROC curve how can i do it. Because my results are never equal to real outputs ?
Thank you for reading my question
[edit#1]: There is a publication on how to produce ROC curves using neural network results
http://www.lcc.uma.es/~jja/recidiva/048.pdf
1) You can scale your values (using minmax, for example). But only scale your training data set. Save the parameters used in the scaling process (in minmax they would be the min and max values by which the data is scaled). Only then, you can scale your test data set WITH the min and max values you got from the training data set. Remember, with the test data set you are trying to mimic the process of classifying unseen data. Unseen data is scaled with your scaling parameters from the testing data set.
2) When talking about errors, do mention which data set the error was computed on. You can compute an error function (in fact, there are different error functions, one of them, the mean squared error, or MSE) on the training data set, and one for your test data set.
4) Think about this: Let's say you train a network with the testing data set,and it only has 1 neuron in the output layer . Then, you present it with the test data set. Depending on which transfer function (activation function) you use in the output layer, you will get a value for each exemplar. Let's assume you use a sigmoid transfer function, where the max and min values are 1 and 0. That means the predictions will be limited to values between 1 and 0.
Let's also say that your target labels ("truth") only contains discrete values of 0 and 1 (indicating which class the exemplar belongs to).
targetLabels=[0 1 0 0 0 1 0 ];
NNprediction=[0.2 0.8 0.1 0.3 0.4 0.7 0.2];
How do you interpret this?
You can apply a hard-limiting function such that the NNprediction vector only contains the discreet values 0 and 1. Let's say you use a threshold of 0.5:
NNprediction_thresh_0.5 = [0 1 0 0 0 1 0];
vs.
targetLabels =[0 1 0 0 0 1 0];
With this information you can compute your False Positives, FN, TP, and TN (and a bunch of additional derived metrics such as True Positive Rate = TP/(TP+FN) ).
If you had a ROC curve showing the False Negative Rate vs. True Positive Rate, this would be a single point in the plot. However, if you vary the threshold in the hard-limit function, you can get all the values you need for a complete curve.
Makes sense? See the dependencies of one process on the others?

Output Value Of Neural Network Does Not Arrive To Desired Values

I made a neural network that also have Back Propagation.it has 5 nodes in input layer,6 nodes in hidden layer,1 node in output layer and have random weights and i use sigmoid as activation function.
i have two set of data for input.
for example :
13.5 22.27 0 0 0 desired value=0.02
7 19 4 7 2 desired value=0.03
now i train the network with 5000 iteration or iteration will stop if the error
value(desired - calculated output value) is less than or equal to 0.001.
the output value of first iteration for each input set is about 60 And it will decrease in each iteration.
now the problem is that the second set of inputs(that has desired value of 0.03),cause to stop iteration because of calculated output value of 3.001 but the first set of inputs did not arrived to desired value of it(that is 0.02) and its output is about 0.03 .
EDITED :
I used LMS algorithm andchanged the error threshold 0.00001 to find correct error value,but now output value of last iteration for both 0.03 and 0.02 desired value is between 0.023 and 0.027 and that is incorrect yet.
For your error value stop threshold, you should take the error on one epoch (Sum of every error of all your dataset) and not only on one member of you dataset. With this you will have to increase the value of your error threshold but it will force your neural network to do a good classification on all your example and not only on some example.

Find groups with high cross correlation matrix in Matlab

Given a lower triangular matrix (100x100) containg cross-correlation
values, where entry 'ij' is the correlation value between signal 'i'
and 'j' and so a high value means that these two signals belong to
the same class of objects, and knowing there are at most four distinct
classes in the data set, does someone know of a fast and effective way
to classify the data and assign all the signals to the 4 different
classes, rather than search and cross check all the entries against
each other? The following 7x7 matrix may help illustrate
the point:
1 0 0 0 0 0 0
.2 1 0 0 0 0 0
.8 .15 1 0 0 0 0
.9 .17 .8 1 0 0 0
.23 .8 .15 .14 1 0 0
.7 .13 .77 .83. .11 1 0
.1 .21 .19 .11 .17 .16 1
there are three classes in this example:
class 1: rows <1 3 4 6>,
class 2: rows <2 5>,
class 3: rows <7>
This is a good problem for hierarchical clustering. Using complete linkage clustering you will get compact clusters, all you have to do is determine the cutoff distance, at which two clusters should be considered different.
First, you need to convert the correlation matrix to a dissimilarity matrix. Since correlation is between 0 and 1, 1-correlation will work well - high correlations get a score close to 0, and low correlations get a score close to 1. Assume that the correlations are stored in an array corrMat
%# remove diagonal elements
corrMat = corrMat - eye(size(corrMat));
%# and convert to a vector (as pdist)
dissimilarity = 1 - corrMat(find(corrMat))';
%# decide on a cutoff
%# remember that 0.4 corresponds to corr of 0.6!
cutoff = 0.5;
%# perform complete linkage clustering
Z = linkage(dissimilarity,'complete');
%# group the data into clusters
%# (cutoff is at a correlation of 0.5)
groups = cluster(Z,'cutoff',cutoff,'criterion','distance')
groups =
2
3
2
2
3
2
1
To confirm that everything is great, you can visualize the dendrogram
dendrogram(Z,0,'colorthreshold',cutoff)
You can use the following method instead of creating the dissimilarity matrix.
Z = linkage(corrMat,'complete','correlation')
This allows Matlab to interpret your matrix as correlation distance and then, you can plot the dendrogram as follows:
dendrogram(Z);
One way to verify if your dendrogram is right or not is by checking its maximum height which should correspond to 1-min(corrMat). If the minimum value in corrMat is 0 then the maximum height of your tree should be 1. If the minimum value is -1 (negative correlation), the height should be 2.
Since it is given that there are going to be 4 groups, I'd start with a pretty simplistic two stage approach.
In the first stage you find the maximum correlation among any two elements, place those two elements in a group, then zero out their correlation in the matrix. Repeat, finding the next highest correlation among two elements and either adding those to an existing group or creating a new one until you have the correct number of groups.
Finally, check which elements aren't in a group, go to their column, and identify the highest correlation they have with any other group. If that element is in a group already, place them in that group as well, otherwise skip to the next element and come back to them later.
If there is interest or anything isn't clear I can add code later. Like I said, the approach is simplistic but if you don't need to verify the number of groups I think it should be effective.