Tableau : Calculed field to transform series of odd numbers to a sequence of numbers - tableau-api

I need your help with the formulation for a calculed a field in Tableau (Tableau Prep to be accurate).
I have a field called [Code Order] which contains only a series of Odd numbers (1,3,5,7,9,..) multiple times, which means it can be (1,3,1,3,5,7,1,1,1,3,5,7,9,11).
What I need is to transform these in a normal sequence of numbers so for my example above I need as a result: (1,2,1,2,3,4,1,1,1,2,3,4,5,6)
In other words when in [Code Order] I have :
1 = 1
3 = 2
5 = 3
7 = 4
9 = 5
11 = 6
13 = 7
15 = 8
...
365 = 183
For the moment my maximum is 365, which is position 183, I would like to avoid to type 182 IF formulas if possible. ;)
Thanks in advance for your help.
CYA
Plt.K

This might turn out to be more accurate in case your Code Order series is missing any values along the way.
Example series:
Alternate Field:
Tableau Setup:

You want to use the index() calculated field. Create a new field called index. The calculation is just index().
Add [Code Order] to your row shelf and index to your label. You should see something like this.

The following calculation should do the trick
CEILING([Code Order] / 2)

Related

Select a number of random rows based on one columan condition in matlab

I have a table 'X' like this:
name value score
joy 3 60
rony 8 50
macheis 20 20
joung 2 80
joy 8 3
joy 90 0
joung 4 78
machies 3 23
joy 7 99
I want to select 2 random rows(with name, value, score) where the name is 'joy'.
I applied something like this:
mnew = datasample(find(X.name=='joy'),2); but it does not work! and gives me the error: Undefined operator '==' for input arguments of type 'cell'.
The rows should be selected randomly (with all columns values) where the name is joy.
Does anyone any other solution of this problem? how can i do it in MATLAB?
You have the right idea, but in order to check for the presence of a string within a cell array of strings, you need to use strcmp, ismember, or another method for comparing a string to a cell array.
You probably also want to specify that you don't want to use replacement when calling datasample so you don't get the same row twice.
subx = X(datasample(find(strcmp(X.name, 'joy')), 2, 'Replace', false),:);

Reshaping and merging simulations in Stata

I have a dataset, which consists of 1000 simulations. The output of each simulation is saved as a row of data. There are variables alpha, beta and simulationid.
Here's a sample dataset:
simulationid beta alpha
1 0.025840106 20.59671241
2 0.019850549 18.72183088
3 0.022440886 21.02298228
4 0.018124857 20.38965861
5 0.024134726 22.08678021
6 0.023619479 20.67689981
7 0.016907209 17.69609466
8 0.020036455 24.6443037
9 0.017203175 24.32682682
10 0.020273349 19.1513272
I want to estimate a new value - let's call it new - which depends on alpha and beta as well as different levels of two other variables which we'll call risk and price. Values of risk range from 0 to 100, price from 0 to 500 in steps of 5.
What I want to achieve is a dataset that consists of values representing the probability that (across the simulations) new is greater than 0 for combinations of risk and price.
I can achieve this using the code below. However, the reshape process takes more hours than I'd like. And it seems to me to be something that could be completed a lot quicker.
So, my question is either:
i) is there an efficient way to generate multiple datasets from a single row of data without multiple reshape, or
ii) am I going about this in totally the wrong way?
set maxvar 15000
/* Input sample data */
input simulationid beta alpha
1 0.025840106 20.59671241
2 0.019850549 18.72183088
3 0.022440886 21.02298228
4 0.018124857 20.38965861
5 0.024134726 22.08678021
6 0.023619479 20.67689981
7 0.016907209 17.69609466
8 0.020036455 24.6443037
9 0.017203175 24.32682682
10 0.020273349 19.1513272
end
forvalues risk = 0(1)100 {
forvalues price = 0(5)500 {
gen new_r`risk'_p`price' = `price' * (`risk'/200)* beta - alpha
gen probnew_r`risk'_p`price' = 0
replace probnew_r`risk'_p`price' = 1 if new_r`risk'_p`price' > 0
sum probnew_r`risk'_p`price', mean
gen mnew_r`risk'_p`price' = r(mean)
drop new_r`risk'_p`price' probnew_r`risk'_p`price'
}
}
drop if simulationid > 1
save simresults.dta, replace
forvalues risk = 0(1)100 {
clear
use simresults.dta
reshape long mnew_r`risk'_p, i(simulationid) j(price)
keep simulation price mnew_r`risk'_p
rename mnew_r`risk'_p risk`risk'
save risk`risk'.dta, replace
}
clear
use risk0.dta
forvalues risk = 1(1)100 {
merge m:m price using risk`risk'.dta, nogen
save merged.dta, replace
}
Here's a start on your problem.
So far as I can see, you don't need more than one dataset.
The various reshapes and merges just rearrange what was first generated and that can be done within one dataset.
The code here in the first instance is for just one pair of values of alpha and beta. To simulate 1000 such, you would need 1000 times more observations, i.e. about 10 million, which is not usually a problem and to loop over the alphas and betas. But the loop can be tacit. We'll get to that.
This code has been run and is legal. It's limited to one alpha, beta pair.
clear
input simulationid beta alpha
1 0.025840106 20.59671241
2 0.019850549 18.72183088
3 0.022440886 21.02298228
4 0.018124857 20.38965861
5 0.024134726 22.08678021
6 0.023619479 20.67689981
7 0.016907209 17.69609466
8 0.020036455 24.6443037
9 0.017203175 24.32682682
10 0.020273349 19.1513272
end
local N = 101 * 101
set obs `N'
egen risk = seq(), block(101)
replace risk = risk - 1
egen price = seq(), from(0) to(100)
replace price = 5 * price
gen result = (price * (risk/200)* beta[1] - alpha[1]) > 0
bysort price risk: gen mean = sum(result)
by price risk: replace mean = mean[_N]/_N
Assuming now that you first read in 1000 values, here is a sketch of how to get the whole thing. This code has not been tested. That is, your dataset starts with 1000 observations; you then enlarge it to 10 million or so, and get your results. The tricksy part is using an expression for the subscript to ensure that each block of results is for a distinct alpha, beta pair. That's not compulsory; you could do it in a loop, but then you would need to generate outside the loop and replace within it.
local N = 101 * 101 * 1000
set obs `N'
egen risk = seq(), block(101)
replace risk = risk - 1
egen price = seq(), from(0) to(100)
replace price = 5 * price
egen sim = seq(), block(10201)
gen result = (price * (risk/200)* beta[ceil(_n/10201)] - alpha[ceil(_n/10201)]) > 0
bysort sim price risk: gen mean = sum(result)
by sim price risk: replace mean = mean[_N]/_N
Other devices used: egen to set up in blocks; getting the mean without repeated calls to summarize; using a true-or-false expression directly.
NB: I haven't tried to understand what you are doing, but it seems to me that the price-risk-simulation conditions define single values, so calculating a mean looks redundant. But perhaps that is in the code because you wish to add further detail to the code once you have it working.
NB2: This seems a purely deterministic calculation. Not sure that you need this code at all.

How do I use the Arrhenis Equation in Matlab?

I need to use the equation k=k0*e^(-Q/RT) where T is 10 variables between 90 and 500, to generate the 10 variables I used T = linspace(90,500,10), but when I try to generate the equation it wont let me, k0=1200,Q=8000,R=2 so when I type in k=k0*exp(-Q./(R.*T) the numbers are kind of funky with some 0.0000, am I doing something wrong? Thanks
T = linspace(90,500,10);
k0=1200;
Q=8000;
R=2;
k=k0*exp(-Q./(R.*T));
format longG
k =
Columns 1 through 3
5.98693127135402e-17 1.83626027549851e-10 3.07185900119097e-07
Columns 4 through 6
2.60112352637493e-05 0.000498552970093129 0.00409767412987074
Columns 7 through 9
0.0198590098887835 0.067709291730596 0.180526236997812
Column 10
0.402555153483014
results = [T';k'];
Nothing wrong there, just the format that went wrong I reckon. For formatting options, see either the documentation on format, or this question.

Using SUM and UNIQUE to count occurrences of value within subset of a matrix

So, presume a matrix like so:
20 2
20 2
30 2
30 1
40 1
40 1
I want to count the number of times 1 occurs for each unique value of column 1. I could do this the long way by [sum(x(1:2,2)==1)] for each value, but I think this would be the perfect use for the UNIQUE function. How could I fix it so that I could get an output like this:
20 0
30 1
40 2
Sorry if the solution seems obvious, my grasp of loops is very poor.
Indeed unique is a good option:
u=unique(x(:,1))
res=arrayfun(#(y)length(x(x(:,1)==y & x(:,2)==1)),u)
Taking apart that last line:
arrayfun(fun,array) applies fun to each element in the array, and puts it in a new array, which it returns.
This function is the function #(y)length(x(x(:,1)==y & x(:,2)==1)) which finds the length of the portion of x where the condition x(:,1)==y & x(:,2)==1) holds (called logical indexing). So for each of the unique elements, it finds the row in X where the first is the unique element, and the second is one.
Try this (as specified in this answer):
>>> [c,~,d] = unique(a(a(:,2)==1))
c =
30
40
d =
1
3
>>> counts = accumarray(d(:),1,[],#sum)
counts =
1
2
>>> res = [c,counts]
Consider you have an array of various integers in 'array'
the tabulate function will sort the unique values and count the occurances.
table = tabulate(array)
look for your unique counts in col 2 of table.

Arrange data using loop in MATLAB

If I have:
t=(1:1:5)'
time=1:3:100
How do I arrange data t in each column starting from 1 until the end, with an interval of 3. Which means that the data t (1 to 5) at column 1,4,7 and so on.
I've tried:
t=[1:1:5];
nt=length(temp);
time=[1:1:100];
nti=length(time);
x=zeros(nt,nti);
temp=temp';
initiator=2;
monomer=3;
post=1:3:100;
for l=1:post
step=1;
maxstep=100;
while (step<maxstep)
step=step+3;
temp=(1:1:5)';
end
t(:,l)=t;
x=[t];
end
This only shows result X with temp at column 1. I do not know how to to arrange this data at columns that I want.
Hope someone will help me. Thank you in advance.
How many dimensions does your data have? If you already have "temp" (temperature?) and "time" as your first two dimensions and you want "t" to be the third dimension, then create a three-dimension matrix.
To extract from indexes [1 4 7 10 13 16 ... ], use (1:3:end)
To extract from indexed [2 5 8 11 14 17 ... ], use (2:3:end)
In MATLAB's colon notation, the first value is the start. Second value is increment. Third value is the end value and is inclusive.