SPSS Modeler first not observation - spss-modeler

How do we get first not observation in SPSS Modeler. I have ordered my variable in ascending order by ID . And Could you please let me know what function do we use to get not equal value in SPSS Modeler. Thank you so much for your help.

Could you clarify what you need? I understood that you are trying to analyze the missing cases.
To filter missing cases: add a filter node (option include or exclude - depending on the type of analysis) and with the code below:
var = undef or var = '$null$' or var =''

Related

Error in makeClassifTask - columns to join must specify "on="

I am getting an error here for the makeClassifTask() from MLR package.
task = makeClassifTask(data = data[,2:20441], target='Disease')
Entering this I get this error.
Provided data is not a pure data.frame but from class data.table, hence it will be converted.
Error in [.data.table(data, target) :
When i is a data.table (or character vector), the columns to join by must be specified using 'on=' argument (see ?data.table), by keying x (i.e. sorted, and, marked as sorted, see ?setkey), or by sharing column names between x and i (i.e., a natural join). Keyed joins might have further speed benefits on very large data due to x being sorted in RAM.
If someone could help me out it'd be great.
Given that you did not provide the data I can only do some guessing and suggest to read the documentation at https://mlr3book.mlr-org.com/tasks.html.
It looks like you left out the first column in your dataset which might be your target. Hence makeClassifTask() cannot find your target column.
As #Shreyash Gputa pointed out correctly, changing the data.table object to a data.frame object solves the issue:
task = makeClassifTask(data = as.data.frame(data[,2:20441]), target='Disease')
Given of course that data[,2:20441] contains the target variable Disease...

How to do recursive calculation in SPSS Modeler

If I want to compute a value that relies on the previous one (Recursive functions) how can I do it in SPSS ? Example:
Q0 = 0
Qn = Q(n-1) + Constant
If by "... the previous one ..." you mean the value of the same field (or a different field) for the previous record, you can use the #OFFSET(FIELD, EXPR) function.
The function allows you to access values from records other than the current one based on a relative reference.
After many research I couldn't find any way to do recursive function with SPSS Modeler. The only work around is to use R Transform node within SPSS. HTH.
Depending on what you need to do, you can either chain many derive nodes or refer to the previous value in a column after sorting them.
I started with creating a domain context for the stream data flow (iterations) with a simple csv source file with records keeping one field N (range from 1 to 100), just to limit the example. Then I connected this data source with a derive node that defines the variable field Q:
if not(#NULL(#OFFSET(N,1))) then #OFFSET(Q,1) + 2 else 0 endif
Here I used the value 2 for the Constant in the example above. I see this being a recursive function and it relies on the OFFSET just as Kenneth suggested above.

Matlab: data from cell matrix to struct. How can I to organize my data with keys?

Let me suppose I'm facing some data obtained a by SQL database query as below (of course my real case is bigger, thoudans of rows and many columns).
key_names header1 header2 header3
-------------------------------------
key1 a 1 bar
key2 b 2 foo
key3 c 3 bla
My goal is to organize data in Matlab (at work I must use it) in a smart and effecient way to get the following results:
Access data by key obtaining the whole row, like dataset(key, :)
Access data by key plus header getting back a single value dataset.header(key)
If possible, getting a whole column (for all keys).
First of all, I used the dataset class provided by the Statistic Toolbox because it has all these features, but I decided to move away because it is really slow (from what I got, basically it is a wrapper onto cell arrays): the bottleneck of my code was getting the data instead of performing computations. In fact, I read that is better trying to avoid it as much as possible.
The newer class table looks more efficient but still not very much: from what I have understood, it is the new version of dataset as explained in the official documentation.
I considered also using containers.Map but it looks not to have the access by both key and column.
Therefore, struct seems to be the best choice as it is really fast and it has all the features I'm looking for.
So here my questions:
Did someone face my same problem? Which way to organize data is the best one?
Let me suppose struct is the best. How can I efficiently create and fill a structure like this: mystruct.key.header?
I'd like to get something like this:
mystruct.key1.header1
ans = a
Of course I could loop but there must be a better way. I documented in this good starting point but the struct is created empty:
fn1 = {'a', 'b', 'c'}; %first level
fn2 = {'d', 'e', 'f'}; %second level
s2 = cell2struct(cell(size(fn2(:))),fn2(:));
s = cell2struct(repmat({s2},size(fn1(:))),fn1(:))
In the cell2struct documentation all the examples do not rename all the levels. The deal help is a good way to fill the data (depending on the Matlab version as from 7.0 it was substituted with a new coding style) but I'm still missing how to combine the parts of creating the structure with the filling one.
Any suggestion or code example is really appreciated.
If you think, or sure, that structs are the best option for you, you can use table2struct. First, import all the data into Matlab as a table, and then convert it to a structure.
mystruct = table2struct(data);
to access your data you would use the following syntax:
mystruct(key).header
if key is an array, then you need to collect all the values to a list using either a cell array:
values = {mystruct(key).header}
or different variables:
[v1, v2, v3] = mystruct(key).header
but the latter option is problematic if you are not sure hoe many outputs to expect.
I'm not sure what will be more convenient to you, but you can also convert to a scalar structure by setting 'ToScalar' argument to true.

Matlab changing variable type, checking not making sense

I might be doing something silly or just have a lack of understanding but I am currently working with a table of data and when I change the variable type of a column using the table.column notation when I check using the table(:,1) notation it returns a 0.
For example in my table the first column is Creditability (where table is the name of the table) so I changed the variable type using table.Creditability = logical(table.Creditability)
Then when I use islogical(table.Creditability) it returns 1
But when I use islogical(table(:,1)) it returns 0 yet when I type table(:,1) it returns a logical variable in true or false form.
I may just have a lack of understanding as I am new to this but any help would be appreciated.
Thanks
Of course it will return 0. This is because of basic point you are missing.
"table" is a structure variable in which a field named "Creditability" is created.
While "Creditability" is a logical array, it's parent "table" is still a structure.
Now, you are not getting error with statement table(:,1) although table is scalar. That is because, MATLAB treats everything as matrix. In this case, table is 1x1 matrix.
I hope it is clear now.

Tableau rawsqlagg_real

Could somebody please give me a little guidance on rawsqlagg_real function in Tableau. What is right syntax for it when it is used to get data from MySQL.
I used it as per my understanding but I am getting an error "No such column [__measure__3]".
Code:
RAWSQLAGG_REAL("select count(Film Id) from flavia.TableforThe_top_10percent_of_the_user where count(distinct(User Id)) = %1",[it sucks])
I see a few issues here
Instead of WHERE, use HAVING
You have column names like Film Id, you should write them as 'Film Id' instead
Though I must say that it is better to do with LOD calculations as Tableau will be able to do better query optimizations that way. Plus it is less error prone and much easier to write.
I find another issue here in addition to using having instead of where. The filter value should be numeric, or the operator should be like and not =.
where count(distinct(User Id)) = **%1**