SPSS aggergate on 2 variables - aggregate

I am trying to compute a N_break that has to "satisfy" a condition. I have a variable which indicates 1 or 0. Lets call that variable "HT". Every lopnr is also labled in every row multiple times. So first 10 rows can be ID nr 1. And next 20 can be ID nr 2 and so on.
My question is: How do i create a N-break with lopnr as breakvariable that has to have HT=1? I am not allowed to select only 1s on variable HT before, since i need the 0s in the file.

A few simple ways to do this:
1 - USE FILTER
filter cases by HT.
aggregate ....
when you get back to original dataset, use:
filter off.
use all.
2 - COPY DATASET
dataset name orig.
dataset copy foragg.
dataset activate foragg.
select if HT.
aggregate....
3 - TEMPORARY SELECTION
temporary.
select if HT.
aggregate....

Related

updating the cells in a partitioned table using .Q.ind[] in q

I have a partitioned table and can read it using a get command as such:
get `:hdb/2018.01.01/trade
and will give me:
sym size exchange
-----------------
0 100 2
1 200 2
1 300 2
I like to modify the cell value like size from 200 and 300 to a 1000 given an index or list of rows. So I am using
.Q.ind[`:hdb/2018.01.01/trade; 1 2j]
to get the rows and then change the cell. But I am getting a `rank error when running .Q.ind[].
The error you're getting is that the first input param to .Q.ind is the mapped table name, not a symbol representing the table name/location
I'm not sure if .Q.ind is going to help you here though, it's more useful for data retrieval than data (re)write.
A couple of approaches you could take:
Pull in the whole date slice select from table where date=X, modify it in memory and then write it back down using `:hdb/2018.01.01/trade/ set delete date from modifiedTable. This is assuming you're not modifying any enumerated/symbol columns. You'd have to be careful to maintain same schema, maintain same compression etc
Use the dbmaint package to handle the changes: https://github.com/KxSystems/kdb/blob/master/utils/dbmaint.md
If you're careful enough you could pull in only the column itself, modify it and write it back down. p set #[get p:`:hdb/2018.01.01/trade/col1;1 2;:;1000]
You could also use an amend operation to update the values.
#[`:hdb/2018.01.01/trade;`size;#[;1 2;:;1000]
This will edit your table on disk.
q)get`:hdb/2018.01.01/trade
sym size exchange
-----------------
0 100 2
1 200 2
1 300 2
q)#[`:hdb/2018.01.01/trade;`size;#[;1 2;:;1000]]
`:hdb/2018.01.01/trade
q)get `:hdb/2018.01.01/trade/
sym size exchange
-----------------
0 100 2
1 1000 2
2 1000 2

Select cases if value is greater than mean of group

Is there a way to include means of entire variables in Select Cases If syntax?
I have a dataset with three groups n=20 each (sorting variable grp with values 1, 2, or 3) and results of a pre and post evaluation (variable pre and post). I want to select for every group only the 10 cases where the pre value is higher than the mean of that value in the group.
In pseudocode:
select if pre-value > mean(grp)
So if the mean in group 1 is 15, that's what all values from group one cases should be compared to. But at the same time if group 2's mean is 20, that is what values from cases in group 2 should be compared to.
Right now I only see the MEAN(arg1,arg2,...) function in the Select Cases If window, but no possibility to get the mean of an entire variable, much less with an additional condition (like group).
Is there a way to do this with Select Cases If syntax, or otherwise?
You need to create a new variable that will contain the mean of the group (so all lines in each group will have the same value in this variable - group mean). You can then compare each line to this value .
First I'll create some example data to demonstrate on:
data list list/grp pre_value .
begin data
1 3
1 6
1 8
2 1
2 4
2 9
3 55
3 43
3 76
end data.
Now you can calculate the group mean and select:
AGGREGATE /OUTFILE=* MODE=ADDVARIABLES /BREAK=grp /GrpMean=MEAN(pre_value).
select if pre_value > GrpMean.
.

Issue when creating category charts in Tableau?

I have a table with different columns called "Bajo conocimiento...", "Exceso de...", etc. Each column has discrete values: 1, 2, 3 or empty (NULL). I need to count the number of occurrences of 1, 2 and 3 in each of these variables.
What I did first was to create bins for each variable. Thus I was able to create a graphic shown below.
However, the problem is that always one of variables is automatically picked as a filter. In the example below this is "Bajo conocimiento..." (the blue one). So, the values for "Bajo conocimiento..." are correct, however, for instance, the COUNT("Exceso de...") for the values 1, 2 and 3 are the intersections of "Exceso de..." with "Bajo conocimiento...". As a result, the values of "Exceso de..." are lower that they should be.
How can I count the occurrences of 1, 2 and 3 for each variable independently and get the resulting chart?.
You will have to create a table calculation for every one of your columns.
Create a calculated field with the following formula and call it eg [ExcesoCOUNT]:
{fixed [columnname]: COUNT([columnname])}
if you drag the newly created field on your dashboard and turn it into a AVG() rather than a SUM(), you will get the count of each value.

JasperReports group by all - display row even if nothing returned

I am trying to get JasperReports to mimic the SQL GROUP BY ALL functionality. I am grouping by MY_BOOL which can either by 0 or 1 and I am displaying the value, and a count on the number of rows in my report. However, I want to display a row for each, even if there are 0 rows for one of the values. So for example, if my query returns ten rows, and MY_BOOL=0 for all ten, I would like to see:
MY_BOOL | COUNT
0 10
1 0
How can I accomplish this in JasperReports?
EDITED:
It sounds like you need 2 variables and no groups.
$V{MY_BOOL_0} counts all rows where $F{MY_BOOL} is 0. $V{MY_BOOL_1} counts all rows where $F{MY_BOOL} is 1. The initial value of each variable is 0, so it doesn't matter if there are zero rows with $F{MY_BOOL} = 0.
Display the 2 variables in the title (or summary or wherever) to provide your 2 "grouped" subtotals.
Note: These 2 variables don't use groups at all, but they would be compatible with groups. For example, you could calculate the MY_BOOL_0 and MY_BOOL_1 values for each month if you group on months.

Get 2nd Value in Dataset in Reporting Services

This seems to be a very simple question, but I am trying to get the 2nd value in a dataset to display as a matrix's header value.
In this report, lets say that I have 2 datasets. In Dataset1, I have a query that pulls down 3 values for a parameter dropdown selection. In Dataset2, I return a result set and have bound it to my matrix.
Within the matrix, I have my repeating columns, and then 3 additional grouped columns to the right that have aggrigate values that I want to display. On the header of those 3 columns, I want to display the 3 values displayed in my Parameters dataset. Within the context of the matrix (and its dataset), I can get the first and last values of a different dataset (Dataset1 in this case) by using:
=First(Fields!DateDisplay.Value, "Dataset1")
=Last(Fields!DateDisplay.Value, "Dataset1")
I need to get something like:
=Second(Fields!DateDisplay.Value, "Dataset1")
How do I pull this off without violating the scoping rules on aggregate columns?
For SSRS 2008 R2, you can do this if each row of your dataset has an identifier column by using the LookUp() function.
=LookUp(1,Fields!Row.Value,Fields!DateDisplay.Value,”Dataset1”)
=LookUp(2,Fields!Row.Value,Fields!DateDisplay.Value,”Dataset1”)
=LookUp(3,Fields!Row.Value,Fields!DateDisplay.Value,”Dataset1”)
If you do not have an identifier column you can use ROW_NUMBER() to build one in.
Query:
SELECT ROW_NUMBER() OVER(ORDER BY DateDisplay) AS Row, DateDisplay
FROM Dates
Results:
Row DateDisplay
--- ---------
1 June 1st
2 March 12th
3 November 15th
Here is a link to a similar thread in MSDN Forums: Nth row element in a dataset SSRS
If you are using SSRS-2012 or 2014 then one has to use below expression.
=LookUp(AnyRowNumber, Fields!RowNumber.Value,Fields!DisplayField.Value,”DatasetName”)
I have tried above it was not working in my case.