Creating a list of all variables except some specific ones - macros

I have multiple spss file having multiple number of variables(col1,col2,...col150).I am trying to create a common code for restructure the file using VARSTOCVASES. in this i need to KEEP 3 variables(col1,col34,col66)these are common in all files but the rest variables are different.I know the normal way in that we will add all the remaining variables in to MAKE sub command. that i am adding bellow
VARSTOCVASES
/MAKE VariableName1 FROM Col1 Col2 Col3 ....etc(except 3)
/INDEX=VariableName(VariableName1)
/KEEP=Col1 Col34 Col66
instead of this i want to create some variable list using the (SPSSINC SELECT VARIABLES) command.I got this idea but i don't have any examples for the same.This Select query must be small which means this query should dynamically select all the variables except these 3(Col1 Col34 Col66)because i have different SPSS files and in that these 3(Col1 Col34 Col66) variables are same but the rest are different and all containing different number of variables.
IF i have a variable list(dynamically generated by excluding the 3) then i can point that in MAKE sub Command.Please any one help me.

one way to go about this could be to rename these specific columns and then select all other variables that start with "col":
rename variables (col1 col34 col66=var1 var34 var66).
spssinc select variables MACRONAME = "!allCOL"
/PROPERTIES PATTERN="Col*".
Now all variables with names starting with "Col" are in the list called "!allCOL" which you can use in your syntax, for example:
VARSTOCVASES
/MAKE VariableName1 FROM !allCOL /INDEX=VariableName(VariableName1) .
EDIT: another solution
The solution above is valid only if there is a constant pattern to all the variables you want on the list. If that is not the case, this following solution enables you to name the variables that you don't want on the list, and put all the rest on the list.
* first we define a new attribute in which we mark the
variables we don't want on the list.
VARIABLE ATTRIBUTE VARIABLES=Car_Model_1 Car_Model_2
ATTRIBUTE=IncludeInMake ("no").
* now we create the list, leaving out the unwanted variables.
spssinc select variables MACRONAME = "!forMake"
/ATTRVALUES NAME=IncludeInMake VALUE="".
VARSTOCVASES /MAKE Val FROM !forMake /INDEX=var(val) .

Related

Keeping original variable order with 'select variables' command

The following code was used to generate a list of numeric variables and their maxima and minima from a datafile containing >500 variables and >2000 cases:
OMS select tables
/if commands=["descriptives"]
subtypes=["descriptive statistics"]
/DESTINATION FORMAT = SAV
OUTFILE = "C:\statyMcStatFace.sav".
SPSSINC SELECT VARIABLES MACRONAME="!nums" /PROPERTIES TYPE= NUMERIC.
DESCRIPTIVES !nums /STATISTICS=MIN MAX.
omsend.
Sadly, the variables weren't listed in the same order in the output file as they were in the original file, nor according to any discernible order I can see. For example, if you run the given code on plantar_fascitiitis.csv at
kaggle.com/rameessahlu/plantar-fasciitis
you'll find that the order of the variables in the original table is age, sex, weight... etc., while the order the variables are listed in the macro is Status, TendernessOfFoot, Alignment, Burning... etc.. Why does this happen, and is there a way for me to order the variables as they are in the original table?
When you are creating your numerical variables list using the select variables command, there is an option to keep the created list in the original order of the dataset. So all you have to do is use the command with this addition:
SPSSINC SELECT VARIABLES MACRONAME="!nums" /PROPERTIES TYPE= NUMERIC /OPTIONS ORDER=FILE.

how to use each to pass sets of variables to a function in q

I have a function that deletes and works using a directory and a table and column as variables:
Delete1[dir,t,c]
Another that retruns a set of directories that works:
Paths[dir]
Now I am trying to combine these two using something like "each" to all the directories that Paths[dir] to Delete1 function and I am trying something like this:
Delete1 each (Paths[dir];t;c)
The syntax does not quite work.
You want to use projection. Supplying only the second and third arguments to the Delete1 function creates a new function with just one argument. You can use each between the projection and Paths
Delete1[;t;c] each Paths[dir]
You could use dot apply to this end, you can read more about dot applies here https://code.kx.com/q/ref/unclassified/#apply. It would look like the following:
Delete1 .' (Paths[dir];t;c)
Note if you're using this delete function to delete a column from a table in every partition you only need to delete it from .d file in the last partition. (like in a previous question of yours soft deleting a column from a table in q )

Conditional processing in SPSS

I would like to conditionally process blocks of syntax where the condition is based on the active data set.
Within an SPSS macro, you can conditionally process a block of syntax using the !IF/!IFEND macro command. However, as far as I can tell, the user is required to explicitly give a value to the flag by either using the !LET command (!LET !FLAG = 1), or by using a Macro input variable. This is wildly different from my experience with other languages, where I can write code that has branching logic based on the data I'm working with.
Say that there is a block of syntax that I only want to run if there are at least 2 records in the active data set. I can create a variable in the data set which is equal to the number of records using the AGGREGATE function, but I can't find a way to make a macro variable equal to that value in a way that is usable as a !IF condition. Below is a very simple version of what I'd like to do.
COMPUTE DUMMY=1.
AGGREGATE
/OUTFILE = * MODE = ADDVARIABLES
/BREAK DUMMY
/NUMBER_OF_CASES = N.
!LET !N_CASES = NUMBER_OF_CASES.
!IF (!N_CASES > 1) !THEN
MEANS TABLES = VAR1 VAR2 VAR3.
!IFEND
Is what I'm attempting possible? Thanks in advance for your time and consideration.
Following is a way to put a value from the dataset into a macro, which you can then use wherever you need - including in another macro.
First we'll make a little dataset to recreate your example:
data list free/var1 var2 var3.
begin data
1 1 1 2 2 2 3 3 3
end data.
* this will create the number of cases value:
AGGREGATE /OUTFILE = * MODE = ADDVARIABLES /BREAK /NUMBER_OF_CASES = N.
Now we can send the value into a macro - by writing a separate syntax file with the macro definition.
do if $casenum=1.
write out='SomePath\N_CASES.sps' /"define !N_CASES() ", NUMBER_OF_CASES, " !enddefine.".
end if.
exe.
insert file='SomePath\N_CASES.sps'.
The macro is now defined and you can use the value in calculations (e.g if you want to use it for analysis of a different dataset, or later in your syntax when the current data is not available).
for example:
compute just_checking= !N_CASES .
You can also use it in your macro as in your example - you'll see that the new macro can't read the !N_CASES macro as is, that's why you need the !eval() function:
define !cond_means ()
!IF (!eval(!N_CASES) > 1) !THEN
MEANS TABLES = VAR1 VAR2 VAR3.
!IFEND
!enddefine.
Now running the macro will produce nothing if there is just one line in your data, and will run means if there was more than one line:
!cond_means.

Expression to look up certain character and store it in SSIS variable

In SSIS 2008 I have a variable called #[User::EANcode] It contains a string with a product eancode like '1234567891123'. The value is derived from a filename like'1234567891123.jpg' via a foreach loop.
However, sometimes the filenames contain an extra '_1', '_2' etc. at the end like '1234567891123_1.jpg' resulting in a value '1234567891123_1' in the EANcode variable.
This happens when there is more than one image for the same EANcode (product). The _N addition is always a number and it is always at the end of the name/string.
What is the expression to find/cath the '_1' (or_2 or_N etc) so you can store it in another variable called #[User::Addition]?
If there is no addition, the variable stays empty which is fine.
The reason I need to get this _N addition into a separate variable is that I later on need it to rename the filename but paste the addition back at the end.
Thanks!
I think you're looking for CHARINDEX() in conjunction with SUBSTRING(). With that, you can split off that _# to another variable like this (copy/pasta and execute to see. Play with the #temp1 variable to see the limitations of the code):
declare #temp1 varchar(20), #temp2 varchar(20)
set #temp1 = '1234567891123_12'
IF CHARINDEX('_', #temp1) > 1
set #temp2 = SUBSTRING(#temp1,CHARINDEX('_', #temp1),LEN(#temp1)-CHARINDEX('_',#temp1)+1)
select #temp1, #temp2
Hope it helps!

Matching Files in SPSS using Table or In

I keep running into an error when trying to add variables of one spss file to another. File 1 has 1.800.000 cases [payments], File 2 has 800.000 cases [recipients]. They both have an ID number to match cases on.
For every payment in File 1 I want to add the recipient, from File 2. The recipients should thus be able to match for multiple payments.
This are the two codes I have been trying, which don't work:
code using IN
DATASET ACTIVATE DataSet1.
SORT CASES BY recipientid(A).
DATASET ACTIVATE DataSet2.
SORT CASES BY recipientid(A).
Match Files /File=DataSet1
/In=DataSet2
/BY globalrecipientid.
execute
When I use /In I don't get any errors, but the files don't properly match sin it doesn't add any variables.
code using TABLE
DATASET ACTIVATE DataSet1.
SORT CASES BY recipientid(A).
DATASET ACTIVATE DataSet2.
SORT CASES BY recipientid(A).
Match Files /File=DataSet1
/TABLE=DataSet2
/BY globalrecipientid.
execute
When I use /TABLE I get the following error:
Warning # 5132
Undefined error #5132 - Cannot open text file 'S:\Progra~1\spss\IBM\SPSS\STATIS~1\20\lang\en\spss.err": No such file or directory
I have run out of tricks, wouldn't dare try this in Ruby, and excel sadly is too small to handle this.. Any thoughts?
Your first solution is wring because you are using IN subcommand wrongly. In other words you are matching Dataset1 with nothing.
IN creates a new variable in the resulting file that indicates whether
a case came from the input file named on the preceding FILE
subcommand.
Your second solution. You are sorting dataset by variable recipientid but the match files is done by the variable globalrecipientid. Why do you sort by one variable but match by another? This could be a problem. And dataset names should be in quotes.
Solution 1:
DATASET ACTIVATE DataSet1.
SORT CASES BY recipientid (A).
DATASET ACTIVATE DataSet2.
SORT CASES BY recipientid (A).
Match Files
/File = "DataSet1"
/TABLE = "DataSet2"
/BY recipientid.
execute.
Solution 2. I never liked the implementation of datasets in SPSS. I did not trusted them. Other solution is to save datasets as files and do the match of files.
get "file1.sav".
SORT CASES BY recipientid (A).
save out "file1s.sav".
get "file2.sav".
SORT CASES BY recipientid (A).
save out "file2s.sav".
Match Files
/File = "file1s.sav"
/TABLE = "file2s.sav"
/BY recipientid.
execute.
My syntax looks somwhat different:
DATASET ACTIVATE DatenSet1.
MATCH FILES /FILE=*
/FILE='DatenSet2'
/RENAME VarsToRename
/BY ID
/DROP= Vars
EXECUTE.
Maybe this helps?