Stata and global variables - global

I am working with Stata.
I have a variable called graduate_secondary.
I generate a global variable called outcome, because eventually I will use another outcome.
Now I want to replace the variable graduate if a condition relative to global is met, but I get an error:
My code is:
global outcome "graduate_secondary"
gen graduate=.
replace graduate=1 if graduate_primary==1 & `outcome'==1
But i receive the symbol ==1 invalid name.
Does anyone know why?

Something along those lines might work (using a reproducible example):
sysuse auto, clear
global outcome "rep78"
gen graduate=.
replace graduate=1 if mpg==22 & $outcome==3
(2 real changes made)
In your example, just use
replace graduate=1 if graduate_primary==1 & $outcome==1
would work.

Another solution is to replace global outcome "graduate_secondary" with local outcome "graduate_secondary".
Stata has two types of macros: global, which are accessed with a $, and local, which are accessed with single quotes `' around the name -- as you did in your original code.
You get an error message because a local by the name of outcome has no value assigned to it in your workspace. By design, this will not itself produce an error but instead will the reference to the macro will evaluate as a blank value. You can see the result of evaluating macro references when you type them by using display as follows. You can also see all of the macros in your workspace with macro dir (the locals start with an underscore):
display `outcome'
display $outcome
Here is a blog post about using macros in Stata. In general, I only use global macros when I have to pass something between multiple routines, but this seems like a good use case for locals.

Related

Using variables defined in R environment within knitr

I'd like to use knitr to produce a document which includes numbers, tables, and figures defined separately in R. I have a separate .R script within which I define the variables of interest, and I have executed this script and verified the variables are in memory. Then, within a .Rmd file I have the R markup code, within which I attempt to display the variables I've defined in R. I keep getting error messages whenever I attempt to knit, stating something like:
Error in unique(c("AsIs", oldClass(x))) : object "lamdaQ6" not found...
Clearly the knitr process initiates a new environment which excludes already defined variables. I have rather extensive R code to define the variables I want to include in the document, which I want to keep separate from the R markup code (both for clarity, and because R markup is not a good development environment for R).
Is there some means of preserving awareness of existing variables in R memory within knitr? I have searched extensively and not found a solution, probably because I don't know the correct term.
The solution offered by Paul Roub worked -- I thought it didn't, but this turned-out to be due to my own typo.
A cleaner solution for my needs, rather than source the whole R file, was to use "save" to save just the objects I needed to create the document. I added a line to save the objects at the end of the .R file, then used "load" to load the objects at the top of the .Rmd script.
So, at the end of the .R file, to save objects XX and YY, I have:
save(XX,YY,file="Data4Rmd.RData")
Then, near the beginning of the .Rmd file, to import these objects, I have:
```
{r,echo=FALSE}
setwd("c:\CurrentProjectDirectory")
load(file="Data4Rmd.RData")
```
This allows me to later use statements like:
Decay rate is lambda = `r XX`, and mean lifetime is `r YY`.
Yet another alternative might have been to save and load the whole workspace, but this seemed excessive.
Thanks for the assistance Paul, much appreciated :)

Stata: Edit each element of a global (which contains a list of variables)

This is probably just a short syntax question. I have:
clear all
macro drop _all
global variables var1 var2
and I want
global means m_var1 m_var2
which I have generated elsewhere. The goal is to use both globals in a Mundlak regression (like reg depvar $variables $means and not having to calculate/include the means by hand for different specifications. My idea was something along the lines of:
global means "m_`variables'"
but that simply ignores the variables global. Again, sorry for the R-think...
Edit: My strategy: I am trying to write a program which runs models (Mundlak/Chamberlain random effects logit, see Wooldridges Panel book 2nd ed p. 487) on several distinct lists of variables and returns graphs of regression results. This should be done such that I only have to change the globals/locals specifying these variables in the beginning. Thus, I need to have code that creates time averages of the globals and uses these and the original global in the logit specification.
I'm not convinced your general strategy is a good one, but I don't have information on the issue you face, so I won't comment much more.
I'll state that using locals is a better idea if you can spare the globals, and that you can redefine the contents of a macro using a loop:
clear all
set more off
local variables var1 var2
// original
local means "m_`variables'"
// loop
local means2
foreach v of local variables {
local means2 `means2' m_`v'
}
display "`means'"
display "`means2'"

combine compute, loop and concat

I would like to run the following syntax on lots of variables. Thus, I'd like to loop over a bunch of variables.
The syntax is the following:
compute v3a_mit = v3a.
recode v3a_mit
(-9998=2) (sysmis=9).
exe.
In this case, however, the syntax only concerns the variable "v3a".I have some other variables (v3b, v3c, v3d...) for which I would like to execute this syntax.
So, the loop should look like this.
DO REPEAT X=v3a to v3z
compute concat(X,'_mit') = X.
recode concat(X,'_mit')
(-9998=2) (sysmis=9).
exe.
END REPEAT.
So, within the loop, new variables shall be created which get a new name depending on the variable which is executed in the loop. The "SHIFT VALUES VARIABLE" command would be ideal (with shift=0) but this command cannot be used within a loop. Unfortunately "compute concat(X,'_mit')" does not work either.
CONCAT is a function for manipulating the values of string variables. So you can't use it for defining a variable name.
However you can make use of the !CONCAT function inside of a SPSS macro.
You can use the following macro to recode a set of variables.
DEFINE !recodeVars (vars = !CMDEND)
* for every variable in the 'vars' list do the RECODE INTO command.
!DO !var !IN (!vars)
RECODE !var (-9998=2) (sysmis=9) INTO !CONCAT(!var, '_mit').
!DOEND
!ENDDEFINE.
* macro call.
!recodeVars vars = v3a v3b v3c v3d.
Here, I used the RECODE INTO command, instead of one COMPUTE and a following RECODE command. But of course the principle of how to use the !CONCAT command would be the same for the COMPUTE operation.
However you can't call the macro in way like this !recodeVars vars = v3a TO v3z. In that case the macro would try perform the RECODE for the variables "v3a", "TO" and "v3z". You have to call this macro with the whole list of variables you want to recode.
This might be a lot of typing. As an easy way to avoid the typing, you could produce a SPSS command via the SPSS Menu (for example Analize -> Descreptive Statistics -> Frequencies) Then select the variables you want to recode (select the first variable, hold the SHIFT key and select the last variable) and then press the paste button. The Frequency command with the list of variables will be pasted to your syntax. You can now copy paste the variable list to your macro call.
If you have the Python integration plugin installed you could also use this small python block to retrieve the varlist between two variables.
BEGIN PROGRAM.
import spss,spssaux
variables = 'v3a to v3z' #Specify variables
varlist = spssaux.VariableDict().expand(variables)
spss.SetMacroValue('!varlist', ' '.join(varlist))
END PROGRAM.
This creates a macro named "!varlist" which expands to the list of variables when called.
You can now call the "!recodeVars" macro the following way: !recodeVars vars = !varlist.
If you don't have the python plugin installed (and don't want to use manual typing or copy and pasting) you can get the full variable list with the use of the "!DefList" macro from Raynald's SPSS Tools.
By the way, you can also make use of a macro for the SHIFT VALUES command.

At which lines in my MATLAB code a variable is accessed?

I am defining a variable in the beginning of my source code in MATLAB. Now I would like to know at which lines this variable effects something. In other words, I would like to see all lines in which that variable is read out. This wish does not only include all accesses in the current function, but also possible accesses in sub-functions that use this variable as an input argument. In this way, I can see in a quick way where my change of this variable takes any influence.
Is there any possibility to do so in MATLAB? A graphical marking of the corresponding lines would be nice but a command line output might be even more practical.
You may always use "Find Files" to search for a certain keyword or expression. In my R2012a/Windows version is in Edit > Find Files..., with the keyboard shortcut [CTRL] + [SHIFT] + [F].
The result will be a list of lines where the searched string is found, in all the files found in the specified folder. Please check out the options in the search dialog for more details and flexibility.
Later edit: thanks to #zinjaai, I noticed that #tc88 required that this tool should track the effect of the name of the variable inside the functions/subfunctions. I think this is:
very difficult to achieve. The problem of running trough all the possible values and branching on every possible conditional expression is... well is hard. I think is halting-problem-hard.
in 90% of the case the assumption that the output of a function is influenced by the input is true. But the input and the output are part of the same statement (assigning the result of a function) so looking for where the variable is used as argument should suffice to identify what output variables are affected..
There are perverse cases where functions will alter arguments that are handle-type (because the argument is not copied, but referenced). This side-effect will break the assumption 2, and is one of the main reasons why 1. Outlining the cases when these side effects take place is again, hard, and is better to assume that all of them are modified.
Some other cases are inherently undecidable, because they don't depend on the computer states, but on the state of the "outside world". Example: suppose one calls uigetfile. The function returns a char type when the user selects a file, and a double type for the case when the user chooses not to select a file. Obviously the two cases will be treated differently. How could you know which variables are created/modified before the user deciding?
In conclusion: I think that human intuition, plus the MATLAB Debugger (for run time), and the Find Files (for quick search where a variable is used) and depfun (for quick identification of function dependence) is way cheaper. But I would like to be wrong. :-)

How does scoping in Matlab work?

I just discovered (to my surprise) that calling the following function
function foo()
if false
fprintf = 1;
else
% do nothing
end
fprintf('test')
gives and error Undefined function or variable "fprintf". My conclusion is that the scope of variables is determined before runtime (in my limited understanding how interpretation of computer languages and specifically Matlab works). Can anyone give me some background information on this?
Edit
Another interesting thing I forgot to mention above is that
function foo()
if false
fprintf = 1;
else
% do nothing
end
clear('fprintf')
fprintf('test')
produces Reference to a cleared variable fprintf.
MATLAB parses the function before it's ever run. It looks for variable names, for instance, regardless of the branching that activates (or doesn't activate) those variables. That is, scope is not determined at runtime.
ADDENDUM: I wouldn't recommend doing this, but I've seen a lot of people doing things with MATLAB that I wouldn't recommend. But... consider what would happen if someone were to define their own function called "false". The pre-runtime parser couldn't know what would happen if that function were called.
It seems that the first time the MATLAB JIT compiler parses the m-file, it identifies all variables declared in the function. It doesn't seem to care whether said variable is being declared in unreachable code. So your local fprintf variable immediately hides the builtin function fprintf. This means that, as far as this function is concerned, there is no builtin function named fprintf.
Of course, once that happens, every reference within the function to fprintf refers to the local variable, and since the variable never actually gets created, attempting to access it results in errors.
Clearing the variable simply clears the local variable, if it exists, it does not bring the builtin function back into scope.
To call a builtin function explicitly, you can use the builtin function.
builtin( 'fprintf', 'test' );
The line above will always print the text at the MATLAB command line, irrespective of local variables that may shadow the fprintf function.
Interesting situation. I doubt if there is detailed information available about how the MATLAB interpreter works in regard to this strange case, but there are a couple of things to note in the documentation...
The function precedence order used by MATLAB places variables first:
Before assuming that a name matches a function, MATLAB checks for a variable with that name in the current workspace.
Of course, in your example the variable fprintf doesn't actually exist in the workspace, since that branch of the conditional statement is never entered. However, the documentation on variable naming says this:
Avoid creating variables with the same name as a function (such as i, j, mode, char, size, and path). In general, variable names take precedence over function names. If you create a variable that uses the name of a function, you sometimes get unexpected results.
This must be one of those "unexpected results", especially when the variable isn't actually created. The conclusion is that there must be some mechanism in MATLAB that parses a file at runtime to determine what possible variables could exist within a given scope, the net result of which is functions can still get shadowed by variables that appear in the m-file even if they don't ultimately appear in the workspace.
EDIT: Even more baffling is that functions like exist and which aren't even aware of the fact that the function appears to be shadowed. Adding these lines before the call to fprintf:
exist('fprintf')
which('fprintf')
Gives this output before the error occurs:
ans =
5
built-in (C:\Program Files\MATLAB\R2012a\toolbox\matlab\iofun\fprintf)
Indicating that they still see the built-in fprintf.
These may provide insight:
https://www.mathworks.com/help/matlab/matlab_prog/base-and-function-workspaces.html
https://www.mathworks.com/help/matlab/matlab_prog/share-data-between-workspaces.html
This can give you some info about what is shadowed:
which -all
(Below was confirmed as a bug)
One gotcha is that Workspace structs, and classes on the path, have particular scoping and type precedence that (if you are me) may catch you out.
E.g. in 2017b:
% In C.m, saved in the current directory
classdef C
properties (Constant)
x = 100;
end
end
% In Command window
C.x = 1;
C.x % 100
C.x % 1 (Note the space)
C.x*C.x % 1
disp(C.x) % 1