Esttab: Append rtf files with page break? - append

I use a loop to append each regression table for various dependent variables into one file:
global all_var var1 var2 var3
foreach var of global all_var {
capture noisily : eststo mod0: reg `var' i.female
capture noisily : eststo mod1: reg `var' i.female
capture noisily : eststo mod2: reg `var' i.female
esttab mod0 mod1 mod2 using "file_name.rtf", append
}
However, in the final rtf file some tables are stretching over two pages which does not look good.
Is there any way to avoid that, e.g. introduce some sort of pagebreak?

The community-contributed package rtfutil provides a solution:
net describe rtfutil, from(http://fmwww.bc.edu/RePEc/bocode/r)
TITLE
'RTFUTIL': module to provide utilities for writing Rich Text Format (RTF) files
DESCRIPTION/AUTHOR(S)
The rtfutil package is a suite of file handling utilities for
producing Rich Text Format (RTF) files in Stata, possibly
containing plots and tables. These RTF files can then be opened
by Microsoft Word, and possibly by alternative free word
processors. The plots can be included by inserting, as linked
objects, graphics files that might be produced by the graph
export command in Stata. The tables can be included by using the
listtex command, downloadable from SSC, with the handle() option.
Exact syntax will depend on your specific use case for which you do not provide any example data.

After installing rtfutil, you may use rtfappend. Suppose you want a page break between mod1 and mod2.
esttab mod0 mod1 using "file_name.rtf", replace
tempname handle
rtfappend `handle' using "file_name.rtf", replace
file write `handle' "\page" _n
rtfclose `handle'
esttab mod2 using "file_name.rtf", append
If you want a line break, just replace \page with \line.

Related

Compare Json Files in Beyond Compare

How can I compare two minified json files in beyond compare? Is there a built in file format for json? I'm looking to compare two pretty print representations of the underlying json objects.
In this thread a representative says:
While not in the box yet, we do have a JSON sorted format available for download in our Additional File Formats section:
With a link to Scooter Software Downloads
You can achieve this specialized diff functionality by defining a new file format conversion rule in beyond compare. This example was conducted in the Windows OS.
Step 0: Create a python conversion script to render the formatted json. Save the following python script somewhere on your harddrive
import json
import sys
sourceFile = sys.argv[1]
targetFile = sys.argv[2]
with open(sourceFile, 'r') as file_r:
# Load json data
data = json.load(file_r)
# Write formatted json data
with open(targetFile, 'w') as file_w:
json.dump(data, file_w, indent=4)
Step 1: Navigate in the BeyondCompare menu to: Tools-->File Formats...
Step 2: Create new file format entry by clicking on the + button and select Text Format
Step 3: Enter *.json into the file format's Mask field, and any description that will help you recall the file format's purpose.
Step 4: Define the file format's conversion settings. Select the Conversion tab and select External program (unicode filenames) from the pull down.
In the Loading field write the following shell command
python C:\Source\jsonPrettyPrint.py "%s" "%t"
Step 5: Press the Save button and optionally rename the file format by right clicking it in the File Formats Name and Mask table.
Further specializations of the json dumping could be considered by looking at the python documentation, eg sort_keys=True

(sas) concatenate multiple files from different folders

I'm a relatively new SAS user, so please bear with me!
I have 63 folders that each contain a uniquely named xls file, all containing the same variables in the same order. I need to concatenate them into a single file. I would post some of the code I've tried but, trust me, it's all gone horribly awry and is totally useless. Below is the basic library structure in a libname statement, though:
`libname JC 'W:\JCs\JC Analyses 2016-2017\JC Data 2016-2017\2 - Received from JCs\&jcname.\2016_&jcname..xls`
(there are 63 unique &jcname values)
Any ideas?
Thanks in advance!!!
This is a common requirement, but it requires a fairly uncommon knowledge of multiple SAS functions to execute well.
I like to approach this problem with a two step solution:
Get a list of filenames
Process each filename in a loop
While you can process each filename as you read it, it's a lot easier to debug and maintain code that separates these steps.
Step 1: Read filenames
I think the best way to get a list of filenames is to use dread() to read
directory entries into a dataset as follows:
filename myfiles 'c:\myfolder';
data filenames (keep=filename);
dir = dopen('myfiles');
do file = 1 to dnum(dir);
filename = dread(dir,file);
output;
end;
rc = dclose(dir);
run;
After this step you can verify that the correct filenames have been read be printing the dataset. You could also modify the code to only output certain types of files. I leave this as an exercise for the reader.
Step 2: use the files
Given a list of names in a dataset, I prefer to use call execute() inside a data step to process each file.
data _null_;
set filenames;
call execute('%import('||filename||')');
run;
I haven't included a macro to read in the Excel files and concatenate the dataset (partly because I don't have a suitable list of Excel files to test, but also because it's a situational problem). The stub macro below just outputs the filenames to the log, to verify that it's running:
%macro import(filename);
/* This is a dummy macro. Here is where you would do something with the file */
%put &filename;
%mend;
Notes:
Arguably there are many are many examples of how to do this in multiple places on the web, e.g.:
this SAS knowledge base article (http://support.sas.com/kb/41/880.html)
or this paper from SUGI,
However, most of them rely on the use of pipe to run a dir or ls command, which I feel is the wrong approach because it's platform dependent and in many modern environments the ability to pipe shell commands will be disabled.
I based this on an answer by Daniel Santos in communities.sas.com, but, given the superior functionality of stackoverflow I'd much rather see a good answer here.

How can I prevent MATLAB from automatically modifying .dat file variable names upon import using the dataset function?

So, I currently have a MATLAB script that does stuff with data and then, using a template .dat file, creates about 20 more .dat files with only a single column being changed (I've been using the dataset and export functions to read and write the files, respectively). The program that will use the .dat files, ExperimentBuilder, requires that the headers have names that start with dollar signs (for example: $image). However, when I use the dataset function in MATLAB to import the template file, I get this warning:
Warning: Variable names were modified to make them valid MATLAB identifiers.
It then replaces all the dollar signs in the variables to x_ (for example, x_image), which would be fine if it would let me change it back to the $ format. But whenever I try to using set , it just gives me this warning again and reverts it back to x_, which is unreadable by ExperimentBuilder.
I know I could just do a quick copy and paste on each file with the original headings, but I would like to know if there's a way to fix this problem in the actual code.
Thanks!
Thing is the MATLAB database uses the header names to provide access to the columns by name, this is why the header names must be valid identifiers (isvarname() states that it must starts with a letter, and contains only valid alphanumeric characters [a-zA-Z0-9_]).
The easiest solution would be manually write the header line yourself (including names starting with $), while separately exporting the data without the headers:
export(ds, ..., 'WriteObsNames',false)
(Note that dataset.export overwrites files by default, so you'll have to export first, then prepend the header line at the beginning of the file. Or if you're comfortable modifying MATLAB own functions, then go edit dataset.export and change the fopen mode from overwrite 'wt' to append 'at' mode).

Convert dataset of .mat format to .csv octave/matlab

there are datasets in .mat format in the this site: http://www.cs.nyu.edu/~roweis/data.html
I want to change the format to .csv.
Can someone tell me how to change the format to create the .csv file.
Thanks!
Suppose that the .mat files from the site are available already. In the command window in Matlab, you may write, for example:
load('C:\Users\YourUserName\Downloads\mnist_all.mat');
to load the .mat file; the result should be a set of matrices test0, test1, ..., train0, train1 ... created in your workspace, which you want saved as CSV files. Because they're different size, you need to save one CSV per variable, e.g. (also in the command window):
csvwrite('C:\Users\YourUserName\Downloads\mnist_test0.csv', test0);
Repeat the command for each variable, and do not forget to change also the name of the output file to avoid overwriting.
Did you tried the csvwrite function in Matlab?
Just load your .mat files with the load function and then write them with csvwrite!
I do not have a Matlab license so I installed GNU Octave 4.2.1 (2017) on Windows 10 (thank you to John W. Eaton and others). I was not fully successful using the csvwrite so I used the following workaround. (BTW, I am totally incompetent in the Octave world. csvwrite worked for simple data structures).
In the Command Window I used the following two commands
load myfile.mat
save("-text","myfile.txt","variablename")
When the "myfile.mat" is loaded, the variable names for the data vectors loaded are displayed in the workspace window. This is the name(s) to use in the save command. Some .mat files will load several data structures.
The "-text" option is the default, so you may not need to include this option in the command.
The output file lists the .mat file contents in text format as single column (of potentially sequential variables). It should be easy to use you text editor to massage this data into the original matrix structure for use in whatever app you are comfortable with.
Had a similar issue. Needed to convert a series of .mat files that had two columns of numerical data into standard data files (ascii text). Note that I don't really ever use csv, but everything here could be adapted by using csvwrite instead of the standard save.
Using Octave 4.2.1 ....
load myfile.mat
LI = [L, I] ## L and I are column vectors representing my data
save myfile.txt LI
Note that L and I appear to be default variable names chosen by Octave for the two columns vectors in my original data file. Ideally a script that iterated over all files with the .mat extension in my directory would be ideal, but this got the job done. It saves the data as two space separated columns of data.
*** Update
The following script works on Octave 4.2.1 for a series of data files with the .mat extension that are in the same directory. It will iterate over them and write the data out to text files with the same name but with the extension .dat . Note that this is not efficient, so if you have a lot of files or if they are large it can take a while to run. I would suggest that you run it from the command line using octave mat2dat.m so you can actually watch it go.
I make no guarantees that this will work for you, but it did for me. I also am NOT proficient in Octave or Matlab, so I'm sure a better solution exists.
# mat2dat.m
dirlist = glob("*.mat")
for i=1:length(dirlist)
filename = dirlist{i,1}
load(filename, "L", "I")
LI = [L,I]
tmpname = filename(1:length(filename)-3)
txtname = strcat(tmpname, 'dat')
save(txtname, "LI")
end

Writing a script for reading many .csv files with similar filenames

I have several .csv files with similar filenames except a numeric month (i.e. 03_data.csv, 04_data.csv, 05_data.csv, etc.) that I'd like to read into R.
I have two questions:
Is there a function in R similar to
MATLAB's varname and assignin that
will let me create/declare a variable name
within a function or loop that will allow me to
read the respective .csv file - i.e.
03_data.csv into 03_data data.frame,
etc.? I want to write a quick loop to
do this because the filenames are
similar.
As an alternative, is it better to
create one dataframe with the first
file and then append the rest using a
for loop? How would I do that?
You could look at this related question. You can create the file names easily with a paste command:
file.names <- paste(sprintf("%02d",1:10), "_data.csv", sep="")
Once you have your file names (whether by creating them or by reading them from the directory as in the other question), you can import them quickly with an lapply:
import.list <- lapply(file.names, read.csv)
Lastly, to combine the list into one dataframe, the easiest approach is to use the reshape function below:
library(reshape)
data <- merge_recurse(import.list)
It is also very easy to read the content of a directory including use of regular expressions to skip focus on certain names only, e.g.
filestoread <- list.files(someDir, pattern="\\.csv$", full.names=TRUE)
returns all (fully-formed, including full path) files in the given directory someDir that end on ".csv". You can get fancier with better regular expressions which are documented in many places.
Once you have your list of files, it is straightforward to read them all using apply or lapply or a loop.