i have a data sales below, and I am trying to send it out via email with .xlsx file attached.
the question is that I want the cell(s) with negative number(i.e. -2.10%, -0.17%) in column 'change' to be red color:
data sales;
input unid $3. map change percent8.2;
format change percent8.2;
cards;
001 100 12.00%
002 509 -2.10%
003 2001 -0.17%
004 48 7.23%
;
run;
When using ODS EXCEL the style attribute tagattr can be used to inject Excel specific features into worksheet cells.
In this case, Excel cell formatting can be applied, such that positive values are formatted one way and negative numbers another way.
Review guidelines for customizing a number format, Microsoft Support
A number format can have up to four sections of code, separated by semicolons. These code sections define the format for positive numbers, negative numbers, zero values, and text, in that order.
<POSITIVE>;<NEGATIVE>;<ZERO>;<TEXT>
For example, you can use these code sections to create the following custom format:
[Blue]#,##0.00_);[Red](#,##0.00);0.00;"sales "#
Example:
ods excel file='sample.xlsx';
proc print noobs data=sales;
var unid map;
var change / style=[tagattr='format:#0.00%;[Red](#0.00%)'];
run;
ods excel close;
Related
I have a series of formatted numeric variables and I would like to convert them all into character variables assigned the corresponding values found in the format labels. Here is an example of the format:
proc format;
value Group
1= 'Experimental 1'
2= 'Experimental 2'
3= 'Treatment as usual';
run;
My variable Group_num has values 1-3 and has this format applied. I want to create a new character variable called Group_char which has the values "Experimental 1", "Experimental 2", and "Treatment as usual".
The (long) way I would do this would be:
data out;
set in;
format Group_char $30.;
if Group_num=1 then Group_char="Experimental 1";
if Group_num=2 then Group_char="Experimental 2";
if Group_num=3 then Group_char="Treatment as usual";
run;
However, I need to do this to 13 different variables and I don't know what their variable values, format names, and format labels are without looking at the data more. Preferably, I would want to use whatever format is already applied to the variable to automatically translate it into a new character variable, without needing to know the format name/labels or original variable values. However, if I need to find out the format name to create a new character variable just by using the format name, that would be better than needing to also know the original variables values and format labels as well.
Alternatively, another way to solve my problem would be if you could tell me if there is a way of importing SPSS datasets using variable value labels only, and leaving the values themselves out of the picture entirely, such that numeric variables with value labels are imported as character variables.
Thank you
First off, it's usually best not to do this - most of the time that you need the character bits, you can get them off of the formats.
But, that said... you need to look at the vvalue function.
data want;
set have;
var_char = vvalue(var_num);
run;
vvalue returns the formatted value of the argument.
I tried to import text file in sas with the following code
PROC IMPORT DATAFILE= '/home/u44418748/MSc Biostatistics with SAS/Datasets/school.txt'
OUT= outdata
DBMS=dlm
REPLACE;
delimiter='09'x;
GETNAMES=YES;
RUN;
But I am getting import unsuccessful because text file has period for missing data
this is what i got in SAS log
NOTE: Invalid data for class_size in line 455 16-17.
455 CHAR 454.34.8.32.17.NA.23.125.12.188 31
ZONE 3330330303303304403323330332333
NUMR 454934989329179E1923E125912E188
sl_no=454 school=34 iq=8 test=32 ses=17 class_size=. meanses=23.125 meaniq=12.188 _ERROR_=1 _N_=454
how can load this text file in SAS
Did you create that text file from R? That package has a nasty habit of putting text values of NA for numeric values into text files. If you are the one that created the file the you might check if the system you are using has a way to not put the NA into the file to begin with. In a delimited file missing values are normally represented by having nothing for the field. So the delimiters are right next to each other. For SAS you can use a period to represent a missing value.
I wouldn't bother to use PROC IMPORT to read a delimited file. Just write a data step to read the file. Since it looks like your file only has six variables and they are all numeric the code is trivial.
data outdata;
infile '/home/u44418748/MSc Biostatistics with SAS/Datasets/school.txt'
dsd dlm='09'x firstobs=2 truncover
;
input sl_no school iq test ses class_size meanses meaniq ;
run;
One way to deal with the NA text in the input file is to replace them with periods. Since all of the fields are numeric you can do that easily because you don't have to worry about replacing real text that just happens to have the letter A after the letter N. Here is trick using the _INFILE_ automatic variable that you can use to make the change on the fly while reading the file.
data outdata;
infile '/home/u44418748/MSc Biostatistics with SAS/Datasets/school.txt'
dsd dlm='09'x firstobs=2 truncover
;
input #;
_infile_=tranwrd(_infile_,'NA','.');
input sl_no school iq test ses class_size meanses meaniq ;
run;
You are getting the NOTE: because of the NA value in the class_size field.
What you presume are periods (.) are actually tabs (hex code 09). Look under the period to confirm, the ZONE is 0 and NUMR 9. 09 is the tab character.
Proc IMPORT guesses each fields data type based on looking at the first few rows (default is 20 rows) of a text file. Your file contained only numbers the 20 rows, so the procedure guessed class_size was numeric.
There a couple of courses of action.
Do nothing. Read your log NOTES and know the places where NA occurred you will have a missing value in your data set.
or,Read the file as-is, but add GUESSINGROWS=MAX; statement to your import code
The mixed data type column class_size will be guessed as character and you might have to do another step to convert the values to numeric (a step in which the non-digit values get converted to missing values)
or, Edit the text file replacing all the NA with a period (.). The dot marks a missing value during IMPORT. The IMPORT step will have no incongruities to LOG about.
Converting a field
PROC IMPORT DATAFILE= '/home/u44418748/MSc Biostatistics with SAS/Datasets/school.txt'
DBMS=dlm REPLACE OUT=work.outdata;
delimiter='09'x;
GETNAMES=YES;
GUESSINGROWS=MAX;
RUN;
data want;
set outdata (rename=(class_size=class_size_char));
class_size = input (class_size_char, ?? best12.);
drop class_size_char;
run;
I have a .csv file with the first column containing dates, a snippet of which looks like the following:
date,values
03/11/2020,1
03/12/2020,2
3/14/20,3
3/15/20,4
3/16/20,5
04/01/2020,6
I would like to import this data into Matlab (I think the best way would probably be using the readtable() function, see here). My goal is to bring the dates into Matlab as a datetime array. As you can see above, the problem is that the dates in the original .csv file are not consistently formatted. Some of them are in the format mm/dd/yyyy and some of them are mm/dd/yy.
Simply calling data = readtable('myfile.csv') on the .csv file results in the following, which is not correct:
'03/11/2020' 1
'03/12/2020' 2
'03/14/0020' 3
'03/15/0020' 4
'03/16/0020' 5
'04/01/2020' 6
Does anyone know a way to automatically account for this type of data in the import?
Thank you!
My version: Matlab R2017a
EDIT ---------------------------------------
Following the suggestion of Max, I have tried specifiying some of the input options for the read command using the following:
T = readtable('example.csv',...
'Format','%{dd/MM/yyyy}D %d',...
'Delimiter', ',',...
'HeaderLines', 0,...
'ReadVariableNames', true)
which results in:
date values
__________ ______
03/11/2020 1
03/12/2020 2
NaT 3
NaT 4
NaT 5
04/01/2020 6
and you can see that this is not working either.
If you are sure all the dates involved do not go back more than 100 years, you can easily apply the pivot method which was in use in the last century (before th 2K bug warned the world of the danger of the method).
They used to code dates in 2 digits only, knowing that 87 actually meant 1987. A user (or a computer) would add the missing years automatically.
In your case, you can read the full table, parse the dates, then it is easy to detect which dates are inconsistent. Identify them, correct them, and you are good to go.
With your example:
a = readtable(tfile) ; % read the file
dates = datetime(a.date) ; % extract first column and convert to [datetime]
idx2change = dates.Year < 2000 ; % Find which dates where on short format
dates.Year(idx2change) = dates.Year(idx2change) + 2000 ; % Correct truncated years
a.date = dates % reinject corrected [datetime] array into the table
yields:
a =
date values
___________ ______
11-Mar-2020 1
12-Mar-2020 2
14-Mar-2020 3
15-Mar-2020 4
16-Mar-2020 5
01-Apr-2020 6
Instead of specifying the format explicitly (as I also suggested before), one should use the delimiterImportoptions and in the case of a csv-file, use the delimitedTextImportOptions
opts = delimitedTextImportOptions('NumVariables',2,...% how many variables per row?
'VariableNamesLine',1,... % is there a header? If yes, in which line are the variable names?
'DataLines',2,... % in which line does the actual data starts?
'VariableTypes',{'datetime','double'})% as what data types should the variables be read
readtable('myfile.csv',opts)
because the neat little feature recognizes the format of the datetime automatically, as it knows that it must be a datetime-object =)
I need to import data from a csv-file. And I'm able to read everything else but the date. The date format is like dd.m.yyyy format: 6;Tiku;17.1.1967;M;191;
I'm guessing if I need to specify an informat to read it in? I can't figure out which one to use because nothing I've tried works.
What I've managed so far:
data [insert name here];
infile [insert name here];
dlm=";" missover;
length Avain 8 Nimi $10 Syntymapaiva 8 Sukupuoli $1 Pituus 8 Paino 5;
input
Avain Nimi $ Syntymapaiva ddmmyyp.(=this doesnt work) Sukupuoli$ Pituus
Paino;
format Paino COMMA5.2 ;
label Syntymapaiva="Syntymäpäivä";
run;
And part of the actual file to read in:
6;Tiku;17.1.1967;M;191;
Thank you for helping this doofus out!
There is no informat named DDMMYYP.. Use the informat DDMMYY. instead.
Also make sure to use the : modifier before the informat specification included in the INPUT statement so that you are still using list mode input instead of formatted input. If you use formatted input instead of list mode input then SAS could read past the delimiter.
input Avain Nimi Syntymapaiva :ddmmyy. Sukupuoli Pituus Paino;
Perhaps you are confused because there is a format named DDMMYYP.
Formats are used to convert values to text. Informats are what you need to use when you want to convert text to values.
553 options nofmterr ;
554 data _null_;
555 str='17.1.1967';
556 ddmmyy = input(str,ddmmyy10.);
557 ddmmyyp = input(str,ddmmyyp10.);
----------
485
NOTE 485-185: Informat DDMMYYP was not found or could not be loaded.
558 put str= (dd:) (= yymmdd10.);
559 _error_=0;
560 run;
NOTE: Invalid argument to function INPUT at line 557 column 13.
str=17.1.1967 ddmmyy=1967-01-17 ddmmyyp=.
NOTE: Mathematical operations could not be performed at the following places. The results of the operations have been set to
missing values.
Each place is given by: (Number of times) at (Line):(Column).
1 at 557:13
You could use the anydtdte informat, but (as #Tom points out) if your data is known to be fixed in this format, then ddmmyy. would be be better. Also, Tom's advice about using the : modifier is correct, and is preferable to use in most (if not all) cases.
data want;
infile cards dlm=";" missover;
input Avain Nimi:$10. Syntymapaiva:ddmmyy. Sukupuoli:$1. Pituus Paino;
format Paino COMMA5.2 Syntymapaiva date9.;
label Syntymapaiva="Syntymäpäivä";
datalines4;
6;Tiku;17.1.1967;M;191;
;;;;
run;
which gives:
I want to import a .txt file in SAS.
Here what looks like my data :
annee manufacturier modele categorie cylindree cylindres transmission ville ...
2016 Ford Focus 1 1.8 5 Manual 10.1
2016 Toyota Tercel 3 1.4 3 Auto 7.1
Here is my code
data car;
infile "C:\Users\Mark\Desktop\sas\car.txt"
LRECL=10000000 DLM=" " firstobs=2 ;
input
annee manufacturier modele categorie cylindree cylindres transmission type ville route combine emissiond indice
;
run;
But, when I run it, I have a lot of " Invalid data for ... " and then I end up with very few data in my table in SAS and lots of missing ones.
Some variables are numbers and some are characters. I feel like the problem is there.
How I could import that type of file ?
Thank you
A text file does not have any intrinsic data type. Everything is character, until you explicitly tell SAS the data type of your columns. Also, sometimes you need to tell SAS the input format, or informat, of your data.
Sometimes SAS is smart enough to guess correctly re: your data informat. For example, the below code generates the same results if you delete the informat statement. But, this would not be the case for say dates. In general, explicitly specifying the informat is best practice.
If your data was delimited, such as CSV, you could use PROC IMPORT to import your data. Using PROC IMPORT, SAS will make a best guess as to the data type based on the content of the columns (like Excel does when it imports text data).
The below code will import the data you specified:
filename temp temp;
data _null_;
infile datalines;
file temp;
input;
put _infile_;
datalines;
annee manufacturier modele categorie cylindree cylindres transmission ville
2016 Ford Focus 1 1.8 5 Manual 10.1
2016 Toyota Tercel 3 1.4 3 Auto 7.1
run;
data want;
infile temp firstobs=2;
length
annee 8
manufacturier $20
modele $20
categorie 8
cylindree 8
cylindres 8
transmission $20
ville 8
;
informat
cylindree 8.1
ville 8.1
;
input
annee
manufacturier
modele
categorie
cylindree
cylindres
transmission
ville
;
run;
If your data contained spaces, for example manufacturier = Mercedes Benz, then you would need to use an informat (eg. $char20.) for that column as well.