I have a dataset like this:
time value
1990 22
1991 31
1992 21
1993 7
1994 32
And I have a macro variable contains several obs value.
%put &p; returns: 1 4 5
I want to use this macro &p to select the matched time in default sequence.
The result should be this:
time value
1990 22
1993 7
1994 32
data result;
set indata;
if _N_ in (&p);
run;
_N_ is automatic variable containing incremental number of current data step iteration. Effectively it's number of current observation for simple cases like this. More on Automatic Variables
Related
I need to apply a certain treatment to sas dates depending on what day of the month the 1st of the month falls on. I need this to go forward for a number of months.
I’ve created the following as I also need macro variables for the start and end dates of the month.
%let first_dt = '01Jun2020'd;
data _null_
do j =0 to 12;
call Symputx(cats("&monstrt",j),put(intnx("month","first_dt",j,"b"),date9.),'g');
call Symputx(cats("&monend",j),put(intnx("month","first_dt",j,"e"),date9.),'g');
end;
run;
I now need to based on the 12 start of the month dates I have, increment the number of days. E.g. if start of month is a Monday I need to increment by 5 days, I’d start of the month is a Tuesday I need to increment by 6 days and so on.
I have attempted the following but it doesn’t appear to be working.
%macro weekdays(weekday);
data test;
if weekday("strt&i."d) = &weekday. then
new_stdt = put(intnx('day',"strt&i."d,+5),date9.)
;
%mend;
%weekdays(1,2,3,4,5,6,7);
Essentially I’m hoping to get all these dates to become populated based off the first_st variable, if this then subsequently changed, I could amend the original value and new values would be populated off of the back of that.
Here is a method that allows you to specify the values the add as a list of 7 numbers. For example this list matches the values in your comment.
%let add=11 5 6 7 8 9 10;
Notice how Monday and Tuesday are mapped to the 5 and 6 you mentioned in your question.
So if you have these values:
%let first_dt = '01Jun2020'd;
%let offset= 8 ;
You can generate a new date value like:
%let date = %sysfunc(intnx(month,&first_dt,&offset,b));
%let date = %eval(&date + %scan(&add,%sysfunc(weekday(&date))));
If you need to have the value of DATE look like something a human would understand you could add.
%let date = "%sysfunc(putn(&date,date9.))"d ;
So when the month offset is 8 the first of the month is on a Monday so the resulting date is 5 days after the first of the month.
1168 %put &=date;
DATE="06FEB2021"d
I am having difficulty achieving this functionality in SPSS. The data set is formatted like this (apologies for the excel format)
In this example, the AGGREGATE function was used to combine the cases by the same variable. In other words, CITY, Tampa in the example, is the break variable.
Unfortunately, each entry for Tampa gives 10 unique temperatures for each day. So the first entry for Tampa is days 0-10, and the second is days 10-20, they provide useful information. I can't figure out how to use the aggregate function to create new variables to avoid losing these days. I want to do this, as I want to be able to run tests on the mean temperature in Tampa over days 0-20, relative to days 0-20 in other cities.
My current syntax is:
AGGREGATE
/OUTFILE=* MODE=ADDVARIABLES
/BREAK=CITY
/Temp=Max(Temp).
But this doesn't create the variables, and I'm not sure where to start on that end. I checked the SPSS manual and didn't see this as an option within aggregate, any idea on what function might allow this functionality?
If I understand right, you are trying to reorganize all the CITY information into one line, and not to aggregate it. So what you are looking for is the restructure command casestovars.
First we'll create some fake data to demonstrate on:
data list list/City (a10) temp1 to temp10 (10f6).
begin data
Tampa 10 11 12 13 14 15 16 17 18 19
Boston 20 21 22 23 24 25 26 27 28 29
Tampa 30 31 32 33 34 35 36 37 38 39
NY 40 41 42 43 44 45 46 47 48 49
Boston 50 51 52 53 54 55 56 57 58 59
End data.
casestovars needs an index variable (e.g number of row within city). In your example your data doesn't have an index, so the following commands will create one:
sort cases by CITY.
if $casenum=1 or city<>lag(city) IndVar=1.
if city=lag(city) IndVar=lag(IndVar)+1.
format IndVar(f2).
Now we can restructure:
sort cases by CITY IndVar.
casestovars /id=CITY /index=IndVar/separator="_"/groupby=index.
This will also work if you have more rows per city.
Important note: my artificial index (IndVar) doesn't necessarily reflect the original order of rows in your file. If your file really doesn't contain an index and isn't ordered so the first row represents the first measurements etc', the restructured file will accordingly not be ordered either: the earlier measurements might appear on the left or on the right of the later ones - according to their order in the original file. To avoid this you should try to define a real index and use it in casestovars.
Run EXECUTE or Transform > Run Pending Transformations to see the results of the AGGREGATE command.
I have an input file, where my dates don't have leading zeros (like 25.3.2016) but I would like to transform them into format DDMMYYYYP10.
Is there any format, informat, function etc. that could do that for me?
I'm using SAS Enterprise Guide 4.3.
There isn't any "transformation" required, really. The only two things you need are:
A proper informat (in your case, the ANYDTDTE10. should do) for SAS to adequately recognize the dates upon reading the data
An output format (you are asking for DDMMYYP10.) to display dates, given they are imported correctly with the informat above.
Illustration:
data dates;
format mydate DDMMYYP10.;
input mydate ANYDTDTE10.;
datalines;
25.3.2016
run;
proc print;run;
Results:
Obs mydate
1 25.03.2016
Of course you'll be needing an INFILE statement rather than a DATALINES if you are reading external data (which I assume is your case), but the results will be the same.
Remember that output formats are only formats. You can change them at will without affecting the underlying data. So the key here is really the informat.
My input is dataset, so this worked for me:
data dates;
set My_data;
format date1 DDMMYYP10.;
date1 = input (date, anydtdte10.);
run;
SAS is designed to store a date as a "SAS Date", which is a numeric variable that is the number of days since Jan 1, 1960. Assuming you have a SAS date (and not a character variable that looks like a date), it should be straight forward to change the format used for this variable. Note that this doesn't actually transform the value, it just changes the format used to display the value when it is printed etc.
36 data want;
37 mydate="1Mar2016"d;
38 put mydate=;
39 format mydate ddmmyyp10.;
40 run;
mydate=01.03.2016
Edit: I reread your question and realized maybe you do not have a SAS dataset as input but instead have a text file? If so, you can read dates like 25.3.2016 using the ddmmyy10 informat. Below uses ddmmyy10 informat to read in the value from text file, and then ddmmyyp10 format to format it with period separators :
115 data want;
116 input mydate ddmmyy10.;
117 put mydate=;
118 format mydate ddmmyyp10.;
119 cards;
mydate=01.03.1960
NOTE: The data set WORK.WANT has 1 observations and 1 variables.
121 ;
122 run;
I have a question about timestamps hope you can help me.
I'm reading one timestamp column from excel to matlab using;
[temp, timestamps] = xlsread('2012_15min.xls', 'JAN', 'A25:A2999');
This column have date like this:
01-01-2012 00:00
01-01-2012 00:15
01-01-2012 00:30
01-01-2012 00:45
01-01-2012 01:00
(it goes on until the end of January in periods of 15 minutes)
Now I want to get a new column in matlab that keeps only year month day and hour, this data must be separated and I don't want to keep repetitive dates (e.g I don't want to get 4 dates with 01 01 2012 0 only one of that)
So I want to get:
01 01 2012 0
01 01 2012 1
01 01 2012 2
It must go until the end of January with periods of 1 hour.
If you know that there is data for every hour you could construct this directly, but if you have possible missing data and you therefore need to convert from your timestamps, then some combination of datestr/datenum/datevec is usually the best bet.
First, convert timestamps with datevec:
times = datevec(timestamps); % sometimes need to also use format string
Then, take only the year/month/day/hour, removing repetitions:
[times_hours,m,n] = unique(times(:,1:4), 'rows');
You can use the indices in m to extract the matching data for those times.
If you want this converted back to some sort of string you can use datestr and specify format:
timesout = datestr(times_hours,'dd mm yyyy hh');
I'm using SAS 9.3
I need to create a way to sum up by week total, and I have no idea how to do it. So basically I have a year list of dates (left column below) with a total from that date (the right column). Our week goes from Friday to the previous Thursday (e.g. Thursday Oct 17 through Friday the Oct 25th).
An issue I also have is as you see the dates on the left are not completely daily and don't always have a Thursday date before the last Friday date. Would any know a way to add these weeks up - Week 1, Week 2, etc etc ...?
Thanks for any help that can be provided
2013-01-01 3
2013-01-02 8
2013-01-03 8
2013-01-04 10
2013-01-06 1
2013-01-07 10
2013-01-08 14
2013-01-09 12
2013-01-10 8
2013-01-11 9
2013-01-12 1
2013-01-14 12
2013-01-15 8
2013-01-16 5
2013-01-17 15
2013-01-18 7
2013-01-20 1
Trivial way:
data want;
set have;
weekno = ceil((date-'03JAN2013'd)/7);
run;
IE, subtract the first thursday and divide by 7, (so 1/1-1/3 is weekno=0).
INTCK function is also adept at calculating this. The basic structure is
weekno=intck('WEEK.5','04JAN2013'd,date); *the second argument is when you want your first week to start;
WEEK means calculate weeks, # on left side of decimal is multiple week groups (2 week periods is WEEK2.), on right side is shift index (from the default sunday-saturday week).
You could also create a format that contained your weeks, and use that.