How to copy one value from dataset to another dataset using JCL - jcl

My question is i want to append one value from one dataset to another dataset.
Below is the screenshot of the first dataset.
Below is the screenshot of the second dataset.
I want the 0 value which is present at the 50th position the needs to added in the second file at 50th position, so how i can do it using JCL using a sort step.
This needs to be done using a step in JCL , please help me out.

I would suggest you to insert sequence numbers at the end (position 81) in both the datasets and do a RIGHT JOIN in DFSORT with sequence number as key. Here is a solution using ICETOOL which does all the tasks in one step.
//XXXXXXA JOB 1,NOTIFY=&SYSUID
//STEP01 EXEC PGM=ICETOOL
//INA DD DSN=XXXXXX.PS.A,DISP=SHR
//INB DD DSN=XXXXXX.PS.B,DISP=SHR
//JNA DD DSN=&&JNA,DISP=(,DELETE),
// SPACE=(CYL,(1,0),RLSE),
// DCB=(LRECL=82,RECFM=FB,BLKSIZE=0)
//JNB DD DSN=&&JNB,DISP=(,DELETE),
// SPACE=(CYL,(1,0),RLSE),
// DCB=(LRECL=82,RECFM=FB,BLKSIZE=0)
//OUT DD DSN=XXXXXX.PS,OUT3,
// DISP=(,CATLG,DELETE),
// SPACE=(CYL,(1,0),RLSE),
// DCB=(LRECL=80,RECFM=FB,BLKSIZE=0)
//TOOLMSG DD SYSOUT=*
//DFSMSG DD SYSOUT=*
//SYSOUT DD SYSOUT=*
//TOOLIN DD *
SORT FROM(INA) TO(JNA) USING(CTL1)
SORT FROM(INB) TO(JNB) USING(CTL1)
SORT JKFROM TO(OUT) USING(CTL2)
/*
//CTL1CNTL DD *
SORT FIELDS=COPY
OUTREC BUILD=(1,80,81:SEQNUM,2,ZD)
//CTL2CNTL DD *
JOINKEYS F1=JNA,FIELDS=(81,2,A)
JOINKEYS F2=JNB,FIELDS=(81,2,A)
REFORMAT FIELDS=(F2:1,49,F1:50,1,F2:51,30)
JOIN UNPAIRED,F2
SORT FIELDS=COPY
Contents in XXXXXX.PS.A dataset:
----+----1----+----2----+----3----+----4----+----5----+----6----+----7--
***************************** Top of Data ******************************
7
**************************** Bottom of Data ****************************
Contents in XXXXXX.PS.B dataset:
----+----1----+----2----+----3----+----4----+----5----+----6----+----7--
***************************** Top of Data ******************************
PUT 'LCDT.ORDER.ORDRAD' LCD_BI_ORDERINFO_&OYMD._00 _01.txt
quit
/*
**************************** Bottom of Data ****************************
And, contents in the final output dataset, XXXXXX.PS.OUT3:
----+----1----+----2----+----3----+----4----+----5----+----6----+----7--
***************************** Top of Data ******************************
PUT 'LCDT.ORDER.ORDRAD' LCD_BI_ORDERINFO_&OYMD._07 _01.txt
quit
/*
**************************** Bottom of Data ****************************
Hope this helps.

Related

How do you merge lines in a single dataset with some duplicate values?

I am analyzing a medical record dataset where the patients were screened for STIs at 4 different times points. The data manager created a line per patient per STI for each time period. I want to merge the dataset so there is one line per patient at each time point with all of the diagnosed STI listed.
I created the new variables to capture each STI that would be listed under the Dx variable, but I can't figure out how to merge data within the same dataset so there is only one per patient at each timepoint.
data dx;
set dx;
if dx='ANOGENITAL WARTS (CONDYLOMATA ACUMINATA)' then MRWarts=1;
if dx='CHLAMYDIA' then MRCHLAMYDIA=1;
if dx='DYSPLASIA (ANAL, CERVICAL, OR VAGINAL)' then MRDYSPLASIA=1;
if dx='GONORRHEA' then MRGONORRHEA=1;
if dx='HEPATITIS B (HBV)' then MRHEPB=1;
if dx='HUMAN PAPILLOMAVIRUSES (HPV)-ANY MANIFESTATION' then MRHPV=1;
if dx='PEDICULOSIS PUBIS' then MRPUBIS=1;
if dx='SYPHILIS' then MRSYPHILIS=1;
if dx='TRICHOMONAS VAGINALIS' then MRTRICHOMONAS=1;
run;
Image of data structure I am looking for
taking the sample dataset that you provided in the image, you can use simple transpose for desired outcome.
data have;
input Pt_ID interval_round DX $10.;
datalines;
4 1 HIV
4 1 Warts
3 1 HIV
5 2 Chlamydia
;
run;
proc sort data=have1; by Pt_Id; run;
proc transpose data=have1 out=want(drop=_NAME_);
by Pt_Id;
id Dx;
var interval_round;
run;
proc print data=want; run;
Now this code will create all variables except interval_round, Say for example - a patient was screened for HIV in round 1 and Warts for round 2. Technically it should have only one row .. so how would you represent the interval_round then?

SAS Placeholder value

I am looking to have a flexible importing structure into my SAS code. The import table from excel looks like this:
data have;
input Fixed_or_Floating $ asset_or_liability $ Base_rate_new;
datalines;
FIX A 10
FIX L Average Maturity
FLT A 20
FLT L Average Maturity
;
run;
The original dataset I'm working with looks like this:
data have2;
input ID Fixed_or_Floating $ asset_or_liability $ Base_rate;
datalines;
1 FIX A 10
2 FIX L 20
3 FIX A 30
4 FLT A 40
5 FLT L 30
6 FLT A 20
7 FIX L 10
;
run;
The placeholder "Average Maturity" exists in the excel file only when the new interest rate is determined by the average maturity of the bond. I have a separate function for this which allows me to search for and then left join the new base rate depending on the closest interest rate. An example of this is such that if the maturity of the bond is in 10 years, i'll use a 10 year interest rate.
So my question is, how can I perform a simple merge, using similar code to this:
proc sort data = have;
by fixed_or_floating asset_or_liability;
run;
proc sort data = have2;
by fixed_or_floating asset_or_liability;
run;
data have3 (drop = base_rate);
merge have2 (in = a)
have1 (in = b);
by fixed_or_floating asset_or_liability;
run;
The problem at the moment is that my placeholder value doesn't read in and I need it to be a word as this is how the excel works in its lookup table - then I use an if statement such as
if base_rate_new = "Average Maturity" then do;
(Insert existing Function Here)
end;
so just the importing of the excel with a placeholder function please and thank you.
TIA.
I'm not 100% sure if this behaviour corresponds with how your data appears once you import it from excel but if I run your code to create have I get:
NOTE: Invalid data for Base_rate_new in line 145 7-13.
RULE: ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+--
145 FIX L Average Maturity
Fixed_or_Floating=FIX asset_or_liability=L Base_rate_new=. _ERROR_=1 _N_=2
NOTE: Invalid data for Base_rate_new in line 147 7-13.
147 FLT L Average Maturity
Fixed_or_Floating=FLT asset_or_liability=L Base_rate_new=. _ERROR_=1 _N_=4
NOTE: SAS went to a new line when INPUT statement reached past the end of a line.
NOTE: The data set WORK.HAVE has 4 observations and 3 variables.
Basically it's saying that when you tried to import the character strings as numeric it couldn't do it so it left them as null values. If we print the table we can see the null values:
proc print data=have;
run;
Result:
Fixed_or_ asset_or_ Base_
Floating liability rate_new
FIX A 10
FIX L .
FLT A 20
FLT L .
Assuming this truly is what your data looks like then we can use the coalesce function to achieve your goal.
data have3 (drop = base_rate);
merge have2 (in = a)
have (in = b);
by fixed_or_floating asset_or_liability;
base_rate_new = coalesce(base_rate_new,base_rate);
run;
The result of doing this gives us this table:
Fixed_or_ asset_or_ Base_
ID Floating liability rate_new
1 FIX A 10
3 FIX A 10
2 FIX L 20
7 FIX L 20
4 FLT A 20
6 FLT A 20
5 FLT L 30
The coalesce function basically returns the first non-null value it can find in the parameters you pass to it. So when base_rate_new already has a value it uses that, and if it doesn't it uses the base_rate field instead.

SAS intnx quarter variation

sorry it is probably a very simple question, but I can't seem to find an answer to it.
Say, we want to create a table that contains 4 quarters back from the previous month:
%macro asd;
%let today = %sysfunc(today());
%let quarter_count_back = 4;
%let first_quarter = %sysfunc(intnx(month,&today.,-1));
proc sql;
create table quarters
(
Quarters num informat = date9. format = date9.
);
insert into quarters
%do i = 0 %to -&quarter_count_back.+1 %by -1;
values(%sysfunc(intnx(quarter,&first_quarter.,&i.)))
%end;
;
quit;
run;
%mend asd;
%asd;
run;
This code works just fine and creates a table, which starts from APR2016 and goes back in time by quarter. However, if I change the number in the 'first_quarter' line for -2, -3 etc... the code always starts from JAN2016 which just doesn't make any sense to me! For example:
%let first_quarter = %sysfunc(intnx(month,&today.,-2));
It seems logical that if I put this line in the code the table should start from MAR2016 and go back by quarter, but it does not, it starts from JAN2016.
Any ideas on what I am doing wrong here?
Thanks!
The default alignment for the INTNX function is the beginning of the interval. If you want it to go back 3 months, that's different than quarters. You can adjust these by looking at the fourth parameter of the INTNX function which controls the alignment. Options are:
Same
Beginning
End
If you want three months, try the MONTH.3 interval instead of quarter.
http://support.sas.com/documentation/cdl/en/lefunctionsref/63354/HTML/default/viewer.htm#p10v3sa3i4kfxfn1sovhi5xzxh8n.htm

split a double into parts - matlab

I have a date vector in form
20140117
20130325
20130530
etc.
There are 5,000,000 lines in the double vector.
How can I transfer it to a datevector recognized by matlab?
I don't like changing it to string and putting the parts in separately. It takes too long!
Please Help!
a combination of fix and mod let's you extract the digits you want:
%Matrix Columns YY,MM,DD,hh,mm,ss
[mod(fix(x/10000),100000),mod(fix(x/100),100),mod(x,100),zeros(size(x,1),3)]
%datenum
datenum(mod(fix(x/10000),10000),mod(fix(x/100),100),mod(x,100))
If you really want to avoid casting to string then back to number then you can use this method:
D = [20140117; 20130325; 20130530];
YY = fix( D./10000 ) ;
MM = fix( (D-YY.*10000) /100 ) ;
DD = fix( (D-YY.*10000-MM.*100 ) );
DateInMatlabformat = datenum( YY , MM , DD ) ;
You can package that in a one liner if you want, but basically what it does is:
Divide by 10000 to get the year in the variable YY
Remove this part from your original date ((D-YY.*10000)), then divide by 100 to get the month.
remove all of that, you obtain the day.
The last line merge all of that in a Matlab standard time serial format. Read the doc on datenum and datestr for more information.

displaying dates of values - time series data

I am trying to maintain a table using some panel data. I have all the data outputting fine, but I am having difficulty getting the correct dates to display. The method I am using is the following:
gen ymdny = date(date,"MDY"); /*<- date var from panel dataset that i import*/
sort name ymdny;
summ ymdny;
local lastdate : disp %tdM-D r(max);
local lastdate2 : disp %tdM-D (r(max)-1);
local lastw : disp %tdM-D (r(max)-7);
This would work fine if the data were daily, but the dataset I have is actually business daily (ie. missing for the weekends and bank national holidays). It seems silly but I have not been able to figure out a workaround that does the job. Ideally - there is a function that i can use to print the corresponding date to a particular value.
For example:
gen resbal_1d = round(l1.resbal,0.1);
gen dateOf = dateOf(resbal_1d); /* <- pseudocode example of what I would like */
I'm not sure what you're asking for but my guess is that you want to see a human readable form date as the output, given a numerical input. (This is your last sentence.) So simply try something like:
display %td 10
The format is important as the following shows (see help format):
display %tq 10
Same numerical input, different format, different output.
Two other examples from the manual:
* string to integer
display date("5-12-1998", "MDY")
* string to date format
display %td date("5-12-1998", "MDY")
As for your example code, I don't get what you're aiming for. In effect, you can summarize the date variable because in Stata, dates are just integers. It's legal but couldn't say if it's good form. Below a simple example.
clear all
set more off
set obs 10
gen date = _n // create the data
format date %td // give date format
list
summarize date
local onedate = r(max)
display %td `onedate'
Some references:
[U] 24 Working with dates and time
help datetime
help datetime business calendars
http://www.stata.com/support/faqs/data-management/creating-date-variables/
http://www.ats.ucla.edu/stat/stata/modules/dates.htm
(Maybe you can explain with more detail and context what it is you want.)
Edit
Your comment
I do not see how this helps with the date output. For example,
displaying r(max) - 1 on a monday will still display the sunday date.
does not explain, at all, the problems you're having with Stata's business calendars.
I'm adding what is basically an example taken from the help file I already referenced. I do this with the hope of convincing you that (re)-reading the help files is worthwhile.
*clear all
set more off
* import string dates
infile str10 sdate float x using http://www.stata-press.com/data/r13/bcal_simple
list
*----- Regular dates -----
* create elapsed dates - Stata's way of managing dates
generate rdate = date(sdate, "MD20Y")
format rdate %td
drop sdate x
list
* compute previous and next dates
generate tomorrow1 = rdate + 1
format tomorrow1 %td
generate yesterday1 = rdate - 1
format yesterday1 %td
list
*----- Business dates -----
* convert regular date to business dates
generate bdate = bofd("simple", rdate)
format bdate %tbsimple
* compute previous and next dates
generate tomorrow2 = bdate + 1
format tomorrow2 %tbsimple
generate yesterday2 = bdate - 1
format yesterday2 %tbsimple
order yesterday1 rdate tomorrow1 yesterday2 bdate tomorrow2
list
/*
The stbcal-file for simple, the calendar shown below,
November 2011
Su Mo Tu We Th Fr Sa
---------------------------
1 2 3 4 X
X 7 8 9 10 11 X
X 14 15 16 17 18 X
X 21 22 23 X X X
X 28 29 30
---------------------------
*/
Notice that if you add or substract 1 from a regular date, then business days are not taken into account. If you do the same with a business calendar date, you get what you want. Business calendars are defined by .stbcal files; the example uses a built-in calendar called simple. You maybe need to make your own .stbcal file but it is not difficult. Again, the details are in the help files.