I am new to stack overflow. I am also a beginner in SAS. I have two datasets: one with a list of ID's and medications by date and one with ID's and dates by admission number. I am trying to get a list of medications by ID, organized by admission number in SAS.
I've tried merging by ID number and creating an "admission number" variable by using:
if admission_date-admission_date_1=0 then admission_number="Admission 1"
but all values are missing when I do that.
Here's what I have:
Here's what I want:
Thank you for your help!
Doesn't seem like that second data set is useful at all. What you're doing there is creating a enumeration variable which can be accomplished using a BY variable.
proc sort data=have; by id admission_date; run;
data want;
set have;
by id admission_date;
if first.id then admission_number=0;
if first.date then admission_number + 1;
run;
More details are available on the methods here if needed.
https://stats.idre.ucla.edu/sas/faq/how-can-i-create-an-enumeration-variable-by-groups/
Related
I have a dataset, that each id has multiple incomplete records, it could make more sense to have a final dataset as shown. Basically the idea is to have non-missing data fill the blanks wherever the value is from the 1st line or 2nd line, as long as for the same id.
The easiest way to do this is the self-update. This uses the core property of the update statement, that only non-missing values can replace other values, in a fun way that allows the rows to be simplified like this. The first obs=0 is there simply to give an empty base to update from - the dataset is really being read in from the second mention on that statement.
data have;
id = 1;
input x y z;
datalines;
1 . .
. 1 .
. . 1
;;;;
run;
data want;
update have(obs=0) have;
by id;
run;
proc sql;
create table need as
Select ID, max(v1) as v1,
max(v2) as v2,
max(v3) as v3,
max(v4) as v4
from have;
quit;
How do I populate the number of purchases and sales per day in tableau?
Here is my Sample Data:
In my first attempt, sales numbers are not counted to the exact date.
In my second attempt, I tried to tabulate by dropping sales date into the rows. However, it returned two figures - purchases and sales.
I have also tried Calculated Field but Tableau is unable to do a "for loop" like python.
First attempt:
After dropping Sales Date into the Rows. This is what I get:
Is there any way to populate it like this? Please help, I am still new to tableau. Special thanks to Fabio Fantoni for the first solution!
Desired Format:
I have another sample data (refer to sample data 2) which I would like to populate in the desired format (refer to desired format 2). In Sample Data 2, the purchase date "15/12/2020" is not reflected in sold dates.
My apologies but I may require some guidance as I am still new to tableau. Thank you in advance.
Sample Data 2:
Desired Format 2:
Based on this sample:
In order to bypass your double count for two different date columns, you may want to cross join your original data with a copy of it on original.Purchase = support.Sold, like this:
Doing so, you just have to create two calculated fields:
count Purchase:
count([Purchase Date])
count Sold:
Count([Purchase Date (Foglio11)])
The only thing you have to pay attention to is that in the second calculus you have to count Purchase date due to your "inverted" cross join.
You should get something like this:
I have a table with members, date and results. The date represent the month for which all the results were generated for various members. Table stores results for multiple months. Now, i want to compare results of current month with last month for particular member and assign “equal” , “>” or “<” sign based on their values. How can i do that in tableau?
Any help is greatly appreciated!
Thank you.
You can use the lookup function. Assuming you have below data:
you can create a lookup function in tableau with below expression:
LOOKUP(sum([Result]),-1)
Note you need to use an aggregate function on your measure
After creating a lookup column create a calculated field to do data comparison:
This should give you the result you are looking for:
You can edit the table calculations to define the reset level for your calculation:
OK I'll start with the problem:
I have product tables being created every week which are named in the format:
products_20130701
products_20130708
.
.
.
I'm trying to automate some campaign analysis so that I don't have to manually change the table name in the code every week to use whichever product table is the first one after the maximum end date of my campaign.
e.g
%put &max_enddate.;
/*20130603*/
my product tables in June are:
products_20130602
*products_20130609*
products_20130616
products_20130623
in this instance i would like to use the second table in the list, ignoring over 12 months worth of product tables and just selecting the table who's date is just after my max_enddate macro.
I've been Googling all day and I'm stumped so ANY advice would be much appreciated.
Thanks!
A SQL solution:
data product_20130603;
run;
data product_20130503;
run;
data product_20130703;
run;
%let campdate=20130601;
proc sql;
select min(memname) into :datasetname from dictionary.tables
where libname='WORK' and upcase(scan(memname,1,'_'))='PRODUCT' and
input(scan(memname,2,'_'),YYMMDD8.) ge input("&campdate.",YYMMDD8.);
quit;
Now you have &datasetname that you can use in the set statement, so
data my_analysis;
set &datasetname;
(whatever you are doing);
run;
Modify 'WORK' to the appropriate libname, and if there are any other restrictions add those as well. You might get some warnings about invalid dates if you have product_somethingnotadate, but that shouldn't matter.
The way this works - the dictionary.tables is a list of all tables in all libnames you have accessed (same as sashelp.vtable, but only available in PROC SQL). First this selects all rows that have a name with a date greater than or equal to your campaign end date; then it takes the min(memname) from that. Memname is of course a string, but in strings that are identical except for a number, you can still use min and get the expected result.
This is probably not suitable for your application, however I find it very useful for the datasets I have as they absolutely must exist for each Sunday and I evaluate the existence of the dataset at the beginning of my code. If they don't exist then it sends an email to our IT guys that tells them that the file is missing and needs to be re-created\restored.
%LET DSN = PRODUCTS_%SYSFUNC(PUTN(%SYSFUNC(INTNX(WEEK.2,%SYSFUNC(INPUTN(&MAX_ENDDATE.,YYMMDD8.)),0,END)),YYMMDDN8.));
With the other suggestions above they will only give you results for datasets that exist, therefore if the one you should have been using has been deleted then it will grab the next one and run the job regardless.
First, get all possible tables:
data PRODUCT_TABLES;
set SASHELP.VTABLE (keep=libname memname);
*get what you need, here i keep it simple;
where lowcase(substr(memname,1,9))='products_';
run;
Next, sort it by date, easily done due to the format of your dataset names.
proc sort data=PRODUCT_TABLES;
by memname;
run;
Finally, you just need to get out the first record where the date is large enough.
data _NULL_;
set PRODUCT_TABLES;
*compare to your macro variable, note that i keep it as simple as possible and let SAS implicitly convert to numeric;
if substr(memname,10,18)>=symgetn("max_enddate") then do;
*set your match into a macro variable, i have put together the libname and memname here;
call symput("selectedTable",cats(libname,'.',memname));
stop; *do not continue, otherwise you will output simply the latest dataset;
end;
run;
Now you can just put the macro variable when you want to use the appropriate dataset, e.g.:
data SOME_TABLE;
set &selectedTable.;
/*DO SOME STUFF*/
run;
A record in a table contains a range of valid dates, say:
*tbl1.start_date* and *tbl1.end_date*. So to ensure I get all records that are valid for a specific date range, the selection logic is: <...> WHERE end_date >= #dtFrom AND start_date < #dtTo (the #dtTo parameter used in the SQL statement is actually the calculated next day of the *#prmDt_To* parameter used in the report).
Now in a report I need to count the number of records for each day within the specified data range and include the days, if any, for which there were no valid records. Thus a retrieved record may be counted in several different days. I can do it relatively easily with a recursive CTE within the data set, but my rule of thumb is to avoid the unnecessary load on the SQL database and instead return just the necessary raw data and let the Report engine handle groupings. So is there a means to do this within SSRS?
Thank you,
Sergey
You might be able to do something in SSRS with custom code, but I recommend against it. The place to do this is in the dataset. SSRS is not designed to fill in groups that don't exist in the dataset. That sounds like what you are trying to do: SSRS would need to create the groups for each date whether or not that date is in the dataset.
If you don't have a number or date table in your database, I would just create a recursive CTE with a record for every date in the range that you are interested as you mention. Then outer join this to your table and use COUNT(tbl1.start_date) to find the appropriate days. This shouldn't be too painful a query for SQL server.
If you really need to avoid the CTE, then I would create a date or number table to use to generate the dates in your range.