Self-taught at SPSS here. Need to know the appropriate syntax to recode four DATE variables into one, based on which would be the latest date. I have four DATE variables in a dataset with 165 cases:
wnd_heal_date
wnd_heal_d14_date
wnd_heal_d30_date
wnd_heal_3m_date
And each variable may or may not contain a value for each case. I want to recode a new variable which scans the dates from all four and only selects the one that is the latest and puts it into a new variable (x_final_wound_heal_date).
How to use the SELECT IF function for this purpose?
select if function selects rows in the data, and so is not appropriate for this case. What you can do is this instead:
compute x_final_wound_heal_date =
max(wnd_heal_date, wnd_heal_d14_date, wnd_heal_d30_date, wnd_heal_3m_date).
VARIABLE LABELS x_final_wnd_heal_date 'Time to definitive wound healing (days)'.
VARIABLE LEVEL x_final_wnd_heal_date(SCALE).
ALTER TYPE x_final_wnd_heal_date(DATE11).
This will put the latest of available date values in the new variable.
Related
I'm already struggling for days to use dates from excel in a proper way in anylogic..
I've created a database in where the date is formulated as integers in different columns since otherwise excel is messing up the dates (for example year=2021 , month=12 day=5 hour=6 minute=44 second=0 stands for 2021/12/5 6:44:00)
Now I know this can be converted to a date by the function toDate(year, month, day, hour, minutes seconds). But how can I use this integers to create agent with specific parameters from the database in a source and add to a custom population?
The most simple way is to add a column where the function toDate(......) is added in the database but I do not know how to do this (see picture if it is unclear). Or are there other solutions?
One way: use Dynamic Events.
Create one and in the action code, write mySource.inject(1)
In Main, on startup, load all dbase rows and create a DE for each row, below assuming it is only with an hour-column:
(Use the database query wizard to adjust your query).
In your source object, set it to "call of inject() function"
This will work, but it is quite cumbersome, as you can see. Much easier if you get your Excel right and just import the date column clean and well so you can use the Source option "arrival table in database" directly. I know you need regular arrivals, so maybe code that up in Excel to give you these on specific dates...
I have an excel sheet with many tabs. Say one is called wsMain and the other is called wsDate.
In my data flow transformation I am able to successfully load the data from wsMain to my table.
Now I have to update this transformation where I have to fetch the maximum date from the worksheet wsDate and only load data from wsMain where the date is less than on equal to the maximum date in wsDate (that is the only column available).
So for I have figured out that I need to create a new Excel connection manager to read the data from wsDate and I have used the Aggregate transformatioin to get the maximum date.
Now the question is how do I use this date to restrict the rows coming from wsMain?
I understand from the link below that you can store the value in a variable but what do I do next?:
SSIS set result set from data flow to variable
I have tried using a merge join but not sure if I am doing it right.
Here is what it looks like now:
I could not achieve the above but would be interested to know if that is possible. As a work around I have created a separate dataflow where I have stored the valued in a variable and then used the variable in the conditional split to filter the required rows:
Here is a step by step guide I followed to write the variable:
https://www.proteanit.com/2008/12/11/ssis-writing-to-a-package-variable-in-a-dataflow/
You can obtain the maximum value of the wsDate column first, this use this as a filter to avoid introducing unnecessary records into the data flow which which would be discarded by the Conditional Split. An overview of this process is below. I'd also recommend confirming the data types for all columns involved.
Create an SSIS DateTime variable and name this something descriptive such as MaxDate.
Create a Data Flow Task before the current one with an Excel Source component. Use the SQL command option for the Data Access Mode and enter a SQL statement to return the max value of the wsDate column. In the following example ExcelSource is the name of the sheet that you're pulling from. I'd suggested confirming the query with the Preview button on the Excel Source as well.
Add a Script Component (not Task) after the Excel Source. Add the MaxDate variable in the ReadWriteVariables field on the main page of the Script Component. On the Inputs and Outputs pane add the output column from the Excel Source as an Input Column with the ReadOnly usage Type. Example C# code for this is below. Note that variables can only be written to in the PostExecute method. The Input0_ProcessInputRow method is called once for each row that passes through, however there will only be the single row in this case. On the following code MaxExcelDate is the name of the output column from the Excel Source.
On the Excel Source component in the Data Flow Task where the records are imported from Excel, change the Data Access Mode to SQL command and enter a SQL statement to return records that have a date less than or equal to the maximum wsDate value. This is the last example and the ? is a placeholder for the parameter. After entering this SQL, click the Parameters button and select Parameter0 for the Parameters field, the MaxDate variable for Variables field, and a direction of Input. The Conditional Split can then be removed since these records will now be filtered out.
Excel MAX wsDate SELECT:
SELECT MAX(wsDate) AS MaxExcelDate FROM ExcelSource
C# Script Component:
DateTime maxDate;
public override void PostExecute()
{
base.PostExecute();
Variables.MaxDate = maxDate;
}
public override void Input0_ProcessInputRow(Input0Buffer Row)
{
maxDate = Row.MaxExcelDate;
}
Excel Command with Date Filter:
SELECT
Column1,
Column2,
Column3
FROM ExcelSheet
WHERE DateColumn <= ?
Yes, it is possible. In the data flow, you will need to determine the max date, which you already have. Next, you will need to MERGE JOIN the two data flows on the date column. From there, you will feed it into a CONDITIONAL SPLIT and split where the date columns match [i.e., !ISNULL()] versus do not match [i.e., ISNULL()]. In your case, you only want the matches. The non-matches will be disregarded.
Note: if you use an INNER JOIN on the MERGE JOIN where there is only one date (i.e., MaxDate) to join on, then this will take care of the row filtering for you. You will not need a CONDITIONAL SPLIT.
Welcome to ETL.
Update
It is a real pain that SSIS's MERGE JOINs only perform joins on EQUAL operations as opposed to LESS THAN and GREATER THAN operations. You will need to separate the data flows.
Use a script component to scan the excel file for the MAX Date and assign that value to a package variable in SSIS. Alternatively, you can have a dates table in SQL Server and then use an Execute SQL Command in SSIS to retrieve the MAX Date from the table and assign that value to a package variable
Modify your existing data flow to remove the reading of the Excel date file completely. Then add a DERIVED COLUMN transformation and add a new column that is mapped to the package variable in SSIS that stores the MAX date. You can name the Derived Column Name 'MaxDate'
Add a conditional split transformation with the following CONDITION logic: [AsOfDt] <= [MaxDate]
Set the Output Name to Insert Records
Note: The CONDITIONAL SPLIT creates a new output data flow with restricted/filtered rows. It does not create a new column within the existing data flow. Think of this as a transposition of data flow output from column modification to row modification. Only those rows that match the condition will be sent to the output that you desire. I assume you only want to Insert these records, so I named it that. You can choose whatever naming convention you prefer
Note 2: Sorry for not making the Update my original answer - I haven't used the AGGREGATE transformation before so I was not aware that it restricts row output as opposed to reading a value in the data flow and then assigning it to a variable. That would be a terrific transformation for Microsoft to add to SSIS. It appears that the ROWCOUNT and SCRIPT COMPONENT transformations are the only ones that have the ability to set a package variable value within the data flow.
I need to be able to use the AddDays function to derive the last week from the date column that I have in the dataset.
So, I have delivery_date of 3/21/2018, then I want to derive AddDays('3/21/2018',-7.0) - only that I want to do do this for every row in the dataset. But, the AddDays function only takes a metric. Can you suggest how I can work around this situation?
Thank you in advance,
Abhilash
As usual it depends on what you want to achieve.
If you need an attribute that returns delivery_date - 7, just create a new attribute and in the definition of the expression you can put a formula like [delivery_date] - 7 or use a pass-through function like ApplySimple to write the formula for your database (more info here).
Note: If you do this, you need to do the form expression with the formula (or the ApplySimple) only for the forms mapped on the fact table, the forms mapped on the lookup table (your Day dimension table) should be without formula, otherwise parent level will returns wrong values. Also if you don't a lookup for this new attribute create an alias or enable -Attribute Role Recognization more here.
If you need to calculate metric values for delivery_date - 7, then in that case you need to use a transformation metric. You will need to create a minus 7 days or same day previous week transformation, then associate it to the Delivery Date attribute and create the needed metrics. The last week transformation is included in the MicroStrategy Tutorial project.
My end goal is to have a box change color when the last 3 records input into a field (based on the time of input) in FileMaker achieve a certain criteria (ex. variance < 2). I would like to know how to make this happen, or how a calculation/script can be written to only look at the last 3 records.
There are several ways you could approach this. A simple one would be to use a script to:
Show all records in the given table;
Unsort them (assuming they were entered in chronological order; otherwise sort them by creation timestamp);
Omit all records except the last three;
Get the value of a summary field defined as Standard Deviation of your value field;
Set a global variable/field to the square of the returned value.
Then use the global variable/field to conditionally format your "box".
If you don't want to use a script, you will have to define a relationship in order to get the last three values in the table, regardless of the current found set and/or sort order. Or you may use the ExecuteSQL() function for this.
I want to use one parameter for date and another one for time in my reports as shown below.
Start Time [16/01/2012][12.00 am]
Can anyone help me regarding that?
Sure it is a multiple step process:
Set up a variable of TEXT as 'DATE' as the variable value and prompt
Set it's 'Default Values' in the left pane to be '1/16/2012'
Set up a variable of TEXT as 'TIME' as the variable value and prompt
Set it's 'Default Values' in the left pane to be '00:00'
Set up a dataset, 'AvailableDateTime' to combine the two into a legitimate datetime field:
SELECT CAST(#Date + ' ' + #Time AS DateTime) AS Datetime
Set up a third variable of DATETIME to be 'DATETIME' as the variable value and prompt.
Set up this variable to use 'AVAILABLE VALUES' on the left pane of properties to be 'Get values from a query'. Use the dataset from step 5.
You now have set up a separate field for data and time.
Further consideration to avoid user input error. You may wish to tie the first variables to be selectable ONLY FROM values you set in available values or from a query. The problem being if a user fat fingers the date or time it will not run as the system is only trying to combine two strings and make a datetime out of it. You may wish to list values directly from a query from the getgo.
EDIT FOR CHANGING FIRST TWO VARIABLES:
You may set the first variable as datetime which gives the end user a calendar.
You can set a second dataset up to get available times for an end user:
declare #time table ( tm int)
declare #cursor int = 0
while #cursor <= 23
Begin
insert into #time values (#cursor)
set #cursor += 1
End
select cast(CAST(tm as varchar) + ':00' as time) as HourOfTheDay
from #time
Setting your second variable to get values from a query that is made in step 2 directly above.
You should now be able to put the values together as above.
As I said in my comment, SSRS does not allow you to have separate parameters for Date and Time.
It has only one parameter Date/Time.
As I see you have two options.
Add a text parameter and consider that as time. You could then do
some validation depending on what tech you are using.
Another way to solve this would be creating a list of possible
values. You select Integer type, for instance and then create a list
of Available Values. (see images)