Modeler question: Is there a function in SPSS for multiple 'if' statements? Forecasting dates

Modeler question: Is there a function in SPSS for multiple 'if' statements? Forecasting dates - spss-modeler

I am trying to build a forecast for interest expense for floating debt in my company.
I have been given a set of ResetDates which help me match a given rate based on when the ResetDate is.
I have been successful in forecasting one period, but I need a much longer set of periods to satisfy my requirements.
I've tried derive nodes and nested if statements as well as filler nodes.
I am given this data to work with, I can only look at one ResetDate ahead.
Here you will find the data I used: Columns A/B/C/D is what i'm given, Column E (or 5th column from left to right) is what I want to derive as my output
I want to use 'InterestPayDate' and derive:
if it's more than 'NextReset' , the add 90 days to the 'NextReset' to create 'NextReset2'
That is as far as I can get.... where my problem lies is I want to look at NextReset2 and derive:
if 'InterestPayDate' is more than 'NextReset2', then add 90 days to 'NextReset2', if it's less than 'NextReset2', keep the current value for 'NextReset2'
Output should look like Column E here
Not sure if I need to dig deeper into the logical functions, in all honesty, I've just picked up SPSS and I am really trying to learn. Hopefully, you can point me in the right direction.
Thank you.

After computing the first NextReset2, you need to use a Filler node like the one below to change the value of the field.
You might need more than one identical nodes like this - one for each potential 90-day period that you are looking to extend the NextReset2 date. In your sample data, you will need at least two Filler nodes to get the correct value of NextReset2 for the last of the records.
There might be a more elegant way to do it, but this will work and it's easy enough to make copies of a node and string them together like this.
Please also see a sample IBM SPSS Modeler stream showing this approach here and using your sample data.

Related

How do I sort this scatter plot?

I would like to sort this scatter plot, which is summarized with a Band that includes Minimum, Average, and Maximum.
I would like to sort it in 2 ways:
by Average
by Widest Range (ie difference between Minimum and Maximum values)
Tableau Public workbook
If you can't view this or I'm not allowed to post external resources on stackoverflow, then perhaps you can show me on this screenshot what I would click to get started on the following sort
Also, bonus question, is there a way to create a control for the user to toggle between the 2 sort methods in the same chart? Or do I have to duplicate the chart with a different sort type for each?
One note is that I only have Tableau Public version since I'm evaluating the product. Until I get a paid version, I can't open a workbook file unless you publish it to Tableau Public cloud. But rather than give me the workbook answer, I would just appreciate it if you gave me instructions to do this as this is more of a learning exercise.
Thanks!

Somewhat unfortunately, you'll have to replicate the min,avg,max by creating 3 calculated fields. Tableau cannot operate on the values placed on the view via reference lines.
These calculations might look something like these:
{Fixed [Cwe]: Min([Cvss Score])}
~
{Fixed [Cwe]: Avg([Cvss Score])}
~
{Fixed [Cwe]: Max([Cvss Score])}
In general, from there, you should pretty easily be able to apply them to the view and sort. Average will be easy. The difference between Min and Max will just need a subtracting calculated field to sort by. Once they're on the view, I'd put them as a dimension (column) to verify that the numbers look correct.
Take note that LOD calculations take place before filtering, so you'll want to put the Cvss filter you have there 'on context' by right clicking it and clicking 'add to context'
Here is how I would complete the sorts:
Starting with all the above calculations on 'Rows' and ensuring that they are 'Dimensions' (Blue).
After right clicking "Sort..." on [Sub-Category] on 'Rows'. Select which field to sort by.
From there, the calculated fields can be taken off the rows column. (They were only there in the first place to ensure that you could check that the sorts took place. They don't actually need to have been there in the first place.)

Tableau Dual Axis with different filters

I am trying to create a graph with two lines, with two filters from the same dimension.
I have a dimension which has 20+ values. I'd like one line to show data based on just one of the selected values and the other line to show a line excluding that same value.
I've tried the following:
-Creating a duplicate/copy dimension and filtering the original one with the first, and the copy with the 2nd. When I do this, the graphic disappears.
-Creating a calculated field that tries to split the measures up. This isn't letting me track the count.
I want this on the same axis; the best I've been able to do is create two sheets, one with the first filter and one with the 2nd, and stack them in a dashboard.
My end user wants the lines in the same visual, otherwise I'd be happy with the dashboard approach. Right now, though, I'd also like to know how to do this.

It is a little hard to tell exactly what you want to achieve, but the problem with filtering is common.
The principle that is important is that Tableau will filter the whole dataset by row. So duplicating the dimension you want to filter won't help as the filter on the original dimension will also filter the corresponding rows in the second dimension. Any solution has to be clever enough to work around this issue.
One solution is to build two new dimensions that use a calculation rather than a filter to create the new result. Let's say you have a dimension, [size] that has a range of numbers from 1 to 10 and you want to compare the total number of rows including and excluding the number 5. You could create a new field using a formula like if [size] <> 5 then 1 else 0 end
Summing the new field will give a count of the number of rows that don't contain a 5 and this can be compared directly to a rowcount of the original [size] field which will give the number including the value 5.
This basic principle can be extended to much more complex logic. The essential point is to realise that filters act on every row in your data and can't, by themselves, show comparisons with alternative filter choices on a single visualisation.
Depending on the nature of your problem there may be other solutions worth looking at including sets and groups but you would need to provide more specific details for users here to tell you whether they would be useful.

We can make a a set out of the values of the dimension and then place it in the required shelf. So, you will have your dimension which will plot accordingly and set which will have data as per the requirement because with filter you can't have that independence of showing data everytime you want.

Different Aggregation calculations of a measure using two dimensions in Tableau

It is a Tableau 8.3 Desktop Edition question.
I am trying to aggregate data using two different dimensions. So, I want to aggregate twice: first I want to sum over all the rows and then multiply the results in a cummulative manner (so I can build a graph). How do I do that? Ok, too vague, here follow some more details:
I have a set of historical data. The columns are the date, the rows are the categories.
Easy part: I would like to sum all the rows.
Hard part: Given this those summations I want to build a graph that for each date it shows the product of all the summations from the earlier date till this date.
In another words:
Take the sum of all rows, call it x_i, where i is the date.
For each date i find y_i such that y_i = x_0 * x_1 * ... * x_i (if there is missing data, consider it to be one)
Then show a line graph for the y values versus the date.
I have searched for a solution for this and tried to figure it out by myself, but failed.
Thank you very much for your time and help :)

You need n calculated fields (number of columns you have), and manually do the calculation you need:
y_i = sum(field0)*sum(field1)
Basically because you cannot iterate on columns. For tableau, each column represent a different dimension or measure. So it won't consider that there is a logic order among them, meaning, it won't assume that column A comes before column B. It will assume A and B are different things.
Tableau works better with tables organized as databases. So if you have year columns, you should reorganize your data, eliminate all those columns and create a single field called 'Date', which will identify the value of your measure for that date. Yes, you will have less columns but far more rows. But Tableau works better this way (for very good reasons).
Tableau 9.0 allows you to do that directly. I only watched a demo (it was launched yesterday), but I understand that now there is an option to selected those columns (in the Data Connection tab) and convert them to a database format.
With that done, you can use a PREVIOUS_VALUE function to help you. I'm not with Tableau right now. As soon as I get to it I'll update this with the final answer . Unless you take the lead and discover yourself before that ;)

Plotting data sets of different lengths from a struct - avoid padding

I hope this is a simple enough question, but I am a beginner and haven't managed it on my own after several sessions.
I have a 1x29 struct of financial market data with 8 fields: stock_market.
the first field is the date
As you can see, not all the components of the stock market i.e. the stocks have the same length time series (some comanies are newer than others or have since left the market etc.)
I would like to know in general how best to manipulate such an uneven struct of time series data. One specific example is that I'd like to write a function that can extract the 'Date' and 'AdjClose' (adjusted closing price) from the struct, for one individual stock, and plot that on a graph, with the dates shown/labelled on the x-axis and prices on the right. An extension is to plot several AdjClose time series on one x-axis together, even though the length of their respective Date values are different. I don't want to pad these strings out so that I have lines running along zero for a portion of the graph.
Other more complicated functions shouldn't be difficult once I could manage this task.
I haven't shared any code, as it is simply 'plot(stock_market(1).Date, stock_market(1).AdjClose)` and mostly red with errors. I have read a few discussions about padding, but like i mentioned above, don't want to pad with zeros or any other number for the diagrams. For other functions, I may have to pad them, but if anyone has a different solution - it'd be great to hear it :-)
Thanks in advance.

Prediction/delay forcasting using Machine Learning?

I have a set of data for the past 5 years. Approx 7000 rows of data with features that are binary {yes/no} or are multi-classed {product A, B, C} A total of about 20+ features.
I am trying to make a program (or one time analysis project) to determine (predict) the product shipdate(shipping delay days) based on this historical data. I have 2 columns that indicate when a product was planned to be shipped and another column of when it was actually shipped! Currently.
I'm wondering how I can make a prediction program that determines based on the historic data when new data input of a product will expect to ship. I don't care about a getting a specific date but even just a program that can tell me number of delay days to add...
I took an ML class a while back and I wasn't sure how to start something like this. Any advice? Plus the closest thing to this I can think of is an image recognition assignment using NN. but that was too easy here I have to deal with a date instead of pixel white/black.... I used Matlab back in the day (I still know how to use it) but I just downloaded Weka data mining tool.
I was thinking of a neural network but I'm not sure how to set it up to have my program give me a the expected delay time (# of days/month) from the inputed ship date.
Basically,
I want to input (size = 5, prod = A, ....,expected ship date = jan 1st)
and the program returns the number of days to add as a delay onto my expected ship date given the historical trends...
Would appreciate any any help on how start something like this the correct/easiest/best way... Thanks in advance.

If you use weka, then get your input/label data into the arff format and then you try out all the different regressors (this is a regression problem after all). To avoid having to do too much programming quite yet (if you are just in an exploratory phase), use the weka experimenter which has a GUI for trying out a whole bunch of regressors on your dataset.
Then when you find one that does something expected and you want to do some more data analysis using MATLAB, then you can use a weka/matlab interface.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse