Using Recursion to Generate a List of Items You Can Afford on a Budget - matlab

Recursion is still baffling me. I understand the basis of it and how it's supposed to work, but I am struggling with how to actually make it work. For my function, I'm given a cell array that has costume items and prices, as well as a budget (given as a double). I have to output a cell array of the items I can buy (in order from cheapest to most expensive) and output how much money I have leftover in my budget. There is a chance I will run out of money before I buy all of the items I need to, and a chance where I do buy everything I need. These would be my two terminating conditions. I have to use recursion and I am not allowed to use sort in this problem. So I am struggling a little. Mostly with figuring out the base case situation. I don't understand that bit. Or how to do recursion with two inputs and outputs. So basically my function looks like:
function[bought, money] = costumeParty(items, budget)
Here is what I have to output:
Test case:
Costume Items:
'Eyepatch' 8.94000000000000
'Adult-sized Teletubby Onesie' 2.89000000000000
'Cowboy Boots' 1.30000000000000
'Mermaid Tail' 1.75000000000000
'Life Vest' 8.10000000000000
'White Bedsheet With Eyeholes' 4.30000000000000
'Lizard Leggings' 0.650000000000000
'Gandalf Beard' 4.23000000000000
'Parachute Pants' 7.49000000000000
'Ballerina Tutu' 8.75000000000000
'Feather Boa' 1.69000000000000
'Groucho Glasses' 6.74000000000000
'80''s Leg Warmers' 5.08000000000000
'Cat Ear Headband' 6.36000000000000
'Ghostface Mask' 1.83000000000000
'Indoor Sunglasses' 2.25000000000000
'Vampire Fangs' 0.620000000000000
'Batman Utility Belt' 7.08000000000000
'Fairy Wand' 5.48000000000000
'Katana' 6.81000000000000
'Blue Body Paint' 5.70000000000000
'Superman Cape' 4.78000000000000
'Assorted Glow Sticks' 4.07000000000000
'Ash Ketchum''s Baseball Cap' 3.57000000000000
'Hipster Mustache' 6.47000000000000
'Camouflage Jacket' 8.73000000000000
'Two Chains Value Pack' 4.76000000000000
'Toy Pistol' 8.41000000000000
'Sushi Chef Headband' 2.59000000000000
'Pitchfork' 8.57000000000000
'Witch Hat' 4.27000000000000
'Dora''s Backpack' 4.13000000000000
'Fingerless Gloves' 0.270000000000000
'George Washington Wig' 7.35000000000000
'Clip-on Parrot' 4.32000000000000
'Christmas Stockings' 8.69000000000000
A lot of items sorry.
[costume1, leftover1] = costumeParty(costumeItems, 5);
costume1 => {'Fingerless Gloves'
'Vampire Fangs'
'Lizard Leggings'
'Cowboy Boots'
'Feather Boa' }
leftover1 => 0.47
What I have:
function[bought, money] = costumeParty(items, budget)
%// I commented these out, because I was unsure of using them:
%// item = [items(:,1)];
%// costumes = [item{:,:}];
%// price = [items{:,2}];
if budget == 0 %// One of the terminating conditions. I think.
money = budget;
bought ={};
%// Here is where I run into issues. I am trying to use recursion to find out the money leftover
else
money = costumeParty(items{:,2}) - costumeParty(budget);
%// My logic here was, costumeParty takes the second column of items and subtracts it from the budget, but it claims I have too many inputs. Any suggestions?
bought = {items(1,:)};
end
end
If I could get an example of how to do recursion with two inputs/outputs, that'd be great, but I couldn't seem to find any. Googling did not help. I'm just...baffled.
I did try to do something like this:
function[bought, money] = costumeParty(items, budget)
item = [items(:,1)];
costumes = [item{:,:}];
price = [items{:,2}];
if budget == 0
money = 0;
bought ={};
else
money = price - budget;
bought = {items(1,:)};
end
end
Unfortunately, that's not exactly recursive. Or, I don't think it is and that didn't really work anyway. One of the tricks to doing recursion is pretending the function is already doing what you want it to do (without you actually coding it in), but how does that work with two inputs and outputs?
Another attempt, because I'm going to figure this darn thing out somehow:
function[bought, money] = costumeParty(items, budget)
price = [items{:,2}]; %// Gives me the prices in a 1x36 double
if price <= budget %// If the price is less than the budget (which my function should calculate) you populate the list with these items
bought = [costumeParty(items,budget)];
else %// if not, keep going until you run out of budget money. Or something
bought = [costumeParty(items{:,2},budget)];
end
I think I need to figure out how to sort the prices first. Without using the sort function. I might just need a whole lesson on recursion. This stuff confuses me. I don't think it should be this hard .-.
I think I'm getting closer!
function[bought, money] = costumeParty(items, budget)
%My terminating conditions are when I run out of the budget and when buying
%the next item, would break my budget
price = [items{:,2}];
Costumes = [items(:,1)];
[~,c] = size(price);
bought = {};
Locate = [];
List = [];
for j = 1:c %// Need to figure out what to do with this
[Value, IND] = min(price(:));
List = [List price(IND)];
end
while budget >= 0
if Value < budget
bought = {Costumes(IND)};
money = budget - price(IND);
elseif length(Costumes) == length(items)
bought = {Costumes(IND)};
money = budget - price(IND);
else
bought=43; %// Arbitrary, ignore
budget = budget - price;
end
budget = budget - price;
end
duck = 32; %// Arbitrary, ignore

From my understanding of the question the recursion needs to be used for sorting the items arrays and then after you have a sorted array you can then decide how many objects and which can be bought based on the budget you have
Therefore, you need to implement a classic recursive sorting algorithm. You may find a few online but the idea is to split your whole list into sub lists and do the same sorting for them and so on.
After the implementation, you will then need to have a threshold of the budget in place.
Another approach will be as you started with 2 items. Then you will need to scan the whole list every time in the look for the cheapest item, cross it from the list and pass the next function an item list with this item missing and a budget that will be lower by that some. Though I don't see the need of a recursion for this implementation, since loops will be more then enough here.
Edit: Code:
This is an idea of a code, didn't run it, and it should have problems with the indexing (you nedd to address the budget and the lables differently) but I think it shows the point.
function main(items,budget)
boughtItemIndex=itemslist(items,budget)
moneyLeft=budget;
for i=1:1:length(boughtItemIndex)
disp(item(boughtItemIndex(i)))
moneyLeft=moneyLeft-boughtItemIndex(i);
end
disp('Money left:');
moneyLeft;
boughtItemIndex=function itemslist(items,budget)
[minVal minInd]=findmin(items)
if (budget>minVal)
newitems=items;
newitem(minInd)=[];
newbudget=budget-minVal;
boughtItemIndex=[minIn, itemlist(newitem,newbudget)];
end
[minVal minInd]=function findmin(items)
minVal=0;
minInd=0;
for i=1:1:length(items)
if (items(i)<minVal)
minVal=items(i);
minInd=i;
end
end

Related

Looking for advice on improving a custom function in AnyLogic

I'm estimating last mile delivery costs in an large urban network using by-route distances. I have over 8000 customer agents and over 100 retail store agents plotted in a GIS map using lat/long coordinates. Each customer receives deliveries from its nearest store (by route). The goal is to get two distance measures in this network for each store:
d0_bar: the average distance from a store to all of its assigned customers
d1_bar: the average distance between all customers common to a single store
I've written a startup function with a simple foreach loop to assign each customer to a store based on by-route distance (customers have a parameter, "customer.pStore" of Store type). This function also adds, in turn, each customer to the store agent's collection of customers ("store.colCusts"; it's an array list with Customer type elements).
Next, I have a function that iterates through the store agent population and calculates the two average distance measures above (d0_bar & d1_bar) and writes the results to a txt file (see code below). The code works, fortunately. However, the problem is that with such a massive dataset, the process of iterating through all customers/stores and retrieving distances via the openstreetmap.org API takes forever. It's been initializing ("Please wait...") for about 12 hours. What can I do to make this code more efficient? Or, is there a better way in AnyLogic of getting these two distance measures for each store in my network?
Thanks in advance.
//for each store, record all customers assigned to it
for (Store store : stores)
{
distancesStore.print(store.storeCode + "," + store.colCusts.size() + "," + store.colCusts.size()*(store.colCusts.size()-1)/2 + ",");
//calculates average distance from store j to customer nodes that belong to store j
double sumFirstDistByStore = 0.0;
int h = 0;
while (h < store.colCusts.size())
{
sumFirstDistByStore += store.distanceByRoute(store.colCusts.get(h));
h++;
}
distancesStore.print((sumFirstDistByStore/store.colCusts.size())/1609.34 + ",");
//calculates average of distances between all customer nodes belonging to store j
double custDistSumPerStore = 0.0;
int loopLimit = store.colCusts.size();
int i = 0;
while (i < loopLimit - 1)
{
int j = 1;
while (j < loopLimit)
{
custDistSumPerStore += store.colCusts.get(i).distanceByRoute(store.colCusts.get(j));
j++;
}
i++;
}
distancesStore.print((custDistSumPerStore/(loopLimit*(loopLimit-1)/2))/1609.34);
distancesStore.println();
}
Firstly a few simple comments:
Have you tried timing a single distanceByRoute call? E.g. can you try running store.distanceByRoute(store.colCusts.get(0)); just to see how long a single call takes on your system. Routing is generally pretty slow, but it would be good to know what the speed limit is.
The first simple change is to use java parallelism. Instead of using this:
for (Store store : stores)
{ ...
use this:
stores.parallelStream().forEach(store -> {
...
});
this will process stores entries in parallel using standard Java streams API.
It also looks like the second loop - where avg distance between customers is calculated doesn't take account of mirroring. That is to say distance a->b is equal to b->a. Hence, for example, 4 customers will require 6 calculations: 1->2, 1->3, 1->4, 2->3, 2->4, 3->4. Whereas in case of 4 customers your second while loop will perform 9 calculations: i=0, j in {1,2,3}; i=1, j in {1,2,3}; i=2, j in {1,2,3}, which seems wrong unless I am misunderstanding your intention.
Generally, for long running operations it is a good idea to include some traceln to show progress with associated timing.
Please have a look at above and post results. With more information additional performance improvements may be possible.

How do i exclude some elements of a list from further calculations

So I have a list of stars and their respective distances. My assignment is to find which stars are in a certain distance (+- 10parsec). I want to exclude some of them from further calculations in the program. The thing is I don't want to remove them completely so remove, pop etc isn't helping me. I still want those stars on the list to be present in my output csv. I just want a line saying something like those stars which don't support the if statement, don't use them in this calculation. So i guess the output would be blank for those.
I suppose it is an if or for statement, to mark those bad stars as False and then down the line use calculation that excludes those faulty stars.
I'm a physics student and this is my first python program ever! Please be cool about my ignorance...
Edit: forgive me if i include useless stuff i don't really know what's important. I also use uncertainties library if its of any use
column_names = ['id','pi','s_pi','v_r' ,'s_v', 'dis', 'X',
'ra_h', 'ra_m', 'ra_s','dec_d', 'dec_m',
'dec_s', 'ma', 's_ma', 'md', 's_md']
data = pd.read_csv("hyades_data.dat", skiprows=2, sep='\s+',
names=column_names)
calculations with all
v_r = unumpy.uarray(data['v_r'], data['s_v'])
ma = unumpy.uarray(data['ma'], data['s_ma'])
md = unumpy.uarray(data['md'], data['s_md'])
mi = unumpy.sqrt(ma**2+md**2)
r_m = v_r*unumpy.tan(th)/(4.74*mi/1000)
diff = np.abs(r_pc - r_m)
'''
if np.abs(dist-46.43) <=10:
r_m=True
else r_m=False
at this point i want to make the distiction
'''
mean_diff = diff.mean()
print("Mean : ")
print(mean_diff)
print(a_ref,d_ref)
df_va=pd.DataFrame(v_r)
df_mi = pd.DataFrame(mi)
df_rm = pd.DataFrame(r_m)
df_rpc = pd.DataFrame(r_pc)
df_diff = pd.DataFrame(diff)
#df_mean_diff = pd.DataFrame(mean_diff)
ve = v_r*np.tan(th)
output = pd.concat([data['id'], ra, dec, th_d, df_mi, df_rm, df_rpc,
df_diff,df_va], axis=1)
output.columns = ['id','ra', 'dec', 'th_d','mi', 'r_m', 'r_pc',
'dist_diff','va']
output.to_csv('results.csv', index=False)

"Appending" to an ArraySlice?

Say ...
you have about 20 Thing
very often, you do a complex calculation running through a loop of say 1000 items. The end result is a varying number around 20 each time
you don't know how many there will be until you run through the whole loop
you then want to quickly (and of course elegantly!) access the result set in many places
for performance reasons you don't want to just make a new array each time. note that unfortunately there's a differing amount so you can't just reuse the same array trivially.
What about ...
var thingsBacking = [Thing](repeating: Thing(), count: 100) // hard limit!
var things: ArraySlice<Thing> = []
func fatCalculation() {
var pin: Int = 0
// happily, no need to clean-out thingsBacking
for c in .. some huge loop {
... only some of the items (roughly 20 say) become the result
x = .. one of the result items
thingsBacking[pin] = Thing(... x, y, z )
pin += 1
}
// and then, magic of slices ...
things = thingsBacking[0..<pin]
(Then, you can do this anywhere... for t in things { .. } )
What I am wondering, is there a way you can call to an ArraySlice<Thing> to do that in one step - to "append to" an ArraySlice and avoid having to bother setting the length at the end?
So, something like this ..
things = ... set it to zero length
things.quasiAppend(x)
things.quasiAppend(x2)
things.quasiAppend(x3)
With no further effort, things now has a length of three and indeed the three items are already in the backing array.
I'm particularly interested in performance here (unusually!)
Another approach,
var thingsBacking = [Thing?](repeating: Thing(), count: 100) // hard limit!
and just set the first one after your data to nil as an end-marker. Again, you don't have to waste time zeroing. But the end marker is a nuisance.
Is there a more better way to solve this particular type of array-performance problem?
Based on MartinR's comments, it would seem that for the problem
the data points are incoming and
you don't know how many there will be until the last one (always less than a limit) and
you're having to redo the whole thing at high Hz
It would seem to be best to just:
(1) set up the array
var ra = [Thing](repeating: Thing(), count: 100) // hard limit!
(2) at the start of each run,
.removeAll(keepingCapacity: true)
(3) just go ahead and .append each one.
(4) you don't have to especially mark the end or set a length once finished.
It seems it will indeed then use the same array backing. And it of course "increases the length" as it were each time you append - and you can iterate happily at any time.
Slices - get lost!

Using For and While Loops to Determine Who to Hire MATLAB

It's that time of the week where I realize just how little I understand in MATLAB. This week, we have homework on iteration, so using for-loops and while-loops. The problem I am currently experiencing difficulties with is one where I have to write a function that decides who to hire somebody. I'm given a list of names, a list of GPAs and a logical vector that tells me whether or not a student stayed to talk. What I have to output is the names of people to hire and the time they spent chatting with the recruiter.
function[candidates_hire, time_spent] = CFRecruiter(names, GPAs, stays_to_talk)
In order to be hired, a canidate must have a GPA that is higher than 2.5 (not inclusive). In order to be hired, the student must stick around to talk, if they don't talk, they don't get hired. The names are separated by a ', ' and the GPAs is a vector. The time spent talking is determined by:
Time in minutes = (GPA - 2.5) * 4;
My code so far:
function[candidates_hire, time_spent] = CFRecruiter(names, GPAs, stays_to_talk)
candidates = strsplit(names, ', ');
%// My attempt to split up the candidates names.
%// I get a 1x3 cell array though
for i = 1:length(GPAs)
%// This is where I ran into trouble, I need to separate the GPAs
student_GPA = (GPAs(1:length(GPAs)));
%// The length is unknown, but this isn't working out quite yet.
%// Not too sure how to fix that
return
end
time_spent = (student_GPA - 2.5) * 4; %My second output
while stays_to_talk == 1 %// My first attempt at a while-loop!
if student_GPA > 2.5
%// If the student has a high enough GPA and talks, yay for them
student = 'hired';
else
student = 'nothired'; %If not, sadface
return
end
end
hired = 'hired';
%// Here was my attempt to get it to realize how was hired, but I need
%// to concatenate the names that qualify into a string for the end
nothired = 'nothired';
canidates_hire = [hired];
What my main issue is here is figuring out how to let the function know them names(1) has the GPA of GPAs(1). It was recommended that I start a counter, and that I had to make sure my loops kept the names with them. Any suggestions with this problem? Please and thank you :)
Test Codes
[Names, Time] = CFRecruiter('Jack, Rose, Tom', [3.9, 2.3, 3.3],...
[false true true])
=> Name = 'Tom'
Time = 3.2000
[Names, Time] = CFRecruiter('Vatech, George Burdell, Barnes Noble',...
[4.0, 2.5, 3.6], [true true true])
=> Name = 'Vatech, Barnes Noble'
Time = 10.4000
I'm going to do away with for and while loops for this particular problem, mainly because you can solve this problem very elegantly in (I kid you not) three lines of code... well four if you count returning the candidate names. Also, the person who is teaching you MATLAB (absolutely no offense intended) hasn't the faintest idea of what they're talking about. The #1 rule in MATLAB is that if you can vectorize your code, do it. However, there are certain situations where a for loop is very suitable due to the performance enhancements of the JIT (Just-In-Time) accelerator. If you're curious, you can check out this link for more details on what JIT is about. However, I can guarantee that using loops in this case will be slow.
We can decompose your problem into three steps:
Determine who stuck around to talk.
For those who stuck around to talk, check their GPAs to see if they are > 2.5.
For those that have satisfied (1) and (2), determine the total time spent on talking by using the formula in your post for each person and add up the times.
We can use a logical vector to generate a Boolean array that simultaneously checks steps #1 and #2 so that we can index into our GPA array that you are specifying. Once we do this, we simply apply the formula to the filtered GPAs, then sum up the time spent. Therefore, your code is very simply:
function [candidates_hire, time_spent] = CFRecruiter(names, GPAs, stays_to_talk)
%// Pre-processing - split up the names
candidates = strsplit(names, ', ');
%// Steps #1 and #2
filtered_candidates = GPAs > 2.5 & stays_to_talk;
%// Return candidates who are hired
candidates_hire = strjoin(candidates(filtered_candidates), ', ');
%// Step #3
time_spent = sum((GPAs(filtered_candidates) - 2.5) * 4);
You had the right idea to split up the names based on the commas. strsplit splits up a string that has the token you're looking for (which is , in your case) into separate strings inside a cell array. As such, you will get a cell array where each element has the name of the person to be interviewed. Now, I combined steps #1 and #2 into a single step where I have a logical vector calculated that tells you which candidates satisfied the requirements. I then use this to index into our candidates cell array, then use strjoin to join all of the names together in a single string, where each name is separated by , as per your example output.
The final step would be to use the logical vector to index into the GPAs vector, grab those GPAs from those candidates who are successful, then apply the formula to each of these elements and sum them up. With this, here are the results using your sample inputs:
>> [Names, Time] = CFRecruiter('Jack, Rose, Tom', [3.9, 2.3, 3.3],...
[false true true])
Names =
Tom
Time =
3.2000
>> [Names, Time] = CFRecruiter('Vatech, George Burdell, Barnes Noble',...
[4.0, 2.5, 3.6], [true true true])
Names =
Vatech, Barnes Noble
Time =
10.4000
To satisfy the masses...
Now, if you're absolutely hell bent on using for loops, we can replace steps #1 and #2 by using a loop and an if condition, as well as a counter to keep track of the total amount of time spent so far. We will also need an additional cell array to keep track of those names that have passed the requirements. As such:
function [candidates_hire, time_spent] = CFRecruiter(names, GPAs, stays_to_talk)
%// Pre-processing - split up the names
candidates = strsplit(names, ', ');
final_names = [];
time_spent = 0;
for idx = 1 : length(candidates)
%// Steps #1 and #2
if GPAs(idx) > 2.5 && stays_to_talk(idx)
%// Step #3
time_spent = time_spent + (GPAs(idx) - 2.5)*4;
final_names = [final_names candidates(idx)];
end
end
%// Return candidates who are hired
candidates_hire = strjoin(final_names, ', ');
The trick with the above code is that we are keeping an additional cell array around that stores those candidates that have passed. We will then join all of the strings together with a , between each name as we did before. You'll also notice that there is a difference in checking for steps #1 and #2 between the two methods. In particular, there is a & in the first method and a && in the second method. The single & is for arrays and matrices while && is for single values. If you don't know what that symbol is, that is the symbol for logical AND. This means that something is true only if both the left side of the & and the right side of the & are both true. In your case, this means that someone who has a GPA of > 2.5 and stays to talk must both be true if they are to be hired.

Summing specific fields in Matlab

How do I sum different fields? I want to sum all of the information for material(1) ...so I want to add 5+4+6+300 but I am unsure how. Like is there another way besides just doing material(1).May + material(1).June etc....
material(1).May= 5;
material(1).June=4;
material(1).July=6;
material(1).price=300;
material(2).May=10;
material(2).price=550;
material(3).May=90;
You can use structfun for this:
result = sum( structfun(#(x)x, material(1)) );
The inner portion (structfun(#(x)x, material(1))) runs a function each individual field in the structure, and returns the results in an array. By using the identity function (#(x)x) we just get the values. sum of course does the obvious thing.
A slightly longer way to do this is to access each field in a loop. For example:
fNames = fieldnames(material(1));
accumulatedValue = 0;
for ix = 1:length(fNames)
accumulatedValue = accumulatedValue + material(1).(fNames{ix});
end
result = accumulatedValue
For some users this will be easier to read, although for expert users the first will be easier to read. The result and (approximate) performance are the same.
I think Pursuit's answer is very good, but here is an alternative off the top of my head:
sum( cell2mat( struct2cell( material(1) )));