It's that time of the week where I realize just how little I understand in MATLAB. This week, we have homework on iteration, so using for-loops and while-loops. The problem I am currently experiencing difficulties with is one where I have to write a function that decides who to hire somebody. I'm given a list of names, a list of GPAs and a logical vector that tells me whether or not a student stayed to talk. What I have to output is the names of people to hire and the time they spent chatting with the recruiter.
function[candidates_hire, time_spent] = CFRecruiter(names, GPAs, stays_to_talk)
In order to be hired, a canidate must have a GPA that is higher than 2.5 (not inclusive). In order to be hired, the student must stick around to talk, if they don't talk, they don't get hired. The names are separated by a ', ' and the GPAs is a vector. The time spent talking is determined by:
Time in minutes = (GPA - 2.5) * 4;
My code so far:
function[candidates_hire, time_spent] = CFRecruiter(names, GPAs, stays_to_talk)
candidates = strsplit(names, ', ');
%// My attempt to split up the candidates names.
%// I get a 1x3 cell array though
for i = 1:length(GPAs)
%// This is where I ran into trouble, I need to separate the GPAs
student_GPA = (GPAs(1:length(GPAs)));
%// The length is unknown, but this isn't working out quite yet.
%// Not too sure how to fix that
return
end
time_spent = (student_GPA - 2.5) * 4; %My second output
while stays_to_talk == 1 %// My first attempt at a while-loop!
if student_GPA > 2.5
%// If the student has a high enough GPA and talks, yay for them
student = 'hired';
else
student = 'nothired'; %If not, sadface
return
end
end
hired = 'hired';
%// Here was my attempt to get it to realize how was hired, but I need
%// to concatenate the names that qualify into a string for the end
nothired = 'nothired';
canidates_hire = [hired];
What my main issue is here is figuring out how to let the function know them names(1) has the GPA of GPAs(1). It was recommended that I start a counter, and that I had to make sure my loops kept the names with them. Any suggestions with this problem? Please and thank you :)
Test Codes
[Names, Time] = CFRecruiter('Jack, Rose, Tom', [3.9, 2.3, 3.3],...
[false true true])
=> Name = 'Tom'
Time = 3.2000
[Names, Time] = CFRecruiter('Vatech, George Burdell, Barnes Noble',...
[4.0, 2.5, 3.6], [true true true])
=> Name = 'Vatech, Barnes Noble'
Time = 10.4000
I'm going to do away with for and while loops for this particular problem, mainly because you can solve this problem very elegantly in (I kid you not) three lines of code... well four if you count returning the candidate names. Also, the person who is teaching you MATLAB (absolutely no offense intended) hasn't the faintest idea of what they're talking about. The #1 rule in MATLAB is that if you can vectorize your code, do it. However, there are certain situations where a for loop is very suitable due to the performance enhancements of the JIT (Just-In-Time) accelerator. If you're curious, you can check out this link for more details on what JIT is about. However, I can guarantee that using loops in this case will be slow.
We can decompose your problem into three steps:
Determine who stuck around to talk.
For those who stuck around to talk, check their GPAs to see if they are > 2.5.
For those that have satisfied (1) and (2), determine the total time spent on talking by using the formula in your post for each person and add up the times.
We can use a logical vector to generate a Boolean array that simultaneously checks steps #1 and #2 so that we can index into our GPA array that you are specifying. Once we do this, we simply apply the formula to the filtered GPAs, then sum up the time spent. Therefore, your code is very simply:
function [candidates_hire, time_spent] = CFRecruiter(names, GPAs, stays_to_talk)
%// Pre-processing - split up the names
candidates = strsplit(names, ', ');
%// Steps #1 and #2
filtered_candidates = GPAs > 2.5 & stays_to_talk;
%// Return candidates who are hired
candidates_hire = strjoin(candidates(filtered_candidates), ', ');
%// Step #3
time_spent = sum((GPAs(filtered_candidates) - 2.5) * 4);
You had the right idea to split up the names based on the commas. strsplit splits up a string that has the token you're looking for (which is , in your case) into separate strings inside a cell array. As such, you will get a cell array where each element has the name of the person to be interviewed. Now, I combined steps #1 and #2 into a single step where I have a logical vector calculated that tells you which candidates satisfied the requirements. I then use this to index into our candidates cell array, then use strjoin to join all of the names together in a single string, where each name is separated by , as per your example output.
The final step would be to use the logical vector to index into the GPAs vector, grab those GPAs from those candidates who are successful, then apply the formula to each of these elements and sum them up. With this, here are the results using your sample inputs:
>> [Names, Time] = CFRecruiter('Jack, Rose, Tom', [3.9, 2.3, 3.3],...
[false true true])
Names =
Tom
Time =
3.2000
>> [Names, Time] = CFRecruiter('Vatech, George Burdell, Barnes Noble',...
[4.0, 2.5, 3.6], [true true true])
Names =
Vatech, Barnes Noble
Time =
10.4000
To satisfy the masses...
Now, if you're absolutely hell bent on using for loops, we can replace steps #1 and #2 by using a loop and an if condition, as well as a counter to keep track of the total amount of time spent so far. We will also need an additional cell array to keep track of those names that have passed the requirements. As such:
function [candidates_hire, time_spent] = CFRecruiter(names, GPAs, stays_to_talk)
%// Pre-processing - split up the names
candidates = strsplit(names, ', ');
final_names = [];
time_spent = 0;
for idx = 1 : length(candidates)
%// Steps #1 and #2
if GPAs(idx) > 2.5 && stays_to_talk(idx)
%// Step #3
time_spent = time_spent + (GPAs(idx) - 2.5)*4;
final_names = [final_names candidates(idx)];
end
end
%// Return candidates who are hired
candidates_hire = strjoin(final_names, ', ');
The trick with the above code is that we are keeping an additional cell array around that stores those candidates that have passed. We will then join all of the strings together with a , between each name as we did before. You'll also notice that there is a difference in checking for steps #1 and #2 between the two methods. In particular, there is a & in the first method and a && in the second method. The single & is for arrays and matrices while && is for single values. If you don't know what that symbol is, that is the symbol for logical AND. This means that something is true only if both the left side of the & and the right side of the & are both true. In your case, this means that someone who has a GPA of > 2.5 and stays to talk must both be true if they are to be hired.
Related
I have a data (matrix) with 3 columns : DATA=[ID , DATE, Value]
I want to filter my data by ID for example DATAid1= DATA where ID==1 and so on ..
for that I write this code in MATLAB
load calibrage_capteur.mat
data = [ID ,DATE , Valeur]
minid = min(data(:,1));
maxid = max(data(:,1));
for i=minid:maxid
ind=find(data(:,1) == i)
dataID = [ID(ind) ,DATE(ind) , Valeur(ind)]
end
As a result he register the last value in this example the max ID=31 so he register dataId31. Now I need how to save the variable each iteration. How can I do this?
You will want to use a cell array to hold your data rather than saving them as independent variables that are named based upon the ID.
data_by_ID = cell();
ids = minid:maxid;
for k = 1:numel(ids)
data_by_ID{k} = data(data(:,1) == ids(k),:);
end
Really though, depending on what you're doing with it, you can use data all of the time since all operations are going to be faster on a numeric matrix than they are on a cell array.
%// Do stuff with data ID = 10
do_stuff(data(data(:,1) == 10, :));
Update
If you absolutely must name your variables you could do the following (but please don't do this and use one of the methods above).
for k = 1:numel(ids)
eval(['dataId', num2str(ids(k)), '= data(k,:);']);
end
Your question is a bit unclear but it sounds like you simply want to save the result at each iteration of the for loop.
I'm assuming min and max id are arbitrary and not necessarily the variable you are trying to index on.
kk = min_id:max_id;
dataID=nan(size(kk));
for ii = 1:numel(kk)
ind=find(data(:,1) == kk(ii))
dataID(kk) = [ID(ind) ,DATE(ind) , Valeur(ind)]
end
This is better than indexing by min_id or max_id since it isn't clear that min_id starts at at 1 (maybe it starts at 0, or something else.)
Recursion is still baffling me. I understand the basis of it and how it's supposed to work, but I am struggling with how to actually make it work. For my function, I'm given a cell array that has costume items and prices, as well as a budget (given as a double). I have to output a cell array of the items I can buy (in order from cheapest to most expensive) and output how much money I have leftover in my budget. There is a chance I will run out of money before I buy all of the items I need to, and a chance where I do buy everything I need. These would be my two terminating conditions. I have to use recursion and I am not allowed to use sort in this problem. So I am struggling a little. Mostly with figuring out the base case situation. I don't understand that bit. Or how to do recursion with two inputs and outputs. So basically my function looks like:
function[bought, money] = costumeParty(items, budget)
Here is what I have to output:
Test case:
Costume Items:
'Eyepatch' 8.94000000000000
'Adult-sized Teletubby Onesie' 2.89000000000000
'Cowboy Boots' 1.30000000000000
'Mermaid Tail' 1.75000000000000
'Life Vest' 8.10000000000000
'White Bedsheet With Eyeholes' 4.30000000000000
'Lizard Leggings' 0.650000000000000
'Gandalf Beard' 4.23000000000000
'Parachute Pants' 7.49000000000000
'Ballerina Tutu' 8.75000000000000
'Feather Boa' 1.69000000000000
'Groucho Glasses' 6.74000000000000
'80''s Leg Warmers' 5.08000000000000
'Cat Ear Headband' 6.36000000000000
'Ghostface Mask' 1.83000000000000
'Indoor Sunglasses' 2.25000000000000
'Vampire Fangs' 0.620000000000000
'Batman Utility Belt' 7.08000000000000
'Fairy Wand' 5.48000000000000
'Katana' 6.81000000000000
'Blue Body Paint' 5.70000000000000
'Superman Cape' 4.78000000000000
'Assorted Glow Sticks' 4.07000000000000
'Ash Ketchum''s Baseball Cap' 3.57000000000000
'Hipster Mustache' 6.47000000000000
'Camouflage Jacket' 8.73000000000000
'Two Chains Value Pack' 4.76000000000000
'Toy Pistol' 8.41000000000000
'Sushi Chef Headband' 2.59000000000000
'Pitchfork' 8.57000000000000
'Witch Hat' 4.27000000000000
'Dora''s Backpack' 4.13000000000000
'Fingerless Gloves' 0.270000000000000
'George Washington Wig' 7.35000000000000
'Clip-on Parrot' 4.32000000000000
'Christmas Stockings' 8.69000000000000
A lot of items sorry.
[costume1, leftover1] = costumeParty(costumeItems, 5);
costume1 => {'Fingerless Gloves'
'Vampire Fangs'
'Lizard Leggings'
'Cowboy Boots'
'Feather Boa' }
leftover1 => 0.47
What I have:
function[bought, money] = costumeParty(items, budget)
%// I commented these out, because I was unsure of using them:
%// item = [items(:,1)];
%// costumes = [item{:,:}];
%// price = [items{:,2}];
if budget == 0 %// One of the terminating conditions. I think.
money = budget;
bought ={};
%// Here is where I run into issues. I am trying to use recursion to find out the money leftover
else
money = costumeParty(items{:,2}) - costumeParty(budget);
%// My logic here was, costumeParty takes the second column of items and subtracts it from the budget, but it claims I have too many inputs. Any suggestions?
bought = {items(1,:)};
end
end
If I could get an example of how to do recursion with two inputs/outputs, that'd be great, but I couldn't seem to find any. Googling did not help. I'm just...baffled.
I did try to do something like this:
function[bought, money] = costumeParty(items, budget)
item = [items(:,1)];
costumes = [item{:,:}];
price = [items{:,2}];
if budget == 0
money = 0;
bought ={};
else
money = price - budget;
bought = {items(1,:)};
end
end
Unfortunately, that's not exactly recursive. Or, I don't think it is and that didn't really work anyway. One of the tricks to doing recursion is pretending the function is already doing what you want it to do (without you actually coding it in), but how does that work with two inputs and outputs?
Another attempt, because I'm going to figure this darn thing out somehow:
function[bought, money] = costumeParty(items, budget)
price = [items{:,2}]; %// Gives me the prices in a 1x36 double
if price <= budget %// If the price is less than the budget (which my function should calculate) you populate the list with these items
bought = [costumeParty(items,budget)];
else %// if not, keep going until you run out of budget money. Or something
bought = [costumeParty(items{:,2},budget)];
end
I think I need to figure out how to sort the prices first. Without using the sort function. I might just need a whole lesson on recursion. This stuff confuses me. I don't think it should be this hard .-.
I think I'm getting closer!
function[bought, money] = costumeParty(items, budget)
%My terminating conditions are when I run out of the budget and when buying
%the next item, would break my budget
price = [items{:,2}];
Costumes = [items(:,1)];
[~,c] = size(price);
bought = {};
Locate = [];
List = [];
for j = 1:c %// Need to figure out what to do with this
[Value, IND] = min(price(:));
List = [List price(IND)];
end
while budget >= 0
if Value < budget
bought = {Costumes(IND)};
money = budget - price(IND);
elseif length(Costumes) == length(items)
bought = {Costumes(IND)};
money = budget - price(IND);
else
bought=43; %// Arbitrary, ignore
budget = budget - price;
end
budget = budget - price;
end
duck = 32; %// Arbitrary, ignore
From my understanding of the question the recursion needs to be used for sorting the items arrays and then after you have a sorted array you can then decide how many objects and which can be bought based on the budget you have
Therefore, you need to implement a classic recursive sorting algorithm. You may find a few online but the idea is to split your whole list into sub lists and do the same sorting for them and so on.
After the implementation, you will then need to have a threshold of the budget in place.
Another approach will be as you started with 2 items. Then you will need to scan the whole list every time in the look for the cheapest item, cross it from the list and pass the next function an item list with this item missing and a budget that will be lower by that some. Though I don't see the need of a recursion for this implementation, since loops will be more then enough here.
Edit: Code:
This is an idea of a code, didn't run it, and it should have problems with the indexing (you nedd to address the budget and the lables differently) but I think it shows the point.
function main(items,budget)
boughtItemIndex=itemslist(items,budget)
moneyLeft=budget;
for i=1:1:length(boughtItemIndex)
disp(item(boughtItemIndex(i)))
moneyLeft=moneyLeft-boughtItemIndex(i);
end
disp('Money left:');
moneyLeft;
boughtItemIndex=function itemslist(items,budget)
[minVal minInd]=findmin(items)
if (budget>minVal)
newitems=items;
newitem(minInd)=[];
newbudget=budget-minVal;
boughtItemIndex=[minIn, itemlist(newitem,newbudget)];
end
[minVal minInd]=function findmin(items)
minVal=0;
minInd=0;
for i=1:1:length(items)
if (items(i)<minVal)
minVal=items(i);
minInd=i;
end
end
Alrighty everybody, it's the time of the week where I learn how to do weird things with MATLAB. This week it's DJing. What I need to do is figure out how to make my function output the name of the song whose length is closest to the time left. For instance, if I'm showing off my DJing skills and I have 3:22 left, I have to pick a song whose length is closest to the time left (can be shorter or longer). I'm given a .txt file to choose from.
Test Case
song1 = pickSong('Funeral.txt', '3:13')
song1 => 'Neighborhood #2 (Laika)'
The file for this looks like:
1. Neighborhood #1 (Tunnels) - 4:48
2. Neighborhood #2 (Laika) - 3:33
3. Une annee sans lumiere - 3:40
4. Neighborhood #3 (Power Out) - 5:12
5. Neighborhood #4 (7 Kettles) - 4:49
6. Crown of Love - 4:42
7. Wake Up - 5:39
8. Haiti - 4:07
9. Rebellion (Lies) - 5:10
10. In the Backseat - 6:21
I have most of it planned out, what I'm having an issue with is populating my cell array. It only puts in the last song, and then changes it to a -1 after my loop runs. I've tried doing it three different ways, the last one being the most complex (and gross looking sorry). Once I get the cell array into it's proper form (as the full song list and not just -1) I should be in the clear.
function[song] = pickSong(file_name,time_remain)
Song_list = fopen(file_name, 'r'); %// Opens the file
Song_names = fgetl(Song_list); %// Retrieves the lines, or song names here
Songs_in = ''; %// I had this as a cell array first, but tried to populate a string this time
while ischar(Songs) %// My while loop to pull out the song names
Songs_in = {Songs_in, Songs};
Songs = fgetl(Song_list);
if ischar(Songs_in) %//How I was trying to populate my string
song_info = [];
while ~isempty(Songs_in)
[name, time] = strtok(Songs_in);
song_info = [song_info {name}];
end
end
end
[songs, rest] = strtok(Songs, '-');
[minutes, seconds] = strtok(songs, ':');
[minutes2, seconds2] = strtok(time_remain, ':')
all_seconds = (minutes*60) + seconds; %// Converting the total time into seconds
all_seconds2 = (minutes2*60) + seconds2;
song_times = all_seconds;
time_remain = all_seconds2
time_remain = min(time_remain - song_times);
fclose(file_name);
end
Please and thank you for the help :)
A troublesome case:
song3 = pickSong('Resistance.txt', '3:57')
song3 => 'Exogenesis: Symphony Part 2 (Cross-Pollination)'
1. Uprising - 5:02
2. Resistance - 5:46
3. Undisclosed Desires - 3:56
4. United States of Eurasia (+Collateral Damage) - 5:47
5. Guiding Light - 4:13
6. Unnatural Selection - 6:54
7. MK ULTRA - 4:06
8. I Belong to You (+Mon Coeur S'ouvre a Ta Voix) - 5:38
9. Exogenesis: Symphony Part 1 (Overture) - 4:18
10. Exogenesis: Symphony Part 2 (Cross-Pollination) - 3:57
11. Exogenesis: Symphony Part 3 (Redemption) - 4:37
Here is my implementation:
function song = pickSong(filename, time_remain)
% read songs file into a table
t = readSongsFile(filename);
% query song length (in seconds)
len = str2double(regexp(time_remain, '(\d+):(\d+)', ...
'tokens', 'once')) * [60;1];
% find closest match
[~,idx] = min(abs(t.Duration - len));
% return song name
song = t.Title(idx);
end
function t = readSongsFile(filename)
% read the whole file (as a cell array of lines)
fid = fopen(filename,'rt');
C = textscan(fid, '%s', 'Delimiter',''); C = C{1};
fclose(fid);
% parse lines of the form: "0. some name - 00:00"
C = regexp(C, '^(\d+)\.\s+(.*)\s+-\s+(\d+):(\d+)$', 'tokens', 'once');
C = cat(1, C{:});
% extract columns and create a table
t = table(str2double(C(:,1)), ...
strtrim(C(:,2)), ...
str2double(C(:,3:4)) * [60;1], ...
'VariableNames',{'ID','Title','Duration'});
t.Properties.VariableUnits = {'', '', 'sec'};
end
We should get the expected results on the test files:
>> pickSong('Funeral.txt', '3:13')
ans =
'Neighborhood #2 (Laika)'
>> pickSong('Resistance.txt', '3:57')
ans =
'Exogenesis: Symphony Part 2 (Cross-Pollination)'
Note: The code above uses MATLAB tables to store the data, which allows for easy manipulation. For example:
>> t = readSongsFile('Funeral.txt');
>> t.Minutes = fix(t.Duration/60); % add minutes column
>> t.Seconds = rem(t.Duration,60); % add seconds column
>> sortrows(t, 'Duration', 'descend') % show table sorted by duration
ans =
ID Title Duration Minutes Seconds
__ _____________________________ ________ _______ _______
10 'In the Backseat' 381 6 21
7 'Wake Up' 339 5 39
4 'Neighborhood #3 (Power Out)' 312 5 12
9 'Rebellion (Lies)' 310 5 10
5 'Neighborhood #4 (7 Kettles)' 289 4 49
1 'Neighborhood #1 (Tunnels)' 288 4 48
6 'Crown of Love' 282 4 42
8 'Haiti' 247 4 7
3 'Une annee sans lumiere' 220 3 40
2 'Neighborhood #2 (Laika)' 213 3 33
% find songs that are at least 5 minutes long
>> t(t.Minutes >= 5,:)
% songs with the word "Neighborhood" in the title
>> t(~cellfun(#isempty, strfind(t.Title, 'Neighborhood')),:)
I'm going to write an answer using most of what you have already written, instead of suggesting something completely different. Though regexp is a powerful too (and I like regular expressions), I find that it is too advanced for what you have learned so far, so let's scrap it for now.
This way, you get to learn what was wrong with your code, as well as how awesome of a debugger I am (just kidding). What you have when reading in the text file almost works. You made a good choice in creating a cell array to store all of the strings.
I'm also going to borrow MrAzzaman's logic in calculating the time in seconds through strtok (awesome job btw).
In addition, I'm going to change your logic a bit so that it makes sense to me on how I would do it. Here's the basic algorithm:
Open up the file and read the first line (song) as you did in your code
Initialize a cell array that contains the first song in the text file
Until we reach the end of the text file, read in the entire line and add it into the cell array. You've also noticed that as soon as you hit a -1, we don't have any more songs to read, so break out of the loop.
Now that we have our songs in a cell array, which include the track number, song and the time for each song, we are going to create two more cell arrays. The first one will store just the times of the songs as strings, with both the minutes and the seconds delimited by :. The next one will just contain the names of the songs themselves. Now, we go through each element in our cell array that we created from Step #3.
(a) To populate the first cell array, I use strfind to find all occurrences of where the - character occurs. Once I find where these occur, I choose the last location of where the - occurs. I use this to index into our song string, and skip over 2 characters to skip over the - character and the space character. We extract all of the characters from this point until the end of the line to extract our times.
(b) To populate the second cell array, I again use strfind, but then I figure out where the spaces occur, and choose the index of where the first space happens. This corresponds to the gap in between the song number and the track of the song. Using my result of the index from (a), I extract the song title by skipping one character from the index of the first space to the index two characters before the last - character to successfully get the song. This is because there will probably be a space in between the last word of the song title before the - character so we want to remove that space.
Next, for each song time in the first cell array computed in Step #4, I use strtok like you have used and split up the string by the :. MrAzzaman has used this as well and I'm going to borrow his logic on computing the total amount of seconds that each time takes.
Finally, we figure out which time is the closest to the time remaining. Note that we also need to convert the time remaining into seconds like we did in Step #5. As MrAzzaman has said, you can use the min function in MATLAB, and use the second output of the function. This tells you where in the array the minimum occurred. As such, we simply search for the minimum difference between the time remaining and the time elapsed for each song. Take note that you said you don't care whether or not you go over or under the time elapsed. You just want the closest time. In that case, you need to take the absolute value of the time differences. Let's say you had a song that took 3:59 and another song that was 6:00, and the time remaining was 4:00. Assuming that there is no song that is 4:00 long in your track, you would want to choose the song that is at 3:59. However, if you just subtract the time remaining from the longer track (6:00), you would get a negative difference, and min would return this track... not the song at 3:59. This is why you need to take the absolute value, so this will disregard whether you're over or under the time remaining.
Once we figure out which song to choose, return the song name that gives us the minimum. Make sure you close the file too!
Without further ado, here's the code:
function [song] = pickSong(file_name, time_remain)
% // Open up the file
fid = fopen(file_name, 'r');
%// Read the first line
song_name = fgetl(fid);
%// Initialize cell array
song_list = {song_name};
%// Read in the song list and place
%// each entry into a cell array
while ischar(song_name)
song_name = fgetl(fid);
if song_name == -1
break;
end
song_list = [song_list {song_name}];
end
%// Now, for each entry in our song list, find all occurrences of the '-'
%// with strfind, and choose the last index that '-' occurs at
%// Make sure you skip over by 2 spaces to remove the '-' and the space
song_times = cell(1,length(song_list));
song_names = cell(1,length(song_list));
for idx = 1 : length(song_list)
idxs = strfind(song_list{idx}, '-');
song_times{idx} = song_list{idx}(idxs(end)+2:end);
idxs2 = strfind(song_list{idx}, ' ');
%// Figure out the index of where the first space is, then extract
%// the string that starts from 1 over, to two places before the
%// last '-' character
song_names{idx} = song_list{idx}(idxs2(1)+1 : idxs(end)-2);
end
%// Now we have a list of times for each song. Tokenize by the ':' to
%// separate the minutes and times, then calculate the number of seconds
%// Logic borrowed by MrAzzaman
song_seconds = zeros(1,length(song_list));
for idx = 1 : length(song_list)
[minute_str, second_str] = strtok(song_times{idx}, ':');
song_seconds(idx) = str2double(minute_str)*60 + str2double(second_str(2:end));
end
%// Now, calculate how much time is remaining from the input
[minute_str, second_str] = strtok(time_remain, ':');
seconds_remain = str2double(minute_str)*60 + str2double(second_str(2:end));
%// Now, choose the song that is closest to the amount of time
%// elapsed
[~,song_to_choose] = min(abs(seconds_remain - song_seconds));
%// Return the song you want
song = song_names{song_to_choose};
%// Close the file
fclose(fid);
end
With your two example cases you've shown above, this is the output I get. I've taken the liberty in creating my own text files with your (awesome taste in) music:
>> song1 = pickSong('Funeral.txt', '3:13')
song1 =
Neighborhood #2 (Laika)
>> song2 = pickSong('Resistance.txt', '3:57')
song2 =
Exogenesis: Symphony Part 2 (Cross-Pollination)
You can manage this with textscan, as follows:
function[song,len] = pickSong(file_name,time_remain)
fid = fopen(filename);
toks = textscan(fid,'%[^-] - %d:%d');
songs = toks{1};
song_len = double(toks{2}*60 + toks{3});
[min_rem, sec_rem] = strtok(time_remain, ':');
time_rem = str2double(min_rem)*60 + str2double(sec_rem(2:end));
[len,i] = min(abs(time_rem - song_len));
song = songs{i};
Note that this will only work if none of your song names have a '-' character in them.
EDIT: Here's a solution that (should) work on any song titles:
function[song,len] = pickSong(file_name,time_remain)
file = fileread(file_name);
toks = regexp(file,'\d+. (.*?) - (\d+):(\d+)\n','tokens');
songs = cell(1,length(toks));
song_lens = zeros(1,length(toks));
for i=1:length(toks)
songs{i} = toks{i}{1};
song_lens(i) = str2double(toks{i}{2})*60 + str2double(toks{i}{3});
end
[min_rem, sec_rem] = strtok(time_remain, ':');
time_rem = str2double(min_rem)*60 + str2double(sec_rem(2:end));
[len,i] = min(abs(time_rem - song_lens));
song = songs{i};
regexp is a MATLAB function that runs regular expressions on a string (in this case your file of song names). The string '\d+. (.*?) - (\d+):(\d+)\n' scans each line extracting the name and length of each song. \d+ matches one or more digit, while .*? matches anything. The brackets are for grouping the output. So, we have:
match n digits, followed by a (string), followed by (n-digits):(n-digits)
Every thing in brackets is returned as a cell array to the toks variable. The for loop is just extracting the song names and lengths from the resulting cell array.
i have a cell array as below, which are dates. I am wondering how can i extract the year at the last 4 digits? Could anyone teach me how to locate the year in the string? Thank you!
'31.12.2001'
'31.12.2000'
'31.12.2004'
'31.12.2003'
'31.12.2002'
'31.12.2000'
'31.12.1999'
'31.12.1998'
'31.12.1997'
'31.12.2005'
'31.12.2004'
'31.12.2003'
'31.12.2002'
'31.12.2001'
'31.12.2000'
'31.12.1999'
'31.12.1998'
'31.12.2005'
'31.12.2004'
'31.12.2003'
'31.12.2002'
'31.12.2005'
Example cell array:
A = {'31.12.2001'; '31.12.2002'; '31.12.2003'};
Apply some regular expressions:
B = regexp(A, '\d\d\d\d', 'match')
B = [B{:}];
EDIT: I never realized that matlab will "nest" an extra layer of cells until I tested this. I don't like this solution as much now that I know the second line is necessary. Here is an alternative approach that gets you the years in numeric form:
C = datevec(A, 'dd.mm.yyyy');
C = C(:, 1);
SECOND EDIT: Suprisingly, if your cell array has less than 10000 elements, the regexp approach is faster on my machine. But the output of it is another cell array (which takes up much more memory than a numeric matrix). You can use B = cell2mat(B) to get a character array instead, but this brings the two approaches to approximately equal efficiency.
Just to add a fun answer, designed to take the OP to the stranger regions of Matlab:
C = char(C);
y = (D(:,7:end)-'0') * 10.^(3:-1:0).'
which is an order of magnitude faster than anything posted in the other answers :)
Or, to stay a bit closer to home,
y = cellfun(#(x)str2double(x(7:end)),C);
or, yet another regexp variation:
y = str2num(char(regexprep(C, '\d+\.\d+\.','')));
Assuming your matrix with dates is M or a cell array C:
In case your data is in a cell array start with
M = cell2mat(C)
Then get the relevant part
Y=M(:,end-4:end)
If required you can even make the year a number
Year = str2num(Y)
Using regexp this will works also with dates with slightly different formats, like 1.1.2000, which can mess with you offsets
res = regexp(dates, '(?<=\d+\.\d+\.)\d+', 'match')
I am new to Mathematica and I am having difficulties with one thing. I have this Table that generates 10 000 times 13 numbers (12 numbers + 1 that is a starting number). I need to create a Histogram from all 10 000 13th numbers. I hope It's quite clear, quite tricky to explain.
This is the table:
F = Table[(Xi = RandomVariate[NormalDistribution[], 12];
Mu = -0.00644131;
Sigma = 0.0562005;
t = 1/12; s = 0.6416;
FoldList[(#1*Exp[(Mu - Sigma^2/2)*t + Sigma*Sqrt[t]*#2]) &, s,
Xi]), {SeedRandom[2]; 10000}]
The result for the following histogram could be a table that will take all the 13th numbers to one table - than It would be quite easy to create an histogram. Maybe with "select"? Or maybe you know other ways to solve this.
You can access different parts of a list using Part or (depending on what parts you need) some of the more specialised commands, such as First, Rest, Most and (the one you need) Last. As noted in comments, Histogram[Last/#F] or Histogram[F[[All,-1]]] will work fine.
Although it wasn't part of your question, I would like to note some things you could do for your specific problem that will speed it up enormously. You are defining Mu, Sigma etc 10,000 times, because they are inside the Table command. You are also recalculating Mu - Sigma^2/2)*t + Sigma*Sqrt[t] 120,000 times, even though it is a constant, because you have it inside the FoldList inside the Table.
On my machine:
F = Table[(Xi = RandomVariate[NormalDistribution[], 12];
Mu = -0.00644131;
Sigma = 0.0562005;
t = 1/12; s = 0.6416;
FoldList[(#1*Exp[(Mu - Sigma^2/2)*t + Sigma*Sqrt[t]*#2]) &, s,
Xi]), {SeedRandom[2]; 10000}]; // Timing
{4.19049, Null}
This alternative is ten times faster:
F = Module[{Xi, beta}, With[{Mu = -0.00644131, Sigma = 0.0562005,
t = 1/12, s = 0.6416},
beta = (Mu - Sigma^2/2)*t + Sigma*Sqrt[t];
Table[(Xi = RandomVariate[NormalDistribution[], 12];
FoldList[(#1*Exp[beta*#2]) &, s, Xi]), {SeedRandom[2];
10000}] ]]; // Timing
{0.403365, Null}
I use With for the local constants and Module for the things that are other redefined within the Table (Xi) or are calculations based on the local constants (beta). This question on the Mathematica StackExchange will help explain when to use Module versus Block versus With. (I encourage you to explore the Mathematica StackExchange further, as this is where most of the Mathematica experts are hanging out now.)
For your specific code, the use of Part isn't really required. Instead of using FoldList, just use Fold. It only retains the final number in the folding, which is identical to the last number in the output of FoldList. So you could try:
FF = Module[{Xi, beta}, With[{Mu = -0.00644131, Sigma = 0.0562005,
t = 1/12, s = 0.6416},
beta = (Mu - Sigma^2/2)*t + Sigma*Sqrt[t];
Table[(Xi = RandomVariate[NormalDistribution[], 12];
Fold[(#1*Exp[beta*#2]) &, s, Xi]), {SeedRandom[2];
10000}] ]];
Histogram[FF]
Calculating FF in this way is even a little faster than the previous version. On my system Timing reports 0.377 seconds - but such a difference from 0.4 seconds is hardly worth worrying about.
Because you are setting the seed with SeedRandom, it is easy to verify that all three code examples produce exactly the same results.
Making my comment an answer:
Histogram[Last /# F]