Find a Name in an Email (Low-Level I/O) - matlab

Round 2: Picking out leaders in an email
Alrighty, so my next problem is trying to figure out who the leader is in a project. In order to determine this, we are given an email and have to find who says "Do you want..." (capitalization may vary). I feel like my code should work for the most part, but I really have an issue figuring out how to correctly populate my cell array. I can get it to create the cell array, but it just puts the email in it over over again. So each cell is basically the name.
function[Leader_Name] = teamPowerHolder(email)
email = fopen(email, 'r'); %// Opens my file
lines = fgets(email); %// Reads the first line
conversations = {lines}; %// Creates my cell array
while ischar(lines) %// Populates my cell array, just not correct
Convo = fgets(email);
if Convo == -1 %// Prevents it from just logging -1 into my cell array like a jerk
break; %// Returns to function
end
conversations = [conversations {lines}]; %// Populates my list
end
Sentences = strfind(conversations,'Do you want'); %// Locates the leader position
Leader_Name = Sentences{1}; %// Indexes that position
fclose(email);
end
What I ideally need it to do is find the '/n' character (hence why I used fgets) but I'm not sure how to make it do that. I tried to have my while loop be like:
while lines == '/n'
but that's incorrect. I feel like I know how to do the '/n' bit, I just can't think of it. So I'd appreciate some hints or tips to do that. I could always try to strsplit or strtok the function, but I need to then populate my cell array so that might get messy.
Please and thanks for help :)
Test Case:
Anna: Hey guys, so I know that he just assigned this project, but I want to go ahead and get started on it.
Can you guys please respond and let me know a weekly meeting time that will work for you?
Wiley: Ummmmm no because ain't nobody got time for that.
John: Wiley? What kind of a name is that? .-.
Wiley: It's better than john. >.>
Anna: Hey boys, let's grow up and talk about a meeting time.
Do you want to have a weekly meeting, or not?
Wiley: I'll just skip all of them and not end up doing anything for the project anyway.
So I really don't care so much.
John: Yes, Anna, I'd like to have a weekly meeting.
Thank you for actually being a good teammate and doing this. :)
out2 = teamPowerHolder('teamPowerHolder_convo2.txt')
=> 'Anna'

The main reason why it isn't working is because you're supposed to update the lines variable in your loop, but you're creating a new variable called Convo that is updating instead. This is why every time you put lines in your cell array, it just puts in the first line repeatedly and never quits the loop.
However, what I would suggest you do is read in each line, then look for the : character, then extract the string up until the first time you encounter this character minus 1 because you don't want to include the actual : character itself. This will most likely correspond to the name of the person that is speaking. If we are missing this occurrence, then that person is still talking. As such, you would have to keep a variable that keeps track of who is still currently talking, until you find the "do you want" string. Whoever says this, we return the person who is currently talking, breaking out of the loop of course! To ensure that the line is case insensitive, you'll want to convert the string to lower.
There may be a case where no leader is found. In that case, you'll probably want to return the empty string. As such, initialize Leader_Name to the empty string. In this case, that would be []. That way, should we go through the e-mail and find no leader, MATLAB will return [].
The logic that you have is pretty much correct, but I wouldn't even bother storing stuff into a cell array. Just examine each line in your text file, and keep track of who is currently speaking until we encounter a sentence that has another : character. We can use strfind to facilitate this. However, one small caveat I'll mention is that if the person speaking includes a : in their conversation, then this method will break.
Judging from the conversation that I'm seeing your test case, this probably won't be the case so we're OK. As such, borrowing from your current code, simply do this:
function[Leader_Name] = teamPowerHolder(email)
Leader_Name = []; %// Initialize leader name to empty
name = [];
email = fopen(email, 'r'); %// Opens my file
lines = fgets(email); %// Reads the first line
while ischar(lines)
% // Get a line in your e-mail
lines = fgets(email);
% // Quit like a boss if you see a -1
if lines == -1
break;
end
% // Check if this line has a ':' character.
% // If we do, then another person is talking.
% // Extract the characters just before the first ':' character
% // as we don't want the ':' character in the name
% // If we don't encounter a ':' character, then the same person is
% // talking so don't change the current name
idxs = strfind(lines, ':');
if ~isempty(idxs)
name = lines(1:idxs(1)-1);
end
% // If we find "do you want" in this sentence, then the leader
% // is found, so quit.
if ~isempty(strfind(lower(lines), 'do you want'))
Leader_Name = name;
break;
end
end
By running the above code with your test case, this is what I get:
out2 = teamPowerHolder('teamPowerHolder_convo2.txt')
out2 =
Anna

Related

Defaultdict() the correct choice?

EDIT: mistake fixed
The idea is to read text from a file, clean it, and pair consecutive words (not permuations):
file = f.read()
words = [word.strip(string.punctuation).lower() for word in file.split()]
pairs = [(words[i]+" " + words[i+1]).split() for i in range(len(words)-1)]
Then, for each pair, create a list of all the possible individual words that can follow that pair throughout the text. The dict will look like
[ConsecWordPair]:[listOfFollowers]
Thus, referencing the dictionary for a given pair will return all of the words that can follow that pair. E.g.
wordsThatFollow[('she', 'was')]
>> ['alone', 'happy', 'not']
My algorithm to achieve this involves a defaultdict(list)...
wordsThatFollow = defaultdict(list)
for i in range(len(words)-1):
try:
# pairs overlap, want second word of next pair
# wordsThatFollow[tuple(pairs[i])] = pairs[i+1][1]
EDIT: wordsThatFollow[tuple(pairs[i])].update(pairs[i+1][1][0]
except Exception:
pass
I'm not so worried about the value error I have to circumvent with the 'try-except' (unless I should be). The problem is that the algorithm only successfully returns one of the followers:
wordsThatFollow[('she', 'was')]
>> ['not']
Sorry if this post is bad for the community I'm figuring things out as I go ^^
Your problem is that you are always overwriting the value, when you really want to extend it:
# Instead of this
wordsThatFollow[tuple(pairs[i])] = pairs[i+1][1]
# Do this
wordsThatFollow[tuple(pairs[i])].append(pairs[i+1][1])

matlab: check which lines of a path are used - graphshortestpath

The related problem comes from the power Grid in Germany. I have a network of substations, which are connected according to the Lines. The shortest way from point A to B was calculated using the graphshortestpath function. The result is a path with the used substation ID's. I am interested in the Line ID's though, so I have written a sequential code to figure out the used Line_ID's for each path.
This algorithm uses two for loops. The first for-loop to access the path from a cell array, the second for-loop looks at each connection and searches the Line_ID from the array.
Question: Is there a better way of coding this? I am looking for the Line_ID's, graphshortestpath only returns the node ID's.
Here is the main code:
for i = i_entries
path_i = LKzuLK_path{i_entries};
if length(path_i) > 3 %If length <=3 no lines are used.
id_vb = 2:length(path_i) - 2;
for id = id_vb
node_start = path_i(id);
node_end = path_i(id+1);
idx_line = find_line_idx(newlinks_vertices, node_start, ...
node_end);
Zuordnung_LKzuLK_pathLines(ind2sub(size_path,i),idx_line) = true;
end
end
end
Note: The first and last enrty of path_i are area ID's, so they are not looked upon for the search for the Line_ID's
function idx_line = find_line_idx(newlinks_vertices, v_id_1, v_id_2)
% newlinks_vertices includes the Line_ID, and then the two connecting substations
% Mirror v_id's in newlinks_vertices:
check_links = [newlinks_vertices; newlinks_vertices(:,1), newlinks_vertices(:,3), newlinks_vertices(:,2)];
tmp_dist1 = find(check_links(:,2) == v_id_1);
tmp_dist2 = find(check_links(tmp_dist1,3) == v_id_2,1);
tmp_dist3 = tmp_dist1(tmp_dist2);
idx_line = check_links(tmp_dist3,1);
end
Note: I have already tried to shorten the first find-search routine, by indexing the links list. This step will return a short list with only relevant entries of the links looked upon. That way the algorithm is reduced of the first and most time consuming find function. The result wasn't much better, the calculation time was still at approximately 7 hours for 401*401 connections, so too long to implement.
I would look into Dijkstra's algorithm to get a faster implementation. This is what Matlab's graphshortestpath uses by default. The linked wiki page probably explains it better than I ever could and even lays it out in pseudocode!

Debugging a for loop in matlab

I've been looking throught the documentation, but can't seem to find the bit I want.
I have a for loop and I would like to be able to view every value in the for loop.
for example here is a part of my code:
for d = 1 : nb
%for loop performs blade by blade averaging and produces a column vector
for cc = navg : length(atbmat);
atb2 = (sum(atbmat((cc-(navg-1):cc),d)))/navg;
atbvec2(:,cc) = atb2;
end
%assigns column vector 'atbvec2' to the correct column of the matrix 'atbmat2'
atbmat2(d,1:length(atbvec2)) = atbvec2;
end
I would like to view every value of atb2. I'm a python user(new to MATLAB) and would normally use a simple print statement to find this.
I'm sure there is a way to do it, but I can't quite find how.
Thankyou in advance.
you can use disp in Matlab to print to the screen but you might want to use sprintf first to format it nicely. However for debugging you're better off using a break point and then inspect the variable in the workspace browser graphically. To me, this is one of Matlab's best features.
Have a look at the "Examine Values" section of this article
The simplest way to view it everywhere is to change this line:
atb2 = (sum(atbmat((cc-(navg-1):cc),d)))/navg;
Into this, without semicolon:
atb2 = (sum(atbmat((cc-(navg-1):cc),d)))/navg
That being said, given the nature of your calculation, you could get the information you need as well by simply storing every value of abt2 and observing them afterwards. This may be done in atbmat2 already?
If you want to look at each value at the time it happens, consider setting a breakpoint or conditional breakpoint after the line where abt2 is assigned.

Matlab: dynamic name for structure

I want to create a structure with a variable name in a matlab script. The idea is to extract a part of an input string filled by the user and to create a structure with this name. For example:
CompleteCaseName = input('s');
USER WRITES '2013-06-12_test001_blabla';
CompleteCaseName = '2013-06-12_test001_blabla'
casename(12:18) = struct('x','y','z');
In this example, casename(12:18) gives me the result test001.
I would like to do this to allow me to compare easily two cases by importing the results of each case successively. So I could write, for instance :
plot(test001.x,test001.y,test002.x,test002.y);
The problem is that the line casename(12:18) = struct('x','y','z'); is invalid for Matlab because it makes me change a string to a struct. All the examples I find with struct are based on a definition like
S = struct('x','y','z');
And I can't find a way to make a dynamical name for S based on a string.
I hope someone understood what I write :) I checked on the FAQ and with Google but I wasn't able to find the same problem.
Use a structure with a dynamic field name.
For example,
mydata.(casename(12:18)) = struct;
will give you a struct mydata with a field test001.
You can then later add your x, y, z fields to this.
You can use the fields later either by mydata.test001.x, or by mydata.(casename(12:18)).x.
If at all possible, try to stay away from using eval, as another answer suggests. It makes things very difficult to debug, and the example given there, which directly evals user input:
eval('%s = struct(''x'',''y'',''z'');',casename(12:18));
is even a security risk - what happens if the user types in a string where the selected characters are system(''rm -r /''); a? Something bad, that's what.
As I already commented, the best case scenario is when all your x and y vectors have same length. In this case you can store all data from the different files into 2 matrices and call plot(x,y) to plot each column as a series.
Alternatively, you can use a cell array such that:
c = cell(2,nufiles);
for ii = 1:numfiles
c{1,ii} = import x data from file ii
c{2,ii} = import y data from file ii
end
plot(c{:})
A structure, on the other hand
s.('test001').x = ...
s.('test001').y = ...
Use eval:
eval(sprintf('%s = struct(''x'',''y'',''z'');',casename(12:18)));
Edit: apologies, forgot the sprintf.

How to add operator signs '+,-,/,*,mod' etc to a label for making a calculator?

I have made a calculator for simple operations but I cant figure out how should I add the operator signs next to the numerals that I am entering.
I created 2 functions 1 on the number being entered
-(IBAction)buttonDigitPressed:(id)sender
and another for the operation
-(IBAction)buttonOperationPressed:(id)sender.
calculatorScreen.text = [NSString stringWithFormat:#"%.2f",result];
This is for the result to be shown on the label calculatorScreen.
The result i would like would be something like "1+2*3/4" on the calculatorScreen.
Sorry if I misunderstand your question, but what you want is to display on your calculator app the full equation that you've input thus far (e.g. 63+42-62).
Like any other calculator, you should have 2 label, one for your current input, and one to show all that you've entered.(I'm guessing you need the latter)
With the second label up, you can add in the append function into your digitpressed, enter/= function, operation function. If you want to tweak it such 16+23-32 will show up as
1) 16+23
2) 39-32
3) 39-32=7
then you'll have to add in your own specific code. otherwise the label will input as 16+23-32 = 7
You can just append the character to whatever is already on calculatorScreen. Or you can save the current input in an instance variable and display where appropriate.
This is just a guideline, since I don't know the behavior of your calculator in case of this input: 1 + 2 * 3 (simple calculator will return 9, scientific will return 7).