Merging tables based on two fields instead of just one? - merge

I have data on sites where invasive species have been managed, and I'm trying to merge information between two tables based on two fields.
The first table includes things like site code, species, area.
Site Code
Scientific Name
Area
001
Alliaria petiolata
Area 1
002
Clematis ternifolia
Area 2
003
Clematis ternifolia
Area 3
The second table has the "tier" or rank of importance according to experts (i.e. 1 means most important, 4 means least important, and there are some confusing letter categories as well).
scientific name
Area 1
Area 2
Area 3
Acer palmatum
2
M
M
Alliaria petiolata
4
4
4
Clematis ternifolia
4
1a
3
I'm trying to add a new column to the first table (because the actual thing has way more information) that shows the tier for that site code, based on the species and area.
Site Code
Scientific Name
Area
Tier
001
Alliaria petiolata
Area 1
4
002
Clematis ternifolia
Area 2
1a
003
Clematis ternifolia
Area 3
3
I'm not even really sure how to approach this because it's like a merge but it's contingent upon two pieces of information. I don't really want to do a giant if-else block.
Any ideas???

Related

I literally don't know what I need to do for this but it needs to be done and evaluated within a couple of days

1 It's an Emergency.
2 For one of my programming units we have to create a 'UCAS' points calculator in Net beans. UCAS sort out all of the university stuff in the UK.
3 So I need to create a GUI that has text boxes and buttons and I need to enter the amount of passes, merits, and distinctions I've got on my course, which will give me a total amount of UCAS points.
4 Passes = 70 points
5 Merit = 80 points
6 Distinctions = 90 points
7 There's 18 units in total, so if I enter I have 4 passes (a pass for 4 units) the box with the total amount of UCAS points needs to go up.
8 I need three boxes, one for passes, merits, and distinctions, and the box for the total amount of UCAS points. It needs a calculate button of course and a 'reset facility'. It says that the current date needs to be seen on the interface.
9 It also says that there should be an option to quit the program at any time.
10 It has to be a **GUI** and I literally don't know what to do and I don't want to fail again.
11 Can someone help me?
12 I literally don't know how to make the buttons work or anything.
I don't even need any help I just need someone to show me exactly what to do because I haven't got a clue.

How to select one value to be displayed in SPSS to be displayed as chart

I have two variables Career and Internet usage, there are two career options but I only want to select one value.
When I generate the pie chart in the output window I get and overview of internet usage but it includes both of the careers I only want a specific career to be shown in the pie chart e.g
How do people from System engineering use the internet as opposed to how does everyone use the internet
I don't know if this is clear, but if it is how would I do this in SPSS? is there some sort of filter_?
For one chart you can use TEMPORARY followed by a SELECT IF. If the the chart command follows directly after it will only plot that subset.
DATA LIST FREE / Career Internet.
BEGIN DATA
1 1
1 2
1 3
2 1
2 2
2 2
END DATA.
VALUE LABELS Internet
1 '2-4 Hours a Day'
2 '5-7 Hours a Day'
3 'Over 7 Hours a Day'.
TEMPORARY.
SELECT IF Career = 2.
GRAPH
/PIE=COUNT BY Internet.
If you have multiple charts see the FILTER command.

How to interpret Turing Machine illustrations on p79 of Stephen Wolfram's “A New Kind of Science” book?

I am reading Stephen Wolfram's "A New Kind of Science".
At present, I cannot understand how the cellular automata illustrations on p79 are created.
In the patterns, the active cell, representing the head, appears to change orientation between up and -45 degrees. However, none of the rules seem to include an active cell with an orientation other than up or down. How does the active cell orientation of -45 degrees come about in the patterns?
Am I missing something obvious (I am a beginner in this area)?
You have a simple rule. Just a mapping from 3 binary digits to 1 binary digit. For example:
111 - 0
110 - 0
101 - 0
100 - 1
011 - 1
010 - 1
001 - 1
000 - 0
Then you have some sequence of digits during time t0. For example 00111010. To find what will happen at time t1, you need to use this mapping. So 001 will be 1, then 011 will be 1, 111 = 0, then ... and 010 = 1. This way you will receive the sequence of the same length for the second generation (t1). And you move on further and further again till you will see repetition.
So on that pictures your X axis is this sequence (empty square 0, gray square 1). On your Y axis is how this sequence evolve under specific rule. In my example it was a rule 30 (because 00011110 = 30 in binary)
You can read high level overview here. Also these rules are simple, they can give rise to complex behavior.
P.S. paper was published in Nature (high level science journal) and considered revolutionized because it showed that complicated structures and motifs like dots on the leopard's skin or the image on the shell can arise from really simple rules.
I think that it is an inconsistency between the printing of the rules and the diagrams.
If the downward (-90 degrees) arrows in the rules are replaced with arrows pointing to the bottom right (-45 degrees) then the rules seem to make sense.

Inter annotator agreement when users annotates more than one category for any subject

I want to find the inter annotator agreement for few annotators.
Annotators annotates few categories (out of 10 categories) for each subjects.
For e.g. there are 3 annotator , 10 categories and 100 subjects .
I am aware about http://en.wikipedia.org/wiki/Cohen's_kappa (For two annotators) and http://en.wikipedia.org/wiki/Fleiss%27_kappa (for more than two annotators) inter annotator agreement but I realized that they may not work if user annotates more than one category for any subject.
Do anyone has any idea for determining inter annotation agreement in this scenario.
Thanks
i had to do this several years back. i cant recall how exactly i did it(i dont have code anymore) but i have a worked example to report to my professor. i was dealing with annotation of comments and have 56 categories and 4 annotators.
note:at the time i need a way to detect where annotators most disagree so that after each annotation session they can focus on why they disagree and set out reasonable rules to maximize this statistic. it worked well for that purpose
Let's assume A-D are annotators and 1-5 are categories. This is a possible scenario.
A B C D Probability of agreement
1 X X X X 4/4
2 X X X 3/4
3 X X 2/4
4 X 1/4
5
A tags this comment as 1,2,3,4 B->1,2,3, and so forth.
For each category the probability of agreement is calculated.
Which is then divided by the number of unique categories tagged for that particular comment.
Therefore for the example comment, we have 10/16 as annotator's agreement. This is a value between 0 and 1.
if this doesnt work for you then (http://www.mitpressjournals.org/doi/pdf/10.1162/coli.07-034-R2) pg-567, which was referenced by pg-587 case study.
Compute agreement on a per-label basis. If you treat one of the annotators as the gold standard, you can then compute recall and precision on label assignments. Another option is label overlap, which would be the proportion of subjects where either annotator assigned a category where the both assigned it (intersection over union).

Can I filter the same data differently in two columns in an Excel pivot table?

I've created a pivot table that lists the number of cases submitted by a series of locations.
Consider:
Location A 100
Location B 10
Location C 1000
TOTAL 1110
Our data for the table includes location and status. Now the client wants to see a percentage of CLOSED cases as a third column.
Consider:
Location A 100 50% (based on 50 cases marked as "closed")
Location B 10 10% (based on 1 case marked as "closed")
Location C 1000 20% (based on 200 cases marked as "closed")
TOTAL 1110 23% (based on 251 total cases marked as "closed")
I can add a third column to the table, but the second I filter on CLOSED cases, column two lists the totals of closed cases only, and my percentage is 100% for all of column three. Is there a way to leave column 2 selecting EVERYTHING, but set column one to look ONLY as closed cases?
In a perfect world, I could display location, count of ALL cases by location and count of CLOSED cases by location, expressed as a percentage of column #2
Is this doable?
I think the best way to do this is to add a column to your source data that assigns a 0 for open stores and a 1 for closed. Then using my dictum:
The percentage of True items in a list is the average of zeros and
ones, where True is represented by 1 and False by 0.
from this post, you can do this: