Merge records from two different tables - select

I have two tables one with 9 fields the other with 12 fields in teradata sql. While i can join on the 9 common fields 3 fields are not joining therefore creating duplicate reads.
[Join on t1.field1=t2.field1
And t1.field2=t2.field2
And t1.field3=t2.field3
And t1.field4=t2.field4
And t1.field5=t2.field5
And t1.field6=t2.field6
And t1.field7=t2.field7
And t1.field8=t2.field8
And t1.field9=t2.field9]
However I have 3 more fields t2.field10, t2.field11,t2.field12
which is creating duplicate records from table1.
Can you advise on how to build a select statement that would not create duplicate of records from table1?
table 1
<tr> <th> semester </th><th> country </th><th> state </th><th> county </th><th> city </th><th> school name </th><th> type </th><th> type2 </th><th> class </th><th> number of students </th> </tr>
<tr> <td> spring2016 </td><td> USA </td><td> Illinois </td><td> Cook </td><td> Chicago </td><td> x1 </td><td> elementary </td><td> public </td><td> 1 </td><td> 331 </td> </tr>
<tr> <td> spring2016 </td><td> USA </td><td> Illinois </td><td> Cook </td><td> Chicago </td><td> x1 </td><td> elementary </td><td> public </td><td> 2 </td><td> 487 </td> </tr>
<tr> <td> spring2016 </td><td> USA </td><td> Illinois </td><td> Cook </td><td> Chicago </td><td> x1 </td><td> elementary </td><td> public </td><td> 3 </td><td> 329 </td> </tr>
<tr> <td> spring2016 </td><td> USA </td><td> Illinois </td><td> Cook </td><td> Chicago </td><td> x1 </td><td> elementary </td><td> public </td><td> 4 </td><td> 400 </td> </tr>
<tr> <td> spring2016 </td><td> USA </td><td> Illinois </td><td> Cook </td><td> Chicago </td><td> x1 </td><td> elementary </td><td> public </td><td> 5 </td><td> 225 </td> </tr>
<tr> <td> fall2016 </td><td> USA </td><td> Illinois </td><td> Cook </td><td> Chicago </td><td> x1 </td><td> elementary </td><td> public </td><td> 1 </td><td> 249 </td> </tr>
<tr> <td> fall2016 </td><td> USA </td><td> Illinois </td><td> Cook </td><td> Chicago </td><td> x1 </td><td> elementary </td><td> public </td><td> 2 </td><td> 136 </td> </tr>
<tr> <td> fall2016 </td><td> USA </td><td> Illinois </td><td> Cook </td><td> Chicago </td><td> x1 </td><td> elementary </td><td> public </td><td> 3 </td><td> 140 </td> </tr>
<tr> <td> fall2016 </td><td> USA </td><td> Illinois </td><td> Cook </td><td> Chicago </td><td> x1 </td><td> elementary </td><td> public </td><td> 4 </td><td> 444 </td> </tr>
<tr> <td> fall2016 </td><td> USA </td><td> Illinois </td><td> Cook </td><td> Chicago </td><td> x1 </td><td> elementary </td><td> public </td><td> 5 </td><td> 371 </td> </tr>
table 2:
country </th><th> state </th><th> county </th><th> city </th><th> school name </th><th> type </th><th> type2 </th><th> level </th><th> class </th><th> homework </th><th> field trip </th><th> tests </th><th> planned budget </th><th> actual budget </th> </tr>
USA </td><td> Illinois </td><td> Cook </td><td> Chicago </td><td> x1 </td><td> elementary </td><td> public </td><td> 1 </td><td> american literature </td><td> n </td><td> n </td><td> 8 </td><td> 6856 </td><td> 5800.357992 </td> </tr>
USA </td><td> Illinois </td><td> Cook </td><td> Chicago </td><td> x1 </td><td> elementary </td><td> public </td><td> 1 </td><td> geography </td><td> y </td><td> y </td><td> 8 </td><td> 3040 </td><td> 963.4004114 </td> </tr>
USA </td><td> Illinois </td><td> Cook </td><td> Chicago </td><td> x1 </td><td> elementary </td><td> public </td><td> 1 </td><td> music </td><td> y </td><td> y </td><td> 10 </td><td> 3288 </td><td> 2362.845994 </td> </tr>
USA </td><td> Illinois </td><td> Cook </td><td> Chicago </td><td> x1 </td><td> elementary </td><td> public </td><td> 2 </td><td> american literature </td><td> n </td><td> n </td><td> 8 </td><td> 6984 </td><td> 4368.417857 </td> </tr>
USA </td><td> Illinois </td><td> Cook </td><td> Chicago </td><td> x1 </td><td> elementary </td><td> public </td><td> 2 </td><td> british literature </td><td> n </td><td> n </td><td> 4 </td><td> 3977 </td><td> 3861.683941 </td> </tr>
USA </td><td> Illinois </td><td> Cook </td><td> Chicago </td><td> x1 </td><td> elementary </td><td> public </td><td> 2 </td><td> geography </td><td> y </td><td> n </td><td> 5 </td><td> 5358 </td><td> 1727.575547 </td> </tr>
USA </td><td> Illinois </td><td> Cook </td><td> Chicago </td><td> x1 </td><td> elementary </td><td> public </td><td> 3 </td><td> biology </td><td> y </td><td> n </td><td> 6 </td><td> 4490 </td><td> 4241.514602 </td> </tr>
USA </td><td> Illinois </td><td> Cook </td><td> Chicago </td><td> x1 </td><td> elementary </td><td> public </td><td> 3 </td><td> british literature </td><td> n </td><td> y </td><td> 9 </td><td> 3476 </td><td> 2176.995858 </td> </tr>
USA </td><td> Illinois </td><td> Cook </td><td> Chicago </td><td> x1 </td><td> elementary </td><td> public </td><td> 3 </td><td> PE </td><td> y </td><td> y </td><td> 7 </td><td> 6060 </td><td> 713.4806136 </td> </tr>
USA </td><td> Illinois </td><td> Cook </td><td> Chicago </td><td> x1 </td><td> elementary </td><td> public </td><td> 4 </td><td> biology </td><td> y </td><td> y </td><td> 8 </td><td> 5059 </td><td> 2269.706168 </td> </tr>
USA </td><td> Illinois </td><td> Cook </td><td> Chicago </td><td> x1 </td><td> elementary </td><td> public </td><td> 4 </td><td> music </td><td> n </td><td> y </td><td> 8 </td><td> 3250 </td><td> 583.2956503 </td> </tr>
USA </td><td> Illinois </td><td> Cook </td><td> Chicago </td><td> x1 </td><td> elementary </td><td> public </td><td> 4 </td><td> PE </td><td> n </td><td> y </td><td> 3 </td><td> 3945 </td><td> 577.6461806 </td> </tr>
USA </td><td> Illinois </td><td> Cook </td><td> Chicago </td><td> x1 </td><td> elementary </td><td> public </td><td> 5 </td><td> american literature </td><td> n </td><td> y </td><td> 7 </td><td> 4083 </td><td> 2853.53736 </td> </tr>
USA </td><td> Illinois </td><td> Cook </td><td> Chicago </td><td> x1 </td><td> elementary </td><td> public </td><td> 5 </td><td> music </td><td> y </td><td> y </td><td> 8 </td><td> 3502 </td><td> 1257.361273 </td> </tr>
USA </td><td> Illinois </td><td> Cook </td><td> Chicago </td><td> x1 </td><td> elementary </td><td> public </td><td> 5 </td><td> PE </td><td> n </td><td> n </td><td> 3 </td><td> 5234 </td><td> 4075.859156 </td> </tr>
USA </td><td> Illinois </td><td> Cook </td><td> Chicago </td><td> x1 </td><td> elementary </td><td> public </td><td> 1 </td><td> american literature </td><td> n </td><td> n </td><td> 8 </td><td> 6856 </td><td> 5800.357992 </td> </tr>
USA </td><td> Illinois </td><td> Cook </td><td> Chicago </td><td> x1 </td><td> elementary </td><td> public </td><td> 1 </td><td> geography </td><td> y </td><td> y </td><td> 8 </td><td> 3040 </td><td> 963.4004114 </td> </tr>
USA </td><td> Illinois </td><td> Cook </td><td> Chicago </td><td> x1 </td><td> elementary </td><td> public </td><td> 1 </td><td> music </td><td> y </td><td> y </td><td> 10 </td><td> 3288 </td><td> 2362.845994 </td> </tr>
USA </td><td> Illinois </td><td> Cook </td><td> Chicago </td><td> x1 </td><td> elementary </td><td> public </td><td> 2 </td><td> american literature </td><td> n </td><td> n </td><td> 8 </td><td> 6984 </td><td> 4368.417857 </td> </tr>
USA </td><td> Illinois </td><td> Cook </td><td> Chicago </td><td> x1 </td><td> elementary </td><td> public </td><td> 2 </td><td> british literature </td><td> n </td><td> n </td><td> 4 </td><td> 3977 </td><td> 3861.683941 </td> </tr>
USA </td><td> Illinois </td><td> Cook </td><td> Chicago </td><td> x1 </td><td> elementary </td><td> public </td><td> 2 </td><td> geography </td><td> y </td><td> n </td><td> 5 </td><td> 5358 </td><td> 1727.575547 </td> </tr>
USA </td><td> Illinois </td><td> Cook </td><td> Chicago </td><td> x1 </td><td> elementary </td><td> public </td><td> 3 </td><td> biology </td><td> y </td><td> n </td><td> 6 </td><td> 4490 </td><td> 4241.514602 </td> </tr>
USA </td><td> Illinois </td><td> Cook </td><td> Chicago </td><td> x1 </td><td> elementary </td><td> public </td><td> 3 </td><td> british literature </td><td> n </td><td> y </td><td> 9 </td><td> 3476 </td><td> 2176.995858 </td> </tr>
USA </td><td> Illinois </td><td> Cook </td><td> Chicago </td><td> x1 </td><td> elementary </td><td> public </td><td> 3 </td><td> PE </td><td> y </td><td> y </td><td> 7 </td><td> 6060 </td><td> 713.4806136 </td> </tr>
USA </td><td> Illinois </td><td> Cook </td><td> Chicago </td><td> x1 </td><td> elementary </td><td> public </td><td> 4 </td><td> biology </td><td> y </td><td> y </td><td> 8 </td><td> 5059 </td><td> 2269.706168 </td> </tr>
USA </td><td> Illinois </td><td> Cook </td><td> Chicago </td><td> x1 </td><td> elementary </td><td> public </td><td> 4 </td><td> music </td><td> n </td><td> y </td><td> 8 </td><td> 3250 </td><td> 583.2956503 </td> </tr>
USA </td><td> Illinois </td><td> Cook </td><td> Chicago </td><td> x1 </td><td> elementary </td><td> public </td><td> 4 </td><td> PE </td><td> n </td><td> y </td><td> 3 </td><td> 3945 </td><td> 577.6461806 </td> </tr>
USA </td><td> Illinois </td><td> Cook </td><td> Chicago </td><td> x1 </td><td> elementary </td><td> public </td><td> 5 </td><td> american literature </td><td> n </td><td> y </td><td> 7 </td><td> 4083 </td><td> 2853.53736 </td> </tr>
USA </td><td> Illinois </td><td> Cook </td><td> Chicago </td><td> x1 </td><td> elementary </td><td> public </td><td> 5 </td><td> music </td><td> y </td><td> y </td><td> 8 </td><td> 3502 </td><td> 1257.361273 </td> </tr>
USA </td><td> Illinois </td><td> Cook </td><td> Chicago </td><td> x1 </td><td> elementary </td><td> public </td><td> 5 </td><td> PE </td><td> n </td><td> n </td><td> 3 </td><td> 5234 </td><td> 4075.859156 </td> </tr>
Table 3 output should look like this:
</th><th> type </th><th> type2 </th><th> level </th><th> number of students </th><th> class </th><th> homework </th><th> number of teachers </th><th> tests </th><th> planned budget </th><th> actual budget </th><th> tests/student </th><th> $planed per student </th><th> actual $/student </th> </tr>
however every query i ran i get either duplicate number of students which than either over counts #students/teacher or planned$/student, actual$/student.
I need that if i were to filter based on these three dimensions:
-class
-tests
-homework
the
-#teachers/student
-actual$/student
-planned$/student
should be to the lowest order of students per class
i tried to transpose the 3 fields that are on the table 2 only however the problem is that when i plug it in into tableau i am not able to filter them right.

Related

In currency exchanges, how to deal with multiple values ​that give the same result (with rounding)?

For example, let's say that I have an invoice for 1000 CUA (Currency A) and the exchange rate is 1 CUA = 20.20 CUB (Currency B). So I make 10 payments of 2019.90 CUB
#
Payment (CUB)
Payment (CUA)
Balance
0
1000.00
1
2019.90
100.00
900.00
2
2019.90
100.00
800.00
3
2019.90
100.00
700.00
4
2019.90
100.00
600.00
5
2019.90
100.00
500.00
6
2019.90
100.00
400.00
7
2019.90
100.00
300.00
8
2019.90
100.00
200.00
9
2019.90
100.00
100.00
10
2019.90
100.00
0.00
Σ
20199.00
1000.00
1000.00 CUA is 20200.00 CUB but total payments were only 20199.00 CUB

How to query with "IN" in Q (kdb)?

Let's assume that I have a table in KBD named "Automotive" with following data:
Manufacturer Country Sales Id
Mercedes United States 002
Mercedes Canada 002
Mercedes Germany 003
Mercedes Switzerland 003
Mercedes Japan 004
BMW United States 002
BMW Canada 002
BMW Germany 003
BMW Switzerland 003
BMW Japan 004
How would I structure a query in Q such that I can fetch the records matching United States and Canada without using an OR clause?
In SQL, it would look something like:
SELECT Manufacturer, Country from Automotive WHERE Country IN ('United States', 'Canada')
Thanks in advance for helping this Q beginner!
It's basically the same in kdb. The way you write you query depends on the data type. See below an example where manufacturer is a symbol, and country is a string.
q)tbl:([]manufacturer:`Merc`Merc`BMW`BMW`BMW;country:("United States";"Canada";"United States";"Germany";"Japan");ID:til 5)
q)
q)tbl
manufacturer country ID
-------------------------------
Merc "United States" 0
Merc "Canada" 1
BMW "United States" 2
BMW "Germany" 3
BMW "Japan" 4
q)meta tbl
c | t f a
------------| -----
manufacturer| s
country | C
ID | j
q)select from tbl where manufacturer in `Merc`Ford
manufacturer country ID
-------------------------------
Merc "United States" 0
Merc "Canada" 1
q)
q)select from tbl where country in ("United States";"Canada")
manufacturer country ID
-------------------------------
Merc "United States" 0
Merc "Canada" 1
BMW "United States" 2
Check out how to use Q-sql here: https://code.kx.com/q4m3/9_Queries_q-sql/

Remove duplicates in spark with 90 percent column match

Compare two rows in a dataframe in Spark and to remove the row if 90 percent of the columns matches(if there are 10 columns and if 9 matches). How to do this?
Name Country City Married Salary
Tony India Delhi Yes 30000
Carol USA Chicago Yes 35000
Shuaib France Paris No 25000
Dimitris Spain Madrid No 28000
Richard Italy Milan Yes 32000
Adam Portugal Lisbon Yes 36000
Tony India Delhi Yes 22000 <--
Carol USA Chicago Yes 21000 <--
Shuaib France Paris No 20000 <--
Have to remove the marked rows since 90 percent that 4 out of 5 column values are matching with already existing rows.How to do this in Pyspark Dataframe.TIA

How to list documents from table grouped by id?

I'm making a template to show documents by its parts (title, annotation, contenttext).
{% block header %}
<h1>{% block title %}Documents{% endblock %}</h1>
{% endblock %}
{% block content %}
{% for document in documents | groupby('documentid' %}
<div class="card mb-3">
<div class="card-header">
{{ document.annotation }}
<span class="badge badge-pill badge-primary">{{ document.title}}</span>
</div>
<div class="card-body">
<p class="card-text">{{ document.contenttext }}</p>
</div>
</div>
{% endfor %}
{% endblock %}
The table looks like that:
documentid title annotation contenttext
1 abc abc abc
1 abc abc def
2 zzz xxx yyy
3 ooo mmm fff
Two first rows have the same documentid so i want to display both 'abc' and 'def' in the same document. And then go frther.
Which construction will allow me to do that?
title|annotation|contenttext
doc1: abc | abc | abc def
doc2: zzz | xxx | yyy
doc3: ooo | mmm | fff

Merge tables together in matlab

How do I merge two tables together?
Table 1:
LastName Age Weight Smoker
__________ ___ ______ ______
'Smith' 38 176 true
'Johnson' 43 163 false
'Williams' 38 131 false
Table 2:
LastName Age Weight Smoker
__________ ___ ______ ______
'Jones' 40 133 false
'Brown' 49 119 false
into:
LastName Age Weight Smoker
__________ ___ ______ ______
'Smith' 38 176 true
'Johnson' 43 163 false
'Williams' 38 131 false
'Jones' 40 133 false
'Brown' 49 119 false
You can simply do this using:
Complete_Table=[Table1;
Table2]
Complete Code:-
LastName = {'Smith';'Johnson';'Williams'};
Age = [38;43;38];
Weight = [176;163;131];
Smoker=logical([1;0;0]);
Table1 = table(LastName,Age,Weight,Smoker)
%Overwriting
LastName = {'Jones';'Brown'};
Age = [40;49];
Weight = [133;119];
Smoker=logical([0;0]);
Table2 = table(LastName,Age,Weight,Smoker)
Complete_Table=[Table1;
Table2]
Output:-
Table1 =
LastName Age Weight Smoker
__________ ___ ______ ______
'Smith' 38 176 true
'Johnson' 43 163 false
'Williams' 38 131 false
Table2 =
LastName Age Weight Smoker
________ ___ ______ ______
'Jones' 40 133 false
'Brown' 49 119 false
Complete_Table =
LastName Age Weight Smoker
__________ ___ ______ ______
'Smith' 38 176 true
'Johnson' 43 163 false
'Williams' 38 131 false
'Jones' 40 133 false
'Brown' 49 119 false