Group unique values in r - group-by

I have a following data:
I want o summarise the data in three columns by Subject type and other two columns sates number of schools and number of students by each subject type. I have used group_by function by it keeps giving the number of types subjects appeared rather then by the number of schools or students
TIA

Related

PowerBI group by separate columns

I'm struggling with PowerBI to group 2 columns. But when using group by function I get "The Expression Refers to Multiple Columns". When using:
SUMX(VALUES('Table'[OrderNumber]), CALCULATE(SUM('Table'[OrderValue])))
I get error message: "A circular dependency was detected.
Anyone know how to group the raw data example from the image to the desired table in PowerBI?
Price product should be grouped by product and summarized for all orders.
Order value should be grouped by product and summarized for total order value.

SQLite query not working when trying to run the same query in postgresql

I have a database of users purchase/sell of stocks and need to retrieve the data by summing all the shares for that specific user.
Ex: If I choose user_id = 7, it should return two rows, one with TESLA and one with Apple with the sums of the shares.
Database:
SQL queries I've tried include:
SELECT name,symbol,name,price,total, SUM(shares)
FROM symbol
WHERE user_id=7
GROUP BY name,symbol,name,price,total
Returns: but it should just be two rows
Your issue is your grouping.
Consider the uniqueness of what you are grouping by and that is how many rows you will get returned.
With your where clause and grouping by name,symbol,name,price,total there are three unique rows.
Remove the price and total columns from the grouping and you will get your desired two rows, or include them in your query as sum'd columns.
eg.
SELECT name,symbol,name,SUM(shares)
FROM symbol
WHERE user_id=7
GROUP BY name,symbol,name

In Tableau: Count entries Column B that have the same value in Column A

I have a table with two columns. For simplicity, lets say Column A is General Contractors and Column B is subcontractors. Any given general contractor can have a variable number of subcontractors. I would like to add a third column that simply displays a count of how many subcontractors each contractor has.
I have tried several calculations using "fixed" and "include" functions as well as "Count" and "CountD" functions and have tried directly using the count functions (right-click>>measure>>count) but all I get are 1's in the resulting column.
The data come from a table where there is one row for each subcontractor, so the if a general contractor had 5 subcontractors then there would be 5 rows where the general contractor repeats it self over and over with a different subcontractor next to it.
There are far too many different general contractors to use conditional statements.
Is what I'm doing possible and what other things should I try?
Try this
{Fixed [general contractor]: Countd([sub contractor]) }
Add this field to your view after contractor and sub-contractor, you'll get that variable count say 5 repeated in each row that general contractor.

Showing 4 records in a portal from same table

I have a table that contains students' results. These results are generally broken into four types: term1, term2, term3 and term4. So over a year, a student may have up to four records in that table containing his results.
I want to create a layout that contain a portal that will show all the 4 records in a single portal row. Is there any way to do this? Or any workaround?
The reason why I do not want to display the records as four rows in the portal is because there are different subjects and will not be right if each subject occupy four rows and there are many subjects a student may take.
I can think of two ways to approach this, both of which would require a relationship from your Results table occurrence to another table occurrence based on Results, let's call it Results~SameStudentID. (The matching field would the foreign key to the Student table, FK_StudentID = FK_StudentID.)
Create 4 calculation fields in your Results table: Result_1, Result_2, Result_n, etc. The formula to use for each of the calculation (starting from the context of the Results table occurrence) would be:
GetNthRecord ( Results~SameStudentID::Result ; n )
Then, simply include the 4 "Result_n" fields in your portal
Create just one field, Results_1_4, with the following formula:
Substitute ( List ( Results~SameStudentID::Result ) ; ΒΆ ; " " )

Need help building complex multi-table queries

This question is something that a lot of people learning bioinformatics and new to DNA data analysis are struggling with:
Lets say I have 20 tables with the same column headings. Each table represents a patient sample and each row represents a locus (site) which has mutated in that sample. Each site is uniquely identified by two columns together - chromosome number and base number (eg. 1 and 43535, 1 and 33456, 1 and 3454353). There are several columns which give different characteristics of each mutation including a column called Gene which gives the gene at that site.. Multiple sites can be mutated in a gene - meaning the Gene column can have the same value multiple times in one table.
I want to query all these tables at the same time by lets say Gene. I input a value from the Gene column and I want as output the names of all the tables (samples) in which the gene name is present in the Gene column and also the entire line(s) (preferably) for each sample so that I can compare the characteristics of the mutation in that gene across multiple samples on one output page.
I also want to input a number say 4 and want as output a list of genes which have mutated in at least 4 of 20 patients (list of genes whose names appear in the Gene column in atleast 4 of 20 tables).
What is the "easiest way" to do this? What is the "best way" assuming I want to make more flexible queries, besides these two?
I am a MD, do not have any particular software expertise but I am willing to put in the necessary time to build this query system. A few lines of code won't put me off..
Eg data:
Func Gene ExonicFunc Chr Start End Ref Obs
exonic ACTRT2 nonsynonymous SNV 1 2939346 2939346 G A
exonic EIF4G3 nonsynonymous SNV 1 21226201 21226201 G A
exonic CSMD2 nonsynonymous SNV 1 34123714 34123714 C T
This is just a third of the columns. Multiple columns were removed to fit the page size here...
Thank you.
Create a view that union's all the tables together. You should probably add additional information about which table ti comes from:
create view allpatients as
select 'a' as whichtable, t.*
from tableA t
union all
select 'b' as whichtable, t.*
from tableB t
...
You might find that it is easier to "instantiate" the view by creating a table with all patients. Just have a stored procedure that recreates the table by combining the 20 tables.
Alternatively, you could find that you have large individual tables (millions of rows). In this case, you would want to treat each of the original tables as a partition.
If what you have is a bunch of Excel files, you can import them all into the same table, with a distinct column for patient id. There is no need to create 20 different tables for this -- in fact, it would be a bad idea.
Once you do, go to Access' query design, SQL view and use these queries:
To create a query that returns all fields for the input gene name:
select *
from gene_data
where gene = [GeneName]
To create a query that returns gene names that are mutated in more than 4 samples:
select gene
from
(select gene, sample_id
from gene_data
group by gene, sample_id) g
group by gene
having count(sample_id) > 4
After this, change to design view -- you'll see how to create similar queries using the GUI.