Group by specific column in PostgreSQL

Group by specific column in PostgreSQL - group-by

I trying to get a count of person depending on the org_id.
Let's say I have 10 people in my people coming in from different organization.
I want to count no of people working in separate with each record listed.
Please click here on SQL Fiddle link to get what exactly I am trying to do.
Postgres version 9.4
Below are my table records:
id person_name emp_id org_id
1 John Mackenzie TTT104 1
2 Raven Raid TTT105 1
3 Albert Pinto TTT106 2
4 Albert Pinto1 TTT119 2
5 Ram Raman TTT108 2
6 Huge Jackman TTT109 2
7 Peter Pan TTT107 2
8 Albert Pinto2 TTT106 2
RESULT EXPECTED:
id person_name emp_id count(org_id)
1 John Mackenzie TTT104 2
2 Raven Raid TTT105 2
3 Albert Pinto TTT106 6
4 Albert Pinto1 TTT119 6
5 Ram Raman TTT108 6
6 Huge Jackman TTT109 6
7 Peter Pan TTT107 6
8 Albert Pinto2 TTT106 6
As shown in the image I want my records to look in my velocity template:

While collecting results, you The solution you are looking for is the following:
SELECT org_id, count(*)
FROM person
GROUP BY org_id;
Basically with this query you are collecting the number of people working in each distinct org_id.
Result of the query is, then:
rg_id | count
-------------
1 | 2
2 | 6

Execute query as below to solve the issue:
SELECT p.person_name,
p.emp_id,
count(p.org_id) OVER w as org
FROM person p WINDOW w AS (PARTITION BY org_id);

Related

Using GraphFrames (Scala) to compute hierarchy

I have a dataframe below:
employee_id|employee_name|manager_employee_id|
----------------------------------------------
1 eric (ceo) 1
2 edward 1
3 john 1
4 james 2
5 ella 4
I would like to use spark (scala) graphframe logic to achieve the following:
employee_id|employee_name|manager_employee_id|level|hierarchy
-------------------------------------------------------------
1 eric 1 0 /1
2 edward 1 1 /1/2
3 john 1 1 /1/3
4 james 2 2 /1/2/4
5 ella 3 3 /1/2/4/5
Any help would be much appreciated

I think the answer you are looking for is more or less related to this.
The only modification required at your end will be aggregating the messages to get the complete hierarchy .
For that part you can refer to this.
A combination of above two will get you the desired results.

"Inserting" Records into Fields from a Database Feed

So the background to this is I'm trying to create a survival curve based on a database feed from the directions here.
What I have so far is three calculated fields per below. Patient ID is not a calculated field or necessary for the survival analysis, but I believe it could be useful for this question. For reference, there are about 20,000 unique patients.
Patient ID | Time | Censor | Group
Id1 3 0 1
Id2 8 0 2
Id3 1 1 1
Id4 3 1 1
Id5 11 0 1
Id5 7 1 2
What I would like to do is insert two records (one for each group) such:
Patient ID | Time | Censor | Group | Link
0 1
0 2
Id1 3 0 1 link
Id2 8 0 2 link
Id3 1 1 1 link
Id4 3 1 1 link
Id5 11 0 1 link
Id5 7 1 2 link
I unsuccessfully tried to create an excel spreadsheet with these base attributes to union with the columns, however, an excel spreadsheet does not appear to be able to union with a database.
My next idea is to find 2 of the 20,000 patients where I can create a calculated field along these lines (not sure this is feasible in Tableau, please excuse my syntax):
IF [Patient ID] = Id3 THEN [TIME] = 0 AND [CENSOR] IS NULL
END
and then a [Link] calculated formula:
IF [Patient ID] = Id3 THEN NULL
ELSE "link"
END
Any help would be appreciated. Would like to avoid inserting these records in the database.

The best / easiest option is to use an outer join to your excel workbook -- this is a new feature in Tableau version 10 (Cross database joins)
Then, once the dataset is combined, you can build business logic through a filter or calculated field based on the absence or presence of the Excel data.
http://www.tableau.com/about/blog/2016/7/integrate-your-data-cross-database-joins-56724

Please help me 'grok' triplestores

I'm an RDBMS person from way back. I'm trying to grok triplestore. I think my "confusion" may be addressed with the answer to the following question:
How is this...
Table (Subjects):
ID Subject Details
1 Barney …
2 Fred …
3 Picture …
4 …
Table2 (Predicates):
ID Predicate Details
1 friendOf …
2 marriedTo …
3 hasTimeStamp …
4 hasGeoCoord …
5 hasEventName …
6 belongsTo …
7 containsPerson …
8 …
Table3 (Objects) - These may be Subjects as well:
ID Object SubjectID Details
1 Fred 2 …
2 Wilma NULL …
3 January 1, 2010 1530 NULL …
4 46°12′N NULL …
5 6°09′E NULL …
6 Wedding NULL …
7 Ski Trip NULL …
8 Barney 1 …
9 …
Table4 (Triplestores)
ID SubjectID PredicateID ObjectID Details
1 1 1 2 …
2 2 2 3 …
3 3 3 3 …
4 3 4 4 …
5 3 4 5 …
6 3 5 6 …
7 3 5 7 …
8 3 7 8 …
9 3 7 2 …
10 3 7 1 …
11 …
So #9 in Tripstore is: Picture containsPerson Fred
...Not a triplestore?
If it is then please comment on why this implementation (as an RDBMS) is inefficient etc.
Thanks in advance!!

It's possible, and easy to some degree, to implement a triple store on top of an RDBMS. There are several systems currently available that do this with varying degrees of success. However, they tend not perform all that well due to transitive self joins their design usually requires. This is why serious vendors who are providing a triple store backed by a relational database, such as Oracle, have customized handling to help improve their efficiency in these situations.
In my experience, native triple stores, those designed for the purpose of storing and querying RDF, always outperform solutions shoehorned on top of a relational system. So while they're very much databases and have a lot in common with a traditional RDBMS, there are still design choices in their implementation that makes them better suited for answering SPARQL queries.

Aggregate path counts using HierarchyID

Business problem - understand process fallout using analytics data.
Here is what we have done so far:
Build a dictionary table with every possible process step
Find each process "start"
Find the last step for each start
Join dictionary table to last step to find path to final step
In the final report output we end up with a list of paths for each start to each final step:
User Fallout Step HierarchyID.ToString()
A 1/1/1
B 1/1/1/1/1
C 1/1/1/1
D 1/1/1
E 1/1
What this means is that five users (A-E) started the process. Assume only User B finished, the other four did not. Since this is a simple example (without branching) we want the output to look as follows:
Step Unique Users
1 5
2 5
3 4
4 2
5 1
The easiest solution I could think of is to take each hierarchyID.ToString(), parse that out into a set of subpaths, JOIN back to the dictionary table, and output using GROUP BY.
Given the volume of data, I'd like to use the built-in HierarchyID functions, e.g. IsAncestorOf.
Any ideas or thoughts how I could write this? Maybe a recursive CTE?

Restructuring the data may help with this. For example, structuring the data like this:
User Step Process#
---- ---- --------
A 1 1
A 2 1
A 3 1
B 1 2
B 2 2
B 3 2
B 4 2
B 5 2
E 1 3
E 2 3
E 1 4
E 2 4
E 3 4
Allows you to run the following query:
select step,
count(distinct process#) as process_iterations,
count(distinct user) as unique_users
from stepdata
group by step
order by step;
which returns:
Step Process_Iterations Unique_Users
---- ------------------ ------------
1 4 3
2 4 3
3 3 3
4 1 1
5 1 1

I'm not familiar with hierarchyid, but splitting out that data into chunks for analysis looks like the sort of problem numbers tables are very good for. Join a numbers table against the individual substrings in the fallout and it shouldn't be too hard to treat the whole thing as a table and analyse it on the fly, without any non-set operations.

Crystal Reports grouping two tables based on the primary key of one table

I have 2 tables:
Product
fields are: prod_no (primary key), prod_name
Sales
fields are: sal_no, sal_date, cust, prod1 (foreign key i.e. prod_no), prod2 (foreign key i.e. prod_no), fst_qty, snd_qty
I want to sort my report based on prod_no.
Example:
Product
prod_no prod_name
1. printer
2. paper
3. cartridge
Sales
sal_no sal_date customer prod1 prod2 fst_qty snd_qty
1 5-sep-09 aaa 1 3 2 3
2 5-sep-09 bbb 2 3 1 1
3 5-sep-09 ccc 1 2 3 4
I want a report like this:
prod_no
1
sal_no sal_date customer qty
1 5-sep-09 aaa 2
3 5-sep-09 ccc 3
2
2 5-sep-09 bbb 1
3 5-sep-09 ccc 4
3
1 5-sep-09 aaa 3
2 5-sep-09 bbb 1

Book - Crystal reports 10: the complete reference By George Peck

Looks like a job for subreports to me, with Prod_No passed as a parameter.

You need to work on your grouping skills for this. I believe that if you simply create groupings for prod_no and then sal_no, and after placing the necessary fields in the proper group headers and details sections you should be able to display this info in the way that you want it.
Please see my answer on the below question for an example of a grouping similar to what you are looking for.
How to avoid the duplicate values?
Hope this helps and don't forget to mark an answer as the accepted answer and vote for answers that have helped. Thanks

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Group by specific column in PostgreSQL - group-by

Execute query as below to solve the issue: SELECT p.person_name, p.emp_id, count(p.org_id) OVER w as org FROM person p WINDOW w AS (PARTITION BY org_id);

Related

Using GraphFrames (Scala) to compute hierarchy

"Inserting" Records into Fields from a Database Feed

Please help me 'grok' triplestores

Aggregate path counts using HierarchyID

Crystal Reports grouping two tables based on the primary key of one table

Categories

Resources