Choose custom column name in SELECT expression depending on returning data, PostgreSQL - postgresql

Let say we have such table:
id | bird | color
1 | raven | black
2 | gull | white
3 | broadbill | green
4 | goldfinch | yellow
5 | tit | yellow
Is it possible in PostgreSQL to write such SELECT expression, which can make dynamic alias for the color column? This aliase's name should depend on selected data from the color column. It is assumed, that only one row is returned (i.e., LIMIT 1 is applied at the end).
Pseudocode:
SELECT id, bird, color
as 'bw' if color.value in ['black', 'white']
else
as 'colored'
FROM table
WHERE color = 'white'
LIMIT 1
Returning examples:
-- WHERE color = 'white'
id | bird | bw
1 | gull | white
-- WHERE color = 'yellow'
id | bird | colored
4 | goldfinch | yellow

Related

Creating a Filter Button in GridJS

I am new to gridjs and am struggling to find a way to add filter buttons. I am looking to add dropdown selectors or buttons that will filter out entire rows if a specified value is not in a specified column.
For example, if I have this table:
| Color | Another header |
| -------- | -------------- |
| Blue | Value1 |
| Green | Value2 |
| Blue | Value3
Is it possible within gridjs to create a filter dropdown that if you select blue, it would filter out all non-blue values and vice versa if you selected green?

How to check all values in columns efficiently using Spark?

I'm wondering how to make dynamic filter given unknown columns in Spark.
For example, the dataframe is like below:
+-------+-------+-------+-------+-------+-------+
| colA | colB | colC | colD | colE | colF |
+-------+-------+-------+-------+-------+-------+
| Red | Red | Red | Red | Red | Red |
| Red | Red | Red | Red | Red | Red |
| Red | Blue | Red | Red | Red | Red |
| Red | Red | Red | Red | Red | Red |
| Red | Red | Red | Red | Blue | Red |
| Red | Red | White | Red | Red | Red |
+-------+-------+-------+-------+-------+-------+
The columns can only be known at runtime, meaning it can have colG, H ..
I need to check if the whole column's value is Red, then get a count, in above case is 3 as colA, colD and ColF columns are all Red.
What I am doing is something like below, and it is SLOW..
val allColumns = df.columns
df.foldLeft(allColumns) {
(df, column) =>
val tmpDf = df.filter(df(column) === "Red")
if (tmpDf.rdd.isEmpty) {
count += 1
}
df
}
I am wondering if there is a better way. Many thanks!
you got N RDD scans where N is number of columns. You can scan all of them at once and reducing in parallel. For example this way:
df.reduce((a, r) => Row.fromSeq(a.toSeq.zip(r.toSeq)
.map { case (a, r) =>
if (a == "Red" && r == "Red") "Red" else "Not"
}
))
res11: org.apache.spark.sql.Row = [Red,Not,Not]
This code will do one RDD scan, and then iterated Row columns inside reduce. Row.toSeq get Seq from Row. fromSeq restore Row to return the same object.
Edit: for count just add: .toSeq.filter(_ == "Red").size
Why not simply do df.filter + df.count using only DataFrame API?
val filter_expr = df.columns.map(c => col(c) === lit("Red")).reduce(_ and _)
val count = df.filter(filter_expr).count
//count: Long = 3

How to derive a column based on two different merged dimensions in SAP Business Objects?

I have two tables, and I want to derive one column from Table1 to Table2 by common fields Name and Color (by merging).
But until I was using only one merged dimension and it was working well for me.
Now, for two dimensions, it is not working.
If I have two tables :
Table1 Name : Fruits
|--------|--------|---------------|
| Name | Color | Rateperunit |
|--------|--------|---------------|
| banana | yellow | 3 |
|--------|--------|---------------|
| apple | red | 25 |
|--------|--------|---------------|
| apple | green | 30 |
|--------|--------|---------------|
Table2 Name : Purchase
|--------|---------|-------------|-----------|----------|
| Item | Color | Rateperunit | Noofitems | Totalbill|
|--------|---------|-------------|-----------|----------|
| apple | green |30 | 3 | 90 |
|--------|---------|-------------|-----------|----------|

Cross tab with a list of values instead of summation

I want a Cross tab that lists field values and counts them instead of just giving a count for the summation. I know I could make this with groups but I cant list the values vertically that way. From my research I believe I have to use a Display String Formula.
SQL Field Data
-------------------------------------------------
| Play # | Formation |Back Set | R/P | PLAY |
-------------------------------------------------
| 1 | TREY | FG | R | TRUCK |
-------------------------------------------------
| 2 | T | FG | R | RHINO |
-------------------------------------------------
| 3 | D | FG | P | 5 STEP |
-------------------------------------------------
| 4 | D | FG | P | 5 STEP |
-------------------------------------------------
| 5 | K JET | NG | R | DOG |
-------------------------------------------------
Desired report structure:
-----------------------------------------------------------
| Backet & Formation | Run | Pass |
-----------------------------------------------------------
| NG K JET | BULLA 1 | |
| | HELL 3 | |
-----------------------------------------------------------
| FG D | | 5 STEP 2 |
-----------------------------------------------------------
| NG K JET | DOG | |
-----------------------------------------------------------
| FG T | RHINO | |
-----------------------------------------------------------
Don't see why a Crosstab is necessary for this - especially if the entire body of the report is just that table.
Group your records by Bracket and Formation - If that's not
something natively configured in your table, make a new Formula field
and group on that.
Drop the 3 relevant fields into whichever section you need to display. (It might be a Footer, based on whether or not you want repeats
Write a formula to determine whether or not Run or Pass are displayed, and place it in their suppression field. (Good luck getting a Crosstab to do that for you! It tends to prefer 0s over blanks.)
If there's more to the report than just this table, you can cheat the system by placing your "table" into a subreport. And of course you can stretch Line objects across the sections and it will stretch to form the table outlines

How can I combine rows in PostgreSQL using Python2.7

I have data of the following format:
|------------------------|
| Product | Color | Year |
|------------------------|
| Ball | Blue | 1999 |
| Ball | Blue | 2000 |
| Ball | Blue | 2001 |
| Stick | Green | 1984 |
| Stick | Green | 1985 |
|------------------------|
How can I convert this into the following:
|-----------------------------|
| Product | Color | Year Range|
|-----------------------------|
| Ball | Blue | 1999-2001 |
| Stick | Green | 1984-1985 |
|-----------------------------|
The data is in a PostgreSQL table and contains 187,000+ rows that desperately need to be consolidated in this fashion. How can I take care of this using Python 2.7?
The data is in a PostgreSQL table and contains 187,000+ rows that
desperately need to be consolidated in this fashion.
It might desperately need to be consolidated that way for reporting, but it almost certainly does not need to be consolidated that way for storage. Step lightly here.
You can get the data in roughly that format just with a GROUP BY clause. (I used "product_color_years" as the table name.)
select product, color, min(year), max(year)
from product_color_years
group by product, color
To consolidate the years into a single column, use the concatenation operator.
select product, color, min(year) || '-' || max(year) year_range
from product_color_years
group by product, color
This works only as long as
there aren't any gaps in the year range, or
there are gaps, but you don't care.
If there are gaps that you'd like to see reported like this:
product color year_range
--
Ball Blue 1999-2001
Ball Blue 2003-2005
Stick Mauve 2000, 2010
then you're probably better off using a report writer. (For example, Google "python reports".) The SQL above will report these blue balls as Ball Blue 1999-2005, which might not be what you want.