using postgresql 9.3 I have a table that shows indivual permits issued across a single year below:
permit_typ| zipcode| address| name
-------------+------+------+-----
CONSTRUCTION | 20004 | 124 fake streeet | billy joe
SUPPLEMENTAL | 20005 | 124 fake streeet | james oswald
POST CARD | 20005 | 124 fake streeet | who cares
HOME OCCUPATION | 20007 | 124 fake streeet | who cares
SHOP DRAWING | 20009 | 124 fake streeet | who cares
I am trying to flatten this so it looks like
CONSTRUCTION | SUPPLEMENTAL | POST CARD| HOME OCCUPATION | SHOP DRAWING | zipcode
-------------+--------------+-----------+----------------+--------------+--------
1 | 2 | 3 | 5 | 6 | 20004
1 | 2 | 3 | 5 | 6 | 20005
1 | 2 | 3 | 5 | 6 | 20006
1 | 2 | 3 | 5 | 6 | 20007
1 | 2 | 3 | 5 | 6 | 20008
have been trying to use Crosstab but its a bit above my rusty SQL experiance. anybody have any ideas
I usually approach this type of query using conditional aggregation. In Postgres, you can do:
select zipcode,
sum( (permit_typ = 'CONSTRUCTION')::int) as Construction,
sum( (permit_typ = 'SUPPLEMENTAL')::int) as SUPPLEMENTAL,
. . .
from t
group by zipcode;
Related
I'm trying to recreate simple SQL query in DAX. The output Query needs to work in Power BI Report Builder and I have been trying all day reading all sorts of PowerBI / DAX online resources to rewrite this.
A little bit about the data:
The data is structured in three tables, CustomCar, Engine and Chassis.
Basically "CarId" is the key that connects all three tables.
Let's assume all tables have more than 20 columns. so only a few of the columns are needed in the final output.
All three tables (CustomCar, Chassis and Engine) have an IsActive property (the relationship between Engine/Chassis to CustomCar is MANY-TO-ONE. Because an engine might blow up and they change it therefore somehow we want to track which Engine is on the car today and what engine was on it last year, however, at any time, there is only one active engine for each car.. The same goes for Chassis)
Both Engine and Chassis have 'Manufacturer' and 'Model' columns so in the output query they need to be identified from each other.
I am not trying to sum any sort of sales number, just a list of cars with their current configuration.
Any help is appreciated.
Select
CC.Name, CC.Model as 'CustomCarModel', CC.MaxSpeed,
Ch.Manufacturer as 'ChassisManufacturer', Ch.Model as 'ChassisModel', Ch.ManufacturedDate as 'ChassisManfDate',
E.Manufactuer as 'EngineManufacturer', E.Model as 'EngineModel', E.Power, E.CylCount, E.ManufacturedDate
From CustomCars CC
Join Chassis Ch on Ch.CarID = CC.CarId
Join Engine E on E.CarID = CC.CarID
where
CC.IsActive = 1 and CC.FirstTestDriveYear < 1980 and
Ch.IsActive = 1 and
E.IsActive = 1
More info, here are my tables.
Classic Car:
CarId (Primary Key) | Model | MaxSpeed | NumOfPax | TankCapacity | IsActive | FirstTestDriveYear |....
1 | SuperChev | 220 | 2 | 60 | 1 | 1985 |
2 | CustomBranco | 185 | 2 | 90 | 1 | 1979 |
3 | RebuiltToyo | 251 | 4 | 20 | 0 | 1990 |
Chassis:
ChassisId (Primary Key) | CarId (Foreign Key)| IsActive | Manufacturer | Model | ManufacturedDate | ...
1 | 1 | 0 | ACME Chassis | M1 | '04-Jan-1985' | ...
2 | 1 | 1 | SuperChassis | T5 | '03-Feb-1987' | ...
3 | 2 | 0 | Ford | S2 | '25-Mar-1965' | ...
4 | 2 | 0 | Ford | S2 | '25-Mar-1968' | ...
5 | 3 | 0 | JapanChass | X123 | '25-Feb-1988' | ...
6 | 2 | 1 | Ford | S8 | '08-Jul-1978' | ...
7 | 2 | 0 | Ford | S2 | '25-Mar-1968' | ...
8 | 3 | 1 | JapanChass | Y765 | '25-Feb-1992' | ...
Engine:
EngineId (Primary Key) | CarId (Foreign Key)| IsActive | Manufacturer | Model | ManufacturedDate | Power | CylCount | ...
1 | 1 | 0 | GM | AB1 | '04-Jan-1985' | 320 | 8 | ...
2 | 1 | 1 | Bently | ZY2 | '03-Feb-1987' | 285 | 8 | ...
3 | 2 | 0 | Ford | S2 | '25-Mar-1965' | 290 | 6 | ...
4 | 2 | 0 | Ford | S2 | '25-Mar-1968' | 292 | 6 | ...
5 | 3 | 0 | Toyota | X123 | '25-Feb-1988' | 180 | 4 | ...
6 | 2 | 1 | Ford | S8 | '08-Jul-1978' | 222 | 8 | ...
7 | 2 | 0 | Ford | S2 | '25-Mar-1968' | 320 | 8 | ...
8 | 3 | 1 | Toyota | Y765 | '25-Feb-1992' | 211 | 6 | ...
I have found a work around for this. I added the query when adding the data pipeline in Power BI dashboard and will use the values from the query as is.
I have a need to concatenate strings in the same field based on id in Informix. I realize this can be done easily in MSSQL.
Here is an example of my current table:
id | doc_num | page_num | description
-------------------------------------------------
1 | 1 | 1 | This is the story about
1 | 1 | 2 | a girl named Daisy.
1 | 2 | 1 | Daisy had a dog named
1 | 2 | 2 | Rover.
2 | 1 | 1 | This story is about Bob.
2 | 2 | 1 | Bob is a DBA who works
2 | 2 | 2 | at an important company
2 | 2 | 3 | that develops important
2 | 2 | 4 | software.
Desired output:
id | description
------------------------------------------------------------
1 | This is a story about a girl named Daisy.
| Daisy has a dog named Rover.
------------------------------------------------------------
2 | This story is about Bob. Bob is a DB who works at an
| important company that develops important software.
------------------------------------------------------------
I found my answer here:
https://dba.stackexchange.com/questions/65101/multiple-table-rows-in-one-row-informix
Since I am running Informix 12, it works using rank() over() sys_connect_by_path().
My Situation
I have some tables in my redshift cluster that all break down into either an order_id, shipment_id, or shipment_item_id depending on how granular the table is. order_id is a 1 to many relationship on shipment_id and shipment_id is a 1 to many on shipemnt_item_id.
My Question
I distribute on order_id, so all shipment_id and shipment_item_id records should be on the same nodes across the tables since they are grouped by order_id. My question is, when I have to join on shipment_id or shipment_item_id then will redshift know that the records are on the same nodes, or will it still broadcast the tables since they aren't joined on order_id?
Example Tables
unified_order shipment_details
+----------+-------------+------------------+ +-------------+-----------+--------------+
| order_id | shipment_id | shipment_item_id | | shipment_id | ship_day | ship_details |
+----------+-------------+------------------+ +-------------+-----------+--------------+
| 1 | 1 | 1 | | 1 | 1/1/2017 | stuff |
| 1 | 1 | 2 | | 2 | 5/1/2017 | other stuff |
| 1 | 1 | 3 | | 3 | 6/14/2017 | more stuff |
| 1 | 2 | 4 | | 4 | 5/13/2017 | less stuff |
| 1 | 2 | 5 | | 5 | 6/19/2017 | that stuff |
| 1 | 3 | 6 | | 6 | 7/31/2017 | what stuff |
| 2 | 4 | 7 | | 7 | 2/5/2017 | things |
| 2 | 4 | 8 | +-------------+-----------+--------------+
| 3 | 5 | 9 |
| 3 | 5 | 10 |
| 4 | 6 | 11 |
| 5 | 7 | 12 |
| 5 | 7 | 13 |
+----------+-------------+------------------+
Distribution
distribution_by_node
+------+----------+-------------+------------------+
| node | order_id | shipment_id | shipment_item_id |
+------+----------+-------------+------------------+
| 1 | 1 | 1 | 1 |
| 1 | 1 | 1 | 2 |
| 1 | 1 | 1 | 3 |
| 1 | 1 | 2 | 4 |
| 1 | 1 | 2 | 5 |
| 1 | 1 | 3 | 6 |
| 1 | 5 | 7 | 12 |
| 1 | 5 | 7 | 13 |
| 2 | 2 | 4 | 7 |
| 2 | 2 | 4 | 8 |
| 3 | 3 | 5 | 9 |
| 3 | 3 | 5 | 10 |
| 4 | 4 | 6 | 11 |
+------+----------+-------------+------------------+
The Amazon Redshift documentation does not go into detail how information is shared between nodes, but it is doubtful that it "broadcasts the tables".
Rather, information is probably sent between nodes based on need -- only the relevant columns would be shared, and possibly only sub-ranges of the data.
Rather than worrying too much about the internal implementation, you should test various DISTKEY and SORTKEY strategies against real queries to determine performance.
Follow the recommendations from Choose the Best Distribution Style to minimize the amount of data that needs to be sent between nodes and consult Amazon Redshift Best Practices for Designing Queries to improve queries.
You can EXPLAIN your query to see how data will be distributed (or not) during the execution. In this doc you'll see how to read the query plan:
Evaluating the Query Plan
I've a data source connected to my Crystal Reports for Enterprise which hat this structure:
--------------------------------------
| OrderNo | Level | Customer | Price |
|=========|=======|==========|=======|
| 1 | 1 | Cus_A | 70.00 |
| 2 | 1 | Cus_A | 78.30 |
| 3 | 2 | Cus_B | 24.50 |
| 4 | 2 | Cus_B | 14.50 |
| 5 | 2 | Cus_B | 17.50 |
--------------------------------------
Now I need for my page header the customer where Level is 1 and 2 so it looks like that:
I. Cus_A
II. Cus_B
Has anybody an idea how to get this values of Customer in two variables without using a subreport for calculating the variables?
Thanks!
I have this situation, I have one offer, and that offer have n number of dates, and n number of options. So I have two additional tables for offer. And third one, which is a price, but price depends of date, and offer. And it is like this:
| | date 1 | date 2 | date 3 |
| offer 1 | price 11 | price 12 | price 13 |
| offer 2 | price 21 | price 22 | price 23 |
| offer 3 | price 31 | price 32 | price 33 |
Is there any way to create TCA custom field to insert all of this Price values at once?
So, basically I need one table with input fields and to store also uid of date and offer in it as reference.
Make more than one table... Tables with dynamic col count are horrible bad to maintain.
Table Offer:
uid | Name | Desc
1 | offer1 | This is some cool shit
2 | offer2 | dsadsad
3 | offer3 | sdadsdsadsada
Table Date:
uid | date
1 | 12.02.2014
2 | 12.03.2014
3 | 20.03.2014
Table Prices:
uid | date | offer | price
1 | 1 | 1 | price11
2 | 1 | 2 | price21
3 | 1 | 3 | price31
4 | 2 | 1 | price12
5 | 2 | 2 | price22
6 | 2 | 3 | price32
7 | 3 | 1 | price13
8 | 3 | 2 | price23
9 | 3 | 3 | price33
And then its straight forward...