I have a tableau table with columns like this:
| ID | ww | count_flag |
| 1 | ww1 | 0 |
| 1 | ww2 | 1 |
| 1 | ww3 | 1 |
| 1 | ww4 | 0 |
| 1 | ww5 | 1 |
| 2 | ww1 | 1 |
| 2 | ww2 | 1 |
| 2 | ww3 | 1 |
| 2 | ww4 | 0 |
| 2 | ww5 | 1 |
...
Now I'd like to add a new column to show the consistent status for each ID among all the ww(workweek), the consistent status will be reset every time when the count_flag is 0 or ID changes, so it will look like below:
|ID | ww | count_flag | consistent status|
| 1 | ww1 | 0 | 0 |
| 1 | ww2 | 1 | 1 |
| 1 | ww3 | 1 | 2 |
| 1 | ww4 | 0 | 0 |
| 1 | ww5 | 1 | 1 |
| 2 | ww1 | 1 | 1 |
| 2 | ww2 | 1 | 2 |
| 2 | ww3 | 1 | 3 |
| 2 | ww4 | 0 | 0 |
| 2 | ww5 | 1 | 1 |
...
How should I create the calculating field to add such a parameter to the table column.
I have the following table:
SELECT * FROM trips_motion_xtics
+------------+---------+-------------+------------+-------------+------------+-----+------------------+------------------+-------+--------------+-------+-------------+
| session_id | trip_id | lat_start | lat_end | lon_start | lon_end | alt | distance | segments_length | speed | acceleration | track | travel_mode |
+------------+---------+-------------+------------+-------------+------------+-----+------------------+------------------+-------+--------------+-------+-------------+
| 652 | 303633 | 41.1523521 | 41.1524966 | -8.6097233 | -8.6096833 | 0 | 42.7424443438547 | 28.0353622436523 | 0 | 74.208 | 0 | foot |
| 652 | 303633 | 41.1523521 | 41.1524966 | -8.6097233 | -8.6096833 | 0 | 42.7424443438547 | 28.0353622436523 | 0 | 74.154 | 0 | foot |
| 652 | 303633 | 41.1523521 | 41.1524966 | -8.6097233 | -8.6096833 | 0 | 42.7424443438547 | 28.0353622436523 | 0 | 68.226 | 0 | foot |
| 656 | 303637 | 41.14454009 | 41.1631127 | -8.56292593 | -8.5870161 | 0 | 5921.07030809987 | 2785.6088546142 | 0 | 99.028 | 0 | car |
| 656 | 303637 | 41.14454009 | 41.1631127 | -8.56292593 | -8.5870161 | 0 | 5921.07030809987 | 2785.6088546142 | 0 | 109.992 | 0 | car |
+------------+---------+-------------+------------+-------------+------------+-----+------------------+------------------+-------+--------------+-------+-------------+
Now would like to compute the average value for columns alt, distance, speed ... for unique value of session_id, trip_id, lat_start,...
Query:
SELECT DISTINCT(session_id, trip_id, lat_start, lat_end, lon_start, lon_end, travel_mode), AVG(alt) AS avg_alt, AVG(distance) AS avg_disntance, AVG(speed) AS avg_speed, AVG(acceleration) AS avg_acc FROM akil.trips_motion_xtics;
ERROR: column "trips_motion_xtics.session_id" must appear in the GROUP BY clause or be used in an aggregate function
LINE 1: SELECT DISTINCT(session_id, trip_id, lat_start, lat_end, lon...
Required result:
+------------+---------+-------------+------------+-------------+------------+-----+------------------+------------------+-------+--------------+-------+-------------+
| session_id | trip_id | lat_start | lat_end | lon_start | lon_end | alt | distance | segments_length | speed | acceleration | track | travel_mode |
+------------+---------+-------------+------------+-------------+------------+-----+------------------+------------------+-------+--------------+-------+-------------+
| 652 | 303633 | 41.1523521 | 41.1524966 | -8.6097233 | -8.6096833 | 0 | 42.7424443438547 | 28.0353622436523 | 0 | 72.196 | 0 | foot |
| 656 | 303637 | 41.14454009 | 41.1631127 | -8.56292593 | -8.5870161 | 0 | 5921.07030809987 | 2785.6088546142 | 0 | 104.51 | 0 | car |
+------------+---------+-------------+------------+-------------+------------+-----+------------------+------------------+-------+--------------+-------+-------------+
You want aggregation. You will get a unique record for each combination of the column listed in the GROUP BY clause, and you can apply aggregate functions (such as AVG()) on other columns:
SELECT
session_id,
trip_id,
lat_start,
lat_end,
lon_start,
lon_end,
travel_mode,
AVG(alt) AS avg_alt,
AVG(distance) AS avg_disntance,
AVG(speed) AS avg_speed,
AVG(acceleration) AS avg_acc
FROM akil.trips_motion_xtics
GROUP BY
session_id,
trip_id,
lat_start,
lat_end,
lon_start,
lon_end,
travel_mode
I have this table in my database:
| id | desc |
|-------------|
| 1 | A |
| 2 | B |
| NULL | C |
| 3 | D |
| NULL | D |
| NULL | E |
| 4 | F |
---------------
And I want to transform this table into a table that replace nulls by consecutive negative ids:
| id | desc |
|-------------|
| 1 | A |
| 2 | B |
| -1 | C |
| 3 | D |
| -2 | D |
| -3 | E |
| 4 | F |
---------------
Anyone knows how can I do this in hive?
Below approach works
select coalesce(id,concat('-',ROW_NUMBER() OVER (partition by id))) as id,desc from database_name.table_name;
I want to count the occurrences of particular values in a certain field for an ID. So what I have is this:
| Location ID | Group |
|:----------- |:---------|
| 1 | Group A |
| 2 | Group B |
| 3 | Group C |
| 4 | Group A |
| 4 | Group B |
| 4 | Group C |
| 3 | Group A |
| 2 | Group B |
| 1 | Group C |
| 2 | Group A |
And what I would hope to yield through some computer magic is this:
| Location ID | Group A Count | Group B Count | Group C count|
|:----------- |:--------------|:--------------|:-------------|
| 1 | 1 | 0 | 1 |
| 2 | 1 | 2 | 0 |
| 3 | 1 | 0 | 1 |
| 4 | 1 | 1 | 1 |
Is there some sort of pivoting function I can use in Redshift to achieve this?
This will require the usage of the CASE function and GROUP clause, as in example.
SELECT l_id,
SUM(CASE WHEN l_group = 'Group A' THEN 1 ELSE 0 END) AS a,
SUM(CASE WHEN l_group = 'Group B' THEN 1 ELSE 0 END) AS b-- and so on
FROM location
GROUP BY l_id;
This should give you such result:
| l_id | a | b |
|------|---|---|
| 4 | 1 | 1 |
| 1 | 1 | 0 |
| 3 | 1 | 0 |
| 2 | 1 | 2 |
You can play with it on this SQL Fiddle.
+----+------------------+-----------------+
| id | template_type_id | url |
+----+------------------+-----------------+
| 1 | 1 | text |
| 2 | 2 | text |
| 3 | 1 | text |
| 4 | 1 | text |
| 5 | 1 | text |
| 6 | 1 | text |
| 7 | 1 | text |
| 8 | 1 | text |
| 9 | 2 | text |
| 10 | 2 | text |
+----+------------------+-----------------+
As i am using 1 page template and 2 page template i need to reorder above result as per 1 page and 2 page as below:
+----+------------------+-----------------+
| id | template_type_id | url |
+----+------------------+-----------------+
| 1 | 1 | text |
| 3 | 1 | text |
| 2 | 2 | text |
| 4 | 1 | text |
| 5 | 1 | text |
| 6 | 1 | text |
| 7 | 1 | text |
| 9 | 2 | text |
| 10 | 2 | text |
| 8 | 1 | text |
+----+------------------+-----------------+
+------------------------------------------+
---------------- ------------------
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
---------------- ------------------
+------------------------------------------+
Assuming there's publish_date column in the table that is not shown and the values in it consistent with the ordering of the records in the examples 1 and 2, I suggest:
order by publish_date, template_type_id