I am trying to make a list where all user can go using command BETWEEN, but it doesn't work properly - postgresql

My code to identified where one user can go works properly, but I want to make a list of where all user can go. And for that I tried using the command BETWEEN AND, but it did not work as expected.
Code: Where ONE USER can go;
SELECT place_name, user_id, user_name
FROM schema.place, schema.person
WHERE schema.place_id NOT IN(
SELECT place_id
FROM went_to
WHERE went_to.user_id = 1
AND age(date) <= interval '4 months'
)
AND user_id=1
IMAGE OF THE CODE WORKING PROPERLY:
There's a total of 40 lines, places the user with the id 1 can go
Code: Where ALL USER can go;
SELECT place_name, user_id, user_name
FROM schema.place, schema.person
WHERE schema.place_id NOT IN(
SELECT place_id
FROM went_to
WHERE went_to.user_id BETWEEN 1 AND 15
AND age(date) <= interval '4 months'
)
AND user_id BETWEEN 1 AND 15
ORDER BY user_id
IMAGE OF THE CODE NOT WORKING PROPERLY:
It should have a total of 40 lines, places the user with the id 1 can go
When I reduce the difference in the BETWEEN, the code gets closer to the right answer, however it isn't right.
What I am doing it wrong with the BETWEEN?
The tables:
CREATE TABLE schema.place (
place_id VARCHAR(8),
place_name VARCHAR (50),
CONSTRAINT pk_place_id PRIMARY KEY (place_id)
);
CREATE TABLE schema.user (
user_id VARCHAR(3),
user_name VARCHAR (50),
CONSTRAINT pk_user_id PRIMARY KEY (user_id)
);
CREATE TABLE schema.visit (
user_id VARCHAR(3),
place_id VARCHAR(8),
data DATE,
CONSTRAINT pk_user_id FOREIGN KEY (user_id) REFERENCES SCHEMA.user,
CONSTRAINT pk_place_id FOREIGN KEY (place_id) REFERENCES code.place,
EXCLUDE USING gist (pk_user_id WITH =, daterange(data, (data + interval '6 months')::date) WITH &&)
);

Seeing the schema would be helpful, but I believe the issue is in how you constructed your query.
SELECT place_id
FROM went_to
WHERE went_to.id BETWEEN 1 AND 15
AND age(date) <= interval '4 months'
If we look at just the subquery, we are returning all the place ids where users 1-15 went to in the last 4 months. You're then trying to return all places/users that don't match those place ids. The issue is that you're combining all the places that all of those users went to and then using that as an exclusion when you really want to be excluding only places a particular user went to.
I think you want something like this:
SELECT schema.place.place_name, schema.user.user_id, schema.user.user_name
FROM schema.place, schema.user
WHERE (schema.place.place_id, schema.user.user_id) NOT IN(
SELECT place_id, schema.visit.user_id
FROM schema.visit
WHERE schema.visit.user_id::int BETWEEN 1 AND 15
AND age(data) <= interval '4 months'
)
AND user_id::int BETWEEN 1 AND 15
ORDER BY schema.user.user_id
Your schema has the ids as varchars and not ints and the date field is called data, so I had to make some tweaks

Related

How to use a declare statement to update a table

I have this Declare Statement
declare #ReferralLevelData table([Type of Contact] varchar(10));
insert into #ReferralLevelData values ('f2f'),('nf2f'),('Travel'),('f2f'),('nf2f'),('Travel'),('f2f'),('nf2f'),('Travel');
select (row_number() over (order by [Type of Contact]) % 3) +1 as [Referral ID]
,[Type of Contact]
from #ReferralLevelData
order by [Referral ID]
,[Type of Contact];
It does not insert into the table so i feel this is not working as expect, i.e it doesn't modify the table.
If it did work I was hoping to modify the statement to make it update.
At the moment the table just prints this result
1 f2f
1 nf2f
1 Travel
2 f2f
2 nf2f
2 Travel
3 f2f
3 nf2f
3 Travel
EDIT:
I want TO Update the table to enter recurring data in groups of three.
I have a table of data, it is duplicated twice in the same table to make three sets.
Its "ReferenceID" is the primary key, i want to in a way group the 3 same ReferenceID's and inject these three values "f2f" "NF2F" "Travel" into the row called "Type" in any order but ensure that each ReferenceID only has one of those values.
Do you mean the following?
declare #ReferralLevelData table(
[Referral ID] int,
[Type of Contact] varchar(10)
);
insert into #ReferralLevelData([Referral ID],[Type of Contact])
select
(row_number() over (order by [Type of Contact]) % 3) +1 as [Referral ID]
,[Type of Contact]
from
(
values ('f2f'),('nf2f'),('Travel'),('f2f'),('nf2f'),('Travel'),('f2f'),('nf2f'),('Travel')
) v([Type of Contact]);
If it suits you then you also can use the next query to generate data:
select r.[Referral ID],ct.[Type of Contact]
from
(
values ('f2f'),('nf2f'),('Travel')
) ct([Type of Contact])
cross join
(
values (1),(2),(3)
) r([Referral ID]);

to write a SQL query which select rows where column value changed from previous row

CREATE TABLE status( id serial NOT NULL,
id integer,
plan smallint,
ime timestamp without time zone
CONSTRAINT data_pkey PRIMARY KEY (id))
WITH (OIDS=FALSE);
ALTER TABLE data
OWNER TO postgres;
Index: data_idx
CREATE INDEX data_idx
ON data
USING btree
(time, id);
I have a table like this
id val plan time
1 8300 1 2011-01-01
2 8300 1 2011-01-02
3 8300 2 2011-01-03
4 9600 1 2011-01-04
5 9600 2 2011-01-05
How do I select the rows where sigplan changed from the previous row for that siteId?
In the example above, the query should return the rows
2011-01-03 (sigplan changed from 1 to 2 between 2011-01-01 and 2011-01-03 for 8300),
2011-01-05(sigplan changed from 1 to 2 between 2011-01-04 and 2011-01-05 for 9600).
The table contains lot of data so the query should be optimized.
SELECT siteId, sigplan, MAX(server_time) FROM traffview.status_data
GROUP BY siteId, sigplan
HAVING COUNT(1) > 1 AND MAX(server_time) > 'XXXXX' AND MAX(server_time) < 'XXXXX'
The annoying part is figuring out which is the previous row id with the same siteId. After that it is pretty easy by joining the table with itself.
SELECT t1.* FROM table t1, table t2
WHERE t1.sigplan != t2.sigplan
AND t2.id = (SELECT MAX(t3.id) FROM table t3 WHERE t3.id < t1.id)
If the table is moderately (not extremely) large I would consider doing this in application code instead, or by storing the change flag in its own column when writing a new row. A subquery for each row in the table has very poor performance.
This version doesn't have a sub-query, but does assume that you have consecutive IDs.
SELECT t1.*
FROM traffview AS t1, traffview AS t2
WHERE
t1.siteId = t2.siteId
AND t1.sigplan <> t2.sigplan
AND t1.id - t2.id = 1
ORDER BY
t1.server_time
In case you compare with previous rows it is useful to use LAG function which does the job for you:
SELECT sub.*
FROM (
SELECT
plan AS curr_plan,
LAG(plan) OVER (PARTITION BY val ORDER BY time) AS prev_plan,
val,
time
) sub
WHERE
sub.prev_plan IS NOT NULL AND sub.prev_plan <> sub.curr_plan;

Days since last purchase postgres (for each purchase)

Just have a standard orders table:
order_id
order_date
customer_id
order_total
Trying to write a query that generates a column that shows the days since the last purchase, for each customer. If the customer had no prior orders, the value would be zero.
I have tried something like this:
WITH user_data AS (
SELECT customer_id, order_total, order_date::DATE,
ROW_NUMBER() OVER (
PARTITION BY customer_id ORDER BY order_date::DATE DESC
)
AS order_count
FROM transactions
WHERE STATUS = 100 AND order_total > 0
)
SELECT * FROM user_data WHERE order_count < 3;
Which I could feed into tableau, then use some table calculations to wrangle the data, but I really would like to understand the SQL approach. My approach also only analyzes the most recent 2 transactions, which is a drawback.
Thanks
You should use lag() function:
select *,
lag(order_date) over (partition by customer_id order by order_date)
as prior_order_date
from transactions
order by order_id
To have the number of days since last order, just subtract the prior order date from the current order date:
select *,
order_date- lag(order_date) over (partition by customer_id order by order_date)
as days_since_last_order
from transactions
order by order_id
The query selects null if there is no prior order. You can use coalesce() to change it to zero.
You indicated that you need to calculate number of days since the last purchase.
..Trying to write a query that generates a column that shows the days
since the last purchase
So, basically you need get a difference between now and last purchase date for each client. Query can be the following:
-- test DDL
CREATE TABLE orders (
order_id SERIAL PRIMARY KEY,
order_date DATE,
customer_id INTEGER,
order_total INTEGER
);
INSERT INTO orders(order_date, customer_id, order_total) VALUES
('01-01-2015'::DATE,1,2),
('01-02-2015'::DATE,1,3),
('02-01-2015'::DATE,2,4),
('02-02-2015'::DATE,2,5),
('03-01-2015'::DATE,3,6),
('03-02-2015'::DATE,3,7);
WITH orderdata AS (
SELECT customer_id,order_total,order_date,
(now()::DATE - max(order_date) OVER (PARTITION BY customer_id)) as days_since_purchase
FROM orders
WHERE order_total > 0
)
SELECT DISTINCT customer_id ,days_since_purchase FROM orderdata ORDER BY customer_id;

PostgreSQL get results that have been created 24 hours from now

I have two tables that I am joining together. I want to filter the results based on whether or not it had been created 24 hours prior. Here are my tables.
table user_infos (
id integer,
date_created timestamp with timezone,
name varchar(40)
);
table user_data (
id integer,
team_name varchar(40)
);
This is my query that I am using to join them together and hopefully filter them:
SELECT timestampdiff(HOUR, user_infos.date_created, now()) as hours_since,
user_data.id, user_data.team_name,
user_infos.name, user_infos.date_created
FROM user_data
JOIN user_infos
ON user_infos.id=user_data.id
WHERE timestampdiff(HOUR, user_infos.date_created, now()) < 24
ORDER BY name ASC, id ASC
LIMIT 50 OFFSET 0
What I am trying to do is join the two tables such that the id, team_name, name, and date-created would be treated as one table.
Then I would like to filter it such that I only get the results that were created 24 hours ago. This is what I am using the timestampdiff for.
Then I ORDER then by name and id in ascending order.
then limit the results to 50.
Everything look good except that I doesn't work. When I run this query it tells me that the "hour" column does not exist.
Clearly there is something subtle here that is messing everything up. Does anyone have any suggestions?
Alternatively, I've tried this, but it tells me that there is a syntax error at 1;
SELECT
user_data.id, user_data.team_name,
user_infos.name, user_infos.date_created
FROM user_data
JOIN user_infos
ON user_infos.id=user_data.id
WHERE user_infos.date_created
BETWEEN DATE( DATE_SUB( NOW() , INTERVAL 1 DAY ) ) AND
DATE ( NOW() )
ORDER BY name ASC, id ASC
LIMIT 50 OFFSET 0
I think your problem is with your data types. You are checking if a timestamp field is between a casted date field (which removes the time from the date). NOW() is different than the DATE(NOW()).
So you have 2 options. You can either remove the DATE() casting and it should work, or you can cast the date_created to a date.
SELECT
user_data.id, user_data.team_name,
user_infos.name, user_infos.date_created
FROM user_data
JOIN user_infos
ON user_infos.id=user_data.id
WHERE user_infos.date_created
BETWEEN DATE_SUB( NOW() , INTERVAL 1 DAY ) AND
NOW()
ORDER BY name ASC, id ASC
LIMIT 50 OFFSET 0
SQL Fiddle Demo

Choosing the first child record in a selfjoin in TSQL

I've got a visits table that looks like this:
id identity(1,1) not null,
visit_date datetime not null,
patient_id int not null,
flag bit not null
For each record, I need to find a matching record that is same time or earlier, has the same patient_id, and has flag set to 1. What I am doing now is:
select parent.id as parent_id,
(
select top 1
child.id as child_id
from
visits as child
where
child.visit_date <= parent.visit_date
and child.patient_id = parent.patient_id
and child.flag = 1
order by
visit_date desc
) as child_id
from
visits as parent
So, this query works correctly, except that it runs too slow -- I suspect that this is because of the subquery. Is it possible to rewrite it as a joined query?
View the query execution plan. Where you have thick arrows, look at those statements. You should learn the different statements and what they imply, like what Clustered Index Scan/ Seek etc.
Usually when a query is going slow however I find that there are no good indexes.
The tables and columns affected and used to join, create an index that covers all these columns. This is called a covering index usually in the forums. It's something you can do for something that really needs it. But keep in mind that too many indexes will slow down insert statements.
/*
id identity(1,1) not null,
visit_date datetime not null,
patient_id int not null,
flag bit not null
*/
SELECT
T.parentId,
T.patientId,
V.id AS childId
FROM
(
SELECT
visit.id AS parentId,
visit.patient_id AS patientId,
MAX (previous_visit.visit_date) previousVisitDate
FROM
visit
LEFT JOIN visit previousVisit ON
visit.patient_id = previousVisit.patient_id
AND visit.visit_date >= previousVisit.visit_date
AND visit.id <> previousVisit.id
AND previousVisit.flag = 1
GROUP BY
visit.id,
visit.visit_date,
visit.patient_id,
visit.flag
) AS T
LEFT JOIN visit V ON
T.patientId = V.patient_id
AND T.previousVisitDate = V.visit_date