When I run the query (1st Code) below I get 1.37 million random Departure Dates based on the current Arrival Date in the database, this is good news. However when I try to update the database with the 2nd Code query I get an error message(See below) and I don't know why. Can you help?
Msg 116, Level 16, State 1, Line 5 Only one expression can be
specified in the select list when the subquery is not introduced with
1st Code
SELECT ArrivalDate, DATEADD(day, 1 + RAND(checksum(NEWID()))
* LengthOfStay.LengthofStay, ArrivalDate) AS DepartureDate
FROM Bookings, LengthOfStay
ORDER BY ArrivalDate
2nd Code
USE Occupancy
Update Bookings
Set DepartureDate = (SELECT ArrivalDate, DATEADD(day, 1 + RAND(checksum(NEWID()))*1.5
* LengthOfStay.LengthofStay, ArrivalDate))
FROM LengthOfStay, Bookings

You have several problems:
LengthOfStay, Bookings is a CROSS JOIN (Cartesian product): is this intended
You have 2 columns from the sub query but are trying to update only one
Assuming your CROSS JOIN is intended, you don't need the subquery
DepartureDate = DATEADD(day,
1 + RAND(checksum(NEWID()))*1.5 * L.LengthofStay,
LengthOfStay L, Bookings B

It seems you are selecting 2 columns to update 1 column(DepartureDate)


Postgresql SUM calculated column

I am trying to create some sql to calculate the worth of a users inventory and have manage to get it to work up to the final step.
SELECT DISTINCT ON (pricing_cards.card_id)
(inventory_cards.nonfoil * pricing_cards.nonfoil) + (inventory_cards.foil * pricing_cards.foil) as x
FROM inventory_cards
INNER JOIN pricing_cards ON pricing_cards.card_id = inventory_cards.card_id
WHERE inventory_cards.user_id = 1
ORDER BY pricing_cards.card_id, pricing_cards.date DESC;
The code above bring back a single column that has the correct calculation for card. I now need to sum this column together but keep getting errors when I try to sum it.
Adding SUM((inventory_cards.nonfoil * pricing_cards.nonfoil) + (inventory_cards.foil * pricing_cards.foil)) throws the following error
ERROR: column "pricing_cards.card_id" must appear in the GROUP BY clause or be used in an aggregate function
LINE 6: ORDER BY pricing_cards.card_id, pricing_cards.date DESC;
Adding GROUP BY pricing_cards.card_id, pricing_cards.date seems to fix the errors but is returning the same column of calculated values.
SELECT DISTINCT ON (pricing_cards.card_id)
SUM((inventory_cards.nonfoil * pricing_cards.nonfoil) + (inventory_cards.foil * pricing_cards.foil)) as x
FROM inventory_cards
INNER JOIN pricing_cards ON pricing_cards.card_id = inventory_cards.card_id
WHERE inventory_cards.user_id = 1
GROUP BY pricing_cards.card_id, pricing_cards.date
ORDER BY pricing_cards.card_id, pricing_cards.date DESC;
I suggest you use a subquery to get the latest pricing data, then join and sum:
SUM(inventory_cards.nonfoil * latest_pricing.nonfoil + inventory_cards.foil * latest_pricing.foil)
FROM inventory_cards
card_id, nonfoil, foild
FROM pricing_cards
ORDER BY pricing_cards.card_id, pricing_cards.date DESC
) AS latest_pricing USING (card_id)
WHERE inventory_cards.user_id = 1
For alternatives in the subquery, see also Select first row in each GROUP BY group? and Optimize GROUP BY query to retrieve latest row per user.

How to make postgres (cursor?) start at particular row

I have created the following query:
select t.id, t.row_id, t.content, t.location, t.retweet_count, t.favorite_count, t.happened_at,
a.id, a.screen_name, a.name, a.description, a.followers_count, a.friends_count, a.statuses_count,
c.id, c.code, c.name,
from tweets t
join accounts a on a.id = t.author_id
left outer join countries c on c.id = t.country_id
where t.row_id > %s
-- order by t.row_id
limit 100
Where %s is a number that starts at 0 and is incremented by 100 after each such query is conducted. I want to fetch all records from the database using this method, where I just increase the %s in the where condition. I found this approach on https://ivopereira.net/efficient-pagination-dont-use-offset-limit. I also included a column in my table which is corresponding to row number (I named it row_id). Now the problem is when I run this query the first time, it returns rows which have an row_id of 3 million. I would like the cursor (not sure if my terminology is correct) to start from rows with row_id 1 through 100 and so on. The table contains 7 million rows. Am I missing something obvious with which I could achieve my goal?

'View' (NOT DELETE) Duplicate Rows from a Postgresql table obtained from joins

So I have temp table I created by joining three tables :
The Stop_times table has a list of trip_ids, the corresponding stops and the scheduled arrival and departure times of buses at those stops.
I searched online and everywhere I seem to find answers for how to delete duplicates (using ctid, nested queries) but not view them.
My query looks something like this :
(CASE st.arrival_time < current_timestamp::time
WHEN true THEN (current_timestamp::date + interval '1 day') + st.arrival_time
ELSE (current_timestamp::date) + st.arrival_time
END) as arrival,
CASE st.departure_time < current_timestamp::time
WHEN true THEN (current_timestamp::date + interval '1 day') + st.departure_time
ELSE (current_timestamp::date) + st.departure_time
END as departure, st.trip_id, st.stop_id, st.stop_headsign,route_id, t.trip_headsign, s.stop_code, s.stop_name, s.stop_lat, s.stop_lon
FROM schema.stop_times st
JOIN schema.trips t ON t.trip_id=st.trip_id
JOIN schema.stops s ON s.stop_id=st.stop_id
order by arrival, departure;
I know that there are duplicates (by running the select * and select DISTINCT on temp), I just need to identify the duplicates...any help will be appreciated!
PS : I know I can use DISTINCT and get rid of duplicates, but it is slowing down the query a lot so I need to rework the query for which I need to identify the duplicates, the resultant records are greater than 200,000 so exporting them to excel and filtering duplicates is not an option either (I tried but excel can't handle it)
I believe this will give you what you want:
SELECT arrival, departure, trip_id, stop_id, stop_headsign, route_id,
headsign, stop_code, stop_name, stop_lat, stop_lon, count(*)
FROM temp
GROUP BY arrival, departure, trip_id, stop_id, stop_headsign, route_id,
headsign, stop_code, stop_name, stop_lat, stop_lon
HAVING count(*) > 1;

multiple extract() with WHERE clause possible?

So far I have come up with the below:
WHERE (extract(month FROM orders)) =
(SELECT min(extract(month from orderdate))
FROM orders)
However, that will consequently return zero to many rows, and in my case, many, because many orders exist within that same earliest (minimum) month, i.e. 4th February, 9th February, 15th Feb, ...
I know that a WHERE clause can contain multiple columns, so why wouldn't the below work?
WHERE (extract(day FROM orderdate)), (extract(month FROM orderdate)) =
(SELECT min(extract(day from orderdate)), min(extract(month FROM orderdate))
FROM orders)
I simply get: SQL Error: ORA-00920: invalid relational operator
Any help would be great, thank you!
Sample data:
Desired output:
I recreated your table and found out you just messed up the brackets a bit. The following works for me:
(extract(day from OrderDate),extract(month from OrderDate))
min(extract(day from OrderDate)),
min(extract(month from OrderDate))
from orders
Use something like this:
with cte1 as (
extract(month from OrderDate) date_month,
extract(day from OrderDate) date_day,
from tablename
), cte2 as (
select min(date_month) min_date_month, min(date_day) min_date_day
from cte1
select cte1.*
from cte1
where (date_month, date_day) = (select min_date_month, min_date_day from cte2)
A common table expression enables you to restructure your data and then use this data to do your select. The first cte-block (cte1) selects the month and the day for each of your table rows. Cte2 then selects min(month) and min(date). The last select then combines both ctes to select all rows from cte1 that have the desired month and day.
There is probably a shorter solution to that, however I like common table expressions as they are almost all the time better to understand than the "optimal, shortest" query.
If that is really what you want, as bizarre as it seems, then as a different approach you could forget the extracts and the subquery against the table to get the minimums, and use an analytic approach instead:
select orderdate
from (
select o.*,
row_number() over (order by to_char(orderdate, 'MMDD')) as rn
from orders o
where rn = 1;
The row_number() effectively adds a pseudo-column to every row in your original table, based on the month and day in the order date. The rn values are unique, so there will be one row marked as 1, which will be from the earliest day in the earliest month. If you have multiple orders with the same day/month, say 01-Jan-2013 and 01-Jan-2014, then you'll still only get exactly one with rn = 1, but which is picked is indeterminate. You'd need to add further order by conditions to make it deterministic, but I have no idea what you might want.
That is done in the inner query; the outer query then filters so that only the records marked with rn = 1 is returned; so you get exactly one row back from the overall query.
This also avoids the situation where the earliest day number is not in the earliest month number - say if you only had 01-Jan-2014 and 02-Feb-2014; comparing the day and month separately would look for 01-Feb-2014, which doesn't exist.
SQL Fiddle (with Thomas Tschernich's anwer thrown in too, giving the same result for this data).
To join the result against your invoice table, you don't need to join to the orders table again - especially not with a cross join, which is skewing your results. You can do the join (at least) two ways:
to_char(o.orderdate, 'DD-MM-YYYY'),
row_number() over (order by to_char(orderdate, 'MMDD')) as rn
FROM orders o
) o, invoices i
WHERE i.invno = o.invno
AND rn = 1;
to_char(o.orderdate, 'DD-MM-YYYY'),
SELECT orderno, orderdate, invno
row_number() over (order by to_char(orderdate, 'MMDD')) as rn
FROM orders o
WHERE rn = 1
) o, invoices i
WHERE i.invno = o.invno;
The first looks like it does more work but the execution plans are the same.
SQL Fiddle with your pastebin-supplied query that gets two rows back, and these two that get one.

Joining Against Derived Table

I'm not sure of the terminology here, so let me give an example. I have this query:
Id Name StartPeriodId EndPeriodId
1 MyEvent 34 32
In here, the PeriodIds specify how long the event lasts for, think of it as weeks of the year specified in another table if that helps. Notice that the EndPeriodId is not necessarily sequentially after the StartPeriodId. So I could do this:
SELECT * FROM Periods WHERE Id = 34
Id StartDate EndDate
34 2009-06-01 2009-08-01
Please do not dwell on this structure, as it's only an example and not how it actually works. What I need to do is come up with this result set:
Id Name PeriodId
1 MyEvent 34
1 MyEvent 33
1 MyEvent 32
In other words, I need to select an event row for each period in which the event exists. I can calculate the Period information (32, 33, 34) easily, but my problem lies in pulling it out in a single query.
This is in SQL Server 2008.
I may be mistaken, and I can't test it right now because there's no SQL Server available right now, but wouldn't that simply be:
SELECT Events.Id, Events.Name, Periods.PeriodId
FROM Periods
ON Periods.ID BETWEEN Events.StartPeriodId AND Events.EndPeriodId
I'm assuming that you want a listing of all periods that fall between the dates for the periods specified by start/end period id's.
With CTE_PeriodDate (ID, MaxDate, MinDate)
as (
Select Id, Max(Dates) MaxDate, MinDate=Min(Dates) from (
Select e.ID, StartDate as Dates from Events e
Inner join Periods P on P.ID=StartPeriodID
Union All
Select e.ID, EndDate from Events e
Inner join Periods P on P.ID=StartPeriodID
Union All
Select e.ID, StartDate from Events e
Inner join Periods P on P.ID=EndPeriodID
Union All
Select e.ID, EndDate from Events e
Inner join Periods P on P.ID=EndPeriodID ) as A
group by ID)
Select E.Name, P.ID from CTE_PeriodDate CTE
Inner Join Periods p on
(P.StartDate>=MinDate and P.StartDate<=MaxDate)
and (p.EndDate<=MaxDate and P.EndDate>=MinDate)
Inner Join Events E on E.ID=CTE.ID
It's not the best way to do this, but it does work.
It get's the min and max date ranges for the periods specified on the event.
Using these two date it joins with the periods table on values inside the range between the two.