mysql subquery in single table - select

I have 2 columns in an "entries" table: a non-unique id, and a date stamp (YYYY-MM-DD). I want to select "all of the entry id's that were inserted today that have never been entered before."
I've been trying to use a subquery, but I don't think I'm using it right since they're both performed on the same table. Could someone help me out with the proper select statement? I can provide more details if need be.

Disclaimer: I don't have access to a mysql database right now, but this should help:
select
e.id
from
entries e
where
e.date = curdate() and
e.id not in
(select id from entries e2 where e2.date < e.date)

Related

Querying Postgres INHERITED tables directly

Postgres allows you to create a table using inheritance. We have a design where we have 1400 tables that inherit from one main table. These tables are for each of our vendor's inventory.
When I want to query stock for a vendor, I just query the main table. When running Explain, the explanation says that it is going through all 1400 indexes and quite a few of the inherited tables. This causes the query to run very slowly. If I query only the vendor's stock table, I cut the query time to less than 50% of the time by querying the main table.
We have a join on another table that pulls identifiers for the vendor's partner vendors and we also want to query their stock. Example:
SELECT
(select m2.company from sup.members m2 where m2.id = u.id) as company,
u.id,
u.item,
DATE_PART('day', CURRENT_TIMESTAMP - u.datein::timestamp) AS daysinstock,
u.grade as condition,
u.stockno AS stocknumber,
u.ic,
CASE WHEN u.rprice > 0 THEN
u.rprice
ELSE
NULL
END AS price,
u.qty
FROM pub.net u
LEFT JOIN sup.members m1
ON m1.id = u.id OR u.id = any(regexp_split_to_array(m1.partnerslist,','))
WHERE u.ic in ('01036') -- part to query
AND m1.id = 'N40' -- vendor to query
The n40_stock table has stock for the vendor with id = N40 and N40's partner vendors (partnerslist) are G01, G06, G21, K17, N49, V02, M16 so I would also want
to query the g01_stock, g06_stock, g21_stock, k17_stock, n49_stock, v02_stock, and m16_stock tables.
I know about the ONLY clause but is there away to modify this query to get the data from ONLY the specific inherited tables?
Edit
This decreases the time to under 800ms, but I'd like it less:
WITH cte as (
SELECT partnerslist as a FROM sup.members WHERE id = 'N40'
)
SELECT
(select m2.company from sup.members m2 where m2.id = u.id) as company,
u.id,
u.item,
DATE_PART('day', CURRENT_TIMESTAMP - u.datein::timestamp) AS daysinstock,
u.grade as condition,
u.stockno AS stocknumber,
u.ic,
CASE WHEN u.rprice > 0 THEN
u.rprice
ELSE
NULL
END AS price,
u.qty
FROM pub.net u
WHERE u.ic in ('01036') -- part to query
AND u.id = any(regexp_split_to_array('N40,'||(select a from cte), ','))
I cannot retrieve the company from sup.members in the cte because I need the one from the u.id, which is different when the partner changes in the where clause.
Inherited table lookups are based on the actual WHERE clause, which maps to the CHECK table constraint. Simply inheriting tables is not good enough.
https://www.postgresql.org/docs/9.6/static/ddl-partitioning.html
Caveat, you can not use a dynamically created variables where the actual value is not implemented in the raw query. This results in a check of all inherited tables.

How to Get Talend to Keep Table Names in tOracleInput

Is there a way to tell Talend not to remove the prefix of column names especially when they are specified in the query to retrieve data from data source and keep the names mentioned in the query itself?
Thanks!
Assuming you are using the 'guess schema' feature with a query that joins some tables. Further assuming your tables have columns with the same names you run into trouble with the guessed schema. There is no way to have talend use or even know the names of the tables the colums come from, because they are part of a 'projection' and could result from transformation and/or aggregation. Thus, you'll need to help talend guessing the correct schema, which means a) you cant use the * to select all columns and b) you should assign each column an alias that hints at the table the column comes from.
So instead of select * from employee join department on employee.department_id = department.id you'd have something like select e.id as emp_id, e.name as emp_name, d.id as department_id, d.name as department_name from employee e join department d on e.department_id = d.id. The id from employee will be emp_id in the guessed schema.

Postgres subquery has access to column in a higher level table. Is this a bug? or a feature I don't understand?

I don't understand why the following doesn't fail. How does the subquery have access to a column from a different table at the higher level?
drop table if exists temp_a;
create temp table temp_a as
(
select 1 as col_a
);
drop table if exists temp_b;
create temp table temp_b as
(
select 2 as col_b
);
select col_a from temp_a where col_a in (select col_a from temp_b);
/*why doesn't this fail?*/
The following fail, as I would expect them to.
select col_a from temp_b;
/*ERROR: column "col_a" does not exist*/
select * from temp_a cross join (select col_a from temp_b) as sq;
/*ERROR: column "col_a" does not exist
*HINT: There is a column named "col_a" in table "temp_a", but it cannot be referenced from this part of the query.*/
I know about the LATERAL keyword (link, link) but I'm not using LATERAL here. Also, this query succeeds even in pre-9.3 versions of Postgres (when the LATERAL keyword was introduced.)
Here's a sqlfiddle: http://sqlfiddle.com/#!10/09f62/5/0
Thank you for any insights.
Although this feature might be confusing, without it, several types of queries would be more difficult, slower, or impossible to write in sql. This feature is called a "correlated subquery" and the correlation can serve a similar function as a join.
For example: Consider this statement
select first_name, last_name from users u
where exists (select * from orders o where o.user_id=u.user_id)
Now this query will get the names of all the users who have ever placed an order. Now, I know, you can get that info using a join to the orders table, but you'd also have to use a "distinct", which would internally require a sort and would likely perform a tad worse than this query. You could also produce a similar query with a group by.
Here's a better example that's pretty practical, and not just for performance reasons. Suppose you want to delete all users who have no orders and no tickets.
delete from users u where
not exists (select * from orders o where o.user_d = u.user_id)
and not exists (select * from tickets t where t.user_id=u.ticket_id)
One very important thing to note is that you should fully qualify or alias your table names when doing this or you might wind up with a typo that completely messes up the query and silently "just works" while returning bad data.
The following is an example of what NOT to do.
select * from users
where exists (select * from product where last_updated_by=user_id)
This looks just fine until you look at the tables and realize that the table "product" has no "last_updated_by" field and the user table does, which returns the wrong data. Add the alias and the query will fail because no "last_updated_by" column exists in product.
I hope this has given you some examples that show you how to use this feature. I use them all the time in update and delete statements (as well as in selects-- but I find an absolute need for them in updates and deletes often)

select distinct from 2 columns but only 1 is duplicate

select a.subscriber_msisdn, war.created_datetime from
(
select distinct subscriber_msisdn from wiz_application_response
where application_item_id in
(select id from wiz_application_item where application_id=155)
and created_datetime between '2012-10-07 00:00' and '2012-11-15 00:00:54'
) a
left outer join wiz_application_response war on (war.subscriber_msisdn=a.subscriber_msisdn)
the sub select returns 11 rows but when joined return 18 (with duplicates). The objective of this query is only add the date column to the 11 rows of the sub select.
Based on your description, it stands to reason that there are multiple created_datetime values for some of the subscriber_msisdn values which is what prompted you to use the distinct in the subquery to begin with. By joining the sub query to the original table you are defeating this. A cleaner way to write the query would be:
SELECT
war.subscriber_msisdn
, war.created_datetime
FROM
wiz_application_response war
LEFT JOIN wiz_application_item wai
ON war.application_item_id = wai.id
AND wai.application_id = 155
WHERE
war.created_datetime BETWEEN '2012-10-07 00:00' AND '2012-11-15 00:00:54'
This should return only the rows from the war table that satisfy the criteria based on the wai table. It should not be and outer join unless you wanted to return all the rows from war table that satisfied the created_datetime parameter regardless of the application_item_id parameter.
This is my best guess based on the limited information I have about your tables and what I’m assuming you’re trying to accomplish. If this doesn’t get you what you are after, I will continue to offer other ideas based on additional information you could provide. Hope this works.
Can most probably simplified to this:
SELECT DISTINCT ON (1)
r.subscriber_msisdn, r.created_datetime
FROM wiz_application_item i
JOIN wiz_application_response r ON r.application_item_id = i.id
WHERE i.application_id = 155
AND i.created_datetime BETWEEN '2012-10-07 00:00' AND '2012-11-15 00:00:54'
ORDER BY 1, 2 DESC -- to pick the latest created_datetime
Details depend on missing information.
More explanation here.

T-SQL - How to write query to get records that match ALL records in a many to many join

(I don't think I have titled this question correctly - but I don't know how to describe it)
Here is what I am trying to do:
Let's say I have a Person table that has a PersonID field. And let's say that a Person can belong to many Groups. So there is a Group table with a GroupID field and a GroupMembership table that is a many-to-many join between the two tables and the GroupMembership table has a PersonID field and a GroupID field. So far, it is a simple many to many join.
Given a list of GroupIDs I would like to be able to write a query that returns all of the people that are in ALL of those groups (not any one of those groups). And the query should be able to handle any number of GroupIDs. I would like to avoid dynamic SQL.
Is there some simple way of doing this that I am missing?
Thanks,
Corey
select person_id, count(*) from groupmembership
where group_id in ([your list of group ids])
group by person_id
having count(*) = [size of your list of group ids]
Edited: thank you dotjoe!
Basically you are looking for Persons for whom there is no group he is not a member of, so
select *
from Person p
where not exists (
select 1
from Group g
where not exists (
select 1
from GroupMembership gm
where gm.PersonID = p.ID
and gm.GroupID = g.ID
)
)
You're basically not going to avoid "dynamic" SQL in the sense of dynamically generating the query at query time. There's no way to hand a list around in SQL (well, there is, table variables, but getting them into the system from C# is either impossible (2005 & below) or else annoying (2008)).
One way that you could do it with multiple queries is to insert your list into a work table (probably a process-keyed table) and join against that table. The only other option would be to use a dynamic query such as the ones specified by Jonathan and hongliang.