I'm looking for a way to update all timestamp members of a column's (time_start) array with a fixed interval value of 'x days'.
Would I need to unnest the array first?
The table in question:
CREATE TABLE public.venue (
id integer NOT NULL,
time_start character varying(25)[]
);
Sample value of time_start (truncated, unordered):
{2019-05-22T04:12:46,2020-03-19T11:07:43,2020-01-08T03:56:40,2019-09-01T20:29:25,2019-09-29T21:38:18,2019-07-21T16:54:37}
Considering an increment of 1 day, a desired output would be:
{2019-05-23T04:12:46,2020-03-20T11:07:43,2020-01-09T03:56:40,2019-09-02T20:29:25,2019-09-30T21:38:18,2019-07-22T16:54:37}
Related
Is it possible to use a sum function in a calculated column?
If yes, I would like to create a calculated column, that calculates the sum of a column in the same table where the date is smaller than the date of this entry. is this possible?
And last, would this optimize repeated calls on this value over the exemplified view below?
SELECT ProductGroup, SalesDate, (
SELECT SUM(Sales)
FROM SomeList
WHERE (ProductGroup= KVU.ProductGroup) AND (SalesDate<= KVU.SalesDate)) AS cumulated
FROM SomeList AS KVU
Is it possible to use a sum function in a calculated column?
Yes, it's possible using a scalar valued function (scalar UDF) for you computed column but this would be a disaster. Using scalar UDFs for computed columns destroy performance. Adding a scalar UDF that accesses data (which would be required here) makes things even worse.
It sounds to me like you just need a good ol' fashioned index to speed things up. First some sample data:
IF OBJECT_ID('dbo.somelist','U') IS NOT NULL DROP TABLE dbo.somelist;
GO
CREATE TABLE dbo.somelist
(
ProductGroup INT NOT NULL,
[Month] TINYINT NOT NULL CHECK ([Month] <= 12),
Sales DECIMAL(10,2) NOT NULL
);
INSERT dbo.somelist
VALUES (1,1,22),(2,1,45),(2,1,25),(2,1,19),(1,2,100),(1,2,200),(2,2,50.55);
and the correct index:
CREATE NONCLUSTERED INDEX nc_somelist ON dbo.somelist(ProductGroup,[Month])
INCLUDE (Sales);
With this index in place this query would be extremely efficient:
SELECT s.ProductGroup, s.[Month], SUM(s.Sales)
FROM dbo.somelist AS s
GROUP BY s.ProductGroup, s.[Month];
If you needed to get a COUNT by month & product group you could create an indexed view like so:
CREATE VIEW dbo.vw_somelist WITH SCHEMABINDING AS
SELECT s.ProductGroup, s.[Month], TotalSales = COUNT_BIG(*)
FROM dbo.somelist AS s
GROUP BY s.ProductGroup, s.[Month];
GO
CREATE UNIQUE CLUSTERED INDEX uq_cl__vw_somelist ON dbo.vw_somelist(ProductGroup, [Month]);
Once that indexed view was in place your COUNTs would be pre-aggregated. You cannot, however, include SUM in an indexed view.
I am using: Microsoft SQL Server 2014 - 12.0.4213.0
Here is my sample table (numbers fuzzed):
CREATE TABLE most_recent_counts(
State VARCHAR(2) NOT NULL PRIMARY KEY
,BuildDate DATE NOT NULL
,Count_1725_Change INTEGER NOT NULL
,Count_1725_Percent_Change NUMERIC(20,2) NOT NULL
,Count_2635_Change INTEGER NOT NULL
,Count_2635_Percent_Change NUMERIC(20,2) NOT NULL
,Count_3645_Change INTEGER NOT NULL
,Count_3645_Percent_Change NUMERIC(20,2) NOT NULL
);
INSERT INTO most_recent_counts(State,BuildDate,Count_1725_Change,Count_1725_Percent_Change,Count_2635_Change,Count_2635_Percent_Change,Count_3645_Change,Count_3645_Percent_Change) VALUES ('AK','2018-06-05',1025,5.00,1700,2.50,2050,3.00);
INSERT INTO most_recent_counts(State,BuildDate,Count_1725_Change,Count_1725_Percent_Change,Count_2635_Change,Count_2635_Percent_Change,Count_3645_Change,Count_3645_Percent_Change) VALUES ('AL','2018-06-02',15000,4.00,10400,2.00,6800,1.25);
INSERT INTO most_recent_counts(State,BuildDate,Count_1725_Change,Count_1725_Percent_Change,Count_2635_Change,Count_2635_Percent_Change,Count_3645_Change,Count_3645_Percent_Change) VALUES ('AR','2018-06-07',2300,1.00,2700,1.00,1800,0.50);
INSERT INTO most_recent_counts(State,BuildDate,Count_1725_Change,Count_1725_Percent_Change,Count_2635_Change,Count_2635_Percent_Change,Count_3645_Change,Count_3645_Percent_Change) VALUES ('AZ','2018-04-26',107000,5.50,45400,3.00,180000,16.00);
INSERT INTO most_recent_counts(State,BuildDate,Count_1725_Change,Count_1725_Percent_Change,Count_2635_Change,Count_2635_Percent_Change,Count_3645_Change,Count_3645_Percent_Change) VALUES ('CA','2018-06-07',140000,6.00,550000,14.00,600000,18.00);
It should look something like this:
IMG: https://i.imgur.com/KGkfm66.png
In the real table, I have some 600ish such counts.
I would like to produce a table from this table, where for each state, I have the top ten (in magnitude) pairs of columns (i.e. The abs. change, and the percent change) (i.e. if in Alabama's row, there is a minus 10 million count in the sales to people in the 46-55 range, that should definitely be part of the result set, even if the rest of the columns are positive accruals in the thousands)
What's the best way to do this?
I have a column in a Postgresql table that is unique and character varying(10) type. The table contains old alpha-numeric values that I need to keep. Every time a new row is created from this point forward, I want it to be numeric only. I would like to get the max numeric-only value from this table for this column then create a new row with that max value incremented by 1.
Is there a way to query this table for the max numeric value only for this column?
For example, if this column currently has the values:
1111
A1111A
1234
1234A
3331
B3332
C-3333
33-D33
3**333*
Is there a query that will return 3333, AKA cut out all the non-numeric characters from the values and then perform a MAX() on them?
Not precisely what you asking, but something that I think will work better for you.
To go over all the columns, convert each to numbers, and then cast it to integer & return max.:
SELECT MAX(regexp_replace(my_column, '[^0-9]', '', 'g')::int) FROM public.foobar;
This gets you your max value... say 2999.
Now, going forward, consider making the default for your column a serial-like value, and convert it to text... that way you set the "MAX" once, and then let postgres do all the work for future values.
-- create simple integer sequence
CREATE SEQUENCE public.foobar_my_column_seq
INCREMENT 1
MINVALUE 1
MAXVALUE 9223372036854775807
START 1
CACHE 0;
-- use new sequence as default value for column __and__ convert to text
ALTER TABLE foobar
ALTER COLUMN my_column
SET DEFAULT nextval('publc.foobar_my_column_seq'::regclass)::text;
-- initialize "next value" of sequence to whatever is larger than
-- what you already have in your data ... say 3000:
ALTER SEQUENCE public.foobar_my_column_seq RESTART WITH 3000;
Because you're simply setting default, you don't change your current alpha-numeric values.
I figured it out. The following query works.
select text_value, regexp_replace(text_value, '[^0-9]+', '') as new_value from the_table;
Result:
text_value | new_value
-----------------------+-------------
4*215474 | 4215474
740024 | 740024
4*100535 | 4100535
42356 | 42356
CASH |
4*215474 | 4215474
740025 | 740025
740026 | 740026
4*5089655798 | 45089655798
4*15680 | 415680
4*224034 | 4224034
4*265718708 | 4265718708
I have a date column which I want to be unique once populated, but want the date field to be ignored if it is not populated.
In MySQL the way this is accomplished is to set the date column to "not null" and give it a default value of '0000-00-00' - this allows all other fields in the unique index to be "checked" even if the date column is not populated yet.
This does not work in PosgreSQL because '0000-00-00' is not a valid date, so you cannot store it in a date field (this makes sense to me).
At first glance, leaving the field nullable seemed like an option, but this creates a problem:
=> create table uniq_test(NUMBER bigint not null, date DATE, UNIQUE(number, date));
CREATE TABLE
=> insert into uniq_test(number) values(1);
INSERT 0 1
=> insert into uniq_test(number) values(1);
INSERT 0 1
=> insert into uniq_test(number) values(1);
INSERT 0 1
=> insert into uniq_test(number) values(1);
INSERT 0 1
=> select * from uniq_test;
number | date
--------+------
1 |
1 |
1 |
1 |
(4 rows)
NULL apparently "isn't equal to itself" and so it does not count towards constraints.
If I add an additional unique constraint only on the number field, it checks only number and not date and so I cannot have two numbers with different dates.
I could select a default date that is a 'valid date' (but outside working scope) to get around this, and could (in fact) get away with that for the current project, but there are actually cases I might be encountering in the next few years where it will not in fact be evident that the date is a non-real date just because it is "a long time ago" or "in the future."
The advantage the '0000-00-00' mechanic had for me was precisely that this date isn't real and therefore indicated a non-populated entry (where 'non-populated' was a valid uniqueness attribute). When I look around for solutions to this on the internet, most of what I find is "just use NULL" and "storing zeros is stupid."
TL;DR
Is there a PostgreSQL best practice for needing to include "not populated" as a possible value in a unique constraint including a date field?
Not clear what you want. This is my guess:
create table uniq_test (number bigint not null, date date);
create unique index i1 on uniq_test (number, date)
where date is not null;
create unique index i2 on uniq_test (number)
where date is null;
There will be an unique constraint for not null dates and another one for null dates effectively turning the (number, date) tuples into distinct values.
Check partial index
It's not a best practice, but you can do it such way:
t=# create table so35(i int, d date);
CREATE TABLE
t=# create unique index i35 on so35(i, coalesce(d,'-infinity'));
CREATE INDEX
t=# insert into so35 (i) select 1;
INSERT 0 1
t=# insert into so35 (i) select 2;
INSERT 0 1
t=# insert into so35 (i) select 2;
ERROR: duplicate key value violates unique constraint "i35"
DETAIL: Key (i, (COALESCE(d, '-infinity'::date)))=(2, -infinity) already exists.
STATEMENT: insert into so35 (i) select 2;
I'm using postgresql 9.0 beta 4.
After inserting a lot of data into a partitioned table, i found a weird thing. When I query the table, i can see an empty row with null-like values in 'not-null' fields.
That weird query result is like below.
689th row is empty. The first 3 fields, (stid, d, ticker), are composing primary key. So they should not be null. The query i used is this.
select * from st_daily2 where stid=267408 order by d
I can even do the group by on this data.
select stid, date_trunc('month', d) ym, count(*) from st_daily2
where stid=267408 group by stid, date_trunc('month', d)
The 'group by' results still has the empty row.
The 1st row is empty.
But if i query where 'stid' or 'd' is null, then it returns nothing.
Is this a bug of postgresql 9b4? Or some data corruption?
EDIT :
I added my table definition.
CREATE TABLE st_daily
(
stid integer NOT NULL,
d date NOT NULL,
ticker character varying(15) NOT NULL,
mp integer NOT NULL,
settlep double precision NOT NULL,
prft integer NOT NULL,
atr20 double precision NOT NULL,
upd timestamp with time zone,
ntrds double precision
)
WITH (
OIDS=FALSE
);
CREATE TABLE st_daily2
(
CONSTRAINT st_daily2_pk PRIMARY KEY (stid, d, ticker),
CONSTRAINT st_daily2_strgs_fk FOREIGN KEY (stid)
REFERENCES strgs (stid) MATCH SIMPLE
ON UPDATE CASCADE ON DELETE CASCADE,
CONSTRAINT st_daily2_ck CHECK (stid >= 200000 AND stid < 300000)
)
INHERITS (st_daily)
WITH (
OIDS=FALSE
);
The data in this table is simulation results. Multithreaded multiple simulation engines written in c# insert data into the database using Npgsql.
psql also shows the empty row.
You'd better leave a posting at http://www.postgresql.org/support/submitbug
Some questions:
Could you show use the table
definitions and constraints for the
partions?
How did you load your data?
You get the same result when using
another tool, like psql?
The answer to your problem may very well lie in your first sentence:
I'm using postgresql 9.0 beta 4.
Why would you do that? Upgrade to a stable release. Preferably the latest point-release of the current version.
This is 9.1.4 as of today.
I got to the same point: "what in the heck is that blank value?"
No, it's not a NULL, it's a -infinity.
To filter for such a row use:
WHERE
case when mytestcolumn = '-infinity'::timestamp or
mytestcolumn = 'infinity'::timestamp
then NULL else mytestcolumn end IS NULL
instead of:
WHERE mytestcolumn IS NULL