I need to store location (x,y point) in my database where, point can be null and X and Y are always less then 999. At the moment I'm using EFCore Code First and Postgresql database, but I'd like to be flexible so that I can switch to MSSql without too much work. I'm not planning to move away from EF Core.
Right now, I have two columns: LocationX and LocationY, both are int? type. I'm not sure if this is good solution, because technically DB allows (X=2, Y=null), and it's should be. It's either both are null, or both are not.
My option two is to store it in a one string column: "123x321", with max length of 7.
Is there a better way?
Thanks,
Check constraint could be used to enforce both column are NULL or NOT NULL at the same time:
CREATE TABLE t(id INT,
x INT,
y INT,
CHECK((x IS NULL AND y IS NULL) OR (x IS NOT NULL AND y IS NOT NULL))
);
db<>fiddle demo
In addition to the check constraint suggested by #LukaszSzozda you can restrict the x,y values with an additional check constraint on each. So assuming they must also be in range 0,999 then
CREATE TABLE t(id INT,
x INT constraint x_range check ( x>=0 and x<=999),
y INT constraint y_range check ( y>=0 and y<=999),
CHECK((x IS NULL AND y IS NULL) OR (x IS NOT NULL AND y IS NOT NULL))
);
As far a your idea of storing a single string - very bad. Not only will you have the issue of separating values every time you need them it allows for distinctly invalid data. Values '1234567' and even 'abcdefg' are completely valid as far as the database is concerned.
So your table definition must account for and eliminate them. With this your table definition becomes:
create table txy
( xy_string varchar(7)
, constraint xy_format check( xy_string ~* '^\d{1,3}x\d{1,3}')
)
insert into txy(xy_string)
( values ('1x2'), ('354X512'), ('38x92') );
Which is actually a reduction as it is back to a single constraint, but your queries now require something like:
select xy_string
, regexp_replace(xy_string, '^(\d+)(X|x)(\d+)','\1') x
, regexp_replace(xy_string, '^(\d+)(X|x)(\d+)','\3') y
from txy;
See demo here.
In short never store groups of numbers values as a single delimited string. The additional work is just not worth it.
Related
It is impossible to speed up the database due to indexing.
I create a table:
CREATE TABLE IF NOT EXISTS coordinate( Id serial primary key,
Lat DECIMAL(9,6),
Lon DECIMAL(9,6));
After that I add indexing:
CREATE INDEX indeLat ON coordinate(Lat);
CREATE INDEX indeLon ON coordinate(Lon);
Then the table is filled in:
INSERT INTO coordinate (Lat, Lon) VALUES(48.685444, 44.474254);
Fill in 100k random coordinates.
Now I need to return all coordinates that are included in a radius of N km from a given coordinate.
SELECT id, Lat, Lon
FROM coordinate
WHERE acos(sin(radians(48.704578))*sin(radians(Lat)) + cos(radians(48.704578))*cos(radians(Lat))*cos(radians(Lon)-radians(44.507112))) * 6371 < 50;
The test execution time is approximately 0.2 seconds, and if you do not do CREATE INDEX, the time does not change. I suspect that there is an error in the request, maybe you need to rebuild it somehow?
I'm sorry for my english
An index can only be used if the indexed expression is exactly what you have on the non-constant side of the operator. That is obviously not the case here.
For operations like this, you need to use the PostGIS extension. Then you can define a table like:
CREATE TABLE coordinate (
id bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
p geography NOT NULL
);
and query like this:
SELECT id, p
FROM coordinate
WHERE ST_DWithin(p, 'POINT(48.704578 44.507112)'::geography, 50);
This index would speed up the query:
CREATE INDEX ON coordinate USING gist (p);
I hava data in my database and i need to select all data where 1 column number is between 1-100.
Im having problems, because i cant use - between 1 and 100; Because that column is character varying, not integer. But all data are numbers (i cant change it to integer).
Code;
dst_db1.eachRow("Select length_to_fault from diags where length_to_fault between 1 AND 100")
Error - operator does not exist: character varying >= integer
Since your column supposed to contain numeric values but is defined as text (or version of text) there will be times when it does not i.e. You need 2 validations: that the column actually contains numeric data and that it falls into your value restriction. So add the following predicates to your query.
and length_to_fault ~ '^\+?\d+(\.\d*)?$'
and length_to_fault::numeric <# ('[1.0,100.0]')::numrange;
The first builds a regexp that insures the column is a valid floating point value. The second insures the numeric value fall within the specified numeric range. See fiddle.
I understand you cannot change the database, but this looks like a good place for a check constraint esp. if n/a is the only non-numeric are allowed. You may want to talk with your DBA ans see about the following constraint.
alter table diags
add constraint length_to_fault_check
check ( lower(length_to_fault) = 'n/a'
or ( length_to_fault ~ '^\+?\d+(\.\d*)?$'
and length_to_fault::numeric <# ('[1.0,100.0]')::numrange
)
);
Then your query need only check that:
lower(lenth_to_fault) != 'n/a'
The below PostgreSQL query will work
SELECT length_to_fault FROM diags WHERE regexp_replace(length_to_fault, '[\s+]', '', 'g')::numeric BETWEEN 1 AND 100;
Let's say I have a table with several columns [a, b, c, d] which can all be nullable. This table is managed with Typeorm.
I want to create a unique constraint on [a, b, c]. However this constraint does not work if one of these column is NULL. I can insert for instance [a=0, b= 1, c=NULL, d=0] and [a=0, b= 1, c=NULL, d=1], where d has different values.
With raw SQL, I could set multiple partial constraints (Create unique constraint with null columns) however, in my case, the unique constraint is on 10 columns. It seems absurd to set a constraint for every possible combination...
I could also create a sort of hash function, but this method does not seem proper to me?
Does Typeorm provide a solution for such cases?
If you have values that can never appear in those columns, you can use them as a replacement in the index:
create unique index on the_table (coalesce(a,-1), coalesce(b, -1), coalesce(c, -1));
That way NULL values are treated the same inside the index, without the need to use them in the table.
If those columns are numeric or float (rather than integer or bigint) using '-Infinity' might be a better substitution value.
There is a drawback to this though:
This index will however not be usable for query on those columns unless you also use the coalesce() expression. So with the above index a query like:
select *
from the_table
where a = 10
and b = 100;
would not use the index. You would need to use the same expressions as used in the index itself:
select *
from the_table
where coalesce(a, -1) = 10
and coalesce(b, -1) = 100;
I need to create some customer number on record insert, format is 'A' + 4 digits, based on the ID. So record ID 23 -> A0023 and so on. My solution is currently this:
-- Table
create table t (
id bigserial unique primary key,
x text,
y text
);
-- Insert
insert into t (x, y) select concat('A',lpad((currval(pg_get_serial_sequence('t','id')) + 1)::text, 4, '0')), 'test';
This works perfectly. Now my question is ... is that 'safe', in the sense that currval(seq)+1 is guaranteed the same as the id column will receive? I think it should be locked during statement execution. Is this the correct way to do it or is there any shortcut to access the to-be-created ID directly?
Instead of storing this data, you could just query it each time you needed it, making the whole thing a lot less error-prone:
SELECT id, 'A' + LPAD(id::varchar, 4, '0')
FROM t
I'm using postgresql 9.0 beta 4.
After inserting a lot of data into a partitioned table, i found a weird thing. When I query the table, i can see an empty row with null-like values in 'not-null' fields.
That weird query result is like below.
689th row is empty. The first 3 fields, (stid, d, ticker), are composing primary key. So they should not be null. The query i used is this.
select * from st_daily2 where stid=267408 order by d
I can even do the group by on this data.
select stid, date_trunc('month', d) ym, count(*) from st_daily2
where stid=267408 group by stid, date_trunc('month', d)
The 'group by' results still has the empty row.
The 1st row is empty.
But if i query where 'stid' or 'd' is null, then it returns nothing.
Is this a bug of postgresql 9b4? Or some data corruption?
EDIT :
I added my table definition.
CREATE TABLE st_daily
(
stid integer NOT NULL,
d date NOT NULL,
ticker character varying(15) NOT NULL,
mp integer NOT NULL,
settlep double precision NOT NULL,
prft integer NOT NULL,
atr20 double precision NOT NULL,
upd timestamp with time zone,
ntrds double precision
)
WITH (
OIDS=FALSE
);
CREATE TABLE st_daily2
(
CONSTRAINT st_daily2_pk PRIMARY KEY (stid, d, ticker),
CONSTRAINT st_daily2_strgs_fk FOREIGN KEY (stid)
REFERENCES strgs (stid) MATCH SIMPLE
ON UPDATE CASCADE ON DELETE CASCADE,
CONSTRAINT st_daily2_ck CHECK (stid >= 200000 AND stid < 300000)
)
INHERITS (st_daily)
WITH (
OIDS=FALSE
);
The data in this table is simulation results. Multithreaded multiple simulation engines written in c# insert data into the database using Npgsql.
psql also shows the empty row.
You'd better leave a posting at http://www.postgresql.org/support/submitbug
Some questions:
Could you show use the table
definitions and constraints for the
partions?
How did you load your data?
You get the same result when using
another tool, like psql?
The answer to your problem may very well lie in your first sentence:
I'm using postgresql 9.0 beta 4.
Why would you do that? Upgrade to a stable release. Preferably the latest point-release of the current version.
This is 9.1.4 as of today.
I got to the same point: "what in the heck is that blank value?"
No, it's not a NULL, it's a -infinity.
To filter for such a row use:
WHERE
case when mytestcolumn = '-infinity'::timestamp or
mytestcolumn = 'infinity'::timestamp
then NULL else mytestcolumn end IS NULL
instead of:
WHERE mytestcolumn IS NULL