when converting text/string to date in postgres, random date is generated - postgresql

I have a text column indicating date i.e. 20170101
UPDATE table_name
SET work_date = to_date(workdate, 'YYYYMMDD');
I used this command to convert it as date. However, I got a odd result. I read though other existing posts but not sure what's wrong here.
+----------+---------------+
| workdate | work_date |
+----------+---------------+
| 20170211 | 2207-05-09 |
| 20170930 | 2209-04-27 |
| 20170507 | 2208-02-29 |
| 20170318 | 2207-08-24 |
+----------+---------------+

I think you must be mistaken about the data you are supplying to to_date.
For example, input to these functions is not restricted by normal ranges, thus to_date('20096040','YYYYMMDD') returns 2014-01-17 rather than causing an error.
Source: https://www.postgresql.org/docs/9.6/static/functions-formatting.html

Related

How do I make a specific query in a PSQL database

I need help with a project. Right now, I have a giant postgreSQL database (~6 million rows, 25 columns) and I need to figure out how to get the following information:
For one specific range for one attribute, "20<np<400":
Find local minimum values in another variable, "itt"
Record whole row of data for that local minimum
Add extra column and add to this the next local maximum of itt
Edit: the attribute are as follows: The database has this schema: id | at | itt | engine_torque | ng | np | fuel_flow | engine_oil_pressure | engine_oil_temp | airspeed | altitude | total_air_temp | weight_on_wheels | p25_p3 | bypass | chip_counter | number_of_flight_counter | number_of_engine_run | run_id | ectm_file_id | aircraft_sn | engine_sn | gas_generator_sn | power_section_sn | tail_number
By 'local minima and maxima' i mean that within a certain range of np, i'm looking for the highest and lowest values of itt in close proximity with respect to time, or "at"
Thanks in advance!!!

how to restore data values that already converted into scientific notation in a table

so i have a problem where i inserted some values into my table that the value automatically converted into a scientific notation (ex: 8.24e+04) does anyone know how restore the original value or how keep the original values in the table?
i'm using double precision as data type for the column and i just noticed that double precision data type often convert long number values into scientific notation.
this is how table looks like after i inserted some values
test=# select * from demo;
| string_col | values |
|------------|-----------------------|
| Rocket | 123228435521 |
| Test | 13328422942213 |
| Power | 1.243343991231232e+15 |
| Pull | 1.233433459353712e+15 |
| Drag | 1244375399128 |
edb=# \d+ demo;
Table "public.demo"
Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
------------+-----------------------+-----------+----------+---------+----------+--------------+-------------
string_col | character varying(20) | | | | extended | |
values | double precision | | | | plain | |
Access method: heap
this just some dummy table i used to explain my question here.
You'll have to format the number using to_char if you want it in a specific format:
SELECT 31672516735473059594023526::double precision,
to_char(
31672516735473059594023526::double precision,
'999999999999999999999999999.99999999999999FM'
);
float8 │ to_char
═══════════════════════╪════════════════════════════
3.167251673547306e+25 │ 31672516735473058997862400
(1 row)
The result is not exact because the precision of double precision is not high enough.
If you don't want the rounding errors and want to avoid scientific notation as well, use the data type numeric instead.

SPSS group by rows and concatenate string into one variable

I'm trying to export SPSS metadata to a custom format using SPSS syntax. The dataset with value labels contains one or more labels for the variables.
However, now I want to concatenate the value labels into one string per variable. For example for the variable SEX combine or group the rows F/Female and M/Male into one variable F=Female;M=Male;. I already concatenated the code and labels into a new variable using Compute CodeValueLabel = concat(Code,'=',ValueLabel).
so the starting point for the source dataset is like this:
+--------------+------+----------------+------------------+
| VarName | Code | ValueLabel | CodeValueLabel |
+--------------+------+----------------+------------------+
| SEX | F | Female | F=Female |
| SEX | M | Male | M=Male |
| ICFORM | 1 | Yes | 1=Yes |
| LIMIT_DETECT | 0 | Too low | 0=Too low |
| LIMIT_DETECT | 1 | Normal | 1=Normal |
| LIMIT_DETECT | 2 | Too high | 2=Too high |
| LIMIT_DETECT | 9 | Not applicable | 9=Not applicable |
+--------------+------+----------------+------------------+
The goal is to get a dataset something like this:
+--------------+-------------------------------------------------+
| VarName | group_and_concatenate |
+--------------+-------------------------------------------------+
| SEX | F=Female;M=Male; |
| ICFORM | 1=Yes; |
| LIMIT_DETECT | 0=Too low;1=Normal;2=Too high;9=Not applicable; |
+--------------+-------------------------------------------------+
I tried using CASESTOVARS but that creates separate variables, so several variables not just one single string variable. I'm starting to suspect that I'm running up against the limits of what SPSS can do. Although maybe it's possible using some AGGREGATE or OMS trickery, any ideas on how to do this?
First I recreate your example here to demonstrate on:
data list list/varName CodeValueLabel (2a30).
begin data
"SEX" "F=Female"
"SEX" "M=Male"
"ICFORM" "1=Yes"
"LIMIT_DETECT" "0=Too low"
"LIMIT_DETECT" "1=Normal"
"LIMIT_DETECT" "2=Too high"
"LIMIT_DETECT" "9=Not applicable"
end data.
Now to work:
* sorting to make sure all labels are bunched together.
sort cases by varName CodeValueLabel.
string combineall (a300).
* adding ";" .
compute combineall=concat(rtrim(CodeValueLabel), ";").
* if this is the same varname as last row, attach the two together.
if $casenum>1 and varName=lag(varName)
combineall=concat(rtrim(lag(combineall)), " ", rtrim(combineall)).
exe.
*now to select only relevant lines - first I identify them.
match files /file=* /last=selectthis /by varName.
*now we can delete the rest.
select if selectthis=1.
exe.
NOTE: make combineall wide enough to contain all the values of your most populated variable.

Storing dates in postgresql arrays

I have a large person-oriented historical dataset, which includes birth-dates recorded in YYYY, YYYY-MM, or YYYY-MM-DD format. I've been thinking I should use a date[] array for this field because the dataset frequently lists two or more birth-dates.
PG docs say that ISO 8601 dates are supported, and ISO 8601 (wikipedia link) accommodates reduced precision, but PG doesn't let me add a reduced-precision date (like 1882-11 for November 1882).
So, what's the best approach for handling records that need to contain multiple birth-dates that might look like 1883, 1882-11, or 1882-12-12?
Lets imagine you have a table person with
+----------------+---------+----------+--------------------------------+
| person_id | fname | lname | bdate[] |
+----------------+---------+----------+--------------------------------+
| 1 | 'Jhon' | 'Smith' | {1883, 1882-11, or 1882-12-12} |
+----------------+---------+----------+--------------------------------+
You dont want that because then is hard to search for one date or update the array.
Instead you want one aditional table birthdays
+-------------+------+------------+
| birthday_id | type | bdate |
+-------------+------+------------+
| 1 | 1 | 1883-01-01 |
| 1 | 2 | 1882-11-01 |
| 1 | 3 | 1882-12-12 |
+-------------+------+------------+
This way even when date is save 1883-01-01 you know will be type = 1 or 1883

Finding the last seven days in a time series

I have a spreadsheet with column A which holds a timestamp and updates daily. Column B holds a value. Like the following:
+--------------------+---------+
| 11/24/2012 1:14:21 | $487.20 |
| 11/25/2012 1:14:03 | $487.20 |
| 11/26/2012 1:14:14 | $487.20 |
| 11/27/2012 1:14:05 | $487.20 |
| 11/28/2012 1:13:56 | $487.20 |
| 11/29/2012 1:13:57 | $487.20 |
| 11/30/2012 1:13:53 | $487.20 |
| 12/1/2012 1:13:54 | $492.60 |
+--------------------+---------+
What I am trying to do is get the average of the last 7, 14, 30 days.
I been playing with GoogleClock() function in order to filter the dates in column A but I can't seem to find the way to subtract TODAY - 7 days. I suspect FILTER will also help, but I am a little bit lost.
There are a few ways to go about this; one way is to return an array of values with a QUERY function (this assumes a header row in row 1, and you want the last 7 dates):
=QUERY(A2:B;"select B order by A desc limit 7";0)
and you can wrap this in whatever aggregation function you like:
=AVERAGE(QUERY(A2:B;"select B order by A desc limit 7";0))