Working with the professional mobility of insee on more than 1m of entities, i seek to add up a field called ipondi only on the journeys from commune of residence to commune of work, and not of commune of work to residential commune.
Let us assume a simple example, with the column of commune of residence named "departure", and commune of work named "arrival", and the field which i wish to make the sum named "ipondi":
start; end; ipondi
La Ciotat; Marseille; 84
La Ciotat; Marseille; 15
Aubagne; Ceyreste; 12
Marseille; La Ciotat; 73
So I get the following result:
select start, end, sum(ipondi)
from trajets
group by start, end
So I get the following result:
La Ciotat; Marseille; 99
Aubagne; Ceyreste; 12
Marseille; La Ciotat; 73
Which is normal. However, I would like to "delete" the Marseille; La Ciotat because it is the return journey of the first two lines.
This being so to arrive at this result:
start; end; ipondi
La Ciotat; Marseille; 99
Aubagne; Ceyreste; 12
My link to my database : https://drive.google.com/file/d/1TOB1MTAt8UNCjt0up6qcgnR593yMXkqt/view?usp=sharing
How to do this on PostgreSQL?
Thank you.
Related
I have the following table:
t:([]date:.z.d+neg til 100; region:100#(`us`eu`asia`au); product:100#(`car`insurance`loan`other); qty:100?1f);
date region product qty
---------------------------------------
2019.11.28 us car 0.3998898
2019.11.27 eu insurance 0.592308
2019.11.26 asia loan 0.9796166
...
What I want is to check if the qty has been above a given threshold, say 0.5, for more than n days, say 3 days. Here days can actually just be the number of consecutive observations.
I want to check this for each of the 4 products I have, and by region.
I thought about applying a rolling function over n days, and apply the following function:
myfunc:{[l]
:all l>0.5;
};
where 0.5 is my threshold, but it is not really working...
One approach using fby:
q)consec:{s in(s:sums[differ y])where(x-1){x&prev x}/y};
q)`region`product`date xdesc select from t where (consec[3];0.5<qty)fby([]region;product)
date region product qty
-------------------------------------
2019.11.04 us car 0.949975
2019.10.31 us car 0.8481567
2019.10.27 us car 0.9367503
2019.09.28 eu insurance 0.6884756
2019.09.24 eu insurance 0.9598964
2019.09.20 eu insurance 0.6789082
2019.09.16 eu insurance 0.726781
2019.09.12 eu insurance 0.5830262
2019.09.08 eu insurance 0.7750292
2019.11.21 au other 0.5347096
2019.11.17 au other 0.5785203
2019.11.13 au other 0.6137452
2019.11.09 au other 0.6919531
2019.09.30 au other 0.7278528
2019.09.26 au other 0.7520102
2019.09.22 au other 0.6430982
2019.09.18 au other 0.9877844
2019.09.14 au other 0.8355065
2019.09.10 au other 0.9149882
2019.09.06 au other 0.6324114
/to make it a sequence of 5 consecutive
`region`product`date xdesc select from t where (consec[5];0.5<qty)fby([]region;product)
/to make the threshold >0.6
`region`product`date xdesc select from t where (consec[5];0.6<qty)fby([]region;product)
I have a table like this
org_ID linenr text
811558672 10 Legevirksomhet.
811560782 10 Clavier Classics er et musikkselskap som produserer komposisjoner og
811560782 20 arrangementer av svært høy kvalitet. De kombinerer den klassiske
811560782 30 musikktradisjonen med moderne teknikker og deres kunder spenner fra
811560782 40 individuelle musikere til ensembler, festivalarrangører, konserthus,
811560782 50 kulturinstitusjoner, eventskapere og mediaprodusenter.
811560812 10 Grafisk design, illustrasjon og nærliggende virksomhet.
811561592 10 Sosial- og helsetjenesten. Konsulentvirksomhet: Veiledning til
811561592 20 foreldre, fosterhjem, skole og barnehage.
As you can see, for some org_ID, they appear multiple times because one line of text is not enough for them. When this happens, the linenr shows multiple numbers. Now I want to concatenate multiple lines of text into one when org_ID is the same. How shall I do this? Many thanks in advance.
Use SAS Retain functionality to concatenate the text and only output when an new org_ID is read.
Note: The two IF statements handles the cases of first row and last row; where there is no Previous ID or no Next ID.
Working Code: (Your Input data must be sorted)
data have;
infile datalines dlm=',' dsd;
length org_ID 8. linenr 8. text $200.;
input org_ID linenr text $;
datalines;
811558672,10, "Legevirksomhet."
811560782,10, "Clavier Classics er et musikkselskap som produserer komposisjoner og"
811560782,20, "arrangementer av svært høy kvalitet. De kombinerer den klassiske"
811560782,30, "musikktradisjonen med moderne teknikker og deres kunder spenner fra"
811560782,40, "individuelle musikere til ensembler, festivalarrangører, konserthus,"
811560782,50, "kulturinstitusjoner, eventskapere og mediaprodusenter."
811560812,10, "Grafisk design, illustrasjon og nærliggende virksomhet."
811561592,10, "Sosial- og helsetjenesten. Konsulentvirksomhet: Veiledning til"
811561592,20, "foreldre, fosterhjem, skole og barnehage."
;
run;
data want;
set have nobs=nobs;
retain longtext;
retain id;
if(_N_=1) then do; longtext=text; id=org_ID; end;
else if org_ID ne id then do; output; longtext=text; id=org_ID; end;
else longtext=cats(longtext,text);
if (_N_=nobs) then do; output; end;
keep org_ID longtext;
run;
Output:
org_ID=811560782 longtext=Legevirksomhet.
org_ID=811560812 longtext=Clavier Classics er et musikkselskap som produserer komposisjoner ogarrangementer av svært høy kva
org_ID=811561592 longtext=Grafisk design, illustrasjon og nærliggende virksomhet.
org_ID=811561592 longtext=Sosial- og helsetjenesten. Konsulentvirksomhet: Veiledning tilforeldre, fosterhjem, skole og barneha
A DOW loop can accumulate each line of text in the org_ID group into a final longtext. The longtext should be assigned a specific length in order to prevent truncations that may occur if default lengths are used. You may or may not want a space separator between lines that are concatenated.
data want(keep=org_ID longtext);
do until (last.org_ID);
set have;
by org_ID;
length longtext $2000;
longtext = catx(' ', longtext, text);
end;
run;
If the data is not sorted, but the org_ID rows are contiguous, you can use
by org_ID notsorted;
So what is happening ?
Longtext is a non dataset variable, so it is automatically reset to missing at the top of the data step.
The data step iterates for each row in the group until the last row in the group.
The length of the variable longtext is specified after the set statement so it be the last variable in program data vector (pdv), and thus be the second column of the kept variables.
catx is used accumulate the concatenations of the text data within the group. A space is used to separate the text data parts.
If you do not want the space separator, accumulate using
longtext = cats(longtext, text);
Is there anyway to sort the string with number field by its number only
I have a value like this
subject_code
DE 312
DE 313
DE 315
Eng 311
COMP 314
can it be sort like this
subject_code
Eng 311
DE 312
DE 313
COMP 314
DE 315
I tried
order by SOUNDEX(subject_code),LENGTH(subject_code),subject_code
but it does not work as expected.
Thank you for your any help and suggestions.
One workaround to your situation uses string operations to obtain the numerical subject code and use it for sorting.
SELECT
subject_code
FROM yourTable
ORDER BY
CAST(SUBSTR(subject_code,
INSTR(subject_code, ' ') + 1) AS UNSIGNED)
However, you should really be storing the text and numerical code in separate columns.
Output:
Demo here:
Rextester
Given a Postgres database with some extensions such as address_standardizer, how to run the below statement with Query Builder:
SELECT pprint_addy(normalize_address('202 East Fremont Street, Las Vegas, Nevada 89101'));
Which yields:
202 E Fremont St, Las Vegas, NV 89101
You can run raw select statements with the DB::select() method:
$data = DB::select('SELECT pprint_addy(normalize_address(?)) as address', ['202 East Fremont Street, Las Vegas, Nevada 89101']);
print_r($data);
friends i had written a code to upload a data in newemp table using UTL code i given below but i get the error
1 declare
2 EMPNO NUMBER(4);
3 ENAME VARCHAR2(10);
4 JOB VARCHAR2(10);
5 MGR NUMBER(4);
6 HIREDATE DATE;
7 SAL NUMBER(7,2);
8 COMM NUMBER(7,2);
9 DEPTNO NUMBER(2);
10 line varchar2(100);
11 namesfile UTL_FILE.FILE_TYPE;
12 begin
13 namesfile :=UTL_FILE.FOPEN('DIPRJDIR','empdata.txt','R');
14 loop
15 UTL_FILE.GET_LINE(namesfile,EMPNO,4);
16 dbms_output.put_line('EMPNO :' || EMPNO);
17 UTL_FILE.GET_LINE(namesfile,ENAME,10);
18 dbms_output.put_line('ENAME :' || ENAME);
19 UTL_FILE.GET_LINE(namesfile,JOB,9);
20 dbms_output.put_line('JOB :' || JOB);
21 UTL_FILE.GET_LINE(namesfile,MGR,4);
22 dbms_output.put_line('MGR :' || MGR);
23 UTL_FILE.GET_LINE(namesfile,HIREDATE,5);
24 dbms_output.put_line('HIREDATE :' || HIREDATE);
25 UTL_FILE.GET_LINE(namesfile,SAL,9);
26 dbms_output.put_line('SAL :' || SAL);
27 UTL_FILE.GET_LINE(namesfile,COMM,9);
28 dbms_output.put_line('COMM :' || COMM);
29 UTL_FILE.GET_LINE(namesfile,DEPTNO,2);
30 dbms_output.put_line('DEPTNO :' || DEPTNO);
31 insert into newemp values(EMPNO,ENAME,JOB,MGR,HIREDATE,SAL,COMM,DEPTNO);
32 end loop;
33 utl_file.fclose(namesfile);
34* end;
SQL> /
EMPNO :7839
ENAME :KING
JOB :PRESIDENT
MGR :0
ERROR
declare
*
ERROR at line 1:
ORA-01843: not a valid month
ORA-06512: at line 23
and my data is in below given format
7839KING PRESIDENT 000017-nov-1981 005000.00 000000.0010 so Please help me
That's because the UTL_FILE.GET_LINE procedure, gets a VARCHAR2 as a second parameter.
If you put something else than a varchar2 oracle tries to implicitly cast it to the right type.
This worked quite well for you with numbers, but with a Date datatype, oracle will use the NLS_DATE_FORMAT which might not meet the format of string in the file.
You can fix it by using a VARCHAR2 in the UTL_FILE.GET_LINE call and then use to_date function to convert it to a date datatype.
But, there is a better way to do this job-
You can use an external table to read the file and then insert-select from it.