How to export the data from table to csv file by applying sort on one column - oracle-sqldeveloper

I have 2 billion of records in table in SQL developer and wanted to export the records in csv file but while exporting data I want to sort one column in ascending order. Is there any efficient or quick way to do this?
for ex:
Suppose the table name is TEMP and i want to sort the A_KEY column in ascending order and then export it
/* TEMP
P_ID ADDRESS A_KEY
1 242 Street 4
2 242 Street 5
3 242 Street 3
4 242 Long St 1
Expected Result in csv file:
P_ID, ADDRESS, A_KEY
4, 242 Long St,1
3, 242 Street,3
1, 242 Street, 4
2, 242 Long St,5
I have tried using below query :
insert into temp2select * from TEMP order by A_KEY ASC;
and then export the table from sqldeveloper but is there any efficient or quick way to direct export records without query?

Losing time on creating a new table (TEMP2) won't help because you are using ORDER BY clause during INSERT, but that means nothing. It is ORDER BY in SELECT statement that matters.
Therefore, run
select * from temp order by a_key;
and export result returned by that query.
2 billion rows? It'll take time. What will you do with such a CSV file? That's difficult to work with.
If you're trying to move data into another Oracle database, then consider using Data Pump export and import utilities which are designed for such a purpose.

Related

More efficient way to fuzzy match large datasets in SAS

I have a dataset with over 33 million records that includes a name field. I need to flag records where this name field value also appears in a second dataset that includes about 5 million records. For my purposes, a fuzzy match would be both acceptable and beneficial.
I wrote the following program to do this. It works but has been thus far running for 4 days, so I'd like to find a more efficient way to write it.
proc sql noprint;
create table INDIV_MATCH as
select A.NAME, SPEDIS(A.NAME, B.NAME) as SPEDIS_VALUE, COMPGED(A.NAME,B.NAME) as COMPGED_SCORE
from DATASET1 A join DATASET2 B
on COMPGED(A.NAME, B.NAME) le 400 and SPEDIS(A.NAME, B.NAME) le 10
order by A.name;
quit;
Any help would be much appreciated!

TimescaleDB: Understanding the return values after creating hypertable and the creation of chunks after populating the hypertable

I have an existing table in my database named price (has 264 rows) and I converted it into a hypertable price_hypertable doing:
CREATE TABLE price_hypertable (LIKE price INCLUDING DEFAULTS INCLUDING CONSTRAINTS EXCLUDING INDEXES);
SELECT create_hypertable('price_hypertable', 'start');
and the output it gave me is as follows:
create_hypertable
-------------------------------
(4,public,price_hypertable,t)
(1 row)
The next thing I did was to populate the price_hypertable as follows:
insert into price_hypertable select * from price;
And I got the following output:
INSERT 0 264
Now, I wanted to check the chunks created, for which I did:
select public.show_chunks('price_hypertable');
and the output I got:
show_chunks
----------------------------------------
_timescaledb_internal._hyper_4_3_chunk
_timescaledb_internal._hyper_4_4_chunk
(2 rows)
When I do:
select * from _timescaledb_internal._hyper_4_3_chunk;
select * from _timescaledb_internal._hyper_4_4_chunk ;
I see that the 264 entries are split as follows:
_timescaledb_internal._hyper_4_3_chunk has 98 rows
_timescaledb_internal._hyper_4_4_chunk has 166 rows
I have a few questions about these steps and their outputs:
Can someone please explain to me what do the values 4 and t represent, when I did
SELECT create_hypertable('price_hypertable', 'start');?
After populating the price_hypertable, the data was automatically split into chunks, but of different size. Why does this happen? Why wasn't the data just split in half (132 rows in each chunk instead of 98 and 166)?
Any help is appreciated. Thanks
For the first question, it is easier to see what they represent by executing create_hypertable as
SELECT * FROM create_hypertable('price_hypertable', 'start');
This gives something like:
hypertable_id | schema_name | table_name | created
---------------+-------------+--------------------+---------
4 | public | price_hypertable | t
For the second question, TmTron already answered. This is because the rows are sorted into buckets based on the time, and they are not necessarily evenly spaced. There is no automation that pick the correct interval for each bucket.
You can find information about the return values in the API documentation on create_hypertable which also discuss the parameter chunk_time_interval that can be used to set the chunk size.
related to your 2nd question:
When you don't specify the chunk_time_interval explicitly, the default is 7 days: see create-hypertable, Best Practices.
So the number of rows in each chunks depends on the distribution of your data (according to your start date-time column).

How to convert rows into column while inserting into temp table?

I want to convert rows into column while inserting into temp table. I read somewhere about crosstab but not exactly sure how to use it for my purpose.
So suppose firstly I query like
select cat_id, category_name from category_tab;
cat_id category_name
111 books
112 pen
113 copy
Now I want a temp table which should be like
s_no books pen copy
1
2
3
s_no will be serial number and rows of previous query becomes the column in new temp table.
How can I achieve this task in postgres ?

How to duplicate partition content?

I'm trying to set-up a testing environment for performance testing, currently we have a table with 8 million records and we want to duplicate this records for 30 days.
In other words:
- Table 1
--Partition1(8 million records)
--Partition2(0 records)
.
.
--Partition30(0 records)
Now I want to take the 8 million records in Partition1 and duplicate them across the rest of partitions, the only difference that they have is a column that contains a DATE. This column should vary 1 day in each copy.
Partition1(DATE)
Partition2(DATE+1)
Partition3(DATE+2)
And so on.
The last restrictions are that there are 2 indexes in the original table and they must be preserved in the copies and Oracle DB is 10g.
How can I duplicate this content?
Thanks!
It seems to me to be as simple as running as efficient an insert as possible.
Probably if you cross-join the existing data to a list of integers, 1 .. 29, then you can generate the new dates you need.
with list_of_numbers as (
select rownum day_add
from dual
connect by level <= 29)
insert /*+ append */ into ...
select date_col + day_add, ...
from ...,
list_of_numbers;
You might want to set NOLOGGING on the table, since this is test data.

Export data from db2 with column names

I want to export data from db2 tables to csv format.I also need that first row should be all the column names.
I have little success by using the following comand
EXPORT TO "TEST.csv"
OF DEL
MODIFIED BY NOCHARDEL coldel: ,
SELECT col1,'COL1',x'0A',col2,'COL2',x'0A'
FROM TEST_TABLE;
But with this i get data like
Row1 Value:COL1:
Row1 Value:COL2:
Row2 Value:COL1:
Row2 Value:COL2:
etc.
I also tried the following query
EXPORT TO "TEST.csv"
OF DEL
MODIFIED BY NOCHARDEL
SELECT 'COL1',col1,'COL2',col2
FROM ADMIN_EXPORT;
But this lists column name with each row data when opened with excel.
Is there a way i can get data in the format below
COL1 COL2
value value
value value
when opened in excel.
Thanks
After days of searching I solved this problem that way:
EXPORT TO ...
SELECT 1 as id, 'COL1', 'COL2', 'COL3' FROM sysibm.sysdummy1
UNION ALL
(SELECT 2 as id, COL1, COL2, COL3 FROM myTable)
ORDER BY id
You can't select a constant string in db2 from nothing, so you have to select from sysibm.sysdummy1.
To have the manually added columns in first row you have to add a pseudo-id and sort the UNION result by that id. Otherwise the header can be at the bottom of the resulting file.
Quite old question, but I've encountered recently the a similar one realized this can be achieved much easier in 11.5 release with EXTERNAL TABLE feature, see the answer here:
https://stackoverflow.com/a/57584730/11946299
Example:
$ db2 "create external table '/home/db2v115/staff.csv'
using (delimiter ',' includeheader on) as select * from staff"
DB20000I The SQL command completed successfully.
$ head /home/db2v115/staff.csv | column -t -s ','
ID NAME DEPT JOB YEARS SALARY COMM
10 Sanders 20 Mgr 7 98357.50
20 Pernal 20 Sales 8 78171.25 612.45
30 Marenghi 38 Mgr 5 77506.75
40 O'Brien 38 Sales 6 78006.00 846.55
50 Hanes 15 Mgr 10 80659.80
60 Quigley 38 Sales 66808.30 650.25
70 Rothman 15 Sales 7 76502.83 1152.00
80 James 20 Clerk 43504.60 128.20
90 Koonitz 42 Sales 6 38001.75 1386.70
Insert the column names as the first row in your table.
Use order by to make sure that the row with the column names comes out first.