Dump subset of records in an OpenEdge database table in the ".d" file format - progress-4gl

I am looking for the easiest way to manually dump a subset of records in an OpenEdge database table in the Progress ".d" file format.
The best way I can imagine is creating an extra test database with the identical schema as the source database, and then copying the subset of records over to the test database using FOR EACH and BUFFER-COPY statements. Then just export the data from the test database using the Dump Data and Definitions Table Contens (.d file )... menu option.

That seems like a lot of trouble. If you can identify the subset of records in order to do the BUFFER-COPY than you should also be able to:
OUTPUT TO VALUE( "table.d" ).
FOR EACH table NO-LOCK WHERE someCondition:
EXPORT table.
END.
OUTPUT CLOSE.
Which is, essentially, what the dictionary "dump data" .d file is less a few lines of administrivia at the bottom which can be safely omitted for most purposes.

Related

PostgreSQL: Import columns into table, matching key/ID

I have a PostgreSQL database. I had to extend an existing, big table with a few more columns.
Now I need to fill those columns. I tought I can create an .csv file (out of Excel/Calc) which contains the IDs / primary keys of existing rows - and the data for the new, empty fields. Is it possible to do so? If it is, how to?
I remember doing exactly this pretty easily using Microsoft SQL Management Server, but for PostgreSQL I am using PG Admin (but I am ofc willing to switch the tool if it'd be helpfull). I tried using the import function of PG Admin which uses the COPY function of PostgreSQL, but it seems like COPY isn't suitable as it can only create whole new rows.
Edit: I guess I could write a script which loads the csv and iterates over the rows, using UPDATE. But I don't want to reinvent the wheel.
Edit2: I've found this question here on SO which provides an answer by using a temp table. I guess I will use it - although it's more of a workaround than an actual solution.
PostgreSQL can import data directly from CSV files with COPY statements, this will however only work, as you stated, for new rows.
Instead of creating a CSV file you could just generate the necessary SQL UPDATE statements.
Suppose this would be the CSV file
PK;ExtraCol1;ExtraCol2
1;"foo",42
4;"bar",21
Then just produce the following
UPDATE my_table SET ExtraCol1 = 'foo', ExtraCol2 = 42 WHERE PK = 1;
UPDATE my_table SET ExtraCol1 = 'bar', ExtraCol2 = 21 WHERE PK = 4;
You seem to work under Windows, so I don't really know how to accomplish this there (probably with PowerShell), but under Unix you could generate the SQL from a CSV easily with tools like awk or sed. An editor with regular expression support would probably suffice too.

Remove header (column names) from query result

I am using a Java based program and I am writing a simple select query inside that program to retrieve data from the PostgreSQL database. The data come with the header which is an error for the rest of my codes.
How do I get rid of all column headings in an SQL query? I just want to
print out the raw data without any headings.
I am using Building Controls Virtual Test Bed (BCVTB) to connect my database to EnergyPlus. This BCVTB has a database actor that you can write a query in it and receive data and send it to your other simulation program. I decided to use PostgreSQL. however when I write Select * From mydb, it brings data with the column names (header). I just want raw data without header. what should I do?
PostgreSQL does not send table headings, not like a CSV file. The protocol (as used via JDBC) sends the rows. The driver does request a description of the rows that includes column names, but it is not part of the result set rows like the "header first" convention for CSV.
Whatever is happening must be a consequence of the BCVTB tools you are using, and I suggest pursuing it on that side of things.

Redshift - Adding a column, do we have to change our previous CSVs to include it?

I currently have a redshift table in our database that has 10 columns, and I want to add another. It's trivial to do an alter table to do this.
My question - When I do this, will all my old CSV files fail to insert into redshift (via COPY from S3) given they won't have this new column?
I was hoping the columns would just be NULL vs. it failing on import, but I haven't seen any documentation on this.
Ideally I wish I could specify the actual column name in the header row of the CSV, but I haven't seen if that is possible anywhere.
FILLRECORD in COPY command does that: 'Allows data files to be loaded when contiguous columns are missing at the end of some of the records'.

Export tables to Flat File with some logic

I'm writing scripts to export some tables to flat files every day. I'm looking at the BCP utility, but I'm not sure it has the kind of features I really need.
For example, I need to output the fields out of order. That is, the 15th field in the MSSQL database should be the 2nd field in the flat file, et.c
More importantly, some of the fields need to be altered. For example, if a certain field is null or contains some special values, I need to replace them with codes.
Is BCP the right tool for this? My gut tells me to do this in Perl instead.
You can write a stored procedure and do all data transformations there.
Then feed this stored procedure to bcp.
It will surely be faster than Perl.
SSIS is fast too; could be an option in case transformations are very complex.
You can use a query to order and format the columns directly with BCP
bcp Utility
"query"
Is a Transact-SQL query that returns a result set.
example:
bcp "SELECT Name FROM AdventureWorks.Sales.Currency" queryout Currency.Name.dat -T -c

Mongodb import and deciphering changed rows

I have a large csv file which contains over 30million rows. I need to load this file on a daily basis and identify which of the rows have changed. Unfortunately there is no unique key field but it's possible to use four of the fields to make it unique. Once I have identified the changed rows I will then want to export the data. I have tried using a traditional SQL Server solution but the performance is so slow it's not going to work. Therefore I have been looking at Mongodb - this has managed to import the file in about 20 minutes (which is fine). Now I don't have any experience using Monogdb and more importantly knowing best practices. So, my idea is the following:
As a one off - Import data into a collection using the mongoimport.
Copy all of the unique id's generated by mongo and put them in a separate collection.
Import new data into the existing collection using upsert fields which should create a new id for each new and changed row.
Compare the 'copy' to the new collection to list out all the changed rows.
Export changed data.
This to me will work but I am hoping there is a much better way to tackle this problem.
Use unix sort and diff.
Sort the file on disk
sort -o new_file.csv -t ',' big_file.csv
sort -o old_file.csv -t ',' yesterday.csv
diff new_file.csv old_file.csv
Commands may need some tweeking.
You can also use mysql to import the file via
http://dev.mysql.com/doc/refman/5.1/en/load-data.html (LOAD FILE)
and then create KEY (or primary key) on the 4 fields.
Then load yesterday's file into a different table and then use a 2 sql statements to compare the files...
But, diff will work best!
-daniel