In OpenEdge, how do you transfer parts of the data in the database in an easy way? - progress-4gl

I have a lot of data in 2 different databases and in many different tables I would like to move from one computer into a few others. The others has the same definition of the db:s. Note, not all the data should be transfered, only some that I define. Some tables fully, and some others just partly.
How would I move these data in the easiest way? To dump each table and load separately in many .d files - is not an easy way. Could you do something similar to the Incremental .df File that contains all that has to be changed?

Dumping (and loading) entire tables is easy. You can do it from the GUI or by command line. Look at for instance this KnowledgeBase entry about command line dump & load and this about creating scripts for dumping the entire database.
Parts of the data is another story. This is very individual and depends on your database and your application. It's hard for a generic tool to compare data and tell if a difference in data depends on changed data, added data or deleted data. Different databases has different kinds of layout, keys and indices.
There are however several built in commands that could help you:
For instance:
IMPORT and EXPORT for importing and exporting data to files, streams etc.
Basic import and export
OUTPUT TO c:\temp\foo.data.
FOR EACH foo NO-LOCK:
EXPORT foo.
END.
OUTPUT CLOSE.
INPUT FROM c:\temp\foo.data.
REPEAT:
CREATE foo.
IMPORT foo.
END.
INPUT CLOSE.
BUFFER-COPY and BUFFER-COMPARE for copying and comparing data between tables (and possibly even databases).
You could also use the built in commands for doing "dump" and then manually edit the created files.
Calling Progress Built in commands
You can call the back end that dumps data from Data Administration. That will require you to extract those .p-files from it's archives and calling them manually. This will also require you to change PROPATHS etc so it's not straightforward. You could also look into modifying the extracted files to your needs. Remember that this might break when upgrading Progress so store away your changes in separate files.
Look at this Progress KB entry:
Progress KB 15884
Best way for you depends on if this is a one time or reacurring task, size and layout of database etc.

Related

Compare tables to ensure non regression in postgresql

Here is my issue: I often need to compare the same postgresql tables (or views that depend on it) between some ETL code refactoring to check for non regressions in my developments.
Let's say I have an ETL code I want to refactor, which regularly uploads data in a table. Currently, once my modifs are done, I often download my data from postgresql as a .csv file as a first step, then empty it, fill it again using my refactored code, and download the data again. Then, I compare the .csv files using for instance Python in a Jupyter Notebook.
That does not seem like the way to go at all. That notably supposes I am the only one to use that table during the operation, and so many other things I can't list them all here.
Is there a better way to go ?
It sounds to me like you have the correct approach. There's no magic to the CSV export operation: whatever tool you use runs a query and formats its resultset into the file. Any other before-and-after comparison operation would have to run the same query.
If you're doing this sort of regression test on an active database, it's probably wise to put some sort of distinctive tag on your test records, maybe prepend ETLTEST- to your customer names, so it's ETLTEST-John Bull. Then you can make your queries handle only your test records. And make sure you do something reliable for ORDER BY.
Juptyer seems a complex way to diff your csv files. Most operating systems have lightweight fast difftools.

Does parameters variation not update the builtin database?

I notice that whenever I run a ParametersVariation model, the built-in database does not update... I have PLE, so there is no way for me to write my own database. I am currently able to pull data from various logs present in the database, but only from a normal simulation run. Is there a way to have the parameters variation write its data to the database after each simulation run?
I am currently running this code in After simulation run
Database myFile = new Database(this, "A DB from Excel", "C:/Users/Downloads/DataExport.xlsx");
ModelDatabase modelDB = getEngine().getModelDatabase();
modelDB.exportToExternalDB("flowchart_stats_time_in_state_log", myFile.getConnection(), "Sheet", false, true);
The export works perfectly. But the data never changes and this is confirmed by exporting a distribution from a histogram that changes with every simulation run. But for this export, its the same data as was written to the database from the last standard (non-parametersvariation) simulation run.
Model log database tables aren't produced for multi-run experiments. It's not specifically stated anywhere, but they're designed more for testing/debugging (single runs of) models.
(Also, notice that the log tables don't have columns specifying a run ID or similar, so there's no way that you would have been able to distinguish rows for different runs anyway, even if there were rows written in multi-run experiments.)
Unfortunately, because they are one of the only ways to 'automatically' produce certain forms of output data (like the contents of datasets or histograms) many people try to use them for that (even though they have a pretty un-useful 'internal' format). In general you should write to your own internal database tables for any persistent outputs, where you can also govern whether you store outputs for multiple runs or not (which will require you to calculate some form of unique run IDs and use those in columns to differentiate outputs per run, plus have logic or UI elements to determine when the table data is cleared for a new run and when it isn't).
NB: Note that the kinds of data the model log tables (like flowchart_stats_time_in_state_log which you mention) create can in virtually all cases be determined and created 'manually' via your own model code. That table in particular has a large amount of detail on what's happened in each block and, in any given case, it's probably only a fraction of that data (or a simplification/aggregation of it) that you really want/need.

How to load data into Core Data?

thanks for you help.
I'm attempting to add core data to my project and I'm stuck at where and how to add the actual data into the persistent store (I'm assuming this is the place for the raw data).
I will have 1000 < objects so I don't want to use a plist approach. From my searches, there seems to be xml and csv approaches. Is there a way I can use SQL for input?
The data will not be changed by the user and the data file will be typed in by hand, so I won't need to update these files during runtime, and at this point I am not limited in any type of file - the lightest on syntax is preferred.
Thanks again for any help.
You could load your data from an xml/csv/json file and create the DB on the first lunch of your application (if the DB is not there, then read the data and create it).
A better/faster approach might be to ship your sqllite DB within your application. You can parse the file in any format you want on the simulator, create a DB with all your entities, then take it from the ApplicationData and just add it to your app as a resource.
Although I'm sure there are lighter file types that could be used, I would include a JSON file into the app bundle from which you import the initial dataset.
Update: some folks are recommending XML. NSXMLParser is almost as fast as JSONKit (but much faster than most other parsers), but the XML syntax is heavier than JSON. So an XML bundled file that holds the initial dataset would weight more than if it was in JSON.
Considering Apple considers the format of its persistent stores implementation details, shipping a prefabricated SQLite database is not a very good idea. I.e. the names of fields and tables may change between iOS versions/phones/whatever hidden variable you can think of. You should, in general, not concern yourself with how this serialization of your data is formatted.
There's a brief article about importing data on Apple's developer site: Efficiently Importing Data
You should ship initial data in whatever format you're comfortable with (XML allows you to do incremental parsing efficiently, which reduces memory footprint) and write an import routine to run if you need to import data.
Edit: With EliBud's comment in mind, I still consider the approach a bit "iffy"... The format of the SQLite database used by Core Data is not something you'd want to generate by yourself (it's weird, simply put, and still not something you should really rely on).
So you'd want to use a mock app running on the Simulator and use Core Data to create the database (as per EliBud's answer). But you'd still have to import the data into that mock-app! And while it might make sense to do this once on a "real" computer instead of a lot of times on a mobile device (i.e. copying a file is easy, importing data is hard), you're essentially using the Simulator as an administration tool.
But hey, if it works...

SQLite3: Batch Insert?

I've got some old code on a project I'm taking over.
One of my first tasks is to reduce the final size of the app binary.
Since the contents include a lot of text files (around 10.000 of them), my first thought was to create a database containing them all.
I'm not really used to SQLite and Core Data, so I've got basically two questions:
1 - Is my assumption correct? Should my SQLite file have a smaller size than all of the text files together?
2 - Is there any way of automating the task of getting them all into my newly created database (maybe using some kind of GUI or script), one file per record inside a single table?
I'm still experimenting with CoreData, but I've done a lot of searching already and could not find anything relevant to bringing everything together inside the database file. Doing that manually has proven no easy task already!
Thanks.
An alternative to using SQLite might be to use a zipfile instead. This is easy to create, and will surely safe space (and definitely reduce the number of files). There are several implementations of using zipfiles on the iphone, e.g. ziparchive or TWZipArchive.
1 - It probably won't be any smaller, but you can compress the files before storing them in the database. Or without the database for that matter.
2 - Sure. It's shouldn't be too hard to write a script to do that.
If you're looking for a SQLite bulk insert command to write your script for 2), there isn't one AFAIK. Prepared insert statments in a loop inside a transaction is the best you can do, I imagine it would take only a few seconds (if that) to insert 10,000 records.

coredata vs file access

I have 100s of file which needs to be accessed for displaying the content on iphone. They are all plists.
Which one is faster core data or file access ? which one is secured ?
You have to consider the file size first, a nice rule of thumb found in these boards is, if the file is under 100kB you can store it as an attribute in an entity as a BLOB, if it is greater that that you maybe want to create a ad-hoc entity for it, and in the end if it exceeds 1 MB in size you can access it through the filesystem.
Secondly, you shall evaluate the cost of the operation too, 100 files may appear many but if you access them few times, maybe file access is the way to go, on the other hand if you need that stored information multiple times frequently but you can even create ad hoc entities for Core Data and load the files at start up. And so on.
This is a nice book on Core Data. You can find many guide lines by reading it, but keep in mind also the general guide lines of designing databases.
If they are static files I would recommend pre-loading them into a Core Data SQLite file. That would yield far better performance, especially if you structure your model properly.