Transferring models between two PCs via PostgresSQL database - postgresql

I have two PCs that want to share tensorflow models "hdf5 format" in a federated learning manner via a PostgresSQL database.
The models will be trained locally on both machines, and then transferred to the database along with the training history. The transfer will be done for multiple cycles in a specific schedule.
I searched online for solutions to transfer the files via PostgresSQL database, but all solutions suggest a tabulated data transfer, e.g. csv file data, not arbitrary file extensions, like hdf5.
Can anyone help me, even with a roadmap, for the solution?
If any tutorials or examples for similar scenarios would be suggested, that would be also appreciated.
Thanks for your help in advance!

Related

transferring data from DMS

As mentioned in the title, I want to transfer data, such as PDF files, from a document management system (DMS) to a directory on my server.
I'm unsure how I should approach the problem however.
I thought to use ETL Talend, but I don't think it offer components to deal with my problem.
Alternatively, I was wondering if FileZilla could help me with my problem.

Suggestions for allowing ad-hoc queries against postgresql via simple web service

I have a PostgreSQL 9.3 two node cluster with warm-standby (read-only) slave. There are around 30 individual databases with a few hundred total tables and 1.3 TB of raw data. I'd really like the internets to have full access to these tables and allow folks to write arbitrary queries against. The main reason is my ignorance and incompetence with setting up useful things like REST services, etc...
So I suppose one approach would be to simply allow postgresql tcp connections to the warm-standby host as a user with very limited SELECT perms and perhaps that is what I should do?
Another approach would be to have some simple JSON(P) emitting service that simply takes a database and query string, then returns results?
And I suspect you'll have a better approach, so that's why I am here :)
In general, I am not worried if the internets overrun this host with load and DOS's it. I just don't want it to become a security liability or have some method to delete data on the warm-standby host. The machine would be there for use and if there are naughty users, too bad for the others I guess. If it gets popular, I could setup more readonly hosts, anyway...
Thanks in advance for your thoughts and for those that say I just need to grit my teeth and figure out how to properly provide web services for the data. My main languages are PHP and python, so if you have ideas of tools for those languages...
There is a site: SQL Fiddle that allows simple querying of different databases. Its code is open sourced and available on github here.
You can try to adapt the code to your needs.

Heroku database backup storage when database is not active

I'm considering using Heroku as a platform for a project I'm working on. This project will have many independent databases (postgres). Each database will spin up when someone is using it, then save the data to a dump file and spin down when no one is logged on (if all these databases are always active it will be colossally expensive).
Unfortunately I have no experience with Heroku and their documentation has an annoying marketing slant to it--I can't figure out if this is possible. How do I pay for the storage of backups? Is it possible to store backups without an associated running database?
My alternative is to build this on Amazon, but I'd rather not do all this engineering myself.
Many thanks in advance.
The Postgres Schemas approach might fit well for your example of multi-tenacy.
This Blog Post and RailsCast might help you further.
Spinning up multiple databases sounds like fighting the defaults, for which concerns?

Is there any GIS extension for Apache Cassandra?

I want to use Cassandra for my web application, because it will manage a lot information. The problem is that it will also handle a lot of geographical data, so I need a GIS (http://en.wikipedia.org/wiki/Geographic_information_system) cassandra extension to capture, store, manipulate, analyze, manage, and present all types of geographical data.
Something like PostGIS for PostgreSQL. Does it already exists? Something simillar? Any suggestions?
Thanks for your help in advance :)
Well, one of our clients at PlayOrm(a client on top of cassandra with it's own command line client) is heavy into GIS so we are going to be adding features to store GIS data though I think they already exist. I meet with someone next week regarding this so in the meantime, you may want to checkout PlayOrm.
Data will be read from cassandra and displayed on one fo the largest monitors I have seen with some big huge machine(s) backing it with tons of graphic's cards....pretty cool setup.
Currently, PlayOrm does joins and Scalable-SQL but it is very likely we will be adding spatial queries to the mix if we continue to do GIS data.

Automatically generated test data to a DB from a schema?

I have a discussion-db, and I need a great amount of test data, for different sized samples. Please, see the ready SELECT, JOIN and CREATE-queries, please scroll down in the link.
How can I automatically generate test data to the db?
How to generate test data in different sized samples?
Is there some ready tool?
Here are a couple of suggestions for free tools that generate test data:
Databene Benerator: supports many JDBC-capable database brands, uses XML format compatible with DbUnit, GPL license.
Super Smack: originally a load-test tool for MySQL, it also supports PostgreSQL and it includes a generator of mock data.
A current version of Super Smack appears to be available here
I asked a similar question here on StackOverflow in February, and the two choices above seemed like the best options.
I know this question is super dated, but I was looking for the answer to this exact question today and I came across this:
http://wiki.postgresql.org/wiki/Sample_Databases
Out of the options listed (including built in tools like pgbench), pgFoundry has several compelling options that works perfectly for the test cases I am working on.
I thought it might help someone like me, so there it is.
I'm not sure how to get automatically generated data and insert it into the database (I'm sure you could pull it off with a python script or something), but if you're just looking for endless blabbering to stick into a db, this should be helpful.
I'm not a postres person, but in many other DBs I've used, a simple mechanism for generating large quantities of test data is a cross join. The technique is particularly useful for generating large quantities of test data.
Here's a nice blog post on it (SQL Server specific though).