I am wondering if there is a way to import data from an HTTP source from within an pgsql function.
I am porting an old system that harvests data from a website. Rather than maintaining a separate set of files to manage the downloading of the data, I was hoping to put the import routines directly into stored procedures.
I do know how to import data with COPY, but that requires the data to already be available locally. Is there a way to get the download the data with PL/PGSQL? Am I out to lunch?
Related: How to import CSV file data into a PostgreSQL table?
Depending what you're after, the Postgres extension www_fdw might work for you: http://pgxn.org/dist/www_fdw/
If you want download custom data by HTTP protocol, then PostgreSQL extensive support for different languages might be handy. Here is the example of connecting to Google Translate service from Postgres function written in Python:
https://wiki.postgresql.org/wiki/Google_Translate
Related
I know it can be quite odd question, but I was wondering if there's a tool that allows me to convert an Access DB to PostgreSQL with the table structure... I've found some third-party tools as dBForce that import data but not the structure.
Thanks
I'm interested in storing long term static data outside of the database, ideally in compressed files that are dynamically uncompressed when accessed. I am currently using the existing file_fdw for some purposed, but would really like to be able to compress the data.
We currently use 9.3.
There seems to be such a wrapper here. It requires Multicorn, so you'll have to install that first.
I have not tried it and don't know how well it works.
Alternatively, did you consider using compression at the storage level?
I'm a Neo4j new user and I played around with the webadmin interface of Neo4j to create small databases and simple queries in Cypher. Now I want to use Neo4J to create a graph with my existing database. It's a postgresql database with millions of entries with the same structure (Neo4J is very adapted to represent these data). My question is how to import these data ? What is the easiest way to do that ? I already saw that Cypher recognizes csv files but do I have to create a csv file with my data or is there another way to import them ? Thank you for your help. Sam
One option is to export your postgres data to csv and apply LOAD CSV to import them into the graph.
Another way is writing a script in a language of choice (I'd vote for groovy here) that connects to Postgres using JDBC and connects to Neo4j and then applies the business logic to transform between the two.
A third option is using a ETL tool like Talend. It basically does the same as your custom script but provides a point & click interface to define the transformation, see http://neo4j.com/blog/fun-with-music-neo4j-and-talend/ for more details.
I've got some csv data to import to cassandra. This could work with the copy-command. The Problem is, that the csv doesn't serve a unique ID for the data so I need to create a timeuuid on import.
Is it possible to do this via copy-command or did I need to write a external script for importing?
I would write a quick script to do it, the copy command can really only handle small amounts of data anyway. Try the new python driver. I find it quite fast to setup loading scripts with, especially if you need any sort of minor modifications of the data before being loaded.
If you have a really big set of data bulk-loading is still the way to go.
Is there a convenient, open-source method to generate a SAS XPORT Transport Format (xpt) file from a postgreSQL database for FDA submission?
I have checked the FDA specifications, available at http://www.fda.gov/downloads/ForIndustry/DataStandards/StudyDataStandards/UCM312964.pdf
These state that 'SAS XPORT transport files can be converted to various other formats using commercially available off the shelf software', but no software packages other than SAS are suggested.
The specifications for an SAS XPORT file are available at http://support.sas.com/techsup/technote/ts140.html
I have checked OpenClinica (which is the EDC software we are using), PGAdmin3 and AM (which can import .xpt files, but I didn't find an export method)
Easy way? Not that I know of. I think one way or another it will take some development work.
My recommendation is to do it as follows:
Write a user-defined function/stored procedure for pulling the data you need for each section.
Write a user-defined function to pull this data from each section and arrange it into an XML file. TheXML functions are likely to come in handy for this.
Of course you could also put the xml conversion in an arbitrary front-end. However, in general, you will find that the design above forces you to push everything into set-based logic which is likely to be more powerful in your case.
If you don't mind using Python, my XPORT module can write xpt files. https://github.com/selik/xport
If you have trouble using it, write me a note and I'll try to help. https://github.com/selik/xport/issues