AWS RDS PostgreSQL - copying from/to csv files on EC2 instance - postgresql

I've run into problem that I can't fix for a few days.
The thing is - I have following architecture:
Two EC2 instances which are nodes running Trifacta application (some kind of application for data scientists),
AWS RDS PostgreSQL instance.
Since the newest version this Trifacta application is using new schema in database which performs some database migrations at the start of application. During the startup some tables are copied into *.csv files and then copied back into tables from this *csv files.
It's all okay when it's run on local database because superuser role in postgresql allows for such actions.
When it comes to performing it on AWS RDS PostgreSQL instance it falls in following errors:
Error running query COPY (select "id" from workspaces) TO '/tmp/workspaces.csv' DELIMITER ',' CSV HEADER; error: must be superuser to COPY to or from a file
at Connection.parseE (/opt/trifacta/migration-framework/node_modules/pg/lib/connection.js:614:13)
at Connection.parseMessage (/opt/trifacta/migration-framework/node_modules/pg/lib/connection.js:413:19)
at Socket.<anonymous> (/opt/trifacta/migration-framework/node_modules/pg/lib/connection.js:129:22)
at Socket.emit (events.js:315:20)
at addChunk (_stream_readable.js:295:12)
at readableAddChunk (_stream_readable.js:271:9)
at Socket.Readable.push (_stream_readable.js:212:10)
at TCP.onStreamRead (internal/stream_base_commons.js:186:23) {
length: 178,
severity: 'ERROR',
code: '42501',
detail: undefined,
hint: "Anyone can COPY to stdout or from stdin. psql's \\copy command also works for anyone.",
position: undefined,
internalPosition: undefined,
internalQuery: undefined,
where: undefined,
schema: undefined,
table: undefined,
column: undefined,
dataType: undefined,
constraint: undefined,
file: 'copy.c',
line: '905',
routine: 'DoCopy'
}
It's just first one, there are a lot of them. I made a research on that and figured why it's happening. AWS is using rds_superuser role instead of superuser and privilleges of this role aren't sufficient for copying from/to local filesystem.
From psql console it can be done with using \copy instead of copy but in my case it isn't any helpful because the way Trifacta does it is executing SQL queries from their *.js files and as far as I know it isn't possible to run \copy query from anywhere else than psql CLI.
With a suggestion of #IMSoP I am adding the code of Trifacta *.js file where the actions are performed:
ConnectUtils.copyQuery = function(query, connection, options = {}) {
ensure.notNull(connection.base.DriverName, 'connection driver name');
ensure.notNull(options.tableName, 'table name');
const table = options.tableName;
const filePath = ConnectUtils.getOutputFilePath(table, options);
if (connection.base.DriverName === DATABASE_JS_TYPE[MYSQL]) {
return `${query} INTO OUTFILE \'${filePath}\' FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"' LINES TERMINATED BY '\n'`;
} else if (connection.base.DriverName === DATABASE_JS_TYPE[POSTGRES]) {
return `COPY (${query}) TO '${filePath}' DELIMITER ',' CSV HEADER;`;
} else if (connection.base.DriverName === DATABASE_JS_TYPE[SQLITE]) {
return query;
}
return;
};
ConnectUtils.loadQuery = function(connection, options = {}) {
ensure.notNull(connection.base.DriverName, 'connection driver name');
ensure.notNull(connection.base.Database, 'connection database');
ensure.notNull(options.tableName, 'table name');
const table = options.tableName;
const filePath = ConnectUtils.getOutputFilePath(table, options);
if (connection.base.DriverName === DATABASE_JS_TYPE[MYSQL]) {
return `LOAD DATA INFILE \'${filePath}\' INTO TABLE ${
connection.base.Database
}.${table} FIELDS TERMINATED BY ',' ENCLOSED BY '\"' LINES TERMINATED BY '\n' IGNORE 1 ROWS;`;
} else if (connection.base.DriverName === DATABASE_JS_TYPE[POSTGRES]) {
return `COPY ${table} FROM '${filePath}' DELIMITER ',' CSV HEADER;`;
}
return;
};
${filePath} is path on EC2 instance and ${table} are the tables on AWS RDS EC2 instance. From your answers before editing my question I assume there is no way to workaround this as this script is trying to reach ${filePath} as a path on AWS RDS instance. Right?
Thanks for reading.

Related

I want to insert with mikro-orm, but it dont find my table :c (TableNotFoundException)

So
Console:
yarn dev
yarn run v1.22.10
$ nodemon dist/index.js
[nodemon] 2.0.7
[nodemon] to restart at any time, enter `rs`
[nodemon] watching path(s): *.*
[nodemon] watching extensions: js,mjs,json
[nodemon] starting `node dist/index.js`
[discovery] ORM entity discovery started, using ReflectMetadataProvider
[discovery] - processing entity Post
[discovery] - entity discovery finished, found 1 entities, took 21 ms
[info] MikroORM successfully connected to database postgres on postgresql://postgres:*****#127.0.0.1:5432
[query] begin
[query] insert into "post" ("created_at", "title", "updated_at") values ('2021-04-05T21:04:23.126Z', 'my first post', '2021-04-05T21:04:23.126Z') returning "_id" [took 12 ms]
[query] rollback
TableNotFoundException: insert into "post" ("created_at", "title", "updated_at") values ('2021-04-05T21:04:23.126Z', 'my first post', '2021-04-05T21:04:23.126Z') returning "_id" - relation "post" does not exist
at PostgreSqlExceptionConverter.convertException (P:\.Projektek\lireddit-server\node_modules\#mikro-orm\postgresql\PostgreSqlExceptionConverter.js:36:24)
at PostgreSqlDriver.convertException (P:\.Projektek\lireddit-server\node_modules\#mikro-orm\core\drivers\DatabaseDriver.js:194:54)
at P:\.Projektek\lireddit-server\node_modules\#mikro-orm\core\drivers\DatabaseDriver.js:198:24
at processTicksAndRejections (internal/process/task_queues.js:93:5)
at async PostgreSqlDriver.nativeInsert (P:\.Projektek\lireddit-server\node_modules\#mikro-orm\knex\AbstractSqlDriver.js:150:21)
at async ChangeSetPersister.persistNewEntity (P:\.Projektek\lireddit-server\node_modules\#mikro-orm\core\unit-of-work\ChangeSetPersister.js:55:21)
at async ChangeSetPersister.executeInserts (P:\.Projektek\lireddit-server\node_modules\#mikro-orm\core\unit-of-work\ChangeSetPersister.js:24:13)
at async UnitOfWork.commitCreateChangeSets (P:\.Projektek\lireddit-server\node_modules\#mikro-orm\core\unit-of-work\UnitOfWork.js:496:9)
at async UnitOfWork.persistToDatabase (P:\.Projektek\lireddit-server\node_modules\#mikro-orm\core\unit-of-work\UnitOfWork.js:458:13)
at async PostgreSqlConnection.transactional (P:\.Projektek\lireddit-server\node_modules\#mikro-orm\knex\AbstractSqlConnection.js:53:25)
at async UnitOfWork.commit (P:\.Projektek\lireddit-server\node_modules\#mikro-orm\core\unit-of-work\UnitOfWork.js:183:17)
at async SqlEntityManager.flush (P:\.Projektek\lireddit-server\node_modules\#mikro-orm\core\EntityManager.js:486:9)
at async SqlEntityManager.persistAndFlush (P:\.Projektek\lireddit-server\node_modules\#mikro-orm\core\EntityManager.js:438:9)
previous error: insert into "post" ("created_at", "title", "updated_at") values ('2021-04-05T21:04:23.126Z', 'my
first post', '2021-04-05T21:04:23.126Z') returning "_id" - relation "post" does not exist
at Parser.parseErrorMessage (P:\.Projektek\lireddit-server\node_modules\pg-protocol\dist\parser.js:278:15)
at Parser.handlePacket (P:\.Projektek\lireddit-server\node_modules\pg-protocol\dist\parser.js:126:29)
at Parser.parse (P:\.Projektek\lireddit-server\node_modules\pg-protocol\dist\parser.js:39:38)
at Socket.<anonymous> (P:\.Projektek\lireddit-server\node_modules\pg-protocol\dist\index.js:10:42)
at Socket.emit (events.js:315:20)
at Socket.EventEmitter.emit (domain.js:467:12)
at addChunk (internal/streams/readable.js:309:12)
at readableAddChunk (internal/streams/readable.js:284:9)
at Socket.Readable.push (internal/streams/readable.js:223:10)
at TCP.onStreamRead (internal/stream_base_commons.js:188:23) {
length: 166,
severity: 'ERROR',
code: '42P01',
detail: undefined,
hint: undefined,
position: '13',
internalPosition: undefined,
internalQuery: undefined,
where: undefined,
schema: undefined,
table: undefined,
column: undefined,
dataType: undefined,
constraint: undefined,
file: 'd:\\pginstaller_13.auto\\postgres.windows-x64\\src\\backend\\parser\\parse_relation.c',
line: '1376',
routine: 'parserOpenTable'
}
Index.ts:
import { MikroORM } from "#mikro-orm/core";
import { __prod__ } from "./constants";
import { Post } from "./entities/Post";
import mikroConfig from "./mikro-orm.config";
const main = async () => {
const orm = await MikroORM.init(mikroConfig);
await orm.getMigrator().up;
const post = orm.em.create(Post, { title: "my first post" });
await orm.em.persistAndFlush(post);
};
main().catch((err) => {
console.error(err);
});
Post.ts:
import { Entity, PrimaryKey, Property } from "#mikro-orm/core";
#Entity()
export class Post {
#PrimaryKey()
_id!: number;
#Property({ type: "date" })
createdAt = new Date();
#Property({ type: "date", onUpdate: () => new Date() })
updatedAt = new Date();
#Property({ type: "text" })
title!: string;
}
mikro-orm.config.ts:
import { __prod__ } from "./constants";
import { Post } from "./entities/Post";
import { MikroORM } from "#mikro-orm/core";
import path from "path";
export default {
migrations: {
path: path.join(__dirname, "./migrations"),
pattern: /^[\w-]+\d+\.[tj]s$/,
},
entities: [Post],
dbName: "postgres",
debug: !__prod__,
type: "postgresql",
password: "hellothere",
} as Parameters<typeof MikroORM.init>[0];
And the migration I created with npx mikro-orm migration:create:
import { Migration } from '#mikro-orm/migrations';
export class Migration20210405205411 extends Migration {
async up(): Promise<void> {
this.addSql('create table "post" ("_id" serial primary key, "created_at" timestamptz(0) not null, "updated_at" timestamptz(0) not null, "title" text not null);');
}
}
After that im compiling it to js btw, but I guess the problem will be somewhere at my code or idk plz help me, I can give you more info just plz help, I've been trying to fix this bug for 5 hours :/
Btw Im doin Ben Awad's 14 hour fullstack tutorial if its matter.
The TableNotFoundException happens when you try to add data before initializing the table's schema (or structure).
Passing the --initial as in Mosh's Answer did not work for me, possibly because I am passing a username and password in ./mikro-orm.config.ts.
I used Mikro-ORM's SchemaGenerator to initialize the table as seen here in the official docs.
Add the following lines before adding data to post in your main function in index.ts:
const generator = orm.getSchemaGenerator();
await generator.updateSchema();
The main function in index.ts should now look like this:
const main = async () => {
const orm = await MikroORM.init(mikroConfig);
await orm.getMigrator().up;
const generator = orm.getSchemaGenerator();
await generator.updateSchema();
const post = orm.em.create(Post, { title: "my first post" });
await orm.em.persistAndFlush(post);
};
updateSchema creates a table or updates it based on .entities/Post.ts. This could cause issues when the Post file is updated, I haven't run in to any while following Ben's tutorial. Although, I'd still recommend creating ./create-schema.ts and running it when needed as shown in the official docs.
I have had the same issue. This is what I did:
I deleted the migrations folder as well as the dist folder
I ran npx mikro-orm migration:create --initial
After that, I restarted yarn watch and yarn dev and it worked for me.
Notice the --initial flag. I would recommend to check the official documentation. The migrations table is used to keep track of already executed migrations. When you only run npx mikro-orm migration:create, the table will not be created and therefore MikroORM is unable to check if the migration for the Post entity has already been performed (which includes creating the respective table on the database).
Ben does not use the --initial flag in his tutorial, he might have already ran it prior to the tutorial.
I had a similar problem myself (Also doing Ben Awad's tutorial).
I used Mikro-ORM's schema generator to initialize the table like in Fares47's Answer, but the problem still persisted.
It wasn't until I set my user to have Superuser permissions that it started working.
I am using postgresql for my data base which I downloaded using homebrew. If you have a similar set up here is what I did:
Start up psql in your terminal using psql postgres. If you want, you can view your users and check their permissions by typing \du in the shell. Then, to change the permissions for a user use the command ALTER ROLE <username> WITH SUPERUSER;. Make sure you include a semi-colon or else it will not run the command.
Check this article out for more info on psql commands.
I have the same problem i solved by install the ts-node on project
npm i -D ts-node
and set useTsNode on package.json as true.
The problem is the mikro-orm cli only add ts files paths in configPaths if the property useTsNode is true and ts-node is installed.
orther problem that i have is the regex in pattern property in mikro-orm.config.ts was wrong because a typo.
If any of the suggested steps didnt solve it for you, simply...
quit yarn watch and yarn dev
run this command from the command line
npx mikro-orm migration:up
now restart watch and dev and it you should be good.
from https://mikro-orm.io/docs/migrations/#migration-class
I also experienced this. And like Fares47 said it's possibly because I passed the username and password in ./mikro-orm.config.ts.
And my solution is simply execute the sql command that generated in ./src/migrations/Migration<NUMBERS>.ts file in postgresql terminal.
Here is the command that I execute in the database,
create table "post" ("id" serial primary key, "created_at" timestamptz(0) not null, "updated_at" timestamptz(0) not null, "title" text not null);
Just like what they suggested in the doc,
A safe approach would be generating the SQL on development server and
saving it into SQL Migration files that are executed manually on the
production server.

Knex cannot find table in Cloud SQL Postgres from Cloud Functions

I am trying to connect to a Postgres 12 DB running in Cloud SQL from a Cloud Function written in TypeScript.
I create the database with the following:
import * as Knex from "knex"
const { username, password, instance } = ... // username, password, connection name (<app-name>:<region>:<database>)
const config = {
client: 'pg',
connection: {
user: username,
password: password,
database: 'ingredients',
host: `/cloudsql/${instance}`,
pool: { min: 1, max: 1}
}
}
const knex = Knex(config as Knex.Config)
I am then querying the database using:
const query = ... // passed in as param
const result = await knex('tableName').where('name', 'ilike', query).select('*')
When I run this code, I get the following error in the Cloud Functions logs:
Unhandled error { error: select * from "tableName" where "name" ilike $1 - relation "tableName" does not exist
at Parser.parseErrorMessage (/workspace/node_modules/pg-protocol/dist/parser.js:278:15)
at Parser.handlePacket (/workspace/node_modules/pg-protocol/dist/parser.js:126:29)
at Parser.parse (/workspace/node_modules/pg-protocol/dist/parser.js:39:38)
at Socket.stream.on (/workspace/node_modules/pg-protocol/dist/index.js:10:42)
at Socket.emit (events.js:198:13)
at Socket.EventEmitter.emit (domain.js:448:20)
at addChunk (_stream_readable.js:288:12)
at readableAddChunk (_stream_readable.js:269:11)
at Socket.Readable.push (_stream_readable.js:224:10)
at Pipe.onStreamRead [as onread] (internal/stream_base_commons.js:94:17)
I created the table using the following commands in the GCP Cloud Shell (then populated with a data from a CSV):
\connect ingredients;
CREATE TABLE tableName (name VARCHAR(255), otherField VARCHAR(255), ... );
In that console, if I run the query SELECT * FROM tableName;, I see the correct data listed.
Why does Knex not see the table: tableName, but the GCP Cloud Shell does?
BTW, I am definitely connecting to the correct db, as I see the same error logs in the Cloud SQL logging interface.
Looks like you are creating the table tableName without quoting, which makes it actually lower case (case insensitive). So when creating schema do:
CREATE TABLE "tableName" ("name" VARCHAR(255), "otherField" VARCHAR(255), ... );
or use only lower-case table / column names.

Mocha tests can't connect to postgres database, using knex

I am trying to implement some integration tests using mocha for functions that interact with a postgres database through knex, in a nodejs express app.
The functions work outside of mocha - I can start the app in node or nodemon, submit requests through Postman, retrieve records from the database, add new records, etc. But when I try to test the code through mocha, I get errors like the following for any functions that try to access the database:
select * from "item" where "user_id" = $1 - relation "item" does not exist
The environment variable for the database connection is set to connect to the right database; when I manually test the app end-to-end, everything works; I get data back from the database.
I've included what I think are the relevant code snippets below: the test script for one of the tests that won't work, the function I'm trying to test, and the modules that that function relies on.
TEST SCRIPT
const Item = require('../db/item');
const chai = require('chai');
const chaiAsPromised = require('chai-as-promised');
// set up the middleware
chai.use(chaiAsPromised);
var should = require('chai').should()
describe('Item.getByUser', function() {
contex`enter code here`t('With valid id', function() {
const item_id = 1;
const expectedResult = "Canoe";
it('should return items', function() {
return Item
.getByUser(item_id)
.then(items => {
items[0].name.should.equal(expectedResult);
});
});
});
SNIPPET FROM THE ITEM.GETBYUSER FUNCTION:
const knex = require('./connection');
module.exports = {
getByUser: function(id) {
return knex('item').where('user_id', id);
},
SNIPPET FROM THE CONNECTION MODULE:
require('dotenv-safe').config();
const environment = process.env.NODE_ENV || 'development';
const config = require('../knexfile')[environment];
module.exports = require('knex')(config);
SNIPPET FROM THE KNEXFILE MODULE:
module.exports = {
development: {
client: 'pg',
connection: process.env.DATABASE_URL
},
production: {
client: 'pg',
connection: process.env.DATABASE_URL
}
};
The error message I get for the above test is:
1) Item.getByUser
With valid id
should return items:
select * from "item" where "user_id" = $1 - relation "item" does not exist
error: relation "item" does not exist
at Connection.parseE (node_modules\pg\lib\connection.js:567:11)
at Connection.parseMessage (node_modules\pg\lib\connection.js:391:17)
at Socket.<anonymous> (node_modules\pg\lib\connection.js:129:22)
at addChunk (_stream_readable.js:284:12)
at readableAddChunk (_stream_readable.js:265:11)
at Socket.Readable.push (_stream_readable.js:220:10)
at TCP.onStreamRead [as onread] (internal/stream_base_commons.js:94:17)
Okay, it's looking like this was in fact related to the environment variable for the database in some way I can't quite figure out. Although I had the database connection set to 'postgres://localhost/mydatabase' the database actually being used when I tested the db live, including migrating and seeding the database through knex commands, was 'postgres://localhost/username' - a database with the same name as the Owner of 'mydatabase'. But I think the mocha tests were trying to connect to mydatabase, which was still empty at that point, since the migrate and seed affected the the database 'username.'
So, I think this can be closed. I'll try to replicate the problem where I was connected to the wrong db; not sure how that could have happened, as I never intentionally set the connection, or any environment variable to 'postgres://localhost/username'.

Difficulty connecting to PostgreSQL#localhost using node-postgress (Error:28000)

I'm currently running openSuse on an rPi3B+ (aarch64) and have hit a wall running a NodeJS connection script.
I went through the standard install of PostgreSQL (v10 is what is offered on this version of openSuse) then created a new role with
CREATE ROLE new_role LOGIN PASSWORD 'passwd';
and then a db with
CREATE DATABASE new_db OWNER new_role;
Both the \l & \du return the expected outputs show that both the role and db have been created successfully with the correct owner.
So I then quickly created a node project directory and copied the test script from the docs: https://node-postgres.com/features/connecting
const { Pool, Client } = require('pg')
const connectionString = 'postgresql://new_role:passwd#localhost:5432/new_db'
const pool = new Pool({
connectionString: connectionString,
})
pool.query('SELECT NOW()', (err, res) => {
console.log(err, res)
pool.end()
})
const client = new Client({
connectionString: connectionString,
})
client.connect()
client.query('SELECT NOW()', (err, res) => {
console.log(err, res)
client.end()
})
This returns a few broken promise errors that I haven't caught(.cath()) correctly yet, and an error code of 28000 that looks like this:
{ error: Ident authentication failed for user "new_role"
at Connection.parseE (/home/eru/postgresDB/node_modules/pg/lib/connection.js:554:11)
at Connection.parseMessage (/home/eru/postgresDB/node_modules/pg/lib/connection.js:379:19)
at Socket.<anonymous> (/home/eru/postgresDB/node_modules/pg/lib/connection.js:119:22)
at Socket.emit (events.js:182:13)
at addChunk (_stream_readable.js:283:12)
at readableAddChunk (_stream_readable.js:264:11)
at Socket.Readable.push (_stream_readable.js:219:10)
at TCP.onStreamRead [as onread] (internal/stream_base_commons.js:94:17)
name: 'error',
length: 99,
severity: 'FATAL',
code: '28000',
detail: undefined,
hint: undefined,
position: undefined,
internalPosition: undefined,
internalQuery: undefined,
where: undefined,
schema: undefined,
table: undefined,
column: undefined,
dataType: undefined,
constraint: undefined,
file: 'auth.c',
line: '328',
routine: 'auth_failed' } undefined
So I'm pretty sure the attempt made it to the intended port otherwise I wouldn't have received the detailed error in terminal. The error code = invalid_authorization_specification
Is the there something I need to do on the server ,psql interface, that will fulfill the authorization specification?
When I've looked into that specific one I can't seem to find useful search results relevant to my situation.
Fairly new to postgres here so I'm sure this is a pretty noob mistake that I'm missing but any help or input is very appreciated!
Found an answer after a little more digging here: error: Ident authentication failed for user
Ended up editing my pg_hba.conf from the ident method to md5
This is rather crude because I don't really understand what I changed aside from telling postgreSQL to check the md5 encrypted password instead of checking if my username matched the roles created on the server.
If anyone has a proper explanation for what's changed and why I'm all ears.

Granting privilige in postgres table

I am absolutely new to postgresSQL database. Using PhpPgAdmin, I was able to create database, user and a table. I am trying to insert a row into the table in my php file with the following codes:
$db = pg_connect( "$host $port $dbname $credentials" );
if($db){
$psql = "INSERT INTO LOGINS (mid, name,ip,date) VALUES ($usid,'$naam','$ipad','$dte')";
$ret = pg_query($db, $psql);
$tot = pg_affected_rows($ret);
}
I am getting the error:
Warning: pg_query(): Query failed: ERROR: permission denied for relation..
I understand that some privileges are to be declared, but where and how?
Use GRANT to give privileges to users in Postgres