PostgreSQL: How do I setup a local server / client environment for initial database experimentation

PostgreSQL: How do I setup a local server / client environment for initial database experimentation - postgresql

I've recently decided to embark on a fun / educational personal project to create some data visualizations and power metrics for my fantasy football league. Since ESPN doesn't provide an API, I've decided to use a combination of elbow grease and the nfldb to pull relevant data (and am hoping to get familiar with Plotly for presenting the data). In setting up nfldb, I'm also getting my first exposure to databases, using postgresql in particular (as required by nfldb).
Since the installation guide provided by nfldb is Linux-centric and assumes a fair bit of previous database experience, I've looked to this guide for help and blindly followed its instructions in hopes of sidestepping postgresql (aka the "just make it work" "solution"). Of course, that didn't work, and I have no idea how to diagnose the problem(s), so I've decided to go ahead and use this opportunity to get a little familiar with databases / postgresql.
I've looked to the postgresql documentation for guidance. Having never worked in a server / client environment, the following text (from "18.1. The PostgreSQL User Account") has me particularly confused:
As with any server daemon that is accessible to the outside world, it is advisable
to run PostgreSQL under a separate user account. This user account should only own
the data that is managed by the server, and should not be shared with other
daemons. (For example, using the user nobody is a bad idea.) It is not advisable
to install executables owned by this user because compromised systems could then
modify their own binaries.
To add a Unix user account to your system, look for a command useradd or adduser.
The user name postgres is often used, and is assumed throughout this book, but you
can use another name if you like.
I'd really appreciate a well annotated version of these paragraphs. How does it apply to someone like me, storing and accessing date on the same machine? Do I need to create a new system user account? How do I make sure it "only owns the data that is managed by the server"? Where is the responsible location to install postgresql? Am I exposed to some sort of security risk by downloading the nfldb database? Why is the user nobody a bad idea?
Relevant: I am using a Mac (v10.11.6) and plan to install (or re-install, if necessary) postgresql using Homebrew.

Related

Get ID for Postgres Server Installation

We want to create a license system where multiple users connected to a Postgres server share the same key.
This means the license key must somehow contain the information which server it was created for, so the key won't work on a second server plus clients.
So I'm searching for an ID that is as unique as possible. So far without much luck.
SELECT version();
Returns a string like the following:
PostgreSQL 9.3.3 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-54), 64-bit
Which is probably a little different for each Postgres installation, but that's not good enough (especially since we want to bundle Postgres with our application).
Is there any information I can access that is a bit more varied, like maybe installation time etc.?

You can, as a_horse_with_no_name notes, use the database system identifier. However:
The sysid does not change when you clone a database via pg_basebackup, SAN snapshots, etc. The Timeline ID may increment, but doesn't always, depending on method used for copying.
The sysid does change if you dump and reload a database to a new instance, even though the database contents are the same.
Many PostgreSQL read/write master instances may have the same sysid if they're all cloned from a common pre-configured postgres in some container template or similar.
It's trivial to regenerate the sysid, or set it to whatever you want it to be.
The sysid is not preserved by pg_upgrade
so personally, I do not recommend using the sysid. If I had to do disaster recovery and restore a dump into a newly initdb'd database and your software locked me out, I'd be finding a new vendor and a lawyer.
It doesn't help that it's not currently accessible via SQL, only by using pg_controldata.
Definitely do not use the version string. It'll be the same for any set of deployments from the same set of packages, distributed binaries, or whatever.
There isn't really anything like what you want in PostgreSQL, and I'm not sure there can be because of the things that people routinely do. Cloning DBs, snapshots and restores, copying a DB for a QA/staging instance, etc, etc.
My personal advice: don't do this. I've had to break systems like this in business emergencies when the license system threatens business continuity. (It's rarely hard). I strongly advocate against tools with active license enforcement when I'm involved in purchasing. Dongles are even worse. Pay attention to the fact that even Microsoft uses "soft" license enforcement, where it helps you track compliance and warns you, but doesn't try to break the system if it thinks you're not in compliance. This is what the legal system is for.
Oh, if you really feel you must do this, Microsoft Windows systems have a SID. But your customers will hate you if they have to reinstall the OS.
Similarly, there's the hardware TPM. But they'll hate you even more if they restore an OS backup image to new hardware and the DB refuses to start.

How to provide my application as a SAAS

I currently have a few apps that I provide to clients by setting them up on their own website. The app uses their own SQL database to record any transactions.
Recently, the number of customers I supply the app to has increased, leading to a higher maintenance work load as each installation must be managed separately.
I'm ready to move to the next level and want to host the app in a single cloud based environment so that I only have to maintain one instance. I would then provide access to that app to each client site, for example embed it in an iframe or perhaps deliver it via a sub-domain. I am not sure about where the DB would sit?
However, this is new territory for me and I'm not sure where to begin. The app is very small and quite simple. I've read a lot of stuff about SAAS but most of it seems quite enterprise level, I'm really looking for a simple and easy to use starting point.
What's the current best practice for this kind of setup and what might be a good guide to read or platform to use?

PostgreSQL - Do I need the command-line tool?

I decided to give PostgreSQL a try. It looks really interesting, but it isn't very user friendly at all.
I got some great help from the PostgreSQL e-mail list, but they insist that the tool to use is the command-line editor (PSQL). Unfortunately, it's a total disaster. When I open it, it opens at least two instances, which soon multiply into a dozen. It also seems to somehow hijack my Apple terminal on my Mac. I can type the same command into two different terminal boxes and get two different results. I don't have a clue what's going on.
Anyway, to get to the point, PostgreSQL is obviously over my head. There's a local PostgreSQL users group that meets once a month - at night, when I'm working. But I'd like to try and make the very beginning of their next meeting and drop them a note. I'd like to hire someone to help me get PostgreSQL set up on my laptop and online, fix whatever the problem is and show me how to create a database and table.
Actually, I've already created a database and table, which I can access with pgAdmin III. But I can't see them with the shell/ terminal. So that's my question: If I can hire someone to get PostgreSQL up and running, will I be able to work with it using pgAdmin III or some other tool, or am I going to be chained to the shell (PSGL)?
If the shell/terminal is indispensable, then I think I'm going to abandon it. It looks like a great program, but I just don't have time to jump through all the hoops right now.

You don't need psql to use PostgreSQL. Many experienced users prefer it, but you can use nothing but PgAdmin and get by just fine. That's what a great many users do.
PostgreSQL is fine on Mac OS X. A number of core developers use Mac OS X and do their development on it. Much like many MySQL users use phpmyadmin, etc, and never use the mysql command line tool.
There were some packaging issues on OS X related to Apple's bundling of PostgreSQL but those are resolved in more recent versions of OS X.
There are also some challenges with different packages of PostgreSQL on OS X - EnterpriseDB, Macports, Homebrew, etc. But those are mostly a matter of documentation and user misunderstanding; each package is in its self just fine. Similar issues exist on Linux, where OS packages, PGDG packages, and EDB's packages can tread on each others' toes.
Characterising the PostgreSQL community as a "Microsoft/Linux fan club" is hilarious, by the way. Windows is tolerated at best by most of the core devs and users on the mailing list.
It's really hard to tell what problem you're encountering based on the description given. Maybe you have multiple different PostgreSQL packages installed, so you have more than one server instance, and are getting them confused? Similarly, I can't tell what's going on with the psql terminal link in the dock. I'd ignore it and use psql from the usual Apple Terminal.app if you want to use psql. Otherwise just use PgAdmin.
One area where you will run into trouble is that because most experienced users use psql, if you ask questions specific to PgAdmin or other tools, rather than PostgreSQL its self, people will pretty much shrug their shoulders and say "dunno, but you do it like in psql". I haven't used PgAdmin for my own stuff in years, and have to go hunting around in the manual if I want to figure out how to do something so I can explain it to somebody. Moreso with things like Navicat, which I've never used at all. The people who use those tools are usually not the ones spending their time helping other people out, so you get help from experienced and enthusiastic users who're also the ones most likely to use the expert-oriented tools.
Relevant link: http://phili.pe/posts/postgresql-on-the-command-line/

I've been using DBeaver to write and execute my Postgres queries because I don't like neither psql nor PgAdmin. Not that DBeaver is without its faults but at least it has decent code completion and an easy way to switch databases / schemas. In the end it's also about what you're used to and I guess coming form SQL Server with Management Studio I found this an easier way in into Postgres.
It's true that in most Postgres books psql is the standard but I usually try to convert the psql specific code to queries I can run in DBeaver, which is usually an interesting (although somewhat frustrating at times) exercise...

Oracle SQL Developer: Tables?

I just installed my oracle database and made some tables for school. My problem is that it looked kind of different because the one we use in school had no tables(clean and empty) but the one that I had installed is filled with tables that I do not understand(I did not make them either).
It is a new install I got from the official website and it had tables named like: AQ$_INTERNET_AGENT_PRIVS, AQ$QUEUES and much more. I have no idea where they come from and every time I go look at my tables I just get confused because of all these things mixed in.
Is it safe to remove them or are they important enough to keep? If removing them is a bad idea, what do I have to do so that I don't see them anymore and all that is listed are the tables that I have created myself?

Those are system tables, you're probably logging in to an account with DBA privileges. Create yourself a new schema (user_id/password) and don't give yourself dba privileges, then you can remain as ignorant as you want about what Oracle is doing under the hood.

How to use isolated development database in shared Visual Studio solution

I'm leading a small software development team (4 people), and have just broken ground on a source-controlled SQL Server 2008 database project, with isolated development databases for each developer. I'm still implementing this one step at a time, but I'm envisioning each developer having their own database, with a naming scheme something like <ProjectName>_DEVELOPMENT_<TFSUserName>. This was all recommended per the MSDN articles I've been reading, but someone let me know if that sounds way off.
Anyway, we have a shared application solution that we've been developing for some time. In the past, we had no database version control, and just modified our database directly from SQL Server Management Studio when new reference data needed to be populated, or when we were testing functionality -- one change immediately affected everyone else. So with this new change, I'm wondering what the best way would be to have each person connect to their isolated development databases from the application solution. Prior to isolated databases, our connection to the database was specified in our application's web.config as a connection string. If we're each going to have our own database, the only way I can see it working is for each developer to set their connection string in their local solution to point to their personal database. But changing the web.config will check out that file in the solution, so developers will always have to specifically uncheck that file when checking in application changes to the baseline. Is there a less clunky way for each developer to use their isolated database when doing application testing?

I recommend that you not make the database names username-specific. Instead make the database the same name for each developer and always reference it via localhost (localhost\<ProjectName>_DEVELOPMENT). Then the same connection string will work for every developer.
MSDN's suggestion to use username-specific databases is better for a shared development environment. It's definitely not ideal for a localized environment.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse