Postgresql compiled from source versus ubuntu package - postgresql

What are the advantages/disadvantages of using postgresql compiled from source compared to the ubuntu postgresql package? Which of the two are recommended to be used on a live production environment?
Thanks in advance.

I'd recommend distribution provided package over self-compiled one. You'll get automatic security updates from your distribution, sane file locations and ability to verify or restore files using your package manager. You'll be able to rollback failed update fast using old package, etc.
If you'll compile yourself you'll have to very often check if there is new version available with security updates, you'll forget what options did you use for compilation and if you mismatch them then an update can make your data unreadable. You won't be available (on vacation) when new security update is published and your update will be late. Or you'll forget to update at all. You're lazy and you'll have to work more.

I know this is an old question, but there is one very real advantage to building from source. If you find a bug, submit it to the pg mailing lists, and get a patch you can apply it within a few hours, easily. I know this, because I've done it twice in the last two years in production.

The only advantage with compiling it yourself is that you could optimize the build yourself with different features and/or compiling modes. But on the other hand this means your build is much less tested then one distribution via the package system. So doing it yourself might end up not being so great. Plus updating/redoing it as Tometzky mentioned, end up being a lot more work. Unless it's very needed to build it yourself, then don't.
(this is not specific to postgresql, but everything in a production environment)

Related

The Science of Installation

I have minimal exposure to RPM, Windows installer mechanics, and WIX. That said, I'm interested in making a cross-platform installer tool (Linux, Windows) that supports upgrading and downgrading (versiona and patches) of my own product. I don't believe this is a topic to be approached lightly; I would like to learn the science of the art (or the art of the science). If I succeed, and build a minimally successful installer tool, it would have these features:
does not depend on a platform-specific tool (such as Windows Installer).
reads XML or a declarative syntax to fulfill installation requirements.
attempts to minimize steps to upgrade or downgrade one of my products (rather than requiring a complete uninstall and re-install).
does not require knowledge of interim product versions, in order to jump versions (i.e. can upgrade one of my products from version 1 to version 3, without passing through version 2).
I'm convinced that "the key" to achieving this goal is by seeing versions as a "point A to point B" problem, which implies that A and B are described by two XML "version" documents that hold info about all the parts and actions (files, or platform specifics such as registry entries). My installer tool would "join" or compare the two documents and determine a minimal set of changes to transform A into B. To some extent, I believe this is precisely what Windows Installer does.
Of course there are further complexities, but that is the point of this post. Where is "the bible" of information on this topic? Remember, I want to make my own installer - not use a platform-specific one. For those who care, my products are usually written in C++ or C#.
Or perhaps I should study something like Steam which is cross-platform and has "automated game updates" as part of its capabilities. In my case, the problem of online deployment is already handled. It is just the final installation step I'm examining. Does Steam use native installers (such as an MSI)? If yes, then that is not what I'm looking for.
In short, what path should I pursue to become somewhat competent on the science of this topic?
I'm not an expert and others can give you better answers but...
Don't declaratively list steps required to install your product - You'll end up making assumptions which will eventually prove wrong. Instead, you should be looking at defining the final state of the installation and let the installer worry about how to make that happen.
Another consideration is that being downgradable may involve huge complications depending on your product - Would it have to down-grade database schemas / file formats / ??? In short, every version of your app will need to be both fully forwards- and backwards-compatible (or at least fail gracefully). Also consider the scenario where V1 of your app stores settings in a file. V2 comes along and adds more settings. You downgrade to V1 - What should it do when changing settings? preserve the V2 settings? dump them? Do some of the V2 settings change the impact/meaning of the V1 settings? Are these decisions to be made by your app or your installer?
Anyway, all that aside, I'd say you need at the least:
A central server/farm with complete files for every version of your App and some API/Web Service which allows the installer to retrieve files/filesets/??? as appropriate (You may be able to tie this into a source control system like svn)
Some way of specifying the desired post-install state of the system in an environment-agnostic way (Think install paths - /usr/??? - should the map to C:\Users\??? or C:\Program Files on windows? Also don't forget it might be a 64-bit machine so it could be C:\Program Files (x86).
A very clever installer written for multiple platforms with as much code re-use as possible (Java, Mono, ???)
The installer should do (simply):
Determine the desired version of the product.
Download/read the appropriate manifest.
Compare the desired situation with the current situation (NB: What is currently on the local system, NOT what should be on the system according to the current version's manifest)
Generate a list of steps to reconcile the two, taking into account any dependencies (can't set file permissions before you copy the file). You can make use of checksums/hashing/similar to compare existing files with desired files - thus only downloading the files actually required.
Possibly take complete backups
Download/unpack required files.
Download/unpack 3rd party dependencies - Later .Net Framework Version/Similar
Perform install steps in atomic a manner as possible (at the very least keeping a record of steps taken so they can be undone)
Potentially apply any version-jump specific changes (up/down-grade db, config files, etc.)
verify installation as much as possible (checksums again)
None of this addresses the question of what to do when the installer itself needs upgrading.
A technique I've used on Windows is that the installer executable itself is little more than a wrapper with some interfaces which loads the actual installer dynamically at runtime - thus I can move files about/unload/reload assemblies, etc... from within a fixed process that almost never changes.
As I said above, I am definitely not an expert, just a novice who's done some of this myself. I sure you can get more complete answers from others but I hope this helped a little

Do you put your development/runtime tools in the repository?

Putting development tools (compilers, IDEs, editors, ...) and runtime environments (jre, .net framework, interpreters, ...) under the version control has a couple of nice reasons. First, you can easily compile/run your program just by checking out your repository. You don't have to have anything else. Second, the triple is surely version compatible as you once tested it. However, it has its own drawbacks. The main one is the big volume of large binary files that must be put under version control system. That may cause the VCS slower and the backup process harder. What's your idea?
Tools and dependencies actually used to compile and build the project, absolutely - it is very useful if you ever have to debug an issue or develop a fix for an older version and you've moved on to newer versions that aren't quite compatible with the old ones.
IDE's & editors no - ideally you're project should be buildable from a script so these would not be necessary. The generated output should still be the same regardless of what you used to edit the source.
I include a text (and thus easily diff-able) file in every project root called "How-to-get-this-project-running" that includes any and all things necessary, including the correct .net version and service packs.
Also for proprietry IDE's (e.g. Visual Studio), there can be licensing issues as this makes it difficult to manage who is using which pieces of software.
Edit:
We also used to store batch files that automatically checked out the source code automatically (and all dependencies) in source control. Developers just check out the "Setup" folder and run the batch scripts, instead of having to search the repository for appropriate bits and pieces.
What I find is very nice and common (in .Net projects I have experience with anyway) is including any "non-default install" dependencies in a lib or dependencies folder with source control. The runtime is provided by the GAC and kind of assumed.
First, you can easily compile/run your program just by checking out your repository.
Not true: it often isn't enough to just get/copy/check out a tool, instead the tool must also be installed on the workstation.
Personally I've seen libraries and 3rd-party components in the source version control system, but not the tools.
I keep all dependencies in a folder under source control named "3rdParty". I agree that this is very convinient and you can just pull down the source and get going. This really shouldnt affect the performance of the source control.
The only real draw back is that the initial size to pull down can be fairly large. In my situation anyone who pulls downt he code usually will run it also, so it is ok. But if you expect many people to pull down the source just to read then this can be annoying.
I've seen this done in more than one place where I worked. In all cases, I've found it to be pretty convenient.

Arguments for and against including 3rd-party libraries in version control? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this question
I've met quite a few people lately who says that 3rd party libraries doesn't belong in version control. These people haven't been able to explain to me why they shouldn't yet, so I hoped you guys could come to my rescue :)
Personally, I think that when I check the trunk of a project out, it should just work - No need to go to other sites to find libraries. More often than not, you'd end up with multiple versions of the same 3rd party lib for different developers then - and sometimes with incompatibility problems.
Is it so bad to have a libs folder up there, with "guarenteed-to-work" libraries you could reference?
In SVN, there is a pattern used to store third-party libraries called vendor branches. This same idea would work for any other SVN-like version control system. The basic idea is that you include the third-party source in its own branch and then copy that branch into your main tree so that you can easily apply new versions over your local customizations. It also cleanly keeps things separate. IMHO, it's wrong to directly include the third-party stuff in your tree, but a vendor branch strikes a nice balance.
Another reason to check in libraries to your source control which I haven't seen mentioned here is that it gives you the ability to rebuild your application from a specific snapshot or version. This allows you to recreate the exact version that someone may report a bug on. If you can't rebuild the exact version you risk not being able to reproduce/debug problems.
Yes you should (when feasible).
You should be able to take a fresh machine and build your project with as few steps as possible. For me, it's:
Install IDE (e.g. Visual Studio)
Install VCS (e.g. SVN)
Checkout
Build
Anything more has to have very good justification.
Here's an example: I have a project that uses Yahoo's YUI compressor to minify JS and CSS. The YUI .jar files go in source control into a tools directory alongside the project. The Java runtime however, does not--that has become a prereq for the project much like the IDE. Considering how popular JRE is, it seems like a reasonable requirement.
No - I don't think you should put third party libraries into source control. The clue is in the name 'source control'.
Although source control can be used for distribution and deployment, that is not its prime function. And the arguments that you should just be able to check out your project and have it work are not realistic. There are always dependencies. In a web project, they might be Apache, MySQL, the programming runtime itself, say Python 2.6. You wouldn't pile all those into your code repository.
Extra code libraries are just the same. Rather than include them in source control for easy of deployment, create a deployment/distribution mechanism that allows all dependencies to easily be obtained and installed. This makes the steps for checking out and running your software something like:
Install VCS
Sync code
Run setup script (which downloads and installs the correct version of all dependencies)
To give a specific example (and I realise this is quite web centric), a Python web application might contain a requirements.txt file which reads:
simplejson==1.2
django==1.0
otherlibrary==0.9
Run that through pip and the job is done. Then when you want to upgrade to use Django 1.1 you simply change the version number in your requirements file and re-run the setup.
The source of 3rd party software doesn't belong (except maybe as static reference), but the compiled binary do.
If your build process will compile an assembly/dll/jar/module, then only keep the 3rd party source code in source control.
If you won't compile it, then put the binary assembly/dll/jar/module into source control.
This could depend on the language and/or environment you have, but for projects I work on I place no libraries (jar files) in source control. It helps to be using a tool such as Maven which fetches the necessary libraries for you. (Each project maintains a list of required jars, Maven automatically fetches them from a common repository - http://repo1.maven.org/maven2/)
That being said, if you're not using Maven or some other means of managing and automatically fetching the necessary libraries, by all means check them into your version control system. When in doubt, be practical about it.
The way I've tended to handle this in the past is to take a pre-compiled version of 3rd party libraries and check that in to version control, along with header files. Instead of checking the source code itself into version control, we archive it off into a defined location (server hard drive).
This kind of gives you the best of both worlds: a 1 step fetch process that fetches everything you need, but it doesn't bog down your version control system with a bunch of necessary files. Also, by fetching pre-compiled binaries, you can skip that phase of compilation, which makes your builds faster.
You should definitively put 3rd party libraries under the source control. Also, you should try to avoid relying on stuff installed on individual developer's machine. Here's why:
All developers will then share the same version of the component. This is very important.
Your build environment will become much more portable. Just install source control client on a fresh machine, download your repository, build and that's it (in theory, at least :) ).
Sometimes it is difficult to obtain an old version of some library. Keeping them under your source control makes sure you won't have such problems.
However, you don't need to add 3rd party source code in your repository if you don't plan to change the code. I tend just to add binaries, but I make sure only these libraries are referenced in our code (and not the ones from Windows GAC, for example).
We do because we want to have tested an updated version of the vendor branch before we integrate it with our code. We commit changes to this when testing new versions. We have the philosophy that everything you need to run the application should be in SVN so that
You can get new developers up and running
Everyone uses the same versions of various libraries
We can know exactly what code was current at a given point in time, including third party libraries.
No, it isn't a war crime to have third-party code in your repository, but I find that to upset my sense of aesthetics. Many people here seem to be of the opinion that it's good to have your whole developement team on the same version of these dependencies; I say it is a liability. You end up dependent on a specific version of that dependency, where it is a lot harder to use a different version later. I prefer a heterogenous development environment - it forces you to decouple your code from the specific versions of dependencies.
IMHO the right place to keep the dependencies is on your tape backups, and in your escrow deposit, if you have one. If your specific project requires it (and projects are not all the same in this respect), then also keep a document under your version control system that links to these specific versions.
I like to check 3rd party binaries into a "lib" directory that contains any external dependencies. After all, you want to keep track of specific versions of those libraries right?
When I compile the binaries myself, I often check in a zipped up copy of the code along side the binaries. That makes it clear that the code is not there for compiling, manipulating, etc. I almost never need to go back and reference the zipped code, but a couple times it has been helpful.
If I can get away with it, I keep them out of my version control and out of my file system. The best case of this is jQuery where I'll use Google's AJAX Library and load it from there:
<script src="http://ajax.googleapis.com/ajax/libs/jquery/1/jquery.min.js" type="text/javascript"></script>
My next choice would be to use something like Git Submodules. And if neither of those suffice, they'll end up in version control, but at that point, its only as up to date as you are...

Version control of deliverables

We need to regularly synchronize many dozens of binary files (project executables and DLLs) between many developers at several different locations, so that every developer has an up to date environment to build and test at. Due to nature of the project, updates must be done often and on-demand (overnight updates are not sufficient). This is not pretty, but we are stuck with it for a time.
We settled on using a regular version (source) control system: put everything into it as binary files, get-latest before testing and check-in updated DLL after testing.
It works fine, but a version control client has a lot of features which don't make sense for us and people occasionally get confused.
Are there any tools better suited for the task? Or may be a completely different approach?
Update:
I need to clarify that it's not a tightly integrated project - more like extensible system with a heap of "plugins", including thrid-party ones. We need to make sure those modules-plugins works nicely with recent versions of each other and the core. Centralised build as was suggested was considered initially, but it's not an option.
I'd probably take a look at rsync.
Just create a .CMD file that contains the call to rsync with all the correct parameters and let people call that. rsync is very smart in deciding what part of files need to be transferred, so it'll be very fast even when large files are involved.
What rsync doesn't do though is conflict resolution (or even detection), but in the scenario you described it's more like reading from a central place which is what rsync is designed to handle.
Another option is unison
You should look into continuous integration and having some kind of centralised build process. I can only imagine the kind of hell you're going through with your current approach.
Obviously that doesn't help with the keeping your local files in sync, but I think you have bigger problems with your process.
Building the project should be a centralized process in order to allow for better control soon your solution will be caos in the long run. Anyway here is what I'd do.
Create the usual repositories for
source files, resources,
documentation, etc for each project.
Create a repository for resources.
There will be the latest binary
versions for each project as well as
any required resources, files, etc.
Keep a good folder structure for
each project so developers can
"reference" the files directly.
Create a repository for final buidls
which will hold the actual stable
release. This will get the stable
files, done in an automatic way (if
possible) from the checked in
sources. This will hold the real
product, the real version for
integration testing and so on.
While far from being perfect you'll be able to define well established protocols. Check in your latest dll here, generate the "real" versiĆ³n from latest source here.
What about embedding a 'what' string in the executables and libraries. Then you can synchronise the desired list of versions with a manifest.
We tend to use CVS id strings as a part of the what string.
const char cvsid[] = "#(#)INETOPS_filter_ip_$Revision: 1.9 $";
Entering the command
what filter_ip | grep INETOPS
returns
INETOPS_filter_ip_$Revision: 1.9 $
We do this for all deliverables so we can see if the versions in a bundle of libraries and executables match the list in a associated manifest.
HTH.
cheers,
Rob
Subversion handles binary files really well, is pretty fast, and scriptable. VisualSVN and TortoiseSVN make dealing with Subversion very easy too.
You could set up a folder that's checked out from Subversion with all your binary files (that all developers can push and update to) then just type "svn update" at the command line, or use TortoiseSVN: right click on the folder, click "SVN Update" and it'll update all the files and tell you what's changed.

Storing third-party libraries in source control

Should libraries that the application relies on be stored in source control? One part of me says it should and another part say's no. It feels wrong to add a 20mb library that dwarfs the entire app just because you rely on a couple of functions from it (albeit rather heavily). Should you just store the jar/dll or maybe even the distributed zip/tar of the project?
What do other people do?
store everything you will need to build the project 10 years from now.I store the entire zip distribution of any library, just in case
Edit for 2017:
This answer did not age well:-). If you are still using something old like ant or make, the above still applies. If you use something more modern like maven or graddle (or Nuget on .net for example), with dependency management, you should be running a dependency management server, in addition to your version control server. As long as you have good backups of both, and your dependency management server does not delete old dependencies, you should be ok. For an example of a dependency management server, see for example Sonatype Nexus or JFrog Artifcatory, among many others.
As well as having third party libraries in your repository, it's worth doing it in such a way that makes it easy to track and merge in future updates to the library easily (for example, security fixes etc.). If you are using Subversion using a proper vendor branch is worthwhile.
If you know that it'd be a cold day in hell before you'll be modifying your third party's code then (as #Matt Sheppard said) an external makes sense and gives you the added benefit that it becomes very easy to switch up to the latest version of the library should security updates or a must-have new feature make that desirable.
Also, you can skip externals when updating your code base saving on the long slow load process should you need to.
#Stu Thompson mentions storing documentation etc. in source control. In bigger projects I've stored our entire "clients" folder in source control including invoices / bills/ meeting minutes / technical specifications etc. The whole shooting match. Although, ahem, do remember to store these in a SEPARATE repository from the one you'll be making available to: other developers; the client; your "browser source view"...cough... :)
Don't store the libraries; they're not strictly speaking part of your project and uselessy take up room in your revision control system. Do, however, use maven (or Ivy for ant builds) to keep track of what versions of external libraries your project uses. You should run a mirror of the repo within your organisation (that is backed up) to ensure you always have the dependencies under your control. This ought to give you the best of both worlds; external jars outside your project, but still reliably available and centrally accessible.
We store the libraries in source control because we want to be able to build a project by simply checking out the source code and running the build script. If you aren't able to get latest and build in one step then you're only going to run into problems later on.
never store your 3rd party binaries in source control. Source control systems are platforms that support concurrent file sharing, parallel work, merging efforts, and change history. Source control is not an FTP site for binaries. 3rd party assemblies are NOT source code; they change maybe twice per SDLC. The desire to be able to wipe your workspace clean, pull everything down from source control and build does not mean 3rd party assemblies need to be stuck in source control. You can use build scripts to control pulling 3rd party assemblies from a distribution server. If you are worried about controlling what branch/version of your application uses a particular 3rd party component, then you can control that through build scripts as well. People have mentioned Maven for Java, and you can do something similar with MSBuild for .Net.
I generally store them in the repository, but I do sympathise with your desire to keep the size down.
If you don't store them in the repository, the absolutely do need to be archived and versioned somehow, and your build system needs to know how to get them. Lots of people in Java world seem to use Maven for fetching dependencies automatically, but I've not used I, so I can't really recommend for or against it.
One good option might be to keep a separate repository of third party systems. If you're on Subversion, you could then use subversion's externals support to automatically check out the libraries form the other repository. Otherwise, I'd suggest keeping an internal Anonymous FTP (or similar) server which your build system can automatically fetch requirements from. Obviously you'll want to make sure you keep all the old versions of libraries, and have everything there backed up along with your repository.
What I have is an intranet Maven-like repository where all 3rd party libraries are stored (not only the libraries, but their respective source distribution with documentation, Javadoc and everything). The reason are the following:
why storing files that don't change into a system specifically designed to manage files that change?
it dramatically fasten the check-outs
each time I see "something.jar" stored under source control I ask "and which version is it?"
I put everything except the JDK and IDE in source control.
Tony's philosophy is sound. Don't forget database creation scripts and data structure update scripts. Before wikis came out, I used to even store our documentation in source control.
My preference is to store third party libraries in a dependency repository (Artifactory with Maven for example) rather than keeping them in Subversion.
Since third party libraries aren't managed or versioned like source code, it doesn't make a lot of sense to intermingle them. Remote developers also appreciate not having to download large libraries over a slow WPN link when they can get them more easily from any number of public repositories.
At a previous employer we stored everything necessary to build the application(s) in source control. Spinning up a new build machine was a matter of syncing with the source control and installing the necessary software.
Store third party libraries in source control so they are available if you check your code out to a new development environment. Any "includes" or build commands that you may have in build scripts should also reference these "local" copies.
As well as ensuring that third party code or libraries that you depend on are always available to you, it should also mean that code is (almost) ready to build on a fresh PC or user account when new developers join the team.
Store the libraries! The repository should be a snapshot of what is required to build a project at any moment in time. As the project requires different version of external libraries you will want to update / check in the newer versions of these libraries. That way you will be able to get all the right version to go with an old snapshot if you have to patch an older release etc.
Personally I have a dependancies folder as part of my projects and store referenced libraries in there.
I find this makes life easier as I work on a number of different projects, often with inter-depending parts that need the same version of a library meaning it's not always feasible to update to the latest version of a given library.
Having all dependancies used at compile time for each project means that a few years down the line when things have moved on, I can still build any part of a project without worrying about breaking other parts. Upgrading to a new version of a library is simply a case of replacing the file and rebuilding related components, not too difficult to manage if need be.
Having said that, I find most of the libraries I reference are relatively small weighing in at around a few hundred kb, rarely bigger, which makes it less of an issue for me to just stick them in source control.
Use git subprojects, and either reference from the 3rd party library's main git repository, or (if it doesn't have one) create a new git repository for each required library. There's nothing reason why you're limited to just one git repository, and I don't recommend you use somebody else's project as merely a directory in your own.
store everything you'll need to build the project, so you can check it out and build without doing anything.
(and, as someone who has experienced the pain - please keep a copy of everything needed to get the controls installed and working on a dev platform. I once got a project that could build - but without an installation file and reg keys, you couldn't make any alterations to the third-party control layout. That was a fun rewrite)
You have to store everything you need in order to build the project.
Furthermore different versions of your code may have different dependencies on 3rd parties.
You'll want to branch your code into maintenance version together with its 3rd party dependencies...
Personally what I have done and have so far liked the results is store libraries in a separate repository and then link to each library that I need in my other repositories through the use of the Subversion svn:externals feature. This works nice because I can keep versioned copies of most of our libraries (mainly managed .NET assemblies) in source control without them bulking up the size of our main source code repository at all. Having the assemblies stored in the repository in this fashion makes it so that the build server doesn't have to have them installed to make a build. I will say that getting a build to succeed in absence of Visual Studio being installed was quite a chore but now that we got it working we are happy with it.
Note that we don't currently use many commercial third-party control suites or that sort of thing much so we haven't run into licensing issues where it may be required to actually install an SDK on the build server but I can see where that could easily become a problem. Unfortunately I don't have a solution for that and will plan on addressing it when I first run into it.