What is the best way to version control a Google sheet using Github? - github

Our teamwork pipeline uses Google Sheets as a very basic database. We use it because it is a standard spreadsheet that can be accessed online and shared. Based on the exported CSV from this Google sheet, carry out any further analysis.
Since sharing work leads to mistakes, I have to restore the version that makes the mistake without losing the other changes. Since Google Sheets' version history isn't as useful as Git's, I want to put this spreadsheet (ideally, the CSV) under Github version control on an automatic basis.
Would it be possible to do that?
I will need to get into the spreadsheet, export the CSV, and push it to the appropriate repository if I have to do that manually. I think it would be easy to automate. I'm not sure how to do it.
I appreciate your help.

You do not need to export, your spreadsheet can be reached using endpoint as follows
https://docs.google.com/spreadsheets/d/##ssID##/gviz/tq?tqx=out:csv&sheet=##sheetName##

DoltHub is an excellent fit for this use case where you have a spreadsheet being built collaboratively and you want the ability to see the full version history, audit where/when/who each cell's value came from, diff any versions of the spreadsheet, and much more. It's free to use DoltHub and you can easily export your data to CSV from the web, or pull it all down as a Dolt database and access everything locally.
Here's a DoltHub blog post that covers this exact use case in more detail:
https://www.dolthub.com/blog/2022-07-15-so-you-want-spreadsheet-version-control/
If you haven't heard of DoltDB or DoltHub yet, here's a little more background...
DoltDB is the first versioned SQL relational database. It has all the power of a SQL database with all the versioning features of Git. That gives you a database that you can branch and fork, push and pull, merge and diff, just like a Git repository. It's open-source and written from the ground up in Go, and targets full MySQL compliance, so you can use it seamlessly with any tools that connect to a MySQL database.
DoltHub is an online site for finding and collaborating on datasets. The Git-style versioning features built into DoltDB enable easy and safe collaboration and gives you a Pull Request workflow for accepting changes, just like on GitHub. You can control if you want your dataset to be public or private with the free tier, and there's a Pro tier if you need to host private databases larger than 1GB. There's even a DoltLab product available for teams that need to keep their data on their own private network.
There's a very active and friendly DoltHub user community on Discord where the DoltHub dev team hangs out, too, if you have any questions/comments/feedback.

Related

Any suggestions on the latest trend in version control for SQL Server 2014 and above?

For example when a developer makes changes in any of the database elements in a business critical database it should force them to commit the code before applying the changes to database itself. I came across Redgate sql source control which matches my expectation somehow. Still do we have any more tools or effective database practices that I am missing here?
If you use SQL Source Control or a tool like it (eg, ReadyRoll or VS Database Projects) I'd recommend also using DLM Dashboard.
The reason for this is that no tool can enforce changes to go through a process if people are given (too many) rights and are able to apply changes to production. It's then up to these people to correctly follow the process.
Although DLM Dashboard doesn't enforce changes to your database, it will alert you on changes made to production, warning you when out-of-process changes (aka "drift").
DLM Dashboard is free, which is another reason to use it!

How to implement continuous migration for large website?

I am working on a website of 3,000+ pages that is updated on a daily basis. It's already built on an open source CMS. However, we cannot simply continue to apply hot fixes on a regular basis. We need to replace the entire system and I anticipate the need to replace the entire system on a 1-2 year basis. We don't have the staff to work on a replacement system while the other is being worked on, as it results in duplicate effort. We also cannot have a "code freeze" while we work on the new site.
So, this amounts to changing the tire while driving. Or fixing the wings while flying. Or all sorts of analogies.
This brings me to a concept called "continuous migration." I read this article here: https://www.acquia.com/blog/dont-wait-migrate-drupal-continuous-migration
The writer's suggestion is to use a CDN like Fastly. The idea is that a CDN allows you to switch between a legacy system and a new system on a URL basis. This idea, in theory, sounds like a great idea that would work. This article claims that you can do this with Varnish but Fastly makes the job easier. I don't work much with Varnish, so I can't really verify its claims.
I also don't know if this is a good idea or if there are better alternatives. I looked at Fastly's pricing scheme, and I simply cannot translate what it means to a specific price point. I don't understand these cryptic cloud-service pricing plans, they don't make sense to me. I don't know what kind of bandwidth the website uses. Another agency manages the website's servers.
Can someone help me understand whether or not using an online CDN would be better over using something like Varnish? Is there free or cheaper solutions? Can someone tell me what this amounts to, approximately, on a monthly or annual basis? Any other, better ways to roll out a new website on a phased basis for a large website?
Thanks!
I think I do not have the exact answers to your question but may be my answer helps a little bit.
I don't think that the CDN gives you an advantage. It is that you have more than one system.
Changes to the code
In professional environments I'm used to have three different CMS installations. The fist is the development system, usually on my PC. That system is used to develop the extensions, fix bugs and so on supported by unit-tests. The code is committed to a revision control system (like SVN, CVS or Git). A continuous integration system checks the commits to the RCS. When feature is implemented (or some bugs are fixed) a named tag will be created. Then this tagged version is installed on a test-system where developers, customers and users can test the implementation. After a successful test exactly this tagged version will be installed on the production system.
A first sight this looks time consuming. But it isn't because most of the steps can be automated. And the biggest advantage is that the customer can test the change on a test system. And it is very unlikely that an error occurs only on your production system. (A precondition is that your systems are build on a similar/equal environment. )
Changes to the content
If your code changes the way your content is processed it is an advantage when your
CMS has strong workflow support. Than you can easily add a step to your workflow
which desides if the content is old and has to be migrated for the current document.
This way you have a continuous migration of the content.
HTH
Varnish is a cache rather than a CDN. It intercepts page requests and delivers a cached version if one exists.
A CDN will serve up contents (images, JS, other resources etc) from an off-server location, typically in the cloud.
The cloud-based solutions pricing is often very cryptic as it's quite complicated technology.
I would be careful with continuous migration. I've done both methods in the past (continuous and full migrations) and I have to say, continuous is a pain. It means double the admin time for everything, and assumes your requirements are the same at all points in time.
Unfortunately, I would say you're better with a proper rebuilt on a 1-2 year basis than a continuous migration, but obviously you know best about that.
I would suggest you maybe also consider a hybrid approach? Build yourself an export tool to keep all of your content in a transferrable state like CSV/XML/JSON so you can just import into a new system when ready. This means you can incorporate new build requests when you need them in a new system (what's the point in a new system if it does exactly the same as the old one) and you get to keep all your content. Plus you don't need to build and maintain two CMS' all the time.

Version control of databases

I am curious if there are any solutions out there, preferably free, that can have a central database to publish data to in a versioned manner.
For example,
Client 1 decides to edit a persons profile so it gets a local copy on its machine to make changes to. When they are happy with there edit they publish the results to the central database. Just like how you would do a submit in perforce.
Client 2 tries to edit the same local copy but when they go to submit they have to resolve conflicts.
The central database must store compressed differences between versions of the data.
At any point someone can look at all versions of the data submitted.
Check out OffScale DataGrove.
This product tracks changes to the entire DB - schema and data. You can tag versions in any point in time, and return to older states of the DB with a simple command. It also allows you to create virtual, separate, copies of the same database so each team member can have his own separate DB. All the virtual copies are tracked into the same repository so it's super-easy to revert your DB to someone else's version (you simply check-out their version, just like you do with your source control). This means all your DBs can always be synchronized.
Disclaimer - I work at OffScale :-)
"Version control of databases" is a bit ambiguous for a title, because you are actually asking for a VCS using a database as repository "data store".
Subversion has such a model (either Berkeley DB or filesystem-based).
It also has a Copy-Modify-Merge model which is similar to the kind of locking mechanism you are describing.
(source: red-bean.com)
(source: red-bean.com)
The sql tools from redgate sort of offer some of this functionality, but not implemented in a way you describe. For example, sql data compare can compare the differences between data in 2 databases, and sql source control can be used as well.
However, getting a copy of the database on a local machine, making changes and resubmitting would be more of a manual process.
What database server are you using? If you are using MySQL and PHP, Doctrine has 'Versionable' behavior which can be applied to a model.
The documentation on this behavior is here:
http://www.doctrine-project.org/projects/orm/1.2/docs/manual/behaviors/en#core-behaviors:versionable
This is exactly what my product (yes I'm biased :)) DBmaestro Teamwork does.
It enforces and keep track on the changes of structure and content
It prevents two parallel changes on an object structure or content by two (as long they work on the same object - meaning, same database, same schema, ...)
It uses a baseline aware analysis which understand the nature of the change and knows if the change should be promoted or should be ignored (as it was made from another environment) or if there is a conflict
And much moreā€¦
I would encourage you to read a comprehensive, unbiased review on Database Enforced Management Solution by veteran Database expert Ben Taylor which he posted on LinkedIn https://www.linkedin.com/pulse/article/20140907002729-287832-solve-database-change-mangement-with-dbmaestro

How do I sync my development with the users?

I create websites for people. I have given them the ability to edit certain areas of their published pages using CushyCMS. That works fine, and everyone is happy with it.
When I go to publish some of my more extensive changes, I first need to pull down the latest version that they have produced. Then I make my changes, and upload everything to production.
I would like to use some sort of version control in this process. This should be a classic update-edit-commit-publish workflow, but I'm not sure how to go about this. Basically I want to avoid pulling down everything locally and doing the commits. I only want to pull down what has changed.
I use filezilla, and it doesn't do a good job of identifying changed files. I can't rely on the filesize, because sometimes it stays the same. I can't rely on timestamps because the server time is different than my machine, and it never seems to work correctly.
How can I get around my problem? I use Notepad++, Subversion and FileZilla, but I'm willing to try other tools if they would make this process easier.
It comes down to CushyCMS's decision to edit files directly and not put the user-provided content in a database like WordPress, DotNetDuke, Drupal, etc. So the real answer is you can't get there from here and should look into migrating to a database backed CMS. Thats not what you want to hear though.
Version control will get you part of the way to concurrency but there is always the possibility of a user updating a page between your pull down and publishing the revised copy since your users wouldn't be checking into the version control system directly. That would require them to learn the version control system and negate the ease that CushyCMS (or any CMS really) provides. You'll want to try and find a system that allows your live site to be the Master to which you compare and check-out files from. I do not know of any mainstream systems that currently work that way.
I found that it was easiest to use a tool like Beyond Compare to handle the synchronization.

Configuration Management with Subversion and SharePoint help

Ok, when hired on to my current company a year ago, I was tasked with migrating our development teams from VSS. They already had it in their minds that they wanted Subversion, and since I had experience using and setting up subversion, I was a good candidate. I first tried to sell TFS because it woul dhave solved the problem I am in right now, but since money is tight, and Subversion is free... well you get it. Anyway, I have finalized the propsal and the only thing standing in the way is the following.
I proposed that we store only our source code in SVN, and all documentation, release builds, and other project artifracts be stored in our SharePoint portal, so we don't have to give non developer stakeholders access to SVN. When I presented the proposal, all was excepted but the question arose about how to manage the syncronization between the artifacts (Ex: How to is document x version 3.1.2 associated with release 4.5.2). My initial reaction is to create a section in the SharePoint porject page for each new release that will hold the artificats (and keep track of changes too). Is there a better way of doing this? Does anyone know of anyone doing this? Or any integration packages to sync SVN with SharePoint?
Here is some info on the companies development environment. All of our software is for internal use, we sell none of it, so our customers are all in-house. We have 2 types of developers: 1. those who take care of maintainance and customization of third party software, and 2. those who write proprieatry software (which is where I fall). Our software we write is mostly .NET, but the 3rd party software is all over the board (COBAL, C, FORTRAN, Other crap that no ones cares about anymore).
Please advise, as I need to get this submitted soon. I HATE VSS!!!!!!!!!! and I need relief!
What we do internally is putting all docs under our version control system, I think it's much easier. Then, of course, you have to give access to not-developers.
In your case, using SVN, why don't you put everything inside and then use the webinterface to give access to the stakeholders? It's easy enough for them :-P
I would use SVN for both documents and source code.
Advantages:
You can synchronize versions of
documents with versions of source
code.
You have everything in one place, so
no two repositories to administrate.
Disadvantages:
You'd probably need to manage the
access rights for some stakeholders
to some parts of the folder
structures.
SVN is not the most appropriate tool
for document management
In order to solve the possible concurrent changes to the same document, you can use SVN property svn:needs-lock for these items, to make them editable by one person, who locks the item.
As pablo said, you can access the documents (at least for reading them) through the web interface.
You could expose the svn repo via the web interface and link to that in sharepoint. That way people who need to edit the documents would need access to subversion but anyone could easily access the documents "read only".
In our organization, we have docs/artifacts, code everything in SVN and have given access to non-technical stakeholders as well who use tortoise client.
however you can look at the following option
Option 1 : create a ASP.Net interface for non-technical users
You can build a simple web interface in ASP.net, configure that with a single user so you would not have to create separate users for all the nontechnical stakeholders and they would get access to the docs with proper version control, etc. you could look at sharpsvn for the implementation aspect. the disadvantage of this approach would be that you might have to invest some time in developing this app
Option 2 : ofcourse, create separate users for each non-developer stakeholder
This answer is probably too late for you implementation, but the simplest integration path may be to store the docs in SVN and then publish to Sharepoint with an svn-hook.
Build artifacts could be programatically published the same way from you build scripts.
You can upload docs to SharePoint using a simple POST
i.e.
http://blogs.msdn.com/rohitpuri/archive/2007/04/10/upload-download-file-to-from-wss-document-library-using-dav.aspx
Probably a little late, too, but I would avoid putting the documents in SVN if you have a SharePoint system setup. Though SVN does a fantastic job for source code, for document management it doesn't provide the ease-of-use of SharePoint. If you have it already setup and you are a primarily MS based network, SharePoint makes a lot of sense and can handle revision control for the MS based documentation much better than SVN.
Yes, you can manage access to SVN documents with a needs-lock, but chances are at some point you'll have a non-developer needing to access the documents. Explaining SVN to a non-developer, non-techie is not an easy thing.