Execute SQL scripts stored in folders on Azure DevOps repo in Java 11 HttpClient - httpclient

I am version-controlling a database by uploading it to an Azure DevOps repository. The structure of the repository looks something like this:
repo_name
- schema1
- schema2
- schema3
Tables
- table1.sql
- table2.sql
Stored Procedures
- stored_procedures.sql
Functions
- functions.sql
In my Java program, I initialize a Java 11 HttpClient and will fetch the files using HttpRequest. I built out a small helper class that takes in a URI but I thought it would be good to get an expert opinion on how to approach this problem.
When I read the repository, there is still a folder structure that I need to overcome, organized by schema. Within each schema directory, I would execute the CREATE TABLE commands within table1.sql, for instance, and then execute the CREATE PROCEDURE and CREATE FUNCTION commands in their respective files.
My question is: should I write an additional class that will traverse the folder structure and execute the files in the order I mentioned above or should I "flatten" the structure by combining all of the .sql files into one large file and then executing that on the target database? Or is there a better way to approach this outside of the two I proposed?

Related

How to delete only tables from target database using DACPAC or using SQLPackege.exe arguments which is not in Source DB? [Azure DevOps]

I have tried multiple ways to delete only tables from target database that is not in Source Database using dacpac.
If anyone has better suggestion or solution to maintain Source and target DB similar in terms of tables only.
Suggest solution in any of these:
dacpac file.
SQL Project in .net
SQLPackege.exe arguments.
Added These SQLPackage.exe arguments:
/p:DropObjectsNotInSource=true /p:BlockOnPossibleDataLoss=false /p:AllowDropBlockingAssemblies=true
Facing these errors:
*** Could not deploy package.
Warning SQL72012:
Warning SQL72015:
Warning SQL72014:
Warning SQL72045:
These StackOverFlow links didn't help me or similar:
Deploy DACPAC with SqlPackage from Azure Pipeline is ignoring arguments and dropping users
Is it possible to exclude objects/object types from sqlpackage?
I am expecting that my Source and Target DBs should have equal tables

How to make Snakemake recognize Globus remote files using Globus CLI?

I am working in a high performance computing grid environment, where large-scale data transfers are done via Globus. I would like to use Snakemake to pull data from a Globus path, process the data, and then push the processed data to a different Globus path. Globus has a command-line interface.
Pulling the data is no problem, for I'd just create a rule that would run globus transfer to create the requisite local file. But for pushing the data back to Globus, I think I'll need a rule that can "see" that the file is missing at the remote location, and then work backwards to determine what needs to happen to create the file.
I could create local "proxy" files that represent the remote files. For example I could make a rule for creating 'processed_data_1234.tar.gz' output files in a directory. These files would just be created using touch (thus empty), and the same rule will run globus transfer to push the files remotely. But then there's the overhead of making sure that the proxy files don't get out of sync with the real Globus-hosted files.
Is there a more elegant way to do this akin to the Remote File capability? Is it difficult to add a Globus CLI support for Snakemake? Thanks in advance for any advice!
Would it help to create a utility function that would generate a list of all desired files and compare it against the list of files available on globus? Something like this (pseudocode):
def return_needed_files():
list_needed_files = [] # either hard-coded or specified with some logic
list_available = [] # as appropriate, e.g. using globus ls
return [i for i in list_needed_files if i not in list_available]
# include all the needed files in the all rule
rule all:
input: return_needed_files

Understanding the output files in a Datastore export

We need to export our Datastore DB from Google Cloud to our local development evironment. I managed to export it and save it in a folder on the Storage. However, there are over a hundred files that are named: "output-{number}". Is not clear for me if we must use all of them in order to import the DB on local, or I just need one of this outputs.
The export created has the following structure:
default_namespace/
all_kinds/
default_namespace_all_kinds.export_metadata
output-0
output-1
...
output-N
Is the entire "default_namespace" directory needed to successfully import the data from Prod to Local?
If you need more information please write a comment and I will provide it to you.
Datasatore exports are expected to generate many different files as specified in the docs
However, the file you should use to perform the import is the one with the extension .overall_export_metadata. (Example: file-name.overall_export_metadata)
If what you want to do is import the Datastore Database to a local instance of the Datastore Emulator, take a look at this documentation

Reading file names from an azure file_storage directory

I have a file_storage within my azure portal which is roughly like :
- 01_file.txt
- 02_file.txt
- 03_file.txt
In azure data studio I have a data set which is linked to this file storage.
If possible, I would like to loop through this directory and get a list of all the file names in my ETL Pipeline.
I've had a look at the For Each and look up but I can't figure out how to apply it to the directory.
the end result would be a list of file_names that I would then carry out some further procedures before ingesting the data into azure.
my current work around is to create a JSON file which lists the file_names when I load the data into the file-storage and parse that using look up and For Each but I'd like to know if there is a better solution using datafactory?
Please use GetMetadata-Activity. You could get folder metadata then get file name lists by accessing childItem properties. More details,please refer to https://learn.microsoft.com/en-us/azure/data-factory/control-flow-get-metadata-activity#get-a-folders-metadata
Pipeline configuration:
Execution:

How to insert data into my SQLDB service instance (Bluemix)

I have created an SQLDB service instance and bound it to my application. I have created some tables and need to load data into them. If I write an INSERT statement into RUN DDL, I receive a SQL -104 error. How can I INSERT SQL into my SQLDB service instance.
If you're needing to run your SQL from an application then there are several examples (sample code included) of how to accomplish this at the site listed below:
http://www.ng.bluemix.net/docs/services/SQLDB/index.html#run-a-query-in-java
Additionally, you can execute SQL in the SQL Database Console by navigating to Manage -> Work with Database Objects. More information can be found here:
http://www.ng.bluemix.net/docs/services/SQLDB/index.html#sqldb_005
s.executeUpdate("CREATE TABLE MYLIBRARY.MYTABLE (NAME VARCHAR(20), ID INTEGER)");
s.executeUpdate("INSERT INTO MYLIBRARY.MYTABLE (NAME, ID) VALUES ('BlueMix', 123)");
Full Code
Most people do initial database population or migrations when they deploy their application. Often these database commands are programming language specific. The poster didn't include the programming language. You can accomplish this two ways.
Append a bash script that would call your database scripts that you uploaded. This project shows how you can call that bash script from within your manifest file as part of doing a CF Push.
Some languages like offer a file type or service that will automatically get used to populate the database on initial deploy or when your migrate/synch the db. For example Python Django offers a "fixtures" file that will automatically take a JSON file and populate your database tables