How could I aggregate all untracked files in one - atlassian-sourcetree

I'm new to sorucetree, I used to git-tower.
How could I aggregate all untracked files in one.

Related

How to put xml from zip in a postgres table?

How can I put xml files from a zip file into a postgresql table?
COPY FROM 'zip/file.zip/*.xml';
There are a lot of xml files in the zip file.
I want to add rows containing xml files to the table.

How to create unique list of codes from multiple files in multiple subfolders in ADF?

We have a folder structure in data lake like this:
folder1/subfolderA/parquet files
folder1/subfolderB/parquet files
folder1/subfolderC/parquet files
folder1/subfolderD/parquet files
etc.
All the parquet files have the same schema, and all the parquet files have, amongst other fields, a code field, let's call it code_XX.
Now I want from all parquet files in all folders the distinct value of code_XX.
So if code_XX, value 'A345' resides multiple times in the parquet files in subfolderA and subfolderC, I only want it once.
Output must be a Parquet file with all unique codes.
Is this doable in Azure Data Factory, and how?
If not, can it be done in Databricks?
You can try as below.
Set source folder path to recursively look for all parquet files and choose a column to store the file names.
As it seems you only need file names in output parquet file, use select to have only that column forward.
Use expression in derived column to get the file names from path string.
distinct(array(right(fileNames,locate('/',reverse(fileNames))-1)))
If you have access to SQL, it can be done with two copy activities, no need for data flows.
Copy Activity 1 (Parquet to SQL): Ingest all files into a staging table.
Copy Activity 2 (SQL to Parquet): Select DISTINCT code_XX from the staging table.
NOTE:
Use Mapping to only extract the column you need.
Use a wildcard file path with the recursive option enabled to copy all files from subfolders. https://learn.microsoft.com/en-us/azure/data-factory/connector-azure-blob-storage?tabs=data-factory#blob-storage-as-a-source-type

ADF / Dataflow - Convert Multiple CSV to Parquet

In ADLS Gen2, TextFiles folder has 3 CSV files. Column names are different in each file.
We need to convert all 3 CSV files to 3 parquet files and put it in ParquetFiles folder
I tried to use Copy Activity and it fails because the column names have empty space in it and parquet files doesn't allow it
To remove spaces, I used Data flow: Source -> Select (replace space by underscore in col name) and sink. This worked for a single file. When I tried to do it for all 3 files, it tries to merge 3 files and generates single file with incorrect data.
How to solve this, mainly removing spaces from column names in all files. What would be the other options here?
Pipeline: ForEach activity (loop over CSV files in folder and send in current iteration item to data flow as param) -> Data Flow activity with source that points to that folder (parameterize the file name in the source path)
I created 2 datasets, one in csv with wildcard format, the other in parquet. I used the Data Copy Activity using the parquet data set as sink and csv data set as source. I set the copy behavior to Merge files.

Paraview: Merge Multiple .stl and save as single (.stl)

How could I merge multiple (.stls) and save as a single (.stl) file?
I would like to save all the (.stls) in the following as a single (.stl) file in Paraview.
Select all the STL sources and run Filters -> Append Datasets. Save the result.

How to merge only some files?

I am trying to merge part of a commit from the default branch (not all files and parts of other files) to a named branch. I tried graft, but it just takes the whole commit wthout giving me a chance to choose. How would this be done?
Example:
A---B---C---D
\
-E---(G)
G does not exist yet. Lets say C and D each added 5 files and modified 5 files. I want G to have 2 of the 5 files added at C, all the modifications to one of the files and one modification to another file. I would ideally like it to also have something similar from D.
When I selected graft to local..., all I got was the whole C change-set. Same for merge with local...
The unit of merging is a whole changeset, so C and D should have been committed in smaller pieces. You could now merge the whole thing and revert some files, but this will have the result that you won't be able to merge the rest later-- they're considered merged already.
What I'd do is make a branch parallel to C-D, rooted at B in your example, that contains copies of the changes in C and D but splits them into coherent parts. Then you can merge whole changesets from that, and close (or or perhaps even delete) the original C-D branch.
C---D
/
A---B--C1--D1--C2--D2 (equivalent to C--D)
\
E---(G?)
In the above, C1 and C2 together are equivalent to C. While I was at it I went ahead and reordered the four new changesets (use a history-rewriting tool such as rebase), so that you can then simply merge D1 with E:
C---D
/
A---B--C1--D1--C2--D2
\ \
E------G
If reordering the new changesets is not an option, you'd have to do some fancy tapdancing to commit the partial changesets in the order C1, D1, C2, D2; it's probably a lot less trouble to use graft (or transplant) to copy the changes that you're not allowed to merge separately. E.g., in the following you can still merge C1, but then you need a copy of D1 (labeled D1') since there's no way to merge it without pulling C2 along with it.
C---D
/
A---B--C1--C2--D1--D2
\ \
E--G1--D1'