Merge PDF files using CSV file list using Powershell - powershell

I want to create multiple merged PDF files from around 1400+ pdf files.
I have a data.csv file with 2 columns as below.
The PDF files with filename matching Filename column and data.csv file are in the same folder.
I need to create multiple merged PDF files and each merged PDF will have group of files that have the same First three characters in the filename.
e.g.,
The filenames starting with EIN* need to be merged into one PDF file in the same sorting order as in the data.csv file. The filename of merged PDF should be Y followed by the first three characters. so in this example it should be YEIN.pdf
This process need to be looped in until all the rows in data.csv are actioned.
sample data.csv file
FilePath Filename
$FilePath1 EINCO01-174.pdf
$FilePath2 EINCO02-174.pdf
$FilePath3 EINCO03-174.pdf
$FilePath4 EINCO04-174.pdf
$FilePath5 EINCL01-174.pdf
$FilePath6 EINCL02-174.pdf
$FilePath7 EINCL03-174.pdf
$FilePath8 EINCL04-174.pdf
$FilePath9 EINCL05-174.pdf
$FilePath10 EINCL06-174.pdf
$FilePath11 EINCL07-174.pdf
$FilePath12 EINCL08-174.pdf
$FilePath13 EINCL09-174.pdf
$FilePath14 EINCL10-174.pdf
$FilePath15 EINCL11-174.pdf
$FilePath16 EINCL12-174.pdf
$FilePath17 EINCL13-174.pdf
$FilePath18 EINCL14-174.pdf
$FilePath19 EINCL15-174.pdf
$FilePath20 EINCL16-174.pdf
$FilePath21 EINCL17-174.pdf
$FilePath22 EINCL18-174.pdf
$FilePath23 EINCL19-174.pdf
$FilePath25 GINLG01-170.pdf
$FilePath26 GINLG02-166.pdf
$FilePath27 GINLG03-159.pdf
$FilePath28 GINLG04-159.pdf
$FilePath29 GINLG05-168.pdf
$FilePath30 GINLG06-152.pdf
$FilePath31 GINNO01-174.pdf
$FilePath32 GINNO02-131.pdf
$FilePath33 GINNO04-150.pdf
$FilePath34 GINNO05-174.pdf
$FilePath35 GINTA01-130.pdf
$FilePath36 GINTA02-139.pdf
$FilePath37 GINTA03-139.pdf
So to tackle this I have created a script to split data.csv file into multiple CSV files grouped by the First three characters as below.
$data = Import-Csv '.\data.csv' |
Select-Object Filepath,Filename,#{n='Group';e={$_.Filename.Substring(0,3)}}
$data | Format-Table -GroupBy Group
Group-Object {$_.Group}| ForEach-Object {
$_.Group | Export-Csv "$($_.Group).csv" -NoTypeInformation
}
foreach ($Group in $data | Group Group)
{
$data | Where-Object {$_.Group -eq $group.name} |
ConvertTo-Csv -delimiter "`t" -NoTypeInformation |
foreach {$_.Replace('"','')} |
Out-File "$($group.name).csv"
}`
From here, I am unable to proceed to next step to achieve what I need to. I presume there could be a better way to do this.
PS: I have installed PSWritePDF module on my machine.

Related

Powershell - Finding the output of get-contents and searching for all occurrences in another file using wild cards

I'm trying to get the output of two separate files although I'm stuck on the wild card or contains select-string search from file A (Names) in file B (name-rank).
The contents of file A is:
adam
george
william
assa
kate
mark
The contents of file B is:
12-march-2020,Mark-1
12-march-2020,Mark-2
12-march-2020,Mark-3
12-march-2020,william-4
12-march-2020,william-2
12-march-2020,william-7
12-march-2020,kate-54
12-march-2020,kate-12
12-march-2020,kate-44
And I need to match on every occurrence of the names after the '-' so my ordered output should look like this which is a combination of both files as the output:
mark
Mark-1
Mark-2
Mark-3
william
william-2
william-4
william-7
Kate
kate-12
kate-44
kate-54
So far I only have the following and I'd be grateful for any pointers or assistance please.
import-csv (c:\temp\names.csv) |
select-string -simplematch (import-csv c:\temp\names-rank.csv -header "Date", "RankedName" | select RankedName) |
set-content c:\temp\names-and-ranks.csv
I imagine the select-string isn't going to be enough and I need to write a loop instead.
The data you give in the example does not give you much to work with, and the desired output is not that intuitive, most of the time with Powershell you would like to combine the data in to a much richer output at the end.
But anyway, with what is given here and what you want, the code bellow will get what you need, I have left comments in the code for you
$pathDir='C:\Users\myUser\Downloads\trash'
$names="$pathDir\names.csv"
$namesRank="$pathDir\names-rank.csv"
$nameImport = Import-Csv -Path $names -Header names
$nameRankImport= Import-Csv -Path $namesRank -Header date,rankName
#create an empty array to collect the result
$list=#()
foreach($name in $nameImport){
#get all the match names
$match=$nameRankImport.RankName -like "$($name.names)*"
#add the name from the First list
$list+=($name.names)
#if there are any matches, add them too
if($match){
$list+=$match
}
}
#Because its a one column string, Export-CSV will now show us what we want
$list | Set-Content -Path "$pathDir\names-and-ranks.csv" -Force
For this I would use a combination of Group-Object and Where-Object to first group all "RankedName" items by the name before the dash, then filter on those names to be part of the names we got from the 'names.csv' file and output the properties you need.
# read the names from the file as string array
$names = Get-Content -Path 'c:\temp\names.csv' # just a list of names, so really not a CSV
# import the CSV file and loop through
Import-Csv -Path 'c:\temp\names-rank.csv' -Header "Date", "RankedName" |
Group-Object { ($_.RankedName -split '-')[0] } | # group on the name before the dash in the 'RankedName' property
Where-Object { $_.Name -in $names } | # use only the groups that have a name that can be found in the $names array
ForEach-Object {
$_.Name # output the group name (which is one of the $names)
$_.Group.RankedName -join [environment]::NewLine # output the group's 'RankedName' property joined with a newline
} |
Set-Content -Path 'c:\temp\names-and-ranks.csv'
Output:
Mark
Mark-1
Mark-2
Mark-3
william
william-4
william-2
william-7
kate
kate-54
kate-12
kate-44

Copy table from .txt file to a new .txt file by skipping certain lines

I have one table (.txt file) in this form:
Table: HHBB
Displayed Fields: 1 of 5 Fixed Columns: 4
-----------------------------------------------------------------------------
| |ID |NAME |Zähler |Obj |ID-RON |MANI |Felder |Nim
|----------------------------------------------------------------------------
| |007 |Kano |000001 |Lad |19283712 | |/HA |
| |007 |Bani |000002 |Bad |917391823 | |/LA |
I want to save this table into another .txt file but just want to skip the lines that match Table and Displayed Fields for example. What I tried:
If ([string]::IsNullOrWhitespace($tempInputRecord2) -or $_ -match "=|Table:|Displayed|----") {
continue
}
How can I do that?
And another question:
What is the best way to write the lines one by one into a new text file?
So you just want to remove the lines which start with Table: or Displayed Fields: and output results to a new file? Use Where-Object to filter lines, and Out-File to write them to the file:
Get-Content test.txt |
Where-Object { $_ -notlike "Table:*" -and $_ -notlike "Displayed Fields:*" } |
Out-File test2.txt
There are many ways for simple tasks:
If the header to skip occurs only once:
Get-Content test.txt|Select-Object -Skip 2|Set-Content test2.txt
A similar approach to yours with -notmatch and RegEx alternation
Get-Content test.txt|Where-Object {$_ -notmatch '^Table:|^Displayed Fields:'}|Set-Content test2.txt
When forcing a complete read in to memory by enclosing in parentheses you can write to the same file:
(Get-Content test.txt)|Select-Object -Skip 2|Set-Content test.txt

Need to remove specific portion from rows in a csv using powershell

I have a csv file with two columns and multiple rows, which has the information of files with folder location and its corresponding size, like below
"Folder_Path","Size"
"C:\MSSQL\DATA\UsersData\FTP.txt","21345"
"C:\MSSQL\DATA\UsersData\Norman\abc.csv","78956"
"C:\MSSQL\DATA\UsersData\Market_Database\123.bak","1234456"
What i want do is remove the "C:\MSSQL\DATA\" part from every row in the csv and keep the rest of the folder path after starting from UsersData and all other data intact as this info is repetitive. So my csv should like this below.
"Folder_Path","Size"
"UsersData\FTP.txt","21345"
"UsersData\Norman\abc.csv","78956"
"UsersData\Market_Database\123.bak","1234456"
What i am running is as below
Import-Csv ".\abc.csv" |
Select-Object -Property #{n='Folder_Path';e={$_.'Folder_Path'.Split('C:\MSSQL\DATA\*')[0]}}, * |
Export-Csv '.\output.csv' -NTI
Any help is appreciated!
Seems like a job for a simple string replace:
Get-Content "abc.csv" | foreach { $_.replace("C:\MSSQL\DATA\", "") | Set-Content "output.csv"
or:
[System.IO.File]::WriteAllText("output.csv", [System.IO.File]::ReadAllText("abc.csv" ).Replace("C:\MSSQL\DATA\", ""))
This should work:
Import-Csv ".\abc.csv" |
Select-Object -Property #{n='Folder_Path';e={$_.'Folder_Path' -replace '^.*\\(.*\\.*)$', '$1'}}, Size |
Export-Csv '.\output.csv' -NoTypeInformation

How do I merge 2 adjacent columns in a CSV file into a single column, separated by comma? (PowerShell)

I have a CSV file with 2 columns, latitude & longitude. I am trying to merge the 2 columns into 1, separated by a comma (no spaces).
Input CSV file, first 5 rows
latitude longitude
35.1868 -106.6652
42.3688 -83.4799
40.3926 -79.9052
40.5124 -88.9883
38.5352 -90.0006
My goal is to take this CSV and create a new one with a single column with both values separated by a comma (no spaces in-between) using PowerShell. See the desired output below...
location
35.1868,-106.6652
42.3688,-83.4799
40.3926,-79.9052
40.5124,-88.9883
38.5352,-90.0006
Any help will be greatly appreciated!
The IMO easiest way is a Select-Object with a calculated property
Import-Csv .\input.csv |
Select-Object #{Name='Location';Expression={$_.latitude,$_.longitude -join ','}} |
Export-Csv .\output.csv -NoTypeInformation
> Get-Content .\output.csv
"Location"
"35.1868,-106.6652"
"42.3688,-83.4799"
"40.3926,-79.9052"
"40.5124,-88.9883"
"38.5352,-90.0006"
Edit
In case there are other columns which should not be affected by the merge,
see this modified Select-Object
Select-Object *,#{N='Location';E={$_.latitude,$_.longitude -join ','}} -Exclude latitude,longitude|
But the new column will then be the last one.
the 1st ten lines are just a way to embed sample data in a script without needing to write it to a file & then read it back in. [grin]
use Import-CSV to get the real data into the script.
# fake reading in a CSV file
# in real life, use Import-CSV
$InStuff = #'
latitude, longitude
35.1868, -106.6652
42.3688, -83.4799
40.3926, -79.9052
40.5124, -88.9883
38.5352, -90.0006
'# | ConvertFrom-Csv
$LocationList = foreach ($IS_Item in $InStuff)
{
[PSCustomObject]#{
Location = #($IS_Item.Latitude, $IS_Item.Longitude) -join ','
}
}
# on screen
$LocationList
# CSV file
$LocationList |
Export-Csv -LiteralPath "$env:TEMP\JohnnyCarino_LocationList.csv" -NoTypeInformation
screen output ...
Location
--------
35.1868,-106.6652
42.3688,-83.4799
40.3926,-79.9052
40.5124,-88.9883
38.5352,-90.0006
CSV file content ...
"Location"
"35.1868,-106.6652"
"42.3688,-83.4799"
"40.3926,-79.9052"
"40.5124,-88.9883"
"38.5352,-90.0006"

CSV file header changes in powershell

I have a CSV file in which I want to change the headers names.
The current header is: name,id and I want to change it to company,transit
Following is what I wrote in script:
$a = import-csv .\finalexam\employees.csv -header name,id
foreach ($a in $as[1-$as.count-1]){
# I used 1 here because I want it to ignore the exiting headers.
$_.name -eq company, $_.id -eq transit
}
I don't think this is the correct way to do this.
You're over thinking this... All you want to do is replace the header row, so set the new header as the first item of an array, read in the file skipping the first line and add it to the array, output the array.
"Company,Transit"|Set-Content C:\Path\To\NewFile.csv
Get-Content C:\Path\To\Old.csv | Select -skip 1 | Add-Content C:\Path\To\NewFile.csv
Something very simple like this:
$file = Get-Content C:\temp\data.csv
"new,column,name" | Set-Content C:\temp\data.csv
$file | Select-Object -Skip 1 | Add-Content C:\temp\data.csv
Collect the complete file contents and then write a new header. Then restore the rest of the file content while -skiping the original header.