Copy table from .txt file to a new .txt file by skipping certain lines - powershell

I have one table (.txt file) in this form:
Table: HHBB
Displayed Fields: 1 of 5 Fixed Columns: 4
-----------------------------------------------------------------------------
| |ID |NAME |Zähler |Obj |ID-RON |MANI |Felder |Nim
|----------------------------------------------------------------------------
| |007 |Kano |000001 |Lad |19283712 | |/HA |
| |007 |Bani |000002 |Bad |917391823 | |/LA |
I want to save this table into another .txt file but just want to skip the lines that match Table and Displayed Fields for example. What I tried:
If ([string]::IsNullOrWhitespace($tempInputRecord2) -or $_ -match "=|Table:|Displayed|----") {
continue
}
How can I do that?
And another question:
What is the best way to write the lines one by one into a new text file?

So you just want to remove the lines which start with Table: or Displayed Fields: and output results to a new file? Use Where-Object to filter lines, and Out-File to write them to the file:
Get-Content test.txt |
Where-Object { $_ -notlike "Table:*" -and $_ -notlike "Displayed Fields:*" } |
Out-File test2.txt

There are many ways for simple tasks:
If the header to skip occurs only once:
Get-Content test.txt|Select-Object -Skip 2|Set-Content test2.txt
A similar approach to yours with -notmatch and RegEx alternation
Get-Content test.txt|Where-Object {$_ -notmatch '^Table:|^Displayed Fields:'}|Set-Content test2.txt
When forcing a complete read in to memory by enclosing in parentheses you can write to the same file:
(Get-Content test.txt)|Select-Object -Skip 2|Set-Content test.txt

Related

Merge PDF files using CSV file list using Powershell

I want to create multiple merged PDF files from around 1400+ pdf files.
I have a data.csv file with 2 columns as below.
The PDF files with filename matching Filename column and data.csv file are in the same folder.
I need to create multiple merged PDF files and each merged PDF will have group of files that have the same First three characters in the filename.
e.g.,
The filenames starting with EIN* need to be merged into one PDF file in the same sorting order as in the data.csv file. The filename of merged PDF should be Y followed by the first three characters. so in this example it should be YEIN.pdf
This process need to be looped in until all the rows in data.csv are actioned.
sample data.csv file
FilePath Filename
$FilePath1 EINCO01-174.pdf
$FilePath2 EINCO02-174.pdf
$FilePath3 EINCO03-174.pdf
$FilePath4 EINCO04-174.pdf
$FilePath5 EINCL01-174.pdf
$FilePath6 EINCL02-174.pdf
$FilePath7 EINCL03-174.pdf
$FilePath8 EINCL04-174.pdf
$FilePath9 EINCL05-174.pdf
$FilePath10 EINCL06-174.pdf
$FilePath11 EINCL07-174.pdf
$FilePath12 EINCL08-174.pdf
$FilePath13 EINCL09-174.pdf
$FilePath14 EINCL10-174.pdf
$FilePath15 EINCL11-174.pdf
$FilePath16 EINCL12-174.pdf
$FilePath17 EINCL13-174.pdf
$FilePath18 EINCL14-174.pdf
$FilePath19 EINCL15-174.pdf
$FilePath20 EINCL16-174.pdf
$FilePath21 EINCL17-174.pdf
$FilePath22 EINCL18-174.pdf
$FilePath23 EINCL19-174.pdf
$FilePath25 GINLG01-170.pdf
$FilePath26 GINLG02-166.pdf
$FilePath27 GINLG03-159.pdf
$FilePath28 GINLG04-159.pdf
$FilePath29 GINLG05-168.pdf
$FilePath30 GINLG06-152.pdf
$FilePath31 GINNO01-174.pdf
$FilePath32 GINNO02-131.pdf
$FilePath33 GINNO04-150.pdf
$FilePath34 GINNO05-174.pdf
$FilePath35 GINTA01-130.pdf
$FilePath36 GINTA02-139.pdf
$FilePath37 GINTA03-139.pdf
So to tackle this I have created a script to split data.csv file into multiple CSV files grouped by the First three characters as below.
$data = Import-Csv '.\data.csv' |
Select-Object Filepath,Filename,#{n='Group';e={$_.Filename.Substring(0,3)}}
$data | Format-Table -GroupBy Group
Group-Object {$_.Group}| ForEach-Object {
$_.Group | Export-Csv "$($_.Group).csv" -NoTypeInformation
}
foreach ($Group in $data | Group Group)
{
$data | Where-Object {$_.Group -eq $group.name} |
ConvertTo-Csv -delimiter "`t" -NoTypeInformation |
foreach {$_.Replace('"','')} |
Out-File "$($group.name).csv"
}`
From here, I am unable to proceed to next step to achieve what I need to. I presume there could be a better way to do this.
PS: I have installed PSWritePDF module on my machine.

Powershell CSV removing rows and then remove from whole file if A column matches

I've created the following small script to remove 2++ strings from a CSV.
Each row is a log of a given person and a answer they give.
The CSV has X columns.
The column named FIRST identifies the person.
What I need to do is when I delete a row matching the answer, I also need to delete the person from the whole CSV if it had one of the two strings.
What I've made so far, removes the row of people having the answers but the person is still left in the overall CSV with other answers. I want to remove the person fully if the questions have been answered.
Can somebody help me out with making the addition or changes to make this happen?
INPUT File
FIRST,LAST,ADDR,ADDR2,GENDER,HOME,WORK
1,N/A,N/A,N/A,N/A,BAF,N/A
10005,JAS,AA,N/A,,ZAV,N/A
10007,JADE,BB,N/A,OMA,N/A,N/A
10007,JADE,N/A,RAV,N/A,N/A,N/A
10011,KIAH,N/A,N/A,BALI,BB,N/A
SCRIPT
$CSVfile = "C:\Temp\Test\Test.csv"
$CSVfile_filtered = "C:\Temp\Test\Test.csv"
$regex001 = "AA"
$regex002 = "BB"
$filterArray = #($regex001,$regex002)
Get-Content $CSVfile | Select-String -pattern $filterArray -notmatch | Set-Content $CSVfile_filtered
The file should then remove 10005, 10011 and both lines of 10007. But my version only removes one of the 10007 since it only matches one of the two patterns.
Using more of PowerShell's built-in cmdlets can make this a little easier to manage.
# Assuming searching only properties ADDR and ADDR2
$filter = 'AA','BB'
# Grouping by First and Last values to easily remove duplicates
# -match uses regex so | is needed for an OR of multiple items
Import-Csv Test.csv | Group-Object First,Last |
Where {!($_.Group.ADDR,$_.Group.ADDR2 -match ($filter -join '|'))} |
Foreach-Object Group |
Export-Csv output.csv -NoType
You would think strictly using text manipulation would be simpler, but it adds other scenarios to consider:
You will need to track users that have duplicate entries and potentially back track to remove them (if not grouping). This could require reading the file contents twice.
Your header row could match the string you want to filter so you will need to add it to the output if filtering removes it.
Keeping the scenarios above in mind, you can still use a grouping concept:
$filter = 'AA','BB'
$file = Get-Content Test.csv
# $file[0] is the header row
# -split string uses regex and splits at the second comma
# -split results' [0] element is First,Last values
$file[0],($file |
Select-Object -Skip 1 |
Group-Object {($_ -split '(?<=^[^,]*,[^,]*),')[0]} |
where {!($_.Group -match ($filter -join '|'))} |
Foreach-Object Group) | Set-Content output.csv
If I got it right you could do something like this:
$SearchPattern = 'AA', 'BB'
$INPUTCSV = #'
FIRST,LAST,ADDR,ADDR2,GENDER,HOME,WORK
1,N/A,N/A,N/A,N/A,BAF,N/A
10005,JAS,AA,N/A,,ZAV,N/A
10007,JADE,BB,N/A,OMA,N/A,N/A
10007,JADE,N/A,RAV,N/A,N/A,N/A
10011,KIAH,N/A,N/A,BALI,BB,N/A
'# | ConvertFrom-Csv
$ActualSearchPattern =
$INPUTCSV |
Where-Object {
$_.LAST -in $SearchPattern -or
$_.ADDR -in $SearchPattern -or
$_.ADDR2 -in $SearchPattern -or
$_.GENDER -in $SearchPattern -or
$_.HOME -in $SearchPattern -or
$_.Work -in $SearchPattern
} |
Select-Object -ExpandProperty FIRST
$INPUTCSV |
Where-Object -Property FIRST -NotIn -Value $ActualSearchPattern |
Format-Table -AutoSize
There might be more sophisticated or more elegant ways but I cannot think about one at the moment. ;-)
There is a nice PowerShell module you can use to manipulate the content of a csv or xlsx file: ImportExcel
This give you a lot of options to manipulate the sheets, columns etc.

Need to remove specific portion from rows in a csv using powershell

I have a csv file with two columns and multiple rows, which has the information of files with folder location and its corresponding size, like below
"Folder_Path","Size"
"C:\MSSQL\DATA\UsersData\FTP.txt","21345"
"C:\MSSQL\DATA\UsersData\Norman\abc.csv","78956"
"C:\MSSQL\DATA\UsersData\Market_Database\123.bak","1234456"
What i want do is remove the "C:\MSSQL\DATA\" part from every row in the csv and keep the rest of the folder path after starting from UsersData and all other data intact as this info is repetitive. So my csv should like this below.
"Folder_Path","Size"
"UsersData\FTP.txt","21345"
"UsersData\Norman\abc.csv","78956"
"UsersData\Market_Database\123.bak","1234456"
What i am running is as below
Import-Csv ".\abc.csv" |
Select-Object -Property #{n='Folder_Path';e={$_.'Folder_Path'.Split('C:\MSSQL\DATA\*')[0]}}, * |
Export-Csv '.\output.csv' -NTI
Any help is appreciated!
Seems like a job for a simple string replace:
Get-Content "abc.csv" | foreach { $_.replace("C:\MSSQL\DATA\", "") | Set-Content "output.csv"
or:
[System.IO.File]::WriteAllText("output.csv", [System.IO.File]::ReadAllText("abc.csv" ).Replace("C:\MSSQL\DATA\", ""))
This should work:
Import-Csv ".\abc.csv" |
Select-Object -Property #{n='Folder_Path';e={$_.'Folder_Path' -replace '^.*\\(.*\\.*)$', '$1'}}, Size |
Export-Csv '.\output.csv' -NoTypeInformation

CSV file header changes in powershell

I have a CSV file in which I want to change the headers names.
The current header is: name,id and I want to change it to company,transit
Following is what I wrote in script:
$a = import-csv .\finalexam\employees.csv -header name,id
foreach ($a in $as[1-$as.count-1]){
# I used 1 here because I want it to ignore the exiting headers.
$_.name -eq company, $_.id -eq transit
}
I don't think this is the correct way to do this.
You're over thinking this... All you want to do is replace the header row, so set the new header as the first item of an array, read in the file skipping the first line and add it to the array, output the array.
"Company,Transit"|Set-Content C:\Path\To\NewFile.csv
Get-Content C:\Path\To\Old.csv | Select -skip 1 | Add-Content C:\Path\To\NewFile.csv
Something very simple like this:
$file = Get-Content C:\temp\data.csv
"new,column,name" | Set-Content C:\temp\data.csv
$file | Select-Object -Skip 1 | Add-Content C:\temp\data.csv
Collect the complete file contents and then write a new header. Then restore the rest of the file content while -skiping the original header.

How to change column position in powershell?

Is there any easy way how to change column position? I'm looking for a way how to move column 1 from the beginning to the and of each row and also I would like to add zero column as a second last column. Please see txt file example below.
Thank you for any suggestions.
File sample
TEXT1,02/10/2015,55.930,57.005,55.600,56.890,1890
TEXT2,02/10/2015,51.060,52.620,50.850,52.510,4935
TEXT3,02/10/2015,50.014,50.74,55.55,52.55,5551
Output:
02/10/2015,55.930,57.005,55.600,56.890,1890,0,TEXT1
02/10/2015,51.060,52.620,50.850,52.510,4935,0,TEXT2
02/10/2015,50.014,50.74,55.55,52.55,5551,0,TEXT3
Another option:
#Prepare test file
(#'
TEXT1,02/10/2015,55.930,57.005,55.600,56.890,1890
TEXT2,02/10/2015,51.060,52.620,50.850,52.510,4935
TEXT3,02/10/2015,50.014,50.74,55.55,52.55,5551
'#).split("`n") |
foreach {$_.trim()} |
sc testfile.txt
#Script starts here
$file = 'testfile.txt'
(get-content $file -ReadCount 0) |
foreach {
'{1},{2},{3},{4},{5},{6},0,{0}' -f $_.split(',')
} | Set-Content $file
#End of script
#show results
get-content $file
02/10/2015,55.930,57.005,55.600,56.890,1890,0,TEXT1
02/10/2015,51.060,52.620,50.850,52.510,4935,0,TEXT2
02/10/2015,50.014,50.74,55.55,52.55,5551,0,TEXT3
Sure, split on commas, spit the results back minus the first result joined by commas, add a 0, and then add the first result to the end and join the whole thing with commas. Something like:
$Input = #"
TEXT1,02/10/2015,55.930,57.005,55.600,56.890,1890
TEXT2,02/10/2015,51.060,52.620,50.850,52.510,4935
TEXT3,02/10/2015,50.014,50.74,55.55,52.55,5551
"# -split "`n"|ForEach{$_.trim()}
$Input|ForEach{
$split = $_.split(',')
($Split[1..($split.count-1)]-join ','),0,$split[0] -join ','
}
I created file test.txt to contain your sample data. I Assigned each field a name, "one","two","three" etc so that i could select them by name, then just selected and exported back to csv in the order you wanted.
First, add the zero to the end, it will end up as second last.
gc .\test.txt | %{ "$_,0" } | Out-File test1.txt
Then, rearrange order.
Import-Csv .\test.txt -Header "one","two","three","four","five","six","seven","eight" | Select-Object -Property two,three,four,five,six,seven,eight,one | Export-Csv test2.txt -NoTypeInformation
This will take the output file and get rid of quotes and header line if you would rather not have them.
gc .\test2.txt | %{ $_.replace('"','')} | Select-Object -Skip 1 | out-file test3.txt