Powershell Searching within a text file and exporting horizontally to CSV

Powershell Searching within a text file and exporting horizontally to CSV - powershell

I have a big file which content somme blocks of text data like this.
[SERVER1]
LIBELLE=DATA, SOMME DATA
VARIABLES=A,B,C,D,E
PHYSICAL NAME=E:\SOMME\PATH\FILE.INI
ARTICLE SIZE=50
MACHINE=SOME SERVER
PARAMETER =
OPTION =
[SERVER2]
LIBELLE=DATA2, SOMME DATA2
VARIABLES=A,B,C,D,E
PHYSICAL NAME=Z:\SOMME\PATH\FILE2.INI
ARTICLE SIZE=150
MACHINE=SOME SERVER XY
PARAMETER =
OPTION 1 = VOID
OPTION 2 =
OPTION 3 =
OPTION 4 =
OPTION 5 =
What i would like to do is retrieving every block [SERVERX] and put it into a CSV file in this format (horizontally)
ColumnA__ | ColumnB (LIBELLE)___ | ColumnC (VARIABLES) | ColumnD ect...
[SERVER1] | DATA1, SOMME DATA1 | A,B,C,D,E___________ | Ect...
[SERVER2] | DATA2, SOMME DATA2 | A,B,C,D,E___________ | Ect...
I've tried this, the output work as i want but it need to be automated and exported to scv, which doesn't work for me.
$mydata = Get-Content my_file.txt
write-Host $mydata[0] $mydata[1] $mydata[2] $mydata[3] $mydata[4] $mydata[5] | Export-Csv -Path $rep\results.csv -Force -UseCulture -NoTypeInformation
Tried also somthing with select-string but i dont know if this is the right way to do my job..
select-String -path $my_file -Pattern '\[*\]', 'IDENTIFIANTS=','LIBELLE=','VARIABLES=' | Select-Object -Property LineNumber, Line | Export-Csv -Path $rep\results.csv -Force -UseCulture -NoTypeInformation
Thanks for your advices.

So, I would read the whole file in as a multi-line string using the -Raw switch for Get-Content. Then split the file up based on the [ character to denote records. The get the properties from the ConvertFrom-StringData cmdlet (have to prepend "SERVER=" to each record), and make an object from it. Then we find out what all properties any given record can have, make sure to add them all to the first record if it doesn't have it (this is done because when you export to CSV it bases the columns off of the first entries' property list). Then you can export a CSV.
$Data = (Get-Content my_file.txt -Raw) -split "(\[[^[]+)" | ?{![string]::IsNullOrWhiteSpace($_)}
$Records = $Data -replace '\\','\\'|%{$Record="SERVER="+$_.trim()|ConvertFrom-StringData;New-Object PSObject -Prop $Record}
$Props = $Records|%{$_.psobject.properties.name}|select -Unique
$Props | Where{$_ -notin $Records[0].PSObject.Properties.Name}|%{Add-Member -InputObject $Records[0] -NotepropertyName $_ -NotepropertyValue $Null}
$Records|Export-CSV .\my_file.csv -notype
Edit: For those of you out there running PowerShell 2.0 (3 versions out of date at this point in time), you can't use the -Raw parameter. Here's the alternative:
$Data = (Get-Content my_file.txt) -Join "`r`n" -split "(\[[^[]+)" | ?{![string]::IsNullOrWhiteSpace($_)}
Alternative: Thanks #Matt for the suggestion, it is always good to have a different point of view on these things. As Matt suggested, you can use Out-String to combine the array of strings that Get-Content generates, and end up with a single multi-line string. Here's the usage!
$Data = (GC my_file.txt | Out-String) -split "(\[[^[]+)" | ?{![string]::IsNullOrWhiteSpace($_)}

Related

Powershell CSV removing rows and then remove from whole file if A column matches

I've created the following small script to remove 2++ strings from a CSV.
Each row is a log of a given person and a answer they give.
The CSV has X columns.
The column named FIRST identifies the person.
What I need to do is when I delete a row matching the answer, I also need to delete the person from the whole CSV if it had one of the two strings.
What I've made so far, removes the row of people having the answers but the person is still left in the overall CSV with other answers. I want to remove the person fully if the questions have been answered.
Can somebody help me out with making the addition or changes to make this happen?
INPUT File
FIRST,LAST,ADDR,ADDR2,GENDER,HOME,WORK
1,N/A,N/A,N/A,N/A,BAF,N/A
10005,JAS,AA,N/A,,ZAV,N/A
10007,JADE,BB,N/A,OMA,N/A,N/A
10007,JADE,N/A,RAV,N/A,N/A,N/A
10011,KIAH,N/A,N/A,BALI,BB,N/A
SCRIPT
$CSVfile = "C:\Temp\Test\Test.csv"
$CSVfile_filtered = "C:\Temp\Test\Test.csv"
$regex001 = "AA"
$regex002 = "BB"
$filterArray = #($regex001,$regex002)
Get-Content $CSVfile | Select-String -pattern $filterArray -notmatch | Set-Content $CSVfile_filtered
The file should then remove 10005, 10011 and both lines of 10007. But my version only removes one of the 10007 since it only matches one of the two patterns.

Using more of PowerShell's built-in cmdlets can make this a little easier to manage.
# Assuming searching only properties ADDR and ADDR2
$filter = 'AA','BB'
# Grouping by First and Last values to easily remove duplicates
# -match uses regex so | is needed for an OR of multiple items
Import-Csv Test.csv | Group-Object First,Last |
Where {!($_.Group.ADDR,$_.Group.ADDR2 -match ($filter -join '|'))} |
Foreach-Object Group |
Export-Csv output.csv -NoType
You would think strictly using text manipulation would be simpler, but it adds other scenarios to consider:
You will need to track users that have duplicate entries and potentially back track to remove them (if not grouping). This could require reading the file contents twice.
Your header row could match the string you want to filter so you will need to add it to the output if filtering removes it.
Keeping the scenarios above in mind, you can still use a grouping concept:
$filter = 'AA','BB'
$file = Get-Content Test.csv
# $file[0] is the header row
# -split string uses regex and splits at the second comma
# -split results' [0] element is First,Last values
$file[0],($file |
Select-Object -Skip 1 |
Group-Object {($_ -split '(?<=^[^,]*,[^,]*),')[0]} |
where {!($_.Group -match ($filter -join '|'))} |
Foreach-Object Group) | Set-Content output.csv

If I got it right you could do something like this:
$SearchPattern = 'AA', 'BB'
$INPUTCSV = #'
FIRST,LAST,ADDR,ADDR2,GENDER,HOME,WORK
1,N/A,N/A,N/A,N/A,BAF,N/A
10005,JAS,AA,N/A,,ZAV,N/A
10007,JADE,BB,N/A,OMA,N/A,N/A
10007,JADE,N/A,RAV,N/A,N/A,N/A
10011,KIAH,N/A,N/A,BALI,BB,N/A
'# | ConvertFrom-Csv
$ActualSearchPattern =
$INPUTCSV |
Where-Object {
$_.LAST -in $SearchPattern -or
$_.ADDR -in $SearchPattern -or
$_.ADDR2 -in $SearchPattern -or
$_.GENDER -in $SearchPattern -or
$_.HOME -in $SearchPattern -or
$_.Work -in $SearchPattern
} |
Select-Object -ExpandProperty FIRST
$INPUTCSV |
Where-Object -Property FIRST -NotIn -Value $ActualSearchPattern |
Format-Table -AutoSize
There might be more sophisticated or more elegant ways but I cannot think about one at the moment. ;-)

There is a nice PowerShell module you can use to manipulate the content of a csv or xlsx file: ImportExcel
This give you a lot of options to manipulate the sheets, columns etc.

Read CSV row 1 columns and save them to variables

I would like to read data from csv or another txt files. Data should been read only from row 1 and few columns on row 1 and save them to variables and after saving delete the row. Now I have done it like this:
Get-ChildItem -Path C:\path | ForEach-Object -Process {
$YourContent = Get-Content -Path $_.FullName
$YourVariable = $YourContent | Select-Object -First 1
$YourContent | Select-Object -Skip 1 | Set-Content -Path $_.FullName
My problem is that my variable prints out like this :
Elvis;867.5390;elvis#geocities.com
So I would like to save each variable to its own column. Example what csv could look:
Elvis | 867.5309 | Elvis#Geocities.com
Sammy | 555.1234 | SamSosa#Hotmail.com

Use Import-Csv instead of Get-Content:
Import-Csv file.csv -Delimiter ";" -Header A, B, C

here's one way to do what i think you want.
the 1st 8 lines make a file to work with. [grin]
line 10 reads in that file
lines 11-13 convert the 1st line into an object & remove the unwanted property
lines 14-15 grab all BUT the 1st line & send it to overwrite the source file
the remaining lines show what was done [grin]
Code:
$FileName = "$env:TEMP\Pimeydentimo.txt"
# create a file to work with
#'
Alfa;123.456;Some unwanted info;Alfa#example.com
Bravo;234.567;More info that can be dropped;Bravo#example.com
Charlie;345.678;This is also ignoreable;Charlie#example.com
'# | Set-Content -LiteralPath $FileName
$InStuff = Get-Content -LiteralPath $FileName
$TempObject = $InStuff[0] |
ConvertFrom-Csv -Delimiter ';' -Header 'Name', 'Number', 'DropThisOne', 'Email' |
Select-Object -Property * -ExcludeProperty DropThisOne
$InStuff[1..$InStuff.GetUpperBound(0)] |
Set-Content -LiteralPath $FileName
$InStuff
'=' * 30
$TempObject
'=' * 30
Get-Content -LiteralPath $FileName
output ...
Alfa;123.456;Some unwanted info;Alfa#example.com
Bravo;234.567;More info that can be dropped;Bravo#example.com
Charlie;345.678;This is also ignoreable;Charlie#example.com
==============================
Name Number Email
---- ------ -----
Alfa 123.456 Alfa#example.com
==============================
Bravo;234.567;More info that can be dropped;Bravo#example.com
Charlie;345.678;This is also ignoreable;Charlie#example.com

Thanks for the answers!
I try to clarify a bit more what i was trying to do. Answers might do it already, but I'm not yet that good in Powershell and learning still a alot.
If I have csv or any other txt file, i would want to read the first row of the file. The row contains more than one piece of information. I want also save each piece of information to Variables. After saving information to variables, I would like to delete the row.
Example:
Car Model Year
Ford Fiesta 2015
Audi A6 2018
In this example, i would like to save Ford, Fiesta and 2015 to variables (row 1)($Card, $Model, $Year) and after it delete the row. The 2nd row should not be deleted, because it is used later on

I use -NoTypeInformation so why do I get header back when using Out-File?

I filtered by date this file data1.csv
2017.11.1,09:55,1.1,1.2,1.3,1.4,1
2017.11.2,09:55,1.5,1.6,1.7,1.8,2
I don't get a header with -NoTypeInformation:
$CutOff = (Get-Date).AddDays(-2)
$filePath = "data1.csv"
$Data = Import-Csv $filePath -Header Date,Time,A,B,C,D,E
$Data2 = $Data | Where-Object {$_.Date -as [datetime] -gt $Cutoff} | convertto-csv -NoTypeInformation -Delimiter "," | % {$_ -replace '"',''}
But when rewriting with Out-File
$Data2 | Out-File "data2.csv" -Encoding utf8 -Force
I get header back as data2.csv contains:
Date,Time,A,B,C,D,E
2017.11.2,09:55,1.5,1.6,1.7,1.8,2
Why do I have Date,Time,A,B,C,D,E ?

-NoTypeInformation is not about the header but the data type of the rows in the file. Remove it to see what shows up. From Microsoft
Omits the type information header from the output. By default, the string in the output contains #TYPE followed by the fully-qualified name of the object type.
Emphasis mine.
CSVs need headers. That is why it is making one. If you don't want to see the header in the output use Select-Object -Skip 1 to remove it.
$Data |
Where-Object {$_.Date -as [datetime] -gt $Cutoff} |
ConvertTo-CSV -NoTypeInformation -Delimiter "," |
Select-Object -Skip 1 |
% {$_ -replace '"'}
I would not pipe Out-File to itself. You could pipe to Set-Content here just as well.
I am guessing this whole process is to keep the source file in the same state just with some lines filtered out based on date. You could skip most of this just by parsing the date out in each line.
$threshold = (Get-Date).AddDays(-2)
$filePath = "c:\temp\bagel.txt"
(Get-Content $filePath) | Where-Object{
$date,$null=$_.Split(",",2)
[datetime]$date -gt $threshold
} | Set-Content $filePath
Now you don't have to worry about PowerShell CSV object structure or output since we act on the raw data of the file itself.
That will take each line of the input file and filter it out if the parsed date does not match the threshold. Change encoding on the input output cmdlets as you see necessary. What $date,$null=$_.Split(",",2) is doing is splitting the line
on the comma into 2 parts. First of which becomes $date and since this is just a filtering condition we dump the rest of the line into $null.

Properly-formed CSV files must have column headers. Your use of -NoTypeInformation in generating the CSV does not affect column headers; instead, it affects whether the PowerShell object type information is included. If you Export-CSV without -NoTypeInformation, the first line of your CSV file will have a line that looks like #TYPE System.PSCustomObject, which you don't want if you're going to open the CSV in a spreadsheet program.
If you subsequently Import-CSV, the headers (Date, Time, A, B, C) are used to create the fields of a PSObject, so that you can refer to them using the standard dot notation (e.g., $CSV[$line].Date).
The ability to specify -Header on Import-CSV is essentially a "hack" to allow the cmdlet to handle files that are comma-separated, but which did not include column headers.

How can I shift column values and add new ones in a CSV

I have to create a new column in my CSV data with PowerShell.
There is my code:
$csv = Import-Csv .\test1.csv -Delimiter ';'
$NewCSVObject = #()
foreach ($item in $csv)
{
$NewCSVObject += $item | Add-Member -name "ref" -value " " -MemberType NoteProperty
}
$NewCSVObject | export-csv -Path ".\test2.csv" -NoType
$csv | Export-CSV -Path ".\test2.csv" -NoTypeInformation -Delimiter ";" -Append
When I open the file, the column is here but a the right and I would like to have this at the left like column A. And I don't know if I can export the two object in one line like this (it doesn't work):
$csv,$NewCSVObject | Export-CSV -Path ".\test2.csv" -NoTypeInformation -Delimiter ";" -Append
The input file (It would have more lines than just the one):
A B C D E F G H
T-89 T-75 T-22 Y-23 Y-7 Y-71
The current output file:
A B C D E F G H
Y-23 Y-7 Y-71 ref: ref2:
The expected result in the Excel table, display "ref:" and "ref:2" before the product columns:
A B C D E F G H
ref: T-89 T-75 T-22 ref2: Y-23 Y-7 Y-71

This might be simpler if we just treat the file as a flat text file and save it in a csv format. You could use the csv objects and shift the values into other rows but that is not really necessary. Your approach of adding columns via Add-Member is not accomplishing this goal as it will be adding new columns and would not match your desired output. Export-CSV wants to write to file objects with the same properties as well which you were mixing which gave your unexpected results.
This is a verbose way of doing this. You could shorten this easily with something like regular expressions (see below). I opted for this method since it is a little easier to follow what is going on.
# Equivelent to Get-Content $filepath. This just shows what I am doing and is a portable solution.
$fileContents = "A;B;C;D;E;F;G;H",
"T-89;T-75;T-22;Y-23;Y-7;Y-71",
"T-89;T-75;T-22;Y-23;Y-7;Y-71"
$newFile = "C:\temp\csv.csv"
# Write the header to the output file.
$fileContents[0] | Set-Content $newFile
# Process the rest of the lines.
$fileContents | Select-Object -Skip 1 | ForEach-Object{
# Split the line into its elements
$splitLine = $_ -split ";"
# Rejoin the elements. adding the ref strings
(#("ref:") + $splitLine[0..2] + "ref2:" + $splitLine[3..5]) -join ";"
} | Add-Content $newFile
What the last line is going is concatenating an array. Starts with "ref:" add the first 3 elements of the split line followed by "ref2:" and the remaining elements. That new array is joined on semicolons and sent down the pipe to be outputted to the file.
If you are willing to give regex a shot this could be done with less code.
$fileContents = Get-Content "C:\source\file\path.csv"
$newFile = "C:\new\file\path.csv"
$fileContents[0] | Set-Content $newFile
($fileContents | Select-Object -Skip 1) -replace "((?:.*?;){3})(.*)",'ref:;$1ref2:;$2' | Add-Content $newFile
What that does is split each line beyond the first on the 3rd semicolon (Explanation). The replacement string is built from the ref strings and the matched content.

You can use Select-Object to specify order.
Assuming your headers are A-H (I know that instead of A it should be ref, from the code, but not sure if T-89 etc are your other headers)
$NewCSVObject | Select-Object A,B,C,D,E,F,G,H | Export-Csv -Path ".\test2.csv" -NoType

Powershell removing columns and rows from CSV

I'm having trouble making some changes to a series of CSV files, all with the same data structure. I'm trying to combine all of the files into one CSV file or one tab delimited text file (don't really mind), however each file needs to have 2 empty rows removed and two of the columns removed, below is an example:
col1,col2,col3,col4,col5,col6 <-remove
col1,col2,col3,col4,col5,col6 <-remove
col1,col2,col3,col4,col5,col6
col1,col2,col3,col4,col5,col6
^ ^
remove remove
End Result:
col1,col2,col4,col6
col1,col2,col4,col6
This is my attempt at doing this (I'm very new to Powershell)
$ListofFiles = "example.csv" #this is an list of all the CSV files
ForEach ($file in $ListofFiles)
{
$content = Get-Content ($file)
$content = $content[2..($content.Count)]
$contentArray = #()
[string[]]$contentArray = $content -split ","
$content = $content[0..2 + 4 + 6]
Add-Content '...\output.txt' $content
}
Where am I going wrong here...

your example file should be read, before foreach to fetch the file list
$ListofFiles = get-content "example.csv"
Inside the foreach you are getting content of mainfile
$content = Get-Content ($ListofFiles)
instead of
$content = Get-Content $file
and for removing rows i will recommend this:
$obj = get-content C:\t.csv | select -Index 0,1,3
for removing columns (column numbers 0,1,3,5):
$obj | %{(($_.split(","))[0,1,3,5]) -join "," } | out-file test.csv -Append

According to the fact the initial files looks like
col1,col2,col3,col4,col5,col6
col1,col2,col3,col4,col5,col6
,,,,,
,,,,,
You can also try this one liner
Import-Csv D:\temp\*.csv -Header 'C1','C2','C3','C4','C5','C6' | where {$_.c1 -ne ''} | select -Property 'C1','C2','C5' | Export-Csv 'd:\temp\final.csv' -NoTypeInformation
According to the fact that you CSVs have all the same structure, you can directly open them providing the header, then remove objects with the missing datas then export all the object in a csv file.

It is sufficient to specify fictitious column names, with a column number that can exceed the number of columns in the file, change where you want and exclude columns that you do not want to take.
gci "c:\yourdirwithcsv" -file -filter *.csv |
%{ Import-Csv $_.FullName -Header C1,C2,C3,C4,C5,C6 |
where C1 -ne '' |
select -ExcludeProperty C3, C4 |
export-csv "c:\temp\merged.csv" -NoTypeInformation
}

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Powershell Searching within a text file and exporting horizontally to CSV - powershell

Related

Powershell CSV removing rows and then remove from whole file if A column matches

Read CSV row 1 columns and save them to variables

I use -NoTypeInformation so why do I get header back when using Out-File?

How can I shift column values and add new ones in a CSV

Powershell removing columns and rows from CSV

Categories

Resources