Removing Header and Footer from imported .csv - powershell

I have 3 .csv files that I am combining into one. This bit of code works:
Get-ChildItem 'C:\Scripts\testing\csvStuffer\temp\Individual.*.csv' |
ForEach-Object {Import-Csv $_} |
Export-Csv -NoTypeInformation 'C:\Scripts\testing\csvStuffer\temp\MergedCsvFiles.csv'
The problem is that each .csv file has a header and a footer.
I do not want to keep the header or footer from any of the files.
Any suggestions of what I need to add to the above code to remove the headers and footers???
Thanks!

This is not the most elegant solution but it worked for my test files.
Get-ChildItem 'C:\Scripts\testing\csvStuffer\temp\Individual.*.csv' |
ForEach-Object {
$filecontent = get-content $_ | select-object -skip 1;
$filecontent | select -First $($filecontent.length -1) | Set-Content -Path $_;
};
Skipping the first line is easy with select-object. Dropping the last line requires a bit more work, but since get-content returns an array of lines, you can just grab all but the last element in that array.

Looks like alroc already gave an answer, but since I already had it written up I figured I'd post this too. It doesn't load it all into a variable, it just reads each file, strips the first and last line of the current file, and then pipes to out-file with -append on it.
gci 'C:\Scripts\testing\csvStuffer\temp\Individual.*.csv' | %{
$(gc $_.fullname|skip 1)|select -First ($(gc $_.fullname|skip 1).count-1)
}|Out-File -Append 'C:\Scripts\testing\csvStuffer\temp\MergedCsvFiles.csv'

Related

From a CSV file get the file header and a portion of the file based on starting and ending line number parameters using PowerShell

So I have a very huge CSV file, the first line has the column headers. I want to keep the first line as a header and add a portion of the file from the file's mid-section or perhaps the end. I'm also trying to select only a few of the columns from the file. And finally, it would be great if the solution also changed the file delimiter from a comma to a tab.
I'm aiming for a solution that's a one-liner or perhaps 2?
Non-working Code version 30 ...
Get-Content -Tail 100 filename.csv | Export-Csv -Delimiter "`t" -NoTypeInformation -Path .\filename_out.csv
I'm trying to get a better grip on PowerShell. So far, so good but I'm not quite there yet. But trying to solve such challenges are helping me (and hopefully others) build a good collection of coding idioms. (FYI - the boss is trying PowerShell due to our efforts so.)
OK thanks to iRon tip. Import-CSV defaults to comma separated, the Select-Object -Property get the columns I want, the select -Last gets the last 200 rows, and the Export-CSV changes the delimiter to a tab:
Import-Csv iarf.csv |
Select-Object -Property Id,Name,RecordTypeId,CreatedDate |
select -Last 200 |
Export-Csv -Delimiter "`t" -NoTypeInformation -Path .\iarf100props6.csv
iRon provided the crucial pointer: Using Import-Csv rather than Get-Content allows you to retrieve arbitrary ranges from the original file as objects, if selected via Select-Object, and exporting these objects again via Export-Csv automatically includes a header line whose column names are the input objects' property names, as initially derived from the input file's header line.
In order to select an arbitrary range of rows, combine Select-Object's -Skip and -First parameters:
To only get rows from the beginning, use just -First $count:
To only get rows from the end, use just -Last $count
To get rows in a given range, use just -Skip $startRowMinus1 -First $rangeRowCount
For instance, the following command extracts rows 10 through 30:
Import-Csv iarf.csv |
Select-Object -Property Id,Name,RecordTypeId,CreatedDate -Skip 9 -First 20 |
Export-Csv -Delimiter "`t" -NoTypeInformation -Path .\iarf100props6.csv

Adding Extra Headers in CSV

Input CSV:
CHeader1,CHeader2,CHeader3,CHeader4,CHeader5
a1,a2,a3,a4,a5
b1,b2,b3,b4,b5
Output CSV:
PHeader1,PHeader2
CHeader1,CHeader2,CHeader5
a1,a2,a5
b1,b2,b5
Already tried
Import-Csv -Path .\before.csv |
select CHeader1, CHeader2, CHeader5 |
Export-Csv -Path .\after.csv
This produces the file without parent level headers.
Any suggestions to add parent level headers in first line of CSV followed by client headers and then the data?
What you're trying to create there is not actually a CSV, at least not in a way that the *-Csv cmdlets could handle. You can manually create it like this, though:
'PHeader1,PHeader2' | Set-Content '.\after.csv'
Import-Csv '.\before.csv' |
Select-Object CHeader1, CHeader2, CHeader5 |
ConvertTo-Csv -NoType |
Add-Content '.\after.csv'

Removing blank lines from text file using batch

Recently this question was posted on stackoverflow, I have a similar problem as J. Williams except I only need to remove empty lines (removing spaces would not hurt the program though, it just isn't necessary). When I tried his original as well as the solution compo gave it only cleared the file instead of removing extra lines. I'm using it by itself in a batch file.
Example:
I need this:
Joe
Bob
Mark
Frank
Dutch
(blank line here too)
to become this:
Joe
Bob
Mark
Frank
Dutch
I'm open to attaching such a solution to this powershell script too, as it is what is giving me the blank lines: (Get-Content friends.txt) | Where-Object {$_ -notmatch "\bJoe\b"} | Set-Content friends.txt Thank's for your help.
This should work for you - in short, it reads all the lines of the file, then writes back only the lines with something in it to the file.
$friends = Get-Content friends.txt
$friends | Where-Object { $_ } | Set-Content friends.txt
If it's a relatively small file:
(Get-Content $infile) -match '\S' | Set-Content $infile
If it's too large to read into memory all at once:
Get-Content $infile -ReadCount 1000 |
ForeachObject {
$_ -match '\S' |
Add-Content $outfile
}
Another PowerShell method would be to use the Select-String cmdlet using the regex pattern .+ which means one or more of any character. Also if using Set-Content be sure to use the -NoNewLine parameter to prevent the unwanted blank line at the end. (Requires PS5+)
Select-String -Path C:\example.txt -Pattern '.+' |
Select-Object -ExpandProperty line |
Set-Content C:\exampleoutput.txt -NoNewline

Searching through a text file

I have a script that searches for the lastest modified log file. It then is suppose to read that text file and pick up a key phrase then display the line after it.
So far i have this
$logfile = get-childitem 'C:\logs' | sort {$_.lastwritetime} | where {$_ -notmatch "X|Zr" }| select -last 1
$error = get-content $logfile | select-string -pattern "Failed to Modify"
an example line it reads is this
20150721 12:46:26 398fbb92 To CV Failed to Modify
CN=ROLE-x-USERS,OU=Role Groups,OU=Groups,DC=gyp,DC=gypuy,DC=net
MDS_E_BAD_MEMBERSHIP One or more members do not exist in the directory
They key bit of information im trying to get here is
Can anyone help?
Thanks
Try this:
$error = get-content $logfile |
Where-Object { $_ -like "*Failed to Modify*" } |
Select-Object -First 1
This is provided you are looking for the first match in the file. The Select-String cmdlet returns a MatchInfo object. Depending on your requirements there might be no reason to add that level of complexity if you're just looking to pull the first occurrence of this error in the file.
Failing this, my recommendation would be to debug this and step through it. Break on the Get-Content call and see what $logfile is. Run Get-Content $logfile and see what that content looks like. Then do your Select-String on that output. See what MatchInfo.ToString() looks like. Maybe you'll see some disconnect.
Again, my recommendation would be to just parse manually through the file and work with the Where-Object cmdlet at this point.
This shoul work:
get-childitem 'c:\logs' | where {$_.Name -notmatch "X|Zr" } | sort {$_.lastwritetime} | select -last 1 | select-string "Failed to Modify"
But I don't like "X|Zr" part. If your log files have .txt extension, it'll not list them because you're saying you don't want any file containing "x" or "zr" in entire name. Use $_.BaseName (name without extension), or modify regular expression.

Count number of PDF files in Directory and output to .csv

I am trying to report on the number of pdf files in a directory. The below code works fine, however i have added Export-Csv into it, and the output does not work. The file is created, but the count is wrong. I get "#TYPE System.Int32" in cell 1A of the output file instead of the file count.... not sure why.
(get-ChildItem C:\Test\* -Filter *.pdf -Recurse).Count | Export-Csv C:\TEMP\Test.csv
Export-CSV works better when you have an object or hashtable with properties and values. All you have is a number in this case and it has no idea what the column heading should be. If all you want is a number in a file, try this:
(get-ChildItem C:\Test\* -Filter *.pdf -Recurse).Count | Set-Content C:\TEMP\Test.csv
But if you really want a csv file or an example for other projects, try this:
$HashTable = #{NumberOfPDFFiles = ((get-ChildItem C:\Test\* -Filter *.pdf -Recurse).Count)}
$HashTable | Export-csv C:\TEMP\Test.csv -NoTypeInformation
Or something like this to stick with the one line idea
(get-ChildItem C:\Test\* -Filter *.pdf -Recurse).Count |
Select-Object #{n='PdfCount';e={$_}} |
Export-CSV C:\TEMP\Test.csv -NoTypeInformation
To compliment kevmar's answer since it does address the issue but doesn't explain why.
From TechNet
By default, the first line of the CSV file contains "#TYPE " followed
by the fully-qualified name of the type of the object.
That is why your first line is: #TYPE System.Int32 and why -NoTypeInformation removes it. If all you are doing is outputting a count then Set-Content makes more sense.