Adding two broken rows using powershell - powershell

I have a file which has header at first row and then other data at remaining rows. I want to check whether all rows has equal no of data with header.
For example: if header has 10 count then I want all my remaining rows to have 10 data each so there will be no error while loading the data.
Suppose in line 5 and 6 there are only 5 data each.So,i want to combine these two rows in such case.
My expected output is(Row 5 has the merged data)
There may such breakable data in many rows of the file.So, i just want to scan whole file and will merge the two rows when such cases are seen.
So, I tried using:
$splitway=' '
$firstLine = Get-Content -Path $filepath -TotalCount 1
$firstrowheader=$firstLine.split($splitway,System.StringSplitOptions]::RemoveEmptyEntries)
$requireddataineachrow=$firstrowheader.Count
echo $requireddataineachrow
The above code will give me 10 since my header is having 10 data.
For ($i = 1; $i -lt $totalrows; $i++) {
$singleline=Get-Content $filepath| Select -Index $i
$singlelinesplit=$singleline.split($splitway,[System.StringSplitOptions]::RemoveEmptyEntries)
if($singlelinesplit.Count -lt $requireddataineachrow){
$curr=Get-Content $filepath| Select -Index $i
$next=Get-Content $filepath| Select -Index $i+1
Write-Host (-join($curr, " ", $next))
}
echo $singlelinesplit.Count
}
I tested using Write-Host (-join($curr, " ", $next)) to join two lines but it's not giving the correct output.
echo $singlelinesplit.Count is showing correct result:
My whole data is:
billing summary_id offer_id vendor_id import_v system_ha rand_dat mand_no sad_no cad_no
11 23 44 77 88 99 100 11 12 500
1111 2333 4444 6666 7777777 8888888888 8888888888888 9999999999 1111111111111 2000000000
33333 444444 As per new account ddddddd gggggggggggg wwwwwwwwwww bbbbbbbbbbb qqqqqqqqqq rrrrrrrrr 5555555
22 33 44 55 666<CR>
42 65 66 55 244
11 23 44 76 88 99 100 11 12 500
1111 2333 new document 664466 7777777 8888888888 8888888888888 9999999999 111111144111 200055000
My whole code if needed is:
cls
$filepath='D:\test.txt'
$splitway=' '
$totalrows=#(Get-Content $filepath).Length
write-host $totalrows.gettype()
$firstLine = Get-Content -Path $filepath -TotalCount 1
$firstrowheader=$firstLine.split($splitway,[System.StringSplitOptions]::RemoveEmptyEntries)
$requireddataineachrow=$firstrowheader.Count
For ($i = 1; $i -lt $totalrows; $i++) {
$singleline=Get-Content $filepath| Select -Index $i
$singlelinesplit=$singleline.split($splitway,[System.StringSplitOptions]::RemoveEmptyEntries)
if($singlelinesplit.Count -lt $requireddataineachrow){
$curr=Get-Content $filepath| Select -Index $i
$next=Get-Content $filepath| Select -Index $i+1
Write-Host (-join($curr, " ", $next))
}
echo $singlelinesplit.Count
}

Update: It seems that instances of string <CR> are a verbatim part of your input file, in which case the following solution should suffice:
(Get-Content -Raw sample.txt) -replace '<CR>\s*', ' ' | Set-Content sample.txt
Here's a solution that makes the following assumptions:
<CR> is just a placeholder to help visualize an actual newline in the input file.
Only data rows with fewer columns than the header row require fixing (as Mathias points out, your data is ambiguous, because a column value such as As per new account technically comprises three values, due to its embedded spaces).
Such a data row can blindly be joined with the subsequent line (only) to form a complete data row.
# Create a sample file.
#'
billing summary_id offer_id vendor_id import_v system_ha rand_dat mand_no sad_no cad_no
11 23 44 77 88 99 100 11 12 500
1111 2333 4444 6666 7777777 8888888888 8888888888888 9999999999 1111111111111 2000000000
33333 444444 As per new account ddddddd gggggggggggg wwwwwwwwwww bbbbbbbbbbb qqqqqqqqqq rrrrrrrrr 5555555
22 33 44 55 666
42 65 66 55 244
11 23 44 76 88 99 100 11 12 500
1111 2333 new document 664466 7777777 8888888888 8888888888888 9999999999 111111144111 200055000
'# > sample.txt
# Read the file into the header row and an array of data rows.
$headerRow, $dataRows = Get-Content sample.txt
# Determine the number of whitespace-separated columns.
$columnCount = (-split $headerRow).Count
# Process all data rows and save the results back to the input file:
# Whenever a data row with fewer columns is encountered,
# join it with the next row.
$headerRow | Set-Content sample.txt
$joinWithNext = $false
$dataRows |
ForEach-Object {
if ($joinWithNext) {
$partialRow + ' ' + $_
$joinWithNext = $false
}
elseif ((-split $_).Count -lt $columnCount) {
$partialRow = $_
$joinWithNext = $true
}
else {
$_
}
} | Add-Content sample.txt

Related

Powershell to match the current line and next line then out-file

I'm trying to extract the data whereby:
line 1 = Report ID + Line 2 = "Machine no" + Line 3 = OFFLINE
Then Out-File to a new file.
Sample data
Report ID page1
Machine no 1234
OTHERS
12
offline
12
OTHERS
23
offline
37
OTHERS
89
offline
65
The result I'm looking for look something like the below after processing:
Report ID page 4
Machine no 1234
offline
12
offline
37
offline
65
You can use the Select-String cmdlet with the -Context Parameter to search through a file and then select how many lines of contextual info you want to get back from your search.
For instance, if we take your input file and store it in a variable called $input like so:
$inputFile= Get-Content C:\Path\To\YourFile.txt
$inputFile| Select-string 'machine no'
>Machine no 1234
We can then find matches for the phrase 'offline' with this:
$inputFile| Select-String offline -Context 0,1
This states that I want you to search for the word 'offline' and give me zero lines proceeding it, and one line after it, giving us this output:
> offline
12
> offline
37
> offline
65
We can put this all together to build this and generate a new output file that would look like this.
$out= ''
$out += $inputFile| Select-string 'machine no'
$out += "`n"
$out += $inputFile| Select-String offline -Context 0,1 | ForEach-Object {$_.Context.DisplayPostContext}
#Resulting file would look this, just the name of the machine and then the number of each offline...uh, whatever it is.
Machine no 1234
12 37 65
If I were you, I'd adapt this flow to make PowerShell objects and properties instead, like this:
$Report = [pscustomobject]#{ID='';MachineNo='';OfflineMembers=#()}
$Report.ID = ($inputFile | select-string page -Context 0 ).ToString().Split('page')[-1]
$Report.MachineNo = ($inputFile | Select-string 'machine no').ToString().Trim().Split()[-1]
$Report.OfflineMembers = $inputFile | Select-String offline -Context 0,1 | ForEach-Object {
[pscustomobject]#{Value='Offline';ID=$_.Context.DisplayPostContext.Trim()}
}
>$Report
ID MachineNo OfflineMembers
-- --------- --------------
1 1234 {#{Value=Offline; ID=12}, #{Value=Offline; ID=37}, #{Value=Offline; ID=65}}
$Report.OfflineMembers
Value ID
----- --
Offline 12
Offline 37
Offline 65

Split large excel file to multiple smaller file by user defined rows through powershell

Im looking to split large excel file into multiple excel file
My sample excel file
Name Value value2
abc1 10 100
abc2 20 200
abc3 30 300
abc4 40 400
abc5 50 500
abc6 60 600
abc7 70 700
abc8 80 800
Expected result
Batch1.xlsx
Name Value Value2
abc1 10 100
abc2 20 200
abc3 30 300
Batch2.xlsx
Name Value Value2
abc4 40 400
abc5 50 500
abc6 60 600
Batch3.xlsx
Name Value Value2
abc7 70 700
abc8 80 800
Myscript strucks in loops .. As a beginner looking for some assistance ..
Example: user choosing data to be splitted with 3 rows ., If have 8 rows data of input file ., file3.xlsx can keep remaining 2 rows .
$nom = Read-Host 'Enter number of rows of data want to be in a file'
$nom = [int]$nom + [int]1
$nom1 = 'A'+ $nom
$nxc = 100
$meto = 'A1'
For ($nom; $nom -le $nxc) {
$excel = New-Object -ComObject Excel.Application
$excel.visible = $true
$workbook = $excel.workbooks.open("C:\Users\admin\Desktop\rax\Master\in.xlsx")
$worksheet = $workbook.sheets.item("Sheet1")
$worksheet.Range("$meto","$nom1").EntireRow.copy()
$wb2=$excel.workbooks.open("C:\Users\admin\Desktop\rax\out.xlsx")
$targetRange=$wb2.Worksheets.Item('Sheet1').Range("A1").EntireRow
$wb2.Worksheets.Item('Sheet1').Activate()
$targetRange.PasteSpecial(-4163)
$meto = $wb2.Worksheets.Item('Sheet1').UsedRange.Rows.Count
$wb2.RefreshAll()
$wb2.Save()
$workbook.Worksheets.Item('Sheet1').Activate()
$met = $workbook.Worksheets.Item('Sheet1').UsedRange.Rows.Count
$nxc = $met
$meto = [int]$meto + [int]1
$nom = [int]$meto - [int]1
$nom = [int]$nom + [int]$nom
$nom
$excel.Quit()
}
You can utilize an awesome module developed by Doug finke . Import-Excel
below code will solve you problem.
$r=#()
$t=$C=1
Import-Excel -Path C:\Temp\test.xlsx|Foreach-Object -Process {
#Append rows in an array
$r += $_
#Save in a new excel when count reaches 3
if($C -eq 3){
$r | Export-Excel -Path C:\Temp\test_$t.xlsx
#reset values
$r=#()
$c=1
$t++
}
else{
#increment row count
$c++
}
}
#save remaining rows
$r|Export-Excel -Path C:\Temp\test_$t.xlsx
You can rename variables accordingly.

PowerShell - Header in CSV file and modify a value in a specific column

I am newbee in Powershell and i would like to get some answers to my few questions.
That I want to do :
I have a CSV file without Header (about 30-35 columns) and i would like to modify value in the column 20. I would replace this value by (newvalue = this value*5).
My problems :
- it is an obligation to create header to my CSV file to access at the column 20 to modify my value ? (If yes, how can a do that ??)
- How can i modify every line of my CSV file in the colum20 by this :
newvalue = column20.value * 5
Thanks for help
$csvImport = Import-Csv .\listOfUsers.csv -Header "name","samname","fullname","firstName","lastname","description","company","country","city","multi","targetNum"
foreach ($item in $csvImport){
[int]$item.targetNum= [int]$item.targetNum * 5
}
$csvImport | ConvertTo-Csv -NoTypeInformation | select -Skip 1 | Set-Content .\listOfUsers.csv
I chose 11th column but it should work for you for 20th column as well.
For demonstration purposes the following script
first generates a file noHeader.csv with random number of columns
(10..15), rows(5..10) and content.
The second part reads in the first line, splits it at ',' comma to
get the column count and builds a pseudo Header H1,H2,...Hx where
x is the column count.
It import-csv the file and multipies H10 by 5 (casting to double)
## Q:\Test\2018\05\29\SO_50581599.ps1
$CsvFile = '.\NoHeader.csv'
$NewFile = '.\WithHeader.csv'
## ======================================================================
## Generate Csv without Header with random no of cols,rows,content
$cols= get-random -min 10 -max 15
$rows= get-random -min 5 -max 10
$Data = ForEach ($Row in (1..$Rows)){
(1..$Cols | ForEach-Object{Get-Random -min 100 -max 1000} ) -join ','
}
$Data | Out-File $CsvFile
"Generated {0} with {1} columns and {2} rows" -f $CsvFile,$cols,$rows
GC $CsvFile
"=" * 70
## ======================================================================
## evaluate number of columns in file $CsvFile
$ColCnt = ((Get-Content $CsvFile | Select -First 1) -split ',').Count
"File {0} has {1} columns" -f $CsvFile,$ColCnt
## generate Header H1..Hx
$Header = (1..$ColCnt | ForEach-Object{"H{0}" -f $_})
$NewCsv = ForEach($row in (Import-Csv $CsvFile -Header $Header)){
[double]$row.H10 *= 5
$row
}
$NewCsv |ft * -auto
$NewCsv | Export-csv $NewFile -NoType
Generated .\NoHeader.csv with 11 columns and 8 rows
591,864,196,281,599,216,152,236,621,311,934
942,222,522,590,649,788,421,665,493,116,282
359,984,645,276,297,320,632,934,296,197,750
746,486,375,756,462,286,977,834,765,828,907
156,899,258,941,222,517,170,301,739,501,414
395,891,182,224,430,773,115,645,114,919,401
743,475,335,650,222,899,690,442,525,695,539
861,341,114,498,550,824,635,407,235,285,631
======================================================================
File .\NoHeader.csv has 11 columns
H1 H2 H3 H4 H5 H6 H7 H8 H9 H10 H11
-- -- -- -- -- -- -- -- -- --- ---
591 864 196 281 599 216 152 236 621 1555 934
942 222 522 590 649 788 421 665 493 580 282
359 984 645 276 297 320 632 934 296 985 750
746 486 375 756 462 286 977 834 765 4140 907
156 899 258 941 222 517 170 301 739 2505 414
395 891 182 224 430 773 115 645 114 4595 401
743 475 335 650 222 899 690 442 525 3475 539
861 341 114 498 550 824 635 407 235 1425 631
Finally i get it as i wish :
Use Unicode because of accent characters
$FicCSV = "C:\Temp\mycsvfile.csv"
Get-Content $FicCSV | Out-File "C:\Temp\mycsvfile_Unicode.csv" -Encoding Unicode
$csvImport = Import-Csv -path "C:\Temp\mycsvfile_Unicode.csv" -Header "A","B","C","D","E","F","G","H","I","J","K","L1","L2","M","N","O","P","Q","R1","R2","R3","R4","R5","R6","R7","R8","R9","S","T","U","V","W" -delimiter ';'
Decimal to do my maths operation
foreach ($item in $csvImport){
[decimal]$item.P= [decimal]$item.P * 5
}
my file must be with records like this : lea;john;katy;10.2;george;..
$csvImport | ConvertTo-Csv -NoTypeInformation -Delimiter ';' | % {$_.replace('"','').replace(',','.')}| select -Skip 1 | Set-Content "C:\Temp\mycsvfile_back.csv"
Thanks for EveryOne who helped me to succeed :)

How to transpose one column (multiple rows) to multiple columns?

I have a file with the below contents:
0
ABC
1
181.12
2
05/07/16
3
1002
4
1211511108
6
1902
7
1902
10
hello
-1
0
ABC
1
1333.21
2
02/02/16
3
1294
4
1202514258
6
1294
7
1294
10
HAI
-1
...
I want to transpose the above file contents like below. The '-1' in above lists is the record separator which indicates the start of the next record.
ABC,181.12,05/07/16,1002,1211511108,1902,1902,hello
ABC,1333.21,02/02/16,1294,1202514258,1294,1294,HAI
...
Please let me know how to achieve this.
Read the file as a single string:
$txt = Get-Content 'C:\path\to\your.txt' | Out-String
Split the content at -1 lines:
$txt -split '(?m)^-1\r?\n'
Split each block at line breaks:
... | ForEach-Object {
$arr = $_ -split '\r?\n'
}
Select the values at odd indexes (skip the number lines) and join them by commas:
$indexes = 1..$($arr.Count - 1) | Where-Object { ($_ % 2) -ne 0 }
$arr[$indexes] -join ','

PowerShell: How to remove columns from delimited text input?

I have a text file with 5 columns of text delimited by whitespace. For example:
10 45 5 23 78
89 3 56 12 56
999 4 67 93 5
Using PowerShell, how do I remove the rightmost two columns? The resulting file should be:
10 45 5
89 3 56
999 4 67
I can extract the individual items using the -split operator. But, the items appear on different lines and I do not see how I can get them back as 3 items per line.
And to make the question more generic (and helpful to others): How to use PowerShell to remove the data at multiple columns in the range [0,n-1] given an input that has lines with delimited data of n columns each?
Read the file content, convert it to a csv and select just the first 3 columns:
Import-Csv .\file.txt -Header col1,col2,col3,col4,col5 -Delimiter ' ' | Select-Object col1,col2,col3
If you want just the values (without a header):
Import-Csv .\file.txt -Header col1,col2,col3,col4,col5 -Delimiter ' ' | Select-Object col1,col2,col3 | Format-Table -HideTableHeaders -AutoSize
To save back the results to the file:
(Import-Csv .\file.txt -Header col1,col2,col3,col4,col5 -Delimiter ' ') | Foreach-Object { "{0} {1} {2}" -f $_.col1,$_.col2,$_.col3} | Out-File .\file.txt
UPDATE:
Just another option:
(Get-Content .\file.txt) | Foreach-Object { $_.split()[0..2] -join ' ' } | Out-File .\file.txt
One way is:
gc input.txt | %{[string]::join(" ",$_.split()[0..2]) } | out-file output.txt
(replace 2 by n-1)
Here is the generic solution:
param
(
# Input data file
[string]$Path = 'data.txt',
# Columns to be removed, any order, dupes are allowed
[int[]]$Remove = (4, 3, 4, 3)
)
# sort indexes descending and remove dupes
$Remove = $Remove | Sort-Object -Unique -Descending
# read input lines
Get-Content $Path | .{process{
# split and add to ArrayList which allows to remove items
$list = [Collections.ArrayList]($_ -split '\s')
# remove data at the indexes (from tail to head due to descending order)
foreach($i in $Remove) {
$list.RemoveAt($i)
}
# join and output
$list -join ' '
}}