I am not looking for a working solution but rather an idea to push me in the right direction as I was thrown a curveball I am not sure how to tackle.
Recently I wrote a script that used split command to check the first part of the file against the folder name. This was successful so now there is a new ask: check all the files against the naming matrix, problem is there are like 50+ file formats on the list.
So for example format of a document would be ID-someID-otherID-date.xls
So for example 12345678-xxxx-1234abc.xls as a format for the amount of characters to see if files have the correct amount of characters in each section to spot typos etc.
Is there any other reasonable way of tackling that besides Regex? I was thinking of using multiple splits using the hyphens but don't really have anything to reference that against other than the amount of characters required in each part.
As always, any (even vague) pointers are most welcome ;-)
Although I would use a regex (as also commented by zett42), there is indeed an other way which is using the ConvertFrom-String cmdlet with a template:
$template = #'
{[Int]Some*:12345678}-{[String]Other:xxxx}-{[DateTime]Date:2022-11-18}.xls
{[Int]Some*:87654321}-{[String]Other:yyyy}-{[DateTime]Date:18Nov2022}.xls
'#
'23565679-iRon-7Oct1963.xls' | ConvertFrom-String -TemplateContent $template
Some : 23565679
Other : iRon
Date : 10/7/1963 12:00:00 AM
RunspaceId : 3bf191e9-8077-4577-8372-e77da6d5f38d
Related
I have a table that contains message_content. It looks like this:
message_content | WFUS54 ABNT 080344\r\r
| TORLCH\r\r
| TXC245-361-080415-\r
How would I extract only the 2nd line of that output(TORLCH)? I've tried to shorten the output to a certain number of characters but that ultimately doesn't provide what I want. I've also tried removing carriage returns and new lines. I am outputting my results to a CSV I could manipulate with Python, but was wondering if there's a way to do it in the query first.
Based on other examples, it seems like I could use a regular expression to maybe do this? Not sure where to start with learning that though.
you can split the line into an array, then take the second element:
(string_to_array(message_content, e'\r\r'))[2]
Online example: https://rextester.com/MDYLXB40812
This question already has answers here:
Export-CSV exports length but not name
(6 answers)
Closed 4 years ago.
Operating System - Windows 10
Powershell version - 5.1.15063.1088
Ok, I'm really trying hard to think logically what can be wrong with this PowerShell script, but apparently can't get an idea and asking for some help. So here is what I'm trying to do, simple as 1+1
If I understood the tutorial correctly, creating an array in PowerShell is like this:
$someVariable = "PowerShell", "MowerShell", "HowerShell", "ZowerShell"
Then I'm simply trying to write this thing to csv file with comma as delimeter, but firstly give it a try in the console output
$someVariable | ConvertTo-Csv -NoTypeInformation
According to PowerShell 5.1 official documentation
...Specifies a delimiter to separate the property values. The default
is a comma (,).
So no additional writing that I would like to use comma as delimiter is not required. Once the command Write-Host $someVariable is executed, I see this weird output:
"Length" "10" "10" "10" "10"
What is this? Am I suppose to see the values of my variable separated with simple comma? So from the numbers I can guess that scripts calculates the amount of alphabet letters in each word -
P o w e r S h e l l
contains 10 letters.
Is this the suggested way to calculate the amount of letters in the string (in case I get PowerShell task on my next job interview) using ConvertTo-Csv command?
Writing this funky data to the csv file itself leads to more unexpected results:
Now I'm completely lost what those numbers are...
Is this possible to write my strings as STRINGS to the csv file in one line rather then silly numbers?
The desired output is this entry as headers in the csv file:
"PowerShell","MowerShell","HowerShell","ZowerShell"
The output reads "Length", and has a series of 10's. Each of your strings are 10 characters long (the double quotes aren't factored in).
Length can be calculated many ways. I wouldn't say there is one suggested way, only the ways that fit what you're trying to do.
To get the literal text of what you posted (no headers, etc.) in a csv, try:
$someVariable | Out-File foo.csv
I've got a bunch of directory names and file names, some are absolute path, some are relative path. I just wish to get the 2 leading parts of each path. Input:
D:\a\b\c\d.txt\
c:\a
\my\desk\n.txt
you\their\mine
I expect to get:
D:\a
c:\a
\my\desk
you\their
Is there a convenient way in PowerShell to achieve this?
You can sometimes get your hand slapped for suggesting string manipulation as it can sometimes be "unreliable". However your test data contains 3 different possibilities. Also, never seen someone looking for the first parts from a path.
I present a simple solution the nets your desired output as you have it in your question
"D:\a\b\c\d.txt\","c:\a","\my\desk\n.txt","you\their\mine" | ForEach-Object{
($_ -split "(?<=\S)\\")[0..1] -join "\"
}
I needed to use a lookbehind since your sample output contains a leading a leading slash that you wanted to retain. It splits every string on slashes that have a non white-space character in front of them.
This would not return the correct path for UNC's. Split-Path would be the obvious choice if you only wanted a single portion of the path. I suppose you could nest the call to get 2 but at this time I am unable to find a simple way to account for all of your examples with the same logic.
I have an existing perl one-liner (from the Edwards lab) that works wonderfully to read a text file (named ids.file) that contains one column of IDs and searches a second, specially formatted text file (named fasta.file in this example - in "fasta" format for those who know bioinformatics) and returns sequences that match the ID from the first file. I was hoping to expand this script to do two additional things:
The current perl one-liner only seems to work if the ids.file contains one column of data. I would like it to work on a file that contains two columns (separated by spaces), and act on the second column of data (well, really any column of data, but I assume that it will be obvious enough to adapt it if someone can give an example using a second column)
I would like to append the any results returned from the output of the search to a third column, instead of just to a new file.
If someone is kind enough to offer an example but only has time or inclination to work on one of these, I would prefer that you try to solve #2 - I have come close to solving #1 with a for loop that uses awk to only use the Perl code on the second column - I haven't gotten it yet, but am close, so #2 seems like the harder one to me.
The perl one liner is as follows:
perl -ne 'if(/^>(\S+)/){$c=$i{$1}}$c?print:chomp;$i{$_}=1 if #ARGV' ids.file fasta.file
I appreciate any help you can give!
Not quite sure but will this do?
perl -ne 'chomp; s/^>(\S+).*/$c=$i{$1}/e; print if $c;
$i{(/^\S*\s(\S*)$/)[0]}="$_ " if #ARGV'
ids.file fasta.file
My script reads a log file once a minute and selects (and acts upon) the lines where the timestamp begins with the previous minute.
This is easy (the regex is simply "^$timestamp"), but when the log gets big it can take a while.
My thinking is the lines I want will always be near the bottom of the file, so I'd be searching far fewer lines if I started at the bottom and searched upwards, stopping when I get to the minute prior to the one I'm interested in.
My question is, how can I search from the bottom of the file instead of the top? Can I even say "read line $length", or even "read line n" (if so I could do a sort of binary search thing to find the length of the file and work backwards from there)?
Last question: would this even be faster (I'd still like to know how to do it even if it wouldn't be faster)?
Ideally, I'd like to do this all in my own code without installing anything extra.
Thanks
get-content bigfile.txt -tail 10
This words on huge files nearly instantly without any big memory usage.
I did it with a 22 GB text file in my testing.
Doing something like "get-context bigfile.txt | select -Last 10" works but it seems to have to load all of the lines (or objects in powershell) then does the select.
May I suggest just changing the regex to equal Get-Date + whatever time period you want?
For example (and this is without your log so i apologize)
$a = Get-Date
$hr = $a.Hour
$min = $a.Minute
Then work off those values to build out the regex to select the times you want. And if you don't already use it this website is awesome for building regex's quickly and easily http://gskinner.com/RegExr/ .
Got another fix, I think you will like this..
$a = get-content .\biglog.text
Use the length to slice the array from back to front change write host to select-string and your regex or whatever you want to do in reverse..
foreach($x in $a.length..0){ write-host $a[$x] }
Another option after the get-content cmdlet again, this option just reverse orders the array then you are reading $a from bottom to top
[array]::Reverse($a)
dc
If you only want the last bit of the file, depending on the format, you can just do this:
Get-Content C:\Windows\WindowsUpdate.log | Select -last 10
This will return the last 10 lines found in the file.