Been struggling for a few hours. I'm trying to hit this link
That has these contents:
# Generated on: Python 3.6.12
# With snowflake-connector-python version: 2.4.6
asn1crypto==1.4.0
azure-common==1.1.27
azure-core==1.15.0
azure-storage-blob==12.8.1
boto3==1.17.98
botocore==1.20.98
certifi==2021.5.30
cffi==1.14.5
chardet==4.0.0
cryptography==3.4.7
dataclasses==0.8
idna==2.10
isodate==0.6.0
jmespath==0.10.0
msrest==0.6.21
oauthlib==3.1.1
oscrypto==1.2.1
pycparser==2.20
pycryptodomex==3.10.1
PyJWT==2.1.0
pyOpenSSL==20.0.1
python-dateutil==2.8.1
pytz==2021.1
requests==2.25.1
requests-oauthlib==1.3.0
s3transfer==0.4.2
six==1.16.0
urllib3==1.26.5
And pass each non-commented out line to Poetry (a command line tool for Python dependency management)
This is my first step
(iwr https://raw.githubusercontent.com/snowflakedb/snowflake-connector-python/v2.5.1/tested_requirements/requirements_36.reqs | Select-Object).Content > req.txt
Where I'm struggling is I've tried doing convertfrom-string with various delimiters, and .Split(), and I can't seem to parse out the pieces I need. Poetry needs to take as input just the "packagename==" although version number is optional. So I essentially want to ignore lines that start with a "#" and then pass each line through a pipe, or even save it as an array. It doesn't seem to respond to setting a delimited to "\t" or carriage return "`r".
So next I would do something like
foreach($package in $package_array){poetry add $package}
Any help would be appreciated.
AdminOfThings provided a good pointer in a comment, but let me try to put it all together:
$url = 'https://raw.githubusercontent.com/snowflakedb/snowflake-connector-python/v2.5.1/tested_requirements/requirements_36.reqs'
foreach ($pkgLine in (irm $url).Trim() -split '\r?\n' -notmatch '^\s*#') {
# Remove `Write-Host` to perform the actual poetry call.
Write-Host poetry add ($pkgLine -replace '=.*')
}
irm is the built-in alias for Invoke-RestMethod, which is a simpler alternative to Invoke-WebRequest (iwr) in this case, because it directly returns the text of interest, as a multi-line string.
As an aside: the | Select-Object in your code is effectively a no-op and can be omitted.
.Trim() trims a trailing newline (all trailing whitespace).
-split '\r?\n' splits the string into individual lines.
-notmatch '^\s*# filters out all lines that start with #, optionally preceded by whitespace.
-replace '=.* removes everything starting with = from each package line.
Related
I have a source file which is in .txt format. It looks like a semi-colon separated file:
100;200;ThisisastringcolumnA;4;
101;400;Thisisastringc;lumnA;5;
102;600;ThisisastringcolumnB;6;
104;600;Thisisa;;ringcolumnB;6;
However, it is determined by length. So it is a length-delimited file.
Fist column for example is from first value to the third (100), then a semi-colon follows.
Second column starts at 5th position (including), until (including) 7th position. A string column can contain a semi-colon.
Now I want to import this length-delimited txt file with Powershell and export it as a csv file. This file should be really semi-colon separated. The result should look like
100;200;ThisisastringcolumnA;4;
101;400;"Thisisastringc;lumnA";5;
102;600;ThisisastringcolumnB;6;
104;600;"Thisisa;;ringcolumnB";6;
But I have simply no idea how to do it? I googled it, but I did not find that much useful code examples for importing length-delimited txt files with PowerShell.
Unfortunately, I cannot use Python. I am not sure, if this task is generally possible using Powershell? Because when exporting, Powershell also needs to recognize that there are string values containing the separator, so it has to pay attention to the quoting: "Thisisa;;ringcolumnB". I think it would be also ok for me, if the whole column is quoted, so every entry in a string column gets quotes added.
You can use regex to describe a string in which the 3rd "column" contains a ; and then inject the quotation marks with the -replace operator:
$lines = Get-Content path\to\file.txt
#($lines) -replace '(.{3});(.{3});(.{20}(?<=;.{0,19}));(.);', '$1;$2;"$3";$4;'
The expression (.{20}(?<=;.{0,19})) is going to match the 20-char 3rd column value only if it contains at least one semi-colon - so lines with no semicolon in that column will be left alone:
# let's try it out with your test data
$lines = #'
100;200;ThisisastringcolumnA;4;
101;400;Thisisastringc;lumnA;5;
102;600;ThisisastringcolumnB;6;
104;600;Thisisa;;ringcolumnB;6;
'# -split '\r?\n'
#($lines) -replace '(.{3});(.{3});(.{20}(?<=;.{0,19}));(.);', '$1;$2;"$3";$4;'
Which yields the following four strings:
100;200;ThisisastringcolumnA;4;
101;400;"Thisisastringc;lumnA";5;
102;600;ThisisastringcolumnB;6;
104;600;"Thisisa;;ringcolumnB";6;
To write the output back to file, use Set-Content:
#($lines) -replace '(.{3});(.{3});(.{20}(?<=;.{0,19}));(.);', '$1;$2;"$3";$4;' |Set-Content path\to\fixed_output.scsv
I have looked at this question, and it's close to what I need to do, but the text I need to replace is inconsistent.
I need to replace "`r`n with ", but only the first of the 2 adjacent lines
example: (the full file is 50k lines and up to 500 chars wide)
ID,Name,LinkedRecords
54429,Abe,
54247,Jonathan,"
63460|63461"
54249,Teresa,
54418,Cody,
58046,Joseph,
58243,David,
,Barry,"
74330"
C8876,Simon,
X_10934,David,
should become
ID,Name,LinkedRecords
54429,Abe,
54247,Jonathan,"63460|63461"
54249,Teresa,
54418,Cody,
58046,Joseph,
58243,David,
,Barry,"74330"
C8876,Simon,
X_10934,David,
I can see this will probably be useful, but I'm having a hard time getting the command to work as desired
If the `r`n characters are literal, then you can do the following:
[System.IO.File]::ReadAllText('c:\path\file.txt') -replace '(?<=,")`r`n\r?\n' |
Set-Content c:\path\file.txt
If `r`n are actual carriage return and line feed chars, then you can do the following:
[System.IO.File]::ReadAllText('c:\path\file.txt') -replace '(?<=,")\r\n' |
Set-Content c:\path\file.txt
Note if memory becomes an issue, a different approach may be needed.
How can I get just a part of XML node text?
I have this piece of XML:
<CorpusLink>../Metadata/A_short_autobiography_of_Herculino_Alves.xml</CorpusLink>
<CorpusLink >../Metadata/Wordlist_and_phrases_-_modifiers.xml</CorpusLink>
<CorpusLink >../desano-silva-0151/Metadata/Wordlist_fruits_and_cultural_items.xml</CorpusLink>
<CorpusLink >../desano-silva-0151/Metadata/The_Turtle_and_the_Deer.xml</CorpusLink>
<CorpusLink >../desano-silva-0151/Metadata/Wordlist_and_phrases_parts_of_a_tree.xml</CorpusLink>
<CorpusLink >../desano-silva-0151/Metadata/Wordlist_and_phrases_.xml</CorpusLink>
I need to extract only this piece of text in each one:
../Metadata
../desano-silva-0151/Metadata
I have this code :
$j = 0
$TrgContent.METATRANSCRIPT.Corpus.CorpusLink | ForEach-Object {
[String]$_.'#text'= % {$alltext[$j] + "xml" $j++}}
But it gives me all the text:
../Metadata/A_short_autobiography_of_Herculino_Alves.xml
../desano-silva-0151/Metadata/Wordlist_fruits_and_cultural_items.xml
Thanks in advance for any help.
To achieve what you have asked. I think we have two main steps here:
Extract the content of XML nodes.
Trim the content and take what you need only.
I'm not really familiar with your existing scripts so I will explain all two steps here. The first step is optional to you.
Extract content of XML nodes
My example XML document:
<Corpus>
<CorpusLink>../Metadata/A_short_autobiography_of_Herculino_Alves.xml</CorpusLink>
<CorpusLink>../Metadata/Wordlist_and_phrases_-_modifiers.xml</CorpusLink>
<CorpusLink>../desano-silva-0151/Metadata/Wordlist_fruits_and_cultural_items.xml</CorpusLink>
<CorpusLink>../desano-silva-0151/Metadata/The_Turtle_and_the_Deer.xml</CorpusLink>
<CorpusLink>../desano-silva-0151/Metadata/Wordlist_and_phrases_parts_of_a_tree.xml</CorpusLink>
<CorpusLink>../desano-silva-0151/Metadata/Wordlist_and_phrases_.xml</CorpusLink>
</Corpus>
PS script to get the content:
[xml] $XmlDocument = Get-Content D:\Path_To_Your_File
$XmlDocument.Corpus.CorpusLink # Content of the nodes you need
Trim the content
There are many methods but I think I will go with regex. Simply loop through all the contents and run the regex.
$XmlDocument2.Corpus.CorpusLink | Foreach-Object {
if ($_ -match "\.\.\/.*?\/") {
$Matches.Values
}
}
About the regex, it matches any character except for line terminators between ..\ and /:
\.\. # Escape for 2 dots `..`
\/ # Escapefor slash `/`
.*? # Takes any character except for line terminators in between other listed characters (above and below)
\/ # Escape for slash `/`
I imply the structure of these strings is stable like that, hence the regex.
If I run the below code, $SRN can be written as output or added to another variable, but trying to include either another variable or regular text causes it to be overwritten from the beginning of the line. I'm assuming it's something to do with how I'm assigning $autocode and $SRN initially but can't tell what it's trying to do.
# Load the property set to allow us to get to the email body.
$item.load($psPropertySet) # Load the data.
$bod = ($item.Body.Text -creplace '(?m)^\s*\r?\n','') -split "\n" # Get the body text, remove blank lines, split on line breaks to create an array (otherwise it is a single string).
$autocode = $bod[4].split('-')[2] # Get line 4 (should be Title), split on dash, look for 3rd element, this should contain our automation code.
$SRN = $bod[1] -replace 'ID: ','' # Get line 2 (should be ID), find and replace the preceding text.
# Skip processing if autocode does not match our list of handled ones.
if ($autocode -cin $autocodes)
{
write-host "$SRN $autocode"
write-host "$autocode $SRN"
write-host "$SRN test"
$var = "$SRN $autocode"
$var
}
The code results in this, you can see if $SRN isn't at the start of the line it is fine. Unsure where the extra spaces come from either:
KRNE8385
KRNE SR1788385
test8385
KRNE8385
I would expect to see this:
SR1788385 KRNE
KRNE SR1788385
SR1788385 test
SR1788385 KRNE
LotPings pointed me down the right path, both variables still had either "0D" or "\r" in them. My regex replace was only getting rid of them on blank lines, and I split the array on "\n" only. Changing line 3 in the original code to the below appears to have resolved the issue. First time seeing Format-Hex, but it appears to be excellent for troubleshooting such issues.
$bod = ($item.Body.Text -creplace '(?m)^\s*\r?\n','') -split "\r\n"
I am using PowerShell and I need replace a line in a .txt file.
The .txt file always has different number at the end of the line.
For example:
...............................txt (first)....................................
appversion= 10.10.1
............................txt (a second time)................................
appversion= 10.10.2
...............................txt (third)...................................
appversion= 10.10.5
I need to replace appversion + number behind it (the number is always different). I have set the required value in variable.
How do I do this?
Part of this issue you are getting, which I see from your comments, is that you are trying to replace text in a file and saved it back to the same file while you are still reading it.
I will try to show a similar solution while addressing this. Again we are going to use -replaces functionality as an array operator.
$NewVersion = "Awesome"
$filecontent = Get-Content C:\temp\file.txt
$filecontent -replace '(^appversion=.*\.).*',"`$1$NewVersion" | Set-Content C:\temp\file.txt
This regex will match lines starting with "appversion=" and everything up until the last period. Since we are storing the text in memory we can write it back to the same file. Change $NewVersion to a number ... unless that is your versioning structure.
Not sure about what numbers you are keeping
About which part of the numbers, if any, you are trying to preserve. If you intend to change the whole number then you can just .*\. to a space. That way you ignore everything after the equal sign.
Yes, you can with regex.
Let call $myString and $verNumber the variables with text and version number
$myString = "appversion= 10.10.1";
$verNumber = 7;
You can use -replace operator to get the version part and replace only last subversion number this way
$mystring -replace 'appversion= (\d+).(\d+).(\d+)', "appversion= `$1.`$2.$verNumber";