How To Produce HTML Table of Filehashes - with Relative Path Only? - powershell
I want to generate a HTML table that shows a filehash (sha1) of a bunch of files in a directory; I want the filenames to be relative to my current directory - not absolute.
I know how to do all the different bits separately, but I can't figure out how to chain-them up.
Here's what I've got so far:
dir|get-filehash -Algorithm sha1
Which gives me this:
Algorithm Hash Path
--------- ---- ----
SHA1 DA39A3EE5E6B4B0D3255BFEF95601890AFD80709 C:\temp\test\empty.txt
SHA1 88A5B867C3D110207786E66523CD1E4A484DA697 C:\temp\test\hello.txt
Now I only want the hash and filename , so I can do this:
dir|get-filehash -Algorithm sha1|select-object hash, path
Which gives me:
Hash Path
---- ----
DA39A3EE5E6B4B0D3255BFEF95601890AFD80709 C:\temp\test\empty.txt
88A5B867C3D110207786E66523CD1E4A484DA697 C:\temp\test\hello.txt
So I can output this to an HTML file like this:
(dir|get-filehash -Algorithm sha1|select-object hash, path)|ConvertTo-html|add-content output.htm
[ignore the fact that this only works properly if the output file doesn't exist for now].
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>HTML TABLE</title>
</head><body>
<table>
<colgroup><col/><col/></colgroup>
<tr><th>Hash</th><th>Path</th></tr>
<tr><td>DA39A3EE5E6B4B0D3255BFEF95601890AFD80709</td><td>C:\temp\test\empty.txt</td></tr>
<tr><td>88A5B867C3D110207786E66523CD1E4A484DA697</td><td>C:\temp\test\hello.txt</td></tr>
</table>
So this gives me a HTML table; but the PATH values are absolute.
I know a simple way of getting a relative path using the 'Resolve-Path' cmdlet:
dir | Resolve-Path -Relative
.\empty.txt
.\hello.txt
But I can't get it to 'fit' in the rest of my script ; I guess their might be a .NET function to do this in a different way ? Or is there some fancy ninja-use of brackets that let me squeeze this call to a cmdlet insde of the 'select-object' list ?
I tried this: but it doesn't work:
# NOTE: this code does not work !
PS > dir|get-filehash|select-object hash, (path|Resolve-Path -relative)
Made some progress on this - but the solution still isn't very satisfactory - since I also need control over the headers (rather than just 'name', 'value' which the HashTable provides).
I can get these headers working in a 'format-table',but still not in 'convertto-html'.
Here's the code so far:
function get-relative($infile) {
begin { $return_hash=#{} }
process {
$relative_path=($_|resolve-path -Relative)
$filehash=($_| Get-FileHash -Algorithm sha1).hash
$return_hash.add($relative_path, $filehash)
}
end { return $return_hash }
}
And to call it with a formatter:
$table_format = #{Expression={$_.Name} ; Label="Filename"}, #{Expression={$_.Value} ; Label="CHECKSUM(SHA1)"}
dir|get-relative|format-table $table_format
This gives:
Filename CHECKSUM(SHA1)
-------- --------------
.\goodbye.txt DA39A3EE5E6B4B0D3255BFEF95601890AFD80709
.\hello.txt 88A5B867C3D110207786E66523CD1E4A484DA697
But if I shove that through a 'convertto-html', weirdness ensues. (I guess this is because the output from 'format-table' is just a string by now....)
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>HTML TABLE</title>
</head><body>
<table>
<colgroup><col/><col/><col/><col/><col/><col/></colgroup>
<tr><th>ClassId2e4f51ef21dd47e99d3c952918aff9cd</th><th>pageHeaderEntry</th><th>pageFooterEntry</th><th>autosizeInfo</th><th>shapeInfo</th><th>
groupingEntry</th></tr>
<tr><td>033ecb2bc07a4d43b5ef94ed5a35d280</td><td></td><td></td><td></td><td>Microsoft.PowerShell.Commands.Internal.Format.TableHeaderInfo</td><
td></td></tr>
<tr><td>9e210fe47d09416682b841769c78b8a3</td><td></td><td></td><td></td><td></td><td></td></tr>
<tr><td>27c87ef9bbda4f709f6b4002fa4af63c</td><td></td><td></td><td></td><td></td><td></td></tr>
<tr><td>27c87ef9bbda4f709f6b4002fa4af63c</td><td></td><td></td><td></td><td></td><td></td></tr>
<tr><td>4ec4f0187cb04f4cb6973460dfe252df</td><td></td><td></td><td></td><td></td><td></td></tr>
<tr><td>cf522b78d86c486691226b40aa69e95c</td><td></td><td></td><td></td><td></td><td></td></tr>
</table>
</body></html>
Related
Search and get content from data in XML and then place that value in another tag using powershell
Sample text file contains: `<?xml version="1.0" encoding="UTF-8" ?> <Document xmlns ="urn:iso:std:iso:20022:tech:xsd:camt.056.001.01" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> <FIToFIPmtCxlReq> <Assgnmt> <Id>ID123456</Id> <Assgnr> <Agt> <FinInstnId> <BIC>BICSEND</BIC> </FinInstnId> </Agt> </Assgnr> <Assgne> <Agt> <FinInstnId> <BIC>BICRCV</BIC> </FinInstnId> </Agt> </Assgne> <CreDtTm>2020-12-16T09:05:15.0Z</CreDtTm> </Assgnmt> <CtrlData> <NbOfTxs>1</NbOfTxs> <CtrlSum>0</CtrlSum> </CtrlData> <Undrlyg> <TxInf> <CxlId>20201216.105.19344855940590400</CxlId> <OrgnlGrpInf> <OrgnlMsgId>REF123456789</OrgnlMsgId> <OrgnlMsgNmId>pacs.008</OrgnlMsgNmId> </OrgnlGrpInf> <OrgnlInstrId>FT123456</OrgnlInstrId> <OrgnlEndToEndId>NOTPROVIDED</OrgnlEndToEndId> <OrgnlTxId>20201216.100.02202020</OrgnlTxId> <OrgnlIntrBkSttlmAmt Ccy="EUR">25.23</OrgnlIntrBkSttlmAmt> <OrgnlIntrBkSttlmDt>2020-12-16</OrgnlIntrBkSttlmDt>` Please be informed that I would like to code PowerShell to extract the data in tag <OrgnlIntrBkSttlmAmt> (please note that the data length can change since this is an amount field) and then replace the "0" in tag <CtrlSum> with "25.23". Can someone help me with this. Thank you for your time.
The xml you show us is invalid as it is missing the following closing tags: </TxInf> </Undrlyg> </FIToFIPmtCxlReq> </Document> If I add these, you could do this to update the value in the <CtrlSum> tag: # load the xml from file [xml]$xml = Get-Content -Path 'D:\Test\test.xml' -Raw # get the amount from the 'OrgnlIntrBkSttlmAmt' tag $amount = $xml.Document.FIToFIPmtCxlReq.Undrlyg.TxInf.OrgnlIntrBkSttlmAmt.'#text' # use that amount to put in the 'CtrlSum' tag $xml.Document.FIToFIPmtCxlReq.CtrlData.CtrlSum = $amount # save the updated xml to file $xml.Save('D:\Test\test.xml')
Jsoup parse with unreadable characters
When parsing flashscore server with JSoup I have unreadable characters. Jsoup code: document = Jsoup.connect(URL + LABEL + SEASON + 1 + END) .userAgent(USER_AGENT) .header("x-fsign", FSIGN) .get(); Server response: <html> <head></head> <body> SA÷1¬~ZA÷ИТАЛИЯ: Серия В¬ZEE÷6oug4RRc¬ZB÷98¬ZY÷Италия¬ZC÷GbNgKxPB¬ZD÷p¬ZE÷K28bJgeL How to work with it?
Set the correct charset in the "charset" attribute: JSoup character encoding issue document = Jsoup.parse(new URL(url).openStream(), "ISO-8859-1", url);
What is preventing some of the content from an Invoke-WebRequest showing?
I am trying to get some acsii art characters directly from a webpage. You can navigate to the page using the following URL. http://patorjk.com/software/taag/#p=display&f=Acrobatic&t=A If you go to that page you will see a rendering of the character A using the Acrobatic font. o <|> / \ o/ \o <|__ __|> / \ o/ \o /v v\ /> <\ Using the following code nets me most of the page. $fontUrlTemplate = "http://patorjk.com/software/taag/#p=display&f={0}&t={1}" $fontName = [uri]::EscapeUriString("Acrobatic") $character = "A" $fontUrl = $fontUrlTemplate -f $fontName, $character $webResult = Invoke-WebRequest $fontUrl $webResult.Content However when I inspect the Content the actual result I am looking for is missing. ... <div id="maincontent" > <div id="outputFigDisplay" ></div> </div> ... There should be something like this in there <pre id="taag_output_text" style="float:left;" class="fig" contenteditable="true">...</pre> I am sure there is a server side reason for this but I would like to better understand and, if possible, mitigate it. I have tried mucking around with -ContentType and -UserAgent but it didn't change anything
powershell invoke-webrequest with codepage win 1251
I need to get data from a page that has the win-1251 codepage. $SiteAdress = "http://www.gisinfo.ru/download/download.htm" $HttpContent = Invoke-WebRequest -URI $SiteAdress echo $HttpContent And it shows me: > StatusCode : 200 StatusDescription : OK Content : > <!DOCTYPE html> > <html><!-- #BeginTemplate "/Templates/panorama.dwt" --><!-- DW6 --> > <head> > <!-- #BeginEditable "doctitle" --> > <title>ÃÈÑ ÏÀÍÎÐÀÌÀ - Ñêà÷àòü ïðîãðàììû</title> > <meta name="keywords" con... RawContent : HTTP/1.1 200 OK > Transfer-Encoding: chunked > Connection: keep-alive > Keep-Alive: timeout=20 > Content-Type: text/html > Date: Fri, 16 Oct 2015 12:40:45 GMT > Server: nginx/1.5.7 > X-Powered-By: PHP/5.2.17... Title is Cyrillic. I have tried the variant below, but the result is the same. $HttpContent = Invoke-WebRequest -URI $SiteAdress -ContentType "text/html; charset=windows-1251"
The -ContentType parameter to Invoke-WebRequest sets the content type for the request, not the response. Since you don't sent any content with your request it's quite irrelevant here. I didn't find an easy way of enforcing a particular encoding for the response. Since the encoding is only specified within the HTML, and not the response header, there's little you can do here, I fear, as Invoke-WebRequest isn't smart enough to figure that out on its own. You can, however, convert the text you read: filter Convert-Encoding { $1251 = [System.Text.Encoding]::GetEncoding(1251) $1251.GetString([System.Text.Encoding]::Default.GetBytes($_)) } $HttpContent.Content | Convert-Encoding will then yield the proper Cyrillic text. <!DOCTYPE html> <html><!-- #BeginTemplate "/Templates/panorama.dwt" --><!-- DW6 --> <head> <!-- #BeginEditable "doctitle" --> <title>ГИС ПАНОРАМА - Скачать программы</title> <meta name="keywords" content="ГИС, карта, геодезия, картография, фотограмметрия, топография, электронная карта, классификатор, трехмерное моделирование, модель местности, карта Москвы, Ногинск, кадастр, межевое дело, Гаусс, эллипсоид Красовского, 1942, оротофотоснимок, WGS, растр, план, схема, бланковка, фотодокумент, земля, право, документация, map, sit, mtw, mtr, rsw, rsc, s57, s52, gis, 2003, 2004, Tool, Kit"> <meta name="description" content="Новые версии ГИС Карта 2000, GIS ToolKit , СУРЗ Земля и Право, документации, библиотек и примеров электронных карт"> <!-- #EndEditable --> In any case, you need to know the exact encoding beforehand, regardless of how you solve it. You could try finding it in the HTML source, though: [Regex]::Matches($HttpContent.Content, 'text/html;\s*charset=(?<encoding>[1-9a-z-]+)') [System.Text.Encoding]::GetEncoding can cope with a string like windows-1251, at least.
My working variant: $client = New-Object System.Net.WebClient $url = "http://www.gisinfo.ru/download/download.htm" $results = [System.Text.Encoding]::GetEncoding('windows-1251').GetString([Byte[]]$client.DownloadData($url)) Thanks Joey for help
Trouble understanding HTML::Element documentation in Perl
I was looking into HTML:Element documentation and came across attr_get_i method which according to documentation states that: In list context, returns a list consisting of the values of the given attribute for $h and for all its ancestors starting from $h and working its way up. Now, according to the example given there: <html lang='i-klingon'> <head><title>Pati Pata</title></head> <body> <h1 lang='la'>Stuff</h1> <p lang='es-MX' align='center'> Foo bar baz <cite>Quux</cite>. </p> <p>Hooboy.</p> </body> </html> If $h is the <cite> element, $h->attr_get_i("lang") in list context will return the list ('es-MX', 'i-klingon'). Now, according to my unuderstanding the returned list should be ('es-MX', 'la', 'i-klingon') that is it should also consider <h1 lang='la'>Stuff</h1> but according to the documentation it doesn't. Now, why am I wrong here.
The 'lang' attributes here are: +-------------+------------------+ | lang | path | +-------------+------------------+ | i-klingon | /html | | la | /html/body/h1 | | es-MX | /html/body/p | +-------------+------------------+ The <cite> node does not have <h1> as its parent (path is /html/body/p/cite), so <h1> is not its ancestor. This is why the method does not return it.
<h1 lang='la'>Stuff</h1> is not an ancestor of <cite>, it is a sibling.