Search and print out information of .html datas in Powershell -
i want use powershell search in .html documents specific strings , print them out.
let me explain first function works: use function search .html documents in path contain string "tag". after search string "id:", skip tag "</td><td>"
, use following regular expression print out following 32 characters, id. below see part of html file , function.
<tr valign=top><td>lokation:</td><td>\test1\blabla\asdf\1234\ws auswertungen</td></tr> <tr valign=top><td>beschreibung:</td><td></td></tr> <tr valign=top><td>eigentümer:</td><td><img align=middle src="file:///c:\users\d0262290\appdata\local\temp\23\user.bmp"> wilmes, tanja</td></tr> <tr valign=top><td>id:</td><td>55c7b7f411e2661e001000806c38eba0</td></tr> </table></td><td><img align=middle src="file:///c:\users\d0262290\appdata\local\temp\23\user.bmp">
the function:
function searchstringid { get-childitem -path c:\users\blub\lala\dokus -filter *.html | select-string -pattern "tag" | select path | get-childitem | foreach { if ((get-content -raw -path $_.fullname) -replace "<.*?>|\s" -match "(?s)id:(?<id>[a-z0-9]{32})" ) { printtooutputlog } } }
all works fine.
now need check 2 more information , can't figure out regular expression have use because has no fixed length of characters. have check string "tag" in problems below.
my first problem: have location of file, gotta search string "lokation:" (you can check on html posted before). information have have skip tags </td><td>
again , use regular expression location. problem here have idea how manage not-fixed length of characters. there way print out characters between "lokation:</td><td>
" , "</td></tr>"
? tags same in other html files, need solution works example.
my second problem: have read out object's name. in html document it's stored in comment. object's name begins after "[object:] , ends "]". here again, can't figure out expression use. special characters in example object's name below used.
<!-- ################################################################## --> <!-- # [object: name bla bla/ bla_bla 1 22:34] # --> <!-- ################################################################## -->
i thankful if me. every hint useful me because brain stuck here. , cheers
ok, 1 gets contents of each file , runs each line through switch match against 3 regex expressions. worked me against sample data. assigns each match variable each of 3 things looking for, , outputs object each.
function searchstringid { get-childitem -path c:\users\blub\lala\dokus -filter *.html | select-string -pattern "tag" | select path | get-childitem | foreach { switch -regex (get-content -path $_.fullname){ "((?<=id:.+?)[a-z0-9]{32})" {$id = $matches[1]} "lokation:.+?>(\\[^<]+)" {$location = $matches[1]} "object: ?([^\]]+)" {$object = $matches[1]} } [pscustomobject][ordered]@{ 'id' = $id 'location' = $location 'name' = $object } } }
so assign variable , have array of results please (output csv? sure! display screen table? can do! email entire company? um, yeah, wouldn't recommend that.)
here's gave me when ran against sample:
id location name -- -------- ---- 55c7b7f411e2661e001000806c38eba0 \test1\blabla\asdf\1234\ws auswertungen name bla bla/ bla_bla 1 22:34
Comments
Post a Comment