Python v 2.7. Enjoy! Licensed GPL v3+.
Code:
import re, lxml.html, string def qg_html_to_ascii(qg_html_text): """Convert and return QualysGuard's quasi HTML text to ASCII text.""" text = qg_html_text # Handle tagged line breaks (<p>, <br>) text = re.sub(r'(?i)<br>[ ]*', '\n', text) text = re.sub(r'(?i)<p>[ ]*', '\n', text) # Remove consecutive line breaks text = re.sub(r"^\s+", "", text, flags = re.MULTILINE) # Remove empty lines at the end. text = re.sub('[\n]+$', '$', text) # Store anchor tags href attribute links = list(lxml.html.iterlinks(text)) # Remove anchor tags html_element = lxml.html.fromstring(text) # Convert anchor tags to "link_text (link: link_url )". logging.debug('Converting anchor tags...') text = html_element.text_content().encode('ascii', 'ignore') # Convert each link. for l in links: # Find and replace each link. link_text = l[0].text_content().encode('ascii', 'ignore').strip() link_url = l[2].strip() # Replacing link_text if link_text != link_url: # Link text is different, most likely a description. text = string.replace(text, link_text, '%s (link: %s )' % (link_text, link_url)) else: # Link text is the same as the href. No need to duplicate link. text = string.replace(text, link_text, '%s' % (link_url)) logging.debug('Done.') return text
Before:
<P>Patch:<BR> Following are links for downloading patches to fix the vulnerabilities: <P> <A HREF="http://www.microsoft.com/downloads/details.aspx?familyid=E22EB3AE-1295-4FE2-9775-6F43C5C2AED3" TARGET="_blank">MS08-067: Microsoft Windows 2000 Service Pack 4</A><P> <A HREF="http://www.microsoft.com/downloads/details.aspx?familyid=0D5F9B6E-9265-44B9-A376-2067B73D6A03" TARGET="_blank">MS08-067: Windows XP Service Pack 2</A><P> <A HREF="http://www.microsoft.com/downloads/details.aspx?familyid=0D5F9B6E-9265-44B9-A376-2067B73D6A03" TARGET="_blank">MS08-067: Windows XP Service Pack 3</A><P> <A HREF="http://www.microsoft.com/downloads/details.aspx?familyid=4C16A372-7BF8-4571-B982-DAC6B2992B25" TARGET="_blank">MS08-067: Windows XP Professional x64 Edition</A><P> <A HREF="http://www.microsoft.com/downloads/details.aspx?familyid=4C16A372-7BF8-4571-B982-DAC6B2992B25" TARGET="_blank">MS08-067: Windows XP Professional x64 Edition Service Pack 2</A><P> <A HREF="http://www.microsoft.com/downloads/details.aspx?familyid=F26D395D-2459-4E40-8C92-3DE1C52C390D" TARGET="_blank">MS08-067: Windows Server 2003 Service Pack 1</A><P> <A HREF="http://www.microsoft.com/downloads/details.aspx?familyid=F26D395D-2459-4E40-8C92-3DE1C52C390D" TARGET="_blank">MS08-067: Windows Server 2003 Service Pack 2</A><P> <A HREF="http://www.microsoft.com/downloads/details.aspx?familyid=C04D2AFB-F9D0-4E42-9E1F-4B944A2DE400" TARGET="_blank">MS08-067: Windows Server 2003 x64 Edition</A><P> <A HREF="http://www.microsoft.com/downloads/details.aspx?familyid=C04D2AFB-F9D0-4E42-9E1F-4B944A2DE400" TARGET="_blank">MS08-067: Windows Server 2003 x64 Edition Service Pack 2</A><P> <A HREF="http://www.microsoft.com/downloads/details.aspx?familyid=AB590756-F11F-43C9-9DCC-A85A43077ACF" TARGET="_blank">MS08-067: Windows Server 2003 with SP1 for Itanium-based Systems</A><P> <A HREF="http://www.microsoft.com/downloads/details.aspx?familyid=AB590756-F11F-43C9-9DCC-A85A43077ACF" TARGET="_blank">MS08-067: Windows Server 2003 with SP2 for Itanium-based Systems</A><P> <A HREF="http://www.microsoft.com/downloads/details.aspx?familyid=18FDFF67-C723-42BD-AC5C-CAC7D8713B21" TARGET="_blank">MS08-067: Windows Vista and Windows Vista Service Pack 1</A><P> <A HREF="http://www.microsoft.com/downloads/details.aspx?familyid=A976999D-264F-4E6A-9BD6-3AD9D214A4BD" TARGET="_blank">MS08-067: Windows Vista x64 Edition and Windows Vista x64 Edition Service Pack 1</A><P> <A HREF="http://www.microsoft.com/downloads/details.aspx?familyid=25C17B07-1EFE-43D7-9B01-3DFDF1CE0BD7" TARGET="_blank">MS08-067: Windows Server 2008 for 32-bit Systems</A><P> <A HREF="http://www.microsoft.com/downloads/details.aspx?familyid=7B12018E-0CC1-4136-A68C-BE4E1633C8DF" TARGET="_blank">MS08-067: Windows Server 2008 for x64-based Systems</A><P> <A HREF="http://www.microsoft.com/downloads/details.aspx?familyid=2BCF89EF-6446-406C-9C53-222E0F0BAF7A" TARGET="_blank">MS08-067: Windows Server 2008 for Itanium-based Systems</A> <P>Virtual Patches:<BR> <A HREF="http://www.trendmicro.com/vulnerabilitycontrols "TARGET="_blank">Trend Micro Virtual Patching</A><BR> Virtual Patch #1002975: Server Service Vulnerability (wkssvc)<BR> Virtual Patch #1003080: Server Service Vulnerability (srvsvc)<BR> Virtual Patch #1003292: Block Conficker.B++ Worm Incoming Named Pipe Connection<BR> Virtual Patch #1003293: Block Conficker.B++ Worm Outgoing Named Pipe Connection<BR>
After:
Patch: Following are links for downloading patches to fix the vulnerabilities: MS08-067: Microsoft Windows 2000 Service Pack 4 (link: http://www.microsoft.com/downloads/details.aspx?familyid=E22EB3AE-1295-4FE2-9775-6F43C5C2AED3 ) MS08-067: Windows XP Service Pack 2 (link: http://www.microsoft.com/downloads/details.aspx?familyid=0D5F9B6E-9265-44B9-A376-2067B73D6A03 ) MS08-067: Windows XP Service Pack 3 (link: http://www.microsoft.com/downloads/details.aspx?familyid=0D5F9B6E-9265-44B9-A376-2067B73D6A03 ) MS08-067: Windows XP Professional x64 Edition (link: http://www.microsoft.com/downloads/details.aspx?familyid=4C16A372-7BF8-4571-B982-DAC6B2992B25 ) MS08-067: Windows XP Professional x64 Edition (link: http://www.microsoft.com/downloads/details.aspx?familyid=4C16A372-7BF8-4571-B982-DAC6B2992B25 ) Service Pack 2 MS08-067: Windows Server 2003 Service Pack 1 (link: http://www.microsoft.com/downloads/details.aspx?familyid=F26D395D-2459-4E40-8C92-3DE1C52C390D ) MS08-067: Windows Server 2003 Service Pack 2 (link: http://www.microsoft.com/downloads/details.aspx?familyid=F26D395D-2459-4E40-8C92-3DE1C52C390D ) MS08-067: Windows Server 2003 x64 Edition (link: http://www.microsoft.com/downloads/details.aspx?familyid=C04D2AFB-F9D0-4E42-9E1F-4B944A2DE400 ) MS08-067: Windows Server 2003 x64 Edition (link: http://www.microsoft.com/downloads/details.aspx?familyid=C04D2AFB-F9D0-4E42-9E1F-4B944A2DE400 ) Service Pack 2 MS08-067: Windows Server 2003 with SP1 for Itanium-based Systems (link: http://www.microsoft.com/downloads/details.aspx?familyid=AB590756-F11F-43C9-9DCC-A85A43077ACF ) MS08-067: Windows Server 2003 with SP2 for Itanium-based Systems (link: http://www.microsoft.com/downloads/details.aspx?familyid=AB590756-F11F-43C9-9DCC-A85A43077ACF ) MS08-067: Windows Vista and Windows Vista Service Pack 1 (link: http://www.microsoft.com/downloads/details.aspx?familyid=18FDFF67-C723-42BD-AC5C-CAC7D8713B21 ) MS08-067: Windows Vista x64 Edition and Windows Vista x64 Edition Service Pack 1 (link: http://www.microsoft.com/downloads/details.aspx?familyid=A976999D-264F-4E6A-9BD6-3AD9D214A4BD ) MS08-067: Windows Server 2008 for 32-bit Systems (link: http://www.microsoft.com/downloads/details.aspx?familyid=25C17B07-1EFE-43D7-9B01-3DFDF1CE0BD7 ) MS08-067: Windows Server 2008 for x64-based Systems (link: http://www.microsoft.com/downloads/details.aspx?familyid=7B12018E-0CC1-4136-A68C-BE4E1633C8DF ) MS08-067: Windows Server 2008 for Itanium-based Systems (link: http://www.microsoft.com/downloads/details.aspx?familyid=2BCF89EF-6446-406C-9C53-222E0F0BAF7A ) Virtual Patches: Trend Micro Virtual Patching (link: http://www.trendmicro.com/vulnerabilitycontrols ) Virtual Patch #1002975: Server Service Vulnerability (wkssvc) Virtual Patch #1003080: Server Service Vulnerability (srvsvc) Virtual Patch #1003292: Block Conficker.B++ Worm Incoming Named Pipe Connection Virtual Patch #1003293: Block Conficker.B++ Worm Outgoing Named Pipe Connection