From comp.editors Mon Apr 1 04:22:27 1996 Path: fu-berlin.de!news.belwue.de!news.uni-ulm.de!rz.uni-karlsruhe.de!blackbush.xlink.net!tank.news.pipex.net!pipex!news.mathworks.com!newsfeed.internetmci.com!news.sprintlink.net!new-news.sprintlink.net!news.cc.sunysb.edu!usenet From: gene@calph.physics.sunysb.edu (Eugene Tyurin) Newsgroups: comp.editors Subject: Re: HTML REMOVER Date: 29 Mar 1996 09:10:16 -0500 Organization: Institute for Theoretical Physics, Stony Brook University Lines: 21 Sender: gene@calph.physics.sunysb.edu Distribution: inet Message-ID: References: <4jd0ea$6ak@zeke.ebtech.net> NNTP-Posting-Host: calph.physics.sunysb.edu In-reply-to: blefebvr@sleepy.ebtech.net's message of 28 Mar 1996 03:18:34 GMT X-Newsreader: Gnus v5.0.15 Don't know how much this will help you under MS-DOG, but here's a little dirty sed script I wrote (works fine for my purposes): # # html2txt.sed -- Strip HTML commands. # Copyleft (Cl) Eugene Tyurin # Last modified: Wed Feb 21 17:20:41 EST 1996 # # Assumes that a single HTML command doesn't take more than 2 lines. # s/<[^<]*>/ /g // /g } -- Eugene Tyurin - PhD student at ITP, Stony Brook U. http://www.physics.sunysb.edu/~gene/