From tilh%sin-co.sin-ro.DHL.COM@sinco.sin-co.sin-ro.DHL.COM Mon Jul 17 03:24:23 1995 Date: Mon, 17 Jul 1995 09:22:32 +0800 From: Ti Lian HwangSubject: Enhanced html_parser Here's an enhanced html_parser to add to your collection of HTML Converters. It rationalises conversions to various other printer formats. I am attempting to standardise reports to HTML format, from which they can be converted to other formats, eg. to printers. Unfortunately most of the converters convert to Postscript. Face with non-postscript printers, like the HP Laserjet, I was searching for a tool to do the job. I'm using the html_parser programmes by Jim Davis of dri.cornell.edu and decided that enhancing it to work with HP-PCL would be worth the while. Included below are my efforts. They are perl scripts running under perl 5.0. Minimal change have been made to the programmes. They are all in parse-html.pl. The strategy is this : parse-html will get the TERM environment variable, and 'require' the relevent file, 'h2a_$TERM.pl'. This file should contain the following functions html_begin_doc () html_end_doc () begin_font ($element,$tag) end_font ($element,$tag) The routines should do whatever font changes are necesary for the HTML tag. If 'h2a_$TERM.pl' is not available, the file 'dummy.pl' will be used. I've included a 'h2a_pcl.pl' file for HP laserjets. I've included the call the begin_font and end_font in html_begin and html_end respectively. Noted 2 errors in the original distribution : 1) the 'U' attribute is not catered for - I've included. 2) there is an additional html_begin_doc in 'html-ascii.pl'. I've removed it, and left the code there, not elegant, but it works ! I would like to have the 'CENTER' attribute working (problem is I haven't figure out how to integrate into the code yet). Regards, email : tilh@sin-co.sin-ro.dhl.com