..
..
"EXTRACT" "1" "15 Feb 2008" "4.x" "HTML-XML-utils"
"EXTRACT" "1" "15 Feb 2008" "4.x" "HTML-XML-utils"
Table of contents
extract - extract selected elements from a HTML or XML file
extract
[ -h
| -? ]
[ -x ]
[ -s
text ]
[ -e
text ]
[ -b
base ]
element-or-class
[ -c
configfile |
file-or-URL ]
extract
outputs all elements with a certain name and/or class.
Input must be well-formed, since no HTML heuristics are applied.
The following options are supported:
-x
Use XML format conventions.
-s text
Insert
text
at the start of the output.
-e text
Insert
text
at the end of the output.
-b base
URL base
-c configfile
Read @chapter lines from
configfile
(lines must be of the form "@chapter filename") and extract elements from each of those files.
-h , -?
Print command usage.
The following operands are supported:
element-or-class
The name of an element to extract (e.g., "H2"), or the name of a class
preceded by "." (e.g., ".example") or a combination of both (e.g.,
"H2.example").
file-or-URL
A file name or a URL. To read from standard input, use "-".
xselect (1)