Transforms HTML documents into Thot Abstract Trees
-
It reads tags and constructs the Abstract Tree step by step
-
It is driven by an internal table and uses the Thot API
-
It dynamically corrects structure errors:
=> add missing elements (for example <Title> without
<Head>)
=> move misplaced elements
- How to Extend ?
-
Change the external table
-
Update the parser code according to the logical structure