String Identity Matching Choices
-
Which representations to treat as equivalent (and which not)
-
Which components in the WWW architecture to make responsible for equivalences:
-
Each individual component that performs a string identity check has to take
equivalences into account (Late Normalization)
-
Duplicates and ambiguities are removed as close to their source as possible
(Early Normalization)
-
Which way to normalize (in the case that early normalization (2.2) is needed,
even if only in some cases)