All posts by Olivier Thereaux

Character encoding in HTML

In this first issue in the cookbook for the web series, we look at character encoding, or "charset"s. Discussing the ingredients, giving a reliable recipe for the detection of character encodings in (x)html, and a quick tip for web authors on an html diet.

link test suite

Building a web spider or link checker isn't as simple as the number of existing instances seem to show. Lots of things to check, from the many html attributes to the intricacies of HTTP's Content-Location. In order to see a little more clearly in all this, here comes the mini "Link Test Suite".