A rather futile pair of programs, but they were easy to write (180 lines of C total, including comments), and it shows an issue with XML...
These two simple programs can be used to transcode any file from ASCII to UTF-8, or vice versa. Round-trip conversion is assured by the use of &#-escapes ("numerical character references") in the ASCII output. (Also a great way to create UTF-8 files if you have no UTF-8 editor...)
The program doesn't understand XML, but it leaves all ASCII characters alone, since they may have special roles in XML. I.e., when the program finds an ASCII character, it will be copied verbatim, and not written as &#nnn;. Escaped ASCII characters (&#nnn;, where nnn <= 127) are left escaped for the same reason.
Actually, this is not enough. In XML, there are contexts in which non-ASCII characters cannot be written as &#-escapes. In particular, names of elements, attributes and entities cannot be encoded with &#-escapes.
For example, this XML file cannot be transcoded, because of the "é" that occurs in an entity name:
<!DOCTYPE xml SYSTEM "my.dtd"> <xml>&één;</xml>
Neither program accepts arguments. Just call them as:
xml2asc <file1 >file2
asc2xml <file2 >file1
The former reads a UTF-8 file and outputs it as ASCII, using &#-escapes for all characters that cannot be encoded in ASCII directly (Unicode codes >127).
The latter reads an ASCII file and writes it as UTF-8, expanding all &#-escapes for characters >127.
Neither program does any error checking. If there are syntax errors in the &#-escapes or if file1 is not a proper UTF-8 file, results are undefined.
Download the source, call it "xmlrecode.c" and compile it, then link the result to both xml2asc and asc2xml. (There is a single binary, what it does depends on the name with which it is invoked.)
There is also a Makefile, which consists of just these three lines:
all: xml2asc asc2xml asc2xml: xmlrecode; ln $< $@ xml2asc: xmlrecode; ln $< $@