No reasonably sized keyboard could possibly include all the characters in Unicode. U.S. keyboards are especially weak when it comes to typing in foreign languages with unusual accents and non-Latin scripts. XML allows you to use either character references or entity references to address this problem. In general, named entity references like Ě should be preferred to character references like ě because they're easier on the human beings who have to read the source code.

While there are no standards for how to name these entity references, there are some useful entity sets bundled with XHTML, DocBook, and MathML. Since these are all modular specifications, you can even use the DTDs that define their entity sets without pulling in the rest of the application. For example, if you just want to use the standard HTML entity references many designers have already memorized like © and  , you could add the following lines to your DTD.

<!ENTITY nbsp   " ">
<!ENTITY iexcl  "¡">
<!ENTITY cent   "¢">
<!ENTITY pound  "£">
<!ENTITY curren "¤">
<!ENTITY yen    "¥">
<!ENTITY brvbar "¦">
<!ENTITY sect   "§">
<!ENTITY uml    "¨">
<!ENTITY copy   "©">
...

Better yet, you could store local copies of the relevant DTDs in the same directory as your own DTD and just point to them.

<!ENTITY % HTMLlat1 PUBLIC
   "-//W3C//ENTITIES Latin 1 for XHTML//EN"
   "xhtml-lat1.ent">
%HTMLlat1;

<!ENTITY % HTMLsymbol PUBLIC
   "-//W3C//ENTITIES Symbols for XHTML//EN"
   "xhtml-symbol.ent">
%HTMLsymbol;

<!ENTITY % HTMLspecial PUBLIC
   "-//W3C//ENTITIES Special for XHTML//EN"
   "xhtml-special.ent">
%HTMLspecial;

If you're using catalogs you can use the public IDs to locate the local caches of these entity sets.

Even if you're defining your own entity references for a particular subset of Unicode, I still suggest using the standard names. The HTML names are far and away the most popular, so if there's an HTML name for a certain character, by all means use it. For example, I would never call Unicode character 0xA0, the nonbreaking space, anything other than  .