single-serving home Phrase Guides Character sets - banner
mp3 Audio
Print Booklets

Latin I

advanced search
Site Map
What's New!

Character sets:

 ISO 8859-1 Latin-1, plus general information
 ISO 8859-2, Latin Slavic and Central European
 ISO 8859-3, Maltese & Turkish
 ISO 8859-4, Baltic
 ISO 8859-5, Cyrillic
 ISO 8859-6, Arabic
 ISO 8859-7, Greek
 ISO 8859-8, Hebrew
 ISO 8859-9, Turkish
 ISO 8859-10, Nordic

Penn State has put together a nice short page on Suggested CSS Font Styles for international websites. This can be found on their "Tips for Developing Non-English Web Sites" page, with more here, and here.

Character sets and Language encoding

In order to write web pages in languages using characters not found in the standard ASCII 7-bit, you need to specify the character set. If the page has been made correctly, the reader still may not be able to read the page if they don't have a corresponding font installed.

7-bit ASCII (As far as I know, every computer that uses ASCII can display these characters.)
0000000-1111111 (7 bits, get it? 2 to the 7th power)
7-bit ASCII is also called US ASCII.

00-15 unused control characters
16-31 unused control characters
32-47 %32 ! " # $ % & ' ( ) * + , - . /
48-63 0 1 2 3 4 5 6 7 8 9 : ; < = > ?
64-79 @ A B C D E F G H I J K L M N O
80-95 P Q R S T U V W X Y Z [ \ ] ^ _
96-111 ` a b c d e f g h i j k l m n o
112-127 p q r s t u v w x y z { | } ~ %127


ISO 8859-1, Latin I

The upper 128 characters are defined differently in different languages. The ISO has developed the following standard sets and are in wide use. There are other character sets, such as KOi8 and JCK.

128-143 unused control characters
144-159 unused control characters
This table to the right is an image. The characters in the table above should match these.

To extend US ASCII, an additional bit was added, extending ASCII from 128 characters (000-127 or 2^7) to 256 characters (000-255 or 2^8)
The upper 128 are the range: 10000000-1111111 in binary.
This will display the second 128 characters, and is different for each language encoding.
This set here is called Latin-1 and is the base character set for HTML. The characters will be displayed if there is no character set definition in the HTML, but only on computers whose language does not use another base set.