Character sets: |
ISO 8859-1 Latin-1, plus general information |
ISO 8859-2, Latin Slavic and Central European |
ISO 8859-3, Maltese & Turkish |
ISO 8859-4, Baltic |
ISO 8859-5, Cyrillic |
ISO 8859-6, Arabic |
ISO 8859-7, Greek |
ISO 8859-8, Hebrew |
ISO 8859-9, Turkish |
ISO 8859-10, Nordic |
|
|
|
Penn State has put together a nice short page on Suggested CSS Font Styles for international websites. This can be found on their "Tips for Developing Non-English Web Sites" page, with more here, and here.
|
|
Character sets and Language encoding
In order to write web pages in languages using characters not found in the standard ASCII 7-bit, you need to specify the character set. If the page has been made correctly, the reader still may not be able to read the page if they don't have a corresponding font installed.
7-bit ASCII (As far as I know, every computer that uses ASCII can display these characters.) 0000000-1111111 (7 bits, get it? 2 to the 7th power)
7-bit ASCII is also called US ASCII.
00-15 |
unused control characters |
16-31 |
unused control characters |
32-47 |
%32 |
! |
" |
# |
$ |
% |
& |
' |
( |
) |
* |
+ |
, |
- |
. |
/ |
48-63 |
0 |
1 |
2 |
3 |
4 |
5 |
6 |
7 |
8 |
9 |
: |
; |
< |
= |
> |
? |
64-79 |
@ |
A |
B |
C |
D |
E |
F |
G |
H |
I |
J |
K |
L |
M |
N |
O |
80-95 |
P |
Q |
R |
S |
T |
U |
V |
W |
X |
Y |
Z |
[ |
\ |
] |
^ |
_ |
96-111 |
` |
a |
b |
c |
d |
e |
f |
g |
h |
i |
j |
k |
l |
m |
n |
o |
112-127 |
p |
q |
r |
s |
t |
u |
v |
w |
x |
y |
z |
{ |
| |
} |
~ |
%127 |
ISO 8859-1, Latin I
The upper 128 characters are defined differently in different languages. The ISO has developed the following standard sets and are in wide use. There are other character sets, such as KOi8 and JCK.
128-143 |
unused control characters |
144-159 |
unused control characters |
160-175 |
|
¡ |
¢ |
£ |
¤ |
¥ |
¦ |
§ |
¨ |
© |
ª |
Ç |
¬ |
|
® |
¯ |
176-191 |
° |
± |
² |
³ |
´ |
µ |
¶ |
· |
¸ |
¹ |
º |
» |
¼ |
½ |
¾ |
¿ |
192-207 |
À |
Á |
 |
à |
Ä |
Å |
Æ |
Ç |
È |
É |
Ê |
Ë |
Ì |
Í |
Î |
Ï |
208-223 |
Ð |
Ñ |
Ò |
Ó |
Ô |
Õ |
Ö |
× |
Ø |
Ù |
Ú |
Û |
Ü |
Ý |
Þ |
ß |
224-239 |
à |
á |
â |
ã |
ä |
å |
æ |
ç |
è |
é |
ê |
ë |
ì |
í |
î |
ï |
240-255 |
ð |
ñ |
ò |
ó |
ô |
õ |
ö |
÷ |
ø |
ù |
ú |
û |
ü |
ý |
þ |
ÿ |
|
This table to the right is an image. The characters in the table above should match these. |
|
To extend US ASCII, an additional bit was added, extending ASCII from 128 characters (000-127 or 2^7) to 256 characters (000-255 or 2^8)
The upper 128 are the range: 10000000-1111111 in binary.
This will display the second 128 characters, and is different for each language encoding.
This set here is called Latin-1 and is the base character set for HTML. The characters will be displayed if there is no character set definition in the HTML, but only on computers whose language does not use another base set.
|
|
|
|