- ASCII
- The ASCII Character Set
- ASCII Printable Characters
- ASCII Device Control Characters
- HTML ASCII Reference
- The ASCII Character Set
- ASCII Printable Characters
- ASCII Device Control Characters
- HTML Character Sets
- Example
- HTML Character Sets
- In the Beginning: ASCII
- In Windows: Windows-1252
- In HTML 4: ISO-8859-1
- Example
- Example
- Example
- In HTML5: Unicode UTF-8
- Example
- Example
ASCII
ASCII, the acronym for the «American Standard Code for Information Interchange» is the first character-encoding scheme used between computers on the Internet.
Modern character encoding schemes like UTF-8 and ISO-8859 are built on ASCII.
The ASCII Character Set
ASCII Character Set was designed in the ’60s, as a standard character set for computers and hardware devices, such as printers and tape drives.
Originally ASCII is based on the English alphabet, and it’s a 7-bit character set containing 128 characters: the numbers 0-9, uppercase and lowercase English letters from A to Z, some basic punctuation symbols and some special characters.
The character sets used in modern computers, HTML, and Internet are all based on ASCII.
Below, you can find table lists that contain 128 ASCII characters and their equivalent HTML entity codes
ASCII Printable Characters
ASCII Character | HTML Entity Code | Description |
---|---|---|
space | ||
! | ! | exclamation mark |
« | " | quotation mark |
# | # | number sign |
$ | $ | dollar sign |
% | % | percent sign |
& | & | ampersand |
‘ | ' | apostrophe |
( | ( | left parenthesis |
) | ) | right parenthesis |
* | * | asterisk |
+ | + | plus sign |
, | , | comma |
— | - | hyphen |
. | . | period |
/ | / | slash |
0 | 0 | digit 0 |
1 | 1 | digit 1 |
2 | 2 | digit 2 |
3 | 3 | digit 3 |
4 | 4 | digit 4 |
5 | 5 | digit 5 |
6 | 6 | digit 6 |
7 | 7 | digit 7 |
8 | 8 | digit 8 |
9 | 9 | digit 9 |
: | : | colon |
; | ; | semicolon |
< | less-than | |
= | = | equals-to |
> | > | greater-than |
? | ? | question mark |
@ | @ | at sign |
A | A | uppercase A |
B | B | uppercase B |
C | C | uppercase C |
D | D | uppercase D |
E | E | uppercase E |
F | F | uppercase F |
G | G | uppercase G |
H | H | uppercase H |
I | I | uppercase I |
J | J | uppercase J |
K | K | uppercase K |
L | L | uppercase L |
M | M | uppercase M |
N | N | uppercase N |
O | O | uppercase O |
P | P | uppercase P |
Q | Q | uppercase Q |
R | R | uppercase R |
S | S | uppercase S |
T | T | uppercase T |
U | U | uppercase U |
V | V | uppercase V |
W | W | uppercase W |
X | X | uppercase X |
Y | Y | uppercase Y |
Z | Z | uppercase Z |
[ | [ | left square bracket |
\ | \ | backslash |
] | ] | right square bracket |
^ | ^ | caret |
_ | _ | underscore |
` | ` | grave accent |
a | a | lowercase a |
b | b | lowercase b |
c | c | lowercase c |
d | d | lowercase d |
e | e | lowercase e |
f | f | lowercase f |
g | g | lowercase g |
h | h | lowercase h |
i | i | lowercase i |
j | j | lowercase j |
k | k | lowercase k |
l | l | lowercase l |
m | m | lowercase m |
n | n | lowercase n |
o | o | lowercase o |
p | p | lowercase p |
q | q | lowercase q |
r | r | lowercase r |
s | s | lowercase s |
t | t | lowercase t |
u | u | lowercase u |
v | v | lowercase v |
w | w | lowercase w |
x | x | lowercase x |
y | y | lowercase y |
z | z | lowercase z |
{ | left curly brace | |
| | | | vertical bar |
> | } | right curly brace |
~ | ~ | tilde |
ASCII Device Control Characters
The ASCII device control characters (except horizontal tab, line feed, and carriage return) have nothing to do inside an HTML document. Originally ASCII control characters (range 00-31, plus 127) were designed to control hardware devices.
ASCII Character | HTML Entity Code | Description |
---|---|---|
NUL | null character | |
SOH | start of header | |
STX | start of text | |
ETX | end of text | |
EOT | end of transmission | |
ENQ | enquiry | |
ACK | acknowledge | |
BEL | bell (ring) | |
BS | backspace | |
HT | horizontal tab | |
LF | line feed | |
VT | vertical tab | |
FF | form feed | |
CR | carriage return | |
SO | shift out | |
SI | shift in | |
DLE | data link escape | |
DC1 | device control 1 | |
DC2 | device control 2 | |
DC3 | device control 3 | |
DC4 | device control 4 | |
NAK | negative acknowledge | |
SYN | synchronize | |
ETB | end transmission block | |
CAN | cancel | |
EM | end of medium | |
SUB | substitute | |
ESC | escape | |
FS | file separator | |
GS | group separator | |
RS | record separator | |
US | unit separator | |
DEL | | delete (rubout) |
HTML ASCII Reference
ASCII was the first character set (encoding standard) used between computers on the Internet.
Both ISO-8859-1 (default in HTML 4.01) and UTF-8 (default in HTML5), are built on ASCII.
The ASCII Character Set
ASCII stands for the «American Standard Code for Information Interchange».
It was designed in the early 60’s, as a standard character set for computers and electronic devices.
ASCII is a 7-bit character set containing 128 characters.
It contains the numbers from 0-9, the upper and lower case English letters from A to Z, and some special characters.
The character sets used in modern computers, in HTML, and on the Internet, are all based on ASCII.
The following tables list the 128 ASCII characters and their equivalent number.
ASCII Printable Characters
Char | Number | Description |
---|---|---|
0 — 31 | Control characters (see below) | |
32 | space | |
! | 33 | exclamation mark |
« | 34 | quotation mark |
# | 35 | number sign |
$ | 36 | dollar sign |
% | 37 | percent sign |
& | 38 | ampersand |
‘ | 39 | apostrophe |
( | 40 | left parenthesis |
) | 41 | right parenthesis |
* | 42 | asterisk |
+ | 43 | plus sign |
, | 44 | comma |
— | 45 | hyphen |
. | 46 | period |
/ | 47 | slash |
0 | 48 | digit 0 |
1 | 49 | digit 1 |
2 | 50 | digit 2 |
3 | 51 | digit 3 |
4 | 52 | digit 4 |
5 | 53 | digit 5 |
6 | 54 | digit 6 |
7 | 55 | digit 7 |
8 | 56 | digit 8 |
9 | 57 | digit 9 |
: | 58 | colon |
; | 59 | semicolon |
60 | less-than | |
= | 61 | equals-to |
> | 62 | greater-than |
? | 63 | question mark |
@ | 64 | at sign |
A | 65 | uppercase A |
B | 66 | uppercase B |
C | 67 | uppercase C |
D | 68 | uppercase D |
E | 69 | uppercase E |
F | 70 | uppercase F |
G | 71 | uppercase G |
H | 72 | uppercase H |
I | 73 | uppercase I |
J | 74 | uppercase J |
K | 75 | uppercase K |
L | 76 | uppercase L |
M | 77 | uppercase M |
N | 78 | uppercase N |
O | 79 | uppercase O |
P | 80 | uppercase P |
Q | 81 | uppercase Q |
R | 82 | uppercase R |
S | 83 | uppercase S |
T | 84 | uppercase T |
U | 85 | uppercase U |
V | 86 | uppercase V |
W | 87 | uppercase W |
X | 88 | uppercase X |
Y | 89 | uppercase Y |
Z | 90 | uppercase Z |
[ | 91 | left square bracket |
\ | 92 | backslash |
] | 93 | right square bracket |
^ | 94 | caret |
_ | 95 | underscore |
` | 96 | grave accent |
a | 97 | lowercase a |
b | 98 | lowercase b |
c | 99 | lowercase c |
d | 100 | lowercase d |
e | 101 | lowercase e |
f | 102 | lowercase f |
g | 103 | lowercase g |
h | 104 | lowercase h |
i | 105 | lowercase i |
j | 106 | lowercase j |
k | 107 | lowercase k |
l | 108 | lowercase l |
m | 109 | lowercase m |
n | 110 | lowercase n |
o | 111 | lowercase o |
p | 112 | lowercase p |
q | 113 | lowercase q |
r | 114 | lowercase r |
s | 115 | lowercase s |
t | 116 | lowercase t |
u | 117 | lowercase u |
v | 118 | lowercase v |
w | 119 | lowercase w |
x | 120 | lowercase x |
y | 121 | lowercase y |
z | 122 | lowercase z |
123 | left curly brace | |
| | 124 | vertical bar |
> | 125 | right curly brace |
~ | 126 | tilde |
ASCII Device Control Characters
The ASCII control characters (range 00-31, plus 127) were designed to control hardware devices.
Control characters (except horizontal tab, line feed, and carriage return) have nothing to do inside an HTML document.
Char | Number | Description |
---|---|---|
NUL | 00 | null character |
SOH | 01 | start of header |
STX | 02 | start of text |
ETX | 03 | end of text |
EOT | 04 | end of transmission |
ENQ | 05 | enquiry |
ACK | 06 | acknowledge |
BEL | 07 | bell (ring) |
BS | 08 | backspace |
HT | 09 | horizontal tab |
LF | 10 | line feed |
VT | 11 | vertical tab |
FF | 12 | form feed |
CR | 13 | carriage return |
SO | 14 | shift out |
SI | 15 | shift in |
DLE | 16 | data link escape |
DC1 | 17 | device control 1 |
DC2 | 18 | device control 2 |
DC3 | 19 | device control 3 |
DC4 | 20 | device control 4 |
NAK | 21 | negative acknowledge |
SYN | 22 | synchronize |
ETB | 23 | end transmission block |
CAN | 24 | cancel |
EM | 25 | end of medium |
SUB | 26 | substitute |
ESC | 27 | escape |
FS | 28 | file separator |
GS | 29 | group separator |
RS | 30 | record separator |
US | 31 | unit separator |
DEL | 127 | delete (rubout) |
HTML Character Sets
To display an HTML page correctly, the browser must know what character set (encoding) to use:
Example
HTML Character Sets
The HTML5 specification encourages web developers to use the UTF-8 character set!
This has not always been the case. The character encoding for the early web was ASCII.
Later, from HTML 2.0 to HTML 4.01, ISO-8859-1 was considered as the standard character set.
With XML and HTML5, UTF-8 finally arrived and solved a lot of character encoding problems.
In the Beginning: ASCII
Computer data is stored as binary codes (01000101) in the electronics.
To standardize the storing of text, the American Standard Code for Information Interchange (ASCII) was created. It defined a unique binary number for each storable character to support the numbers from 0-9, the upper and lower case alphabet (a-z, A-Z), and special characters like ! $ + — ( ) @ < >, .
Since ASCII used 7 bits for the character, it could only represent 128 different characters.
The biggest weakness with ASCII, was that it excluded non English letters.
ASCII is still in use today, especially in large mainframe computer systems.
For a closer look, please study our Complete ASCII Reference.
In Windows: Windows-1252
Windows-1252 was the default character set in Windows, up to Windows 95.
It is an extension to ASCII, with added international characters.
It uses a full byte (8-bits) to represent 256 different characters.
Since Windows-1252 has been the default in Windows, it is supported by all browsers.
In HTML 4: ISO-8859-1
The character set most often used in HTML 4 was ISO-8859-1.
ISO-8859-1 is an extension to ASCII, with added international characters.
Example
In HTML 4, a character set different from ISO-8859-1 can be specified in the tag:
Example
All HTML 4 processors also support UTF-8:
Example
When a browser detects ISO-8859-1 it normally defaults to Windows-1252, because Windows-1252 has 32 more international characters.
In HTML5: Unicode UTF-8
The HTML5 specification encourages web developers to use the UTF-8 character set.
Example
A character-set different from UTF-8 can be specified in the tag:
Example
The Unicode Consortium developed the UTF-8 and UTF-16 standards, because the ISO-8859 character-sets are limited, and not compatible a multilingual environment.
The Unicode Standard covers (almost) all the characters, punctuations, and symbols in the world.
All HTML5 and XML processors support UTF-8, UTF-16, Windows-1252, and ISO-8859.