Тег META, атрибут charset

Содержание

HTML Character Sets
Example
HTML Character Sets
In the Beginning: ASCII
In Windows: Windows-1252
In HTML 4: ISO-8859-1
Example
Example
Example
In HTML5: Unicode UTF-8
Example
Example
Атрибут charset
Синтаксис
Значения
Значение по умолчанию
Типы тегов
What is ?
2 Answers 2

HTML Character Sets

To display an HTML page correctly, the browser must know what character set (encoding) to use:

Example

HTML Character Sets

The HTML5 specification encourages web developers to use the UTF-8 character set!

This has not always been the case. The character encoding for the early web was ASCII.

Later, from HTML 2.0 to HTML 4.01, ISO-8859-1 was considered as the standard character set.

With XML and HTML5, UTF-8 finally arrived and solved a lot of character encoding problems.

In the Beginning: ASCII

Computer data is stored as binary codes (01000101) in the electronics.

To standardize the storing of text, the American Standard Code for Information Interchange (ASCII) was created. It defined a unique binary number for each storable character to support the numbers from 0-9, the upper and lower case alphabet (a-z, A-Z), and special characters like ! $ + — ( ) @ < >, .

Since ASCII used 7 bits for the character, it could only represent 128 different characters.

The biggest weakness with ASCII, was that it excluded non English letters.

ASCII is still in use today, especially in large mainframe computer systems.

For a closer look, please study our Complete ASCII Reference.

In Windows: Windows-1252

Windows-1252 was the default character set in Windows, up to Windows 95.

It is an extension to ASCII, with added international characters.

It uses a full byte (8-bits) to represent 256 different characters.

Since Windows-1252 has been the default in Windows, it is supported by all browsers.

In HTML 4: ISO-8859-1

The character set most often used in HTML 4 was ISO-8859-1.

ISO-8859-1 is an extension to ASCII, with added international characters.

Example

In HTML 4, a character set different from ISO-8859-1 can be specified in the tag:

Example

All HTML 4 processors also support UTF-8:

Example

When a browser detects ISO-8859-1 it normally defaults to Windows-1252, because Windows-1252 has 32 more international characters.

In HTML5: Unicode UTF-8

The HTML5 specification encourages web developers to use the UTF-8 character set.

Example

A character-set different from UTF-8 can be specified in the tag:

Example

The Unicode Consortium developed the UTF-8 and UTF-16 standards, because the ISO-8859 character-sets are limited, and not compatible a multilingual environment.

The Unicode Standard covers (almost) all the characters, punctuations, and symbols in the world.

All HTML5 and XML processors support UTF-8, UTF-16, Windows-1252, and ISO-8859.

Источник

Атрибут charset

Указывает кодировку документа. Атрибут введен в HTML5 и предназначен для сокращения формы тега , которая задавала кодировку в предыдущих версиях HTML и XHTML.

Синтаксис

Значения

Название кодировки, например UTF-8.

Значение по умолчанию

      Типовой документ.

Не выкладывайте свой код напрямую в комментариях, он отображается некорректно. Воспользуйтесь сервисом cssdeck.com или jsfiddle.net, сохраните код и в комментариях дайте на него ссылку. Так и результат сразу увидят.

Типы тегов

HTML5

Блочные элементы

Строчные элементы

Универсальные элементы

Нестандартные теги

Осуждаемые теги

Видео

Документ

Звук

Изображения

Объекты

Скрипты

Списки

Ссылки

Таблицы

Текст

Форматирование

Формы

Фреймы

Источник

What is ?

I just started learning HTML (no coding background) and don’t know what this means. I generally write it when I start the code after , but I don’t have any idea what it means. I also do not know what «doctype» means. What will happen if I don’t use it?

2 Answers 2

The characters you are reading on your screen now each have a numerical value. In the ASCII format, for example, the letter ‘A’ is 65, ‘B’ is 66, and so on. If you look at a table of characters available in ASCII you will see that it isn’t much use for someone who wishes to write something in Mandarin, Arabic, or Japanese. For characters / words from those languages to be displayed we needed another system of encoding them to and from numbers stored in computer memory.

UTF-8 is just one of the encoding methods that were invented to implement this requirement. It lets you write text in all kinds of languages, so French accents will appear perfectly fine, as will text like this:

Бзиа збаша (Bzia zbaşa), Фэсапщы, Ç’kemi, ሰላም, and even right-to-left writing such as this السلام عليكم

If you copy and paste the above text into Notepad and then try to save the file as ANSI (another format) you will receive a warning that saving in this format will lose some of the formatting. Accept it, then reload the text file and you’ll see something like this:

. . (Bzia zbasa), . Ç’kemi, . and even right-to-left writing such as this . .

Источник