- Character sets
- The character set and character escaping
- The character set and character escaping
- mysqli_set_charset
- Список параметров
- Возвращаемые значения
- Ошибки
- Примеры
- Примечания
- Смотрите также
- User Contributed Notes 5 notes
- mysqli_character_set_name
- Parameters
- Return Values
- Examples
- See Also
- User Contributed Notes
Character sets
Ideally a proper character set will be set at the server level, and doing this is described within the » Character Set Configuration section of the MySQL Server manual. Alternatively, each MySQL API offers a method to set the character set at runtime.
The character set and character escaping
The character set should be understood and defined, as it has an affect on every action, and includes security implications. For example, the escaping mechanism (e.g., mysqli_real_escape_string() for mysqli, mysql_real_escape_string() for mysql, and PDO::quote() for PDO_MySQL) will adhere to this setting. It is important to realize that these functions will not use the character set that is defined with a query, so for example the following will not have an effect on them:
Пример #1 Problems with setting the character set with SQL
$mysqli = new mysqli ( «localhost» , «my_user» , «my_password» , «world» );
// Will NOT affect $mysqli->real_escape_string();
$mysqli -> query ( «SET NAMES utf8» );
// Will NOT affect $mysqli->real_escape_string();
$mysqli -> query ( «SET CHARACTER SET utf8» );
// But, this will affect $mysqli->real_escape_string();
$mysqli -> set_charset ( ‘utf8’ );
// But, this will NOT affect it (utf-8 vs utf8) — don’t use dashes here
$mysqli -> set_charset ( ‘utf-8’ );
Below are examples that demonstrate how to properly alter the character set at runtime using each API.
Замечание: Possible UTF-8 confusion
Because character set names in MySQL do not contain dashes, the string «utf8» is valid in MySQL to set the character set to UTF-8. The string «utf-8» is not valid, as using «utf-8» will fail to change the character set.
Пример #2 Setting the character set example: mysqli
$mysqli = new mysqli ( «localhost» , «my_user» , «my_password» , «world» );
?php
printf ( «Initial character set: %s\n» , $mysqli -> character_set_name ());
if (! $mysqli -> set_charset ( ‘utf8’ )) printf ( «Error loading character set utf8: %s\n» , $mysqli -> error );
exit;
>
echo «New character set information:\n» ;
print_r ( $mysqli -> get_charset () );
Пример #3 Setting the character set example: pdo_mysql
Note: This only works as of PHP 5.3.6.
Пример #4 Setting the character set example: mysql
$conn = mysql_connect ( «localhost» , «my_user» , «my_pass» );
$db = mysql_select_db ( «world» );
?php
echo ‘Initial character set: ‘ . mysql_client_encoding ( $conn ) . «\n» ;
if (! mysql_set_charset ( ‘utf8’ , $conn )) echo «Error: Unable to set the character set.\n» ;
exit;
>
echo ‘Your current character set is: ‘ . mysql_client_encoding ( $conn );
?>
The character set and character escaping
The character set should be understood and defined, as it has an affect on every action, and includes security implications. For example, the escaping mechanism (e.g., mysqli_real_escape_string() for mysqli and PDO::quote() for PDO_MySQL) will adhere to this setting. It is important to realize that these functions will not use the character set that is defined with a query, so for example the following will not have an effect on them:
Example #1 Problems with setting the character set with SQL
$mysqli = new mysqli ( «localhost» , «my_user» , «my_password» , «world» );
// Will NOT affect $mysqli->real_escape_string();
$mysqli -> query ( «SET NAMES utf8mb4» );
// Will NOT affect $mysqli->real_escape_string();
$mysqli -> query ( «SET CHARACTER SET utf8mb4» );
// But, this will affect $mysqli->real_escape_string();
$mysqli -> set_charset ( ‘utf8mb4’ );
// But, this will NOT affect it (UTF-8 vs utf8mb4) — don’t use dashes here
$mysqli -> set_charset ( ‘UTF-8’ );
?>
Below are examples that demonstrate how to properly alter the character set at runtime using each API.
Note: Possible UTF-8 confusion
Because character set names in MySQL do not contain dashes, the string «utf8» is valid in MySQL to set the character set to UTF-8 (up to 3 byte UTF-8 Unicode Encoding). The string «UTF-8» is not valid, as using «UTF-8» will fail to change the character set and will throw an error.
Example #2 Setting the character set example: mysqli
$mysqli = new mysqli ( «localhost» , «my_user» , «my_password» , «world» );
?php
echo ‘Initial character set: ‘ . $mysqli -> character_set_name () . «\n» ;
if (! $mysqli -> set_charset ( ‘utf8mb4’ )) printf ( «Error loading character set utf8mb4: %s\n» , $mysqli -> error );
exit;
>
echo ‘Your current character set is: ‘ . $mysqli -> character_set_name () . «\n» ;
?>
Example #3 Setting the character set example: pdo_mysql
mysqli_set_charset
Задаёт набор символов, который будет использоваться при обмене данными с сервером баз данных.
Список параметров
Только для процедурного стиля: объект mysqli , полученный с помощью mysqli_connect() или mysqli_init() .
Набор символов, который необходимо установить.
Возвращаемые значения
Возвращает true в случае успешного выполнения или false в случае возникновения ошибки.
Ошибки
Если уведомления об ошибках mysqli включены ( MYSQLI_REPORT_ERROR ) и запрошенная операция не удалась, выдаётся предупреждение. Если, кроме того, установлен режим MYSQLI_REPORT_STRICT , вместо этого будет выброшено исключение mysqli_sql_exception .
Примеры
Пример #1 Пример использования mysqli::set_charset()
mysqli_report ( MYSQLI_REPORT_ERROR | MYSQLI_REPORT_STRICT );
$mysqli = new mysqli ( «localhost» , «my_user» , «my_password» , «test» );
printf ( «Начальный набор символов: %s\n» , $mysqli -> character_set_name ());
/* изменение набора символов на utf8mb4 */
$mysqli -> set_charset ( «utf8mb4» );
printf ( «Текущий набор символов: %s\n» , $mysqli -> character_set_name ());
mysqli_report ( MYSQLI_REPORT_ERROR | MYSQLI_REPORT_STRICT );
$link = mysqli_connect ( ‘localhost’ , ‘my_user’ , ‘my_password’ , ‘test’ );
printf ( «Начальный набор символов: %s\n» , mysqli_character_set_name ( $link ));
/* изменение набора символов на utf8mb4 */
mysqli_set_charset ( $link , «utf8mb4» );
printf ( «Текущий набор символов: %s\n» , mysqli_character_set_name ( $link ));
Результат выполнения данных примеров:
Начальный набор символов: latin1 Текущий набор символов: utf8mb4
Примечания
Замечание:
Чтобы использовать эту функцию на Windows платформах, вам потребуется клиентская библиотека MySQL версии 4.1.11 или выше (для MySQL 5.0 соответственно 5.0.6 или выше).
Замечание:
Это предпочтительный способ задания набора символов. Использование для этих целей функции mysqli_query() (например SET NAMES utf8 ) не рекомендуется. Дополнительно смотрите Наборы символов в MySQL.
Смотрите также
- mysqli_character_set_name() — Возвращает текущую кодировку, установленную для соединения с БД
- mysqli_real_escape_string() — Экранирует специальные символы в строке для использования в SQL-выражении, используя текущий набор символов соединения
- Концепции кодировок MySQL
- » Список поддерживаемых MySQL наборов символов
User Contributed Notes 5 notes
Setting the charset (it’s really the encoding) like this after setting up your connection:
$connection->set_charset(«utf8mb4»)
FAILS to set the proper collation for the connection:
character_set_client: utf8mb4
character_set_connection: utf8mb4
character_set_database: utf8mb4
character_set_filesystem: binary
character_set_results: utf8mb4
character_set_server: utf8mb4
character_set_system: utf8
collation_connection: utf8mb4_general_ci collation_database: utf8mb4_unicode_ci
collation_server: utf8mb4_unicode_ci
If you use SET NAMES, that works:
$connection->query(«SET NAMES utf8mb4 COLLATE utf8mb4_unicode_ci»);
character_set_client: utf8mb4
character_set_connection: utf8mb4
character_set_database: utf8mb4
character_set_filesystem: binary
character_set_results: utf8mb4
character_set_server: utf8mb4
character_set_system: utf8
collation_connection: utf8mb4_unicode_ci collation_database: utf8mb4_unicode_ci
collation_server: utf8mb4_unicode_ci
Please note, that I set the following variables on the server:
Set the following to be: utf8mb4_unicode_ci
character-set-client-handshake = FALSE or 0
skip-character-set-client-handshake = TRUE or 1
So in my case, I had tried changing the collation from utf8mb4_unicode_ci for mysql and had to change it to uft8_general_ci.
mysqli_set_charset( $con, ‘utf8’);
right before I did the SELECT command.
This is my code for reading from db :
$con = mysqli_connect($DB_SERVER, $DB_USER_READER, $DB_PASS_READER, $DB_NAME, $DB_PORT);//this is the unique connection for the selection
mysqli_set_charset( $con, ‘utf8’);
$slct_stmnt = «SELECT «.$SELECT_WHAT.» FROM «.$WHICH_TBL.» WHERE «.$ON_WHAT_CONDITION;
$slct_query = mysqli_query($con, $slct_stmnt);
if ($slct_query==true) //Do your stuff here . . .
>
And it worked like a charm. All the best. The above code can work with reading chineese, russian or arabic or any international language from the database’s table column holding such data.
Although the documentation says that using that function is preferred than using SET NAMES, it is not sufficient in case you use a collation different from the default one:
// That will reset collation_connection to latin1_swedish_ci
// (the default collation for latin1):
$mysqli -> set_charset ( ‘latin1’ );
// You have to execute the following statement *after* mysqli::set_charset()
// in order to get the desired value for collation_connection:
$mysqli -> query ( «SET NAMES latin1 COLLATE latin1_german1_ci» );
To align both the character set (e.g., utf8mb4) AND the collation sequence with the schema (database) settings:
$mysqli = new mysqli ( DB_HOST , DB_USER , DB_PASSWORD , DB_SCHEMA , DB_PORT );
if ( 0 !== $mysqli -> connect_errno )
throw new \ Exception ( $mysqli -> connect_error , $mysqli -> connect_errno );
if ( TRUE !== $mysqli -> set_charset ( ‘utf8mb4’ ) )
throw new \ Exception ( $mysql -> error , $mysqli -> errno );
if ( TRUE !== $mysqli -> query ( ‘SET collation_connection = @@collation_database;’ ) )
throw new \ Exception ( $mysql -> error , $mysqli -> errno );
?>
To confirm:
echo ‘character_set_name: ‘ , $mysqli -> character_set_name (), ‘
‘ , PHP_EOL ;
foreach( $mysqli -> query ( «SHOW VARIABLES LIKE ‘%_connection’;» )-> fetch_all () as $setting )
echo $setting [ 0 ], ‘: ‘ , $setting [ 1 ], ‘
‘ , PHP_EOL ;
?>
will output something like:
character_set_name: utf8mb4
character_set_connection: utf8mb4
collation_connection: utf8mb4_unicode_520_ci
Note that using utf8mb4 with this function may cause this function to return false, depending on the MySQL client library compiled into PHP. If the client library is older than the introduction of utf8mb4, then PHP’s call of the libraries ‘mysql_set_character_set’ will return an error because it won’t recognise that character set.
The only way you will know there’s an error is by checking the return value, because PHP warnings are not emitted by this function.
mysqli_error will return something like:
«Can’t initialize character set utf8mb4 (path: /usr/share/mysql/charsets/)»
(I don’t think the directory has anything to do with it; I think the utf8mb4 vs utf8 distinction is handled internally)
A workaround is to recall with utf8, then do a ‘SET NAMES’ query with utf8mb4.
If your MySQL server is configured to use utf8 by default, then you may not notice any of this until you get obscure bugs. It seems it will still save into the database correctly in terms of bytes. However, you may get «Data too long for column» errors if you are truncating strings to fit fields, because from MySQL’s point of view during the length check every 4-byte character will actually be multiple individual characters. This caused me hours of debugging.
mysqli_character_set_name
Returns the current character set of the database connection.
Parameters
Procedural style only: A mysqli object returned by mysqli_connect() or mysqli_init()
Return Values
The current character set of the connection
Examples
Example #1 mysqli::character_set_name() example
mysqli_report ( MYSQLI_REPORT_ERROR | MYSQLI_REPORT_STRICT );
$mysqli = new mysqli ( «localhost» , «my_user» , «my_password» , «world» );
/* Set the default character set */
$mysqli -> set_charset ( ‘utf8mb4’ );
/* Print current character set */
$charset = $mysqli -> character_set_name ();
printf ( «Current character set is %s\n» , $charset );
mysqli_report ( MYSQLI_REPORT_ERROR | MYSQLI_REPORT_STRICT );
$mysqli = mysqli_connect ( «localhost» , «my_user» , «my_password» , «world» );
/* Set the default character set */
mysqli_set_charset ( $mysqli , ‘utf8mb4’ );
/* Print current character set */
$charset = mysqli_character_set_name ( $mysqli );
printf ( «Current character set is %s\n» , $charset );
The above examples will output:
Current character set is utf8mb4
See Also
- mysqli_set_charset() — Sets the client character set
- mysqli_real_escape_string() — Escapes special characters in a string for use in an SQL statement, taking into account the current charset of the connection