測試環境為 CentOS 7 / CentOS 8 x86_64
MariaDB (MySQL) 資料庫編碼集預設為 latin1 (西歐語言).
[root@localhost ~]# mysql -u root -p Enter password: Welcome to the MariaDB monitor. Commands end with ; or \g. Your MariaDB connection id is 3 Server version: 5.5.60-MariaDB MariaDB Server Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. MariaDB [(none)]> SHOW VARIABLES LIKE "%character%"; +--------------------------+----------------------------+ | Variable_name | Value | +--------------------------+----------------------------+ | character_set_client | utf8 | | character_set_connection | utf8 | | character_set_database | latin1 | | character_set_filesystem | binary | | character_set_results | utf8 | | character_set_server | latin1 | | character_set_system | utf8 | | character_sets_dir | /usr/share/mysql/charsets/ | +--------------------------+----------------------------+ 8 rows in set (0.00 sec)
- character_set_client : 來自客戶端語句的字元編碼 ( Character Sets ).
- character_set_connection : “connection” 指的是從客戶端連接到服務器交互的會話的過程.如客戶端發送SQL 查詢語句,服務器回應客戶端的過程.
- character_set_database : 資料庫字元編碼 ( Character Sets ).
- character_set_filesystem : 檔案系統的字元編碼 ( Character Sets ).
- character_set_results : 查詢結果回傳給客戶端的字元編碼 ( Character Sets ).
- character_set_server : 資料庫伺服器的字元編碼 ( Character Sets ).
- character_set_system : The character set used by the server for storing identifiers. 該值始終為utf8.
文字排序 ( Collations )
MariaDB [(none)]> SHOW VARIABLES LIKE "%collation%"; +----------------------+-----------------+ | Variable_name | Value | +----------------------+-----------------+ | collation_connection | utf8_general_ci | | collation_database | utf8_general_ci | | collation_server | utf8_general_ci | +----------------------+-----------------+ 3 rows in set (0.001 sec)
更多關於字元編碼 ( Character Sets ) 設定,請參考 – https://dev.mysql.com/doc/refman/8.0/en/server-system-variables.html#sysvar_character_set_client
我的資料庫主要是儲存英語與中文,之前一直遇到搜尋不正確或是搜尋不到的問題,後來是將資料庫字元編碼 ( Character Sets ) 設定為 utf8 (8-bit Unicode Transformation Format).除了字元編碼設定為 utf8 外還要注意文字排序 ( Collations ) 的問題.
utf8 字元編碼 ( Character Sets ) 其 Collations 預設為 utf8_general_ci .
MariaDB [(none)]> SHOW CHARACTER SET LIKE 'utf8'; +---------+---------------+-------------------+--------+ | Charset | Description | Default collation | Maxlen | +---------+---------------+-------------------+--------+ | utf8 | UTF-8 Unicode | utf8_general_ci | 3 | +---------+---------------+-------------------+--------+ 1 row in set (0.00 sec)
不過蠻多討論都建議將 utf8_general_ci 改成 utf8_unicode_ci ,關於文字排序 ( Collations ) 請自行參考:
utf8_general_ci – http://collation-charts.org/mysql60/mysql604.utf8_general_ci.european.html
utf8_unicode_ci – http://collation-charts.org/mysql60/mysql604.utf8_unicode_ci.european.html
下面提供幾種方式來設定 Character Sets 與 Collations.
Database
- 建立資料庫時指定
下面我們在建立資料庫時指定資料庫字元編碼 ( Character Sets ) 以及 文字排序 ( Collations ) 設定為 utf8 與 utf8_general_ci.MariaDB [(none)]> CREATE DATABASE test1 DEFAULT CHARACTER SET utf8 COLLATE utf8_general_ci; Query OK, 1 row affected (0.00 sec)
檢視一下.
MariaDB [(none)]> USE test1; Database changed MariaDB [test1]> STATUS; -------------- mysql Ver 15.1 Distrib 5.5.60-MariaDB, for Linux (x86_64) using readline 5.1 Connection id: 3 Current database: test1 Current user: root@localhost SSL: Not in use Current pager: stdout Using outfile: '' Using delimiter: ; Server: MariaDB Server version: 5.5.60-MariaDB MariaDB Server Protocol version: 10 Connection: Localhost via UNIX socket Server characterset: latin1 Db characterset: utf8 Client characterset: utf8 Conn. characterset: utf8 UNIX socket: /var/lib/mysql/mysql.sock Uptime: 1 hour 13 min 44 sec Threads: 1 Questions: 19 Slow queries: 0 Opens: 1 Flush tables: 2 Open tables: 27 Queries per second avg: 0.004 --------------
- 修改資料庫編碼
如果在建立資料庫時沒有指定資料庫字元編碼 ( Character Sets ) 以及 文字排序 ( Collations ) 時,可以透過 ALTER 設定為 utf8 與 utf8_general_ci.MariaDB [(none)]> CREATE DATABASE test2; Query OK, 1 row affected (0.00 sec) MariaDB [(none)]> USE test2; Database changed MariaDB [test2]> STATUS; -------------- mysql Ver 15.1 Distrib 5.5.60-MariaDB, for Linux (x86_64) using readline 5.1 Connection id: 3 Current database: test2 Current user: root@localhost SSL: Not in use Current pager: stdout Using outfile: '' Using delimiter: ; Server: MariaDB Server version: 5.5.60-MariaDB MariaDB Server Protocol version: 10 Connection: Localhost via UNIX socket Server characterset: latin1 Db characterset: latin1 Client characterset: utf8 Conn. characterset: utf8 UNIX socket: /var/lib/mysql/mysql.sock Uptime: 1 hour 16 min 23 sec Threads: 1 Questions: 26 Slow queries: 0 Opens: 1 Flush tables: 2 Open tables: 27 Queries per second avg: 0.005 --------------
透過 ALTER 設定為 utf8 與 utf8_general_ci.
MariaDB [test1]> ALTER DATABASE test2 CHARACTER SET utf8 COLLATE utf8_general_ci; Query OK, 1 row affected (0.00 sec) MariaDB [test2]> STATUS; -------------- mysql Ver 15.1 Distrib 5.5.60-MariaDB, for Linux (x86_64) using readline 5.1 Connection id: 3 Current database: test2 Current user: root@localhost SSL: Not in use Current pager: stdout Using outfile: '' Using delimiter: ; Server: MariaDB Server version: 5.5.60-MariaDB MariaDB Server Protocol version: 10 Connection: Localhost via UNIX socket Server characterset: latin1 Db characterset: utf8 Client characterset: utf8 Conn. characterset: utf8 UNIX socket: /var/lib/mysql/mysql.sock Uptime: 1 hour 17 min 16 sec Threads: 1 Questions: 29 Slow queries: 0 Opens: 1 Flush tables: 2 Open tables: 27 Queries per second avg: 0.006 --------------
- 預設 Character Sets
這個方式是透過修改設定檔 /etc/my.cnf (資料庫服務需要重啟),主要在 [mysqld] , [mysqld_safe] 區塊下新增 character-set-server=utf8 與 character-set-filesystem = utf8 , collation-server=utf8_general_ci ,以及新增 [client] 區塊及 #default-character-set=utf8- CentOS 7
[root@localhost ~]# vi /etc/my.cnf [mysqld] datadir=/var/lib/mysql socket=/var/lib/mysql/mysql.sock # Disabling symbolic-links is recommended to prevent assorted security risks symbolic-links=0 # Settings user and group are ignored when systemd is used. # If you need to run mysqld under a different user or group, # customize your systemd unit file for mariadb according to the # instructions in http://fedoraproject.org/wiki/Systemd character-set-server=utf8 character-set-filesystem=utf8 collation-server=utf8_general_ci [mysqld_safe] log-error=/var/log/mariadb/mariadb.log pid-file=/var/run/mariadb/mariadb.pid character-set-server=utf8 character-set-filesystem=utf8 collation-server=utf8_general_ci # # include all files from the config directory # !includedir /etc/my.cnf.d [client] default-character-set=utf8
[root@localhost ~]# systemctl restart mariadb [root@localhost ~]# systemctl status mariadb ● mariadb.service - MariaDB database server Loaded: loaded (/usr/lib/systemd/system/mariadb.service; enabled; vendor preset: disabled) Active: active (running) since 三 2019-10-16 11:31:06 CST; 8s ago Process: 14422 ExecStartPost=/usr/libexec/mariadb-wait-ready $MAINPID (code=exited, status=0/SUCCESS) Process: 14387 ExecStartPre=/usr/libexec/mariadb-prepare-db-dir %n (code=exited, status=0/SUCCESS) Main PID: 14420 (mysqld_safe) Tasks: 20 CGroup: /system.slice/mariadb.service ├─14420 /bin/sh /usr/bin/mysqld_safe --basedir=/usr └─14644 /usr/libexec/mysqld --basedir=/usr --datadir=/var/lib/mysql --plugin-dir=/usr/lib64/mysql/pl...
- CentOS 8
設定方式同上, [mysqld] 區塊位於檔案 /etc/my.cnf.d/mariadb-server.cnf 與 [client] 區塊位於檔案 /etc/my.cnf.d/client.cnf
檢視一下.
[root@localhost ~]# mysql -u root -p Enter password: Welcome to the MariaDB monitor. Commands end with ; or \g. Your MariaDB connection id is 3 Server version: 5.5.60-MariaDB MariaDB Server Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. MariaDB [(none)]> SHOW VARIABLES LIKE "char%"; +--------------------------+----------------------------+ | Variable_name | Value | +--------------------------+----------------------------+ | character_set_client | utf8 | | character_set_connection | utf8 | | character_set_database | utf8 | | character_set_filesystem | utf8 | | character_set_results | utf8 | | character_set_server | utf8 | | character_set_system | utf8 | | character_sets_dir | /usr/share/mysql/charsets/ | +--------------------------+----------------------------+ 8 rows in set (0.00 sec)
- CentOS 7
Table
- 建立資料表時指定
MariaDB [astl1]> USE test1; Database changed
MariaDB [test1]> CREATE TABLE T1 (ID INT(11) UNSIGNED , Data1 TEXT ) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci; Query OK, 0 rows affected (0.004 sec)
MariaDB [test1]> SHOW CREATE TABLE T1; +-------+-------------------------------------------------------------------------------------------------------------------------------+ | Table | Create Table | +-------+-------------------------------------------------------------------------------------------------------------------------------+ | T1 | CREATE TABLE `T1` ( `ID` int(11) unsigned DEFAULT NULL, `Data1` text DEFAULT NULL ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 | +-------+-------------------------------------------------------------------------------------------------------------------------------+ 1 row in set (0.001 sec)
- 修改資料表編碼
透過 ALTER 設定為 utf8 與 utf8_general_ci.MariaDB [test1]> ALTER TABLE T1 CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci; Query OK, 0 rows affected (0.016 sec) Records: 0 Duplicates: 0 Warnings: 0 MariaDB [test1]> SHOW CREATE TABLE T1; +-------+----------------------------------------------------------------------------------------------------------------------------+ | Table | Create Table | +-------+----------------------------------------------------------------------------------------------------------------------------+ | T1 | CREATE TABLE `T1` ( `ID` int(11) unsigned DEFAULT NULL, `Data1` text DEFAULT NULL ) ENGINE=InnoDB DEFAULT CHARSET=utf8 | +-------+----------------------------------------------------------------------------------------------------------------------------+ 1 row in set (0.001 sec)
Column
- 修改資料欄位編碼
MariaDB [test1]> ALTER TABLE T1 MODIFY Data1 text CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci; Query OK, 0 rows affected (0.010 sec) Records: 0 Duplicates: 0 Warnings: 0 MariaDB [test1]> SHOW CREATE TABLE T1; +-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Table | Create Table | +-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | T1 | CREATE TABLE `T1` ( `ID` int(11) unsigned DEFAULT NULL, `Data1` text CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci DEFAULT NULL ) ENGINE=InnoDB DEFAULT CHARSET=utf8 | +-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 1 row in set (0.000 sec)
遇過問題
要把資料輸入到資料表時遇到以下的錯誤訊息.
Incorrect string value: '\\xF0\x9F\x91\xBD\xF0
查詢後是編碼問題,我的資料含有特殊字元,建議使用 CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci .