MariaDB 資料庫編碼

Loading

測試環境為 CentOS 7 / CentOS 8 x86_64

MariaDB (MySQL) 資料庫編碼集預設為 latin1 (西歐語言).

[root@localhost ~]# mysql -u root -p
Enter password: 
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 3
Server version: 5.5.60-MariaDB MariaDB Server

Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

MariaDB [(none)]> SHOW VARIABLES LIKE "%character%";
+--------------------------+----------------------------+
| Variable_name            | Value                      |
+--------------------------+----------------------------+
| character_set_client     | utf8                       |
| character_set_connection | utf8                       |
| character_set_database   | latin1                     |
| character_set_filesystem | binary                     |
| character_set_results    | utf8                       |
| character_set_server     | latin1                     |
| character_set_system     | utf8                       |
| character_sets_dir       | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+
8 rows in set (0.00 sec)
  • character_set_client : 來自客戶端語句的字元編碼 ( Character Sets ).
  • character_set_connection : “connection” 指的是從客戶端連接到服務器交互的會話的過程.如客戶端發送SQL 查詢語句,服務器回應客戶端的過程.
  • character_set_database : 資料庫字元編碼 ( Character Sets ).
  • character_set_filesystem : 檔案系統的字元編碼 ( Character Sets ).
  • character_set_results : 查詢結果回傳給客戶端的字元編碼 ( Character Sets ).
  • character_set_server : 資料庫伺服器的字元編碼 ( Character Sets ).
  • character_set_system : The character set used by the server for storing identifiers. 該值始終為utf8.

文字排序 ( Collations )

MariaDB [(none)]> SHOW VARIABLES LIKE "%collation%";
+----------------------+-----------------+
| Variable_name        | Value           |
+----------------------+-----------------+
| collation_connection | utf8_general_ci |
| collation_database   | utf8_general_ci |
| collation_server     | utf8_general_ci |
+----------------------+-----------------+
3 rows in set (0.001 sec)

更多關於字元編碼 ( Character Sets ) 設定,請參考 – https://dev.mysql.com/doc/refman/8.0/en/server-system-variables.html#sysvar_character_set_client

我的資料庫主要是儲存英語與中文,之前一直遇到搜尋不正確或是搜尋不到的問題,後來是將資料庫字元編碼 ( Character Sets ) 設定為 utf8 (8-bit Unicode Transformation Format).除了字元編碼設定為 utf8 外還要注意文字排序 ( Collations ) 的問題.

utf8 字元編碼 ( Character Sets ) 其 Collations 預設為 utf8_general_ci .

MariaDB [(none)]> SHOW CHARACTER SET LIKE 'utf8';
+---------+---------------+-------------------+--------+
| Charset | Description   | Default collation | Maxlen |
+---------+---------------+-------------------+--------+
| utf8    | UTF-8 Unicode | utf8_general_ci   |      3 |
+---------+---------------+-------------------+--------+
1 row in set (0.00 sec)

不過蠻多討論都建議將 utf8_general_ci 改成 utf8_unicode_ci ,關於文字排序 ( Collations ) 請自行參考:
utf8_general_ci – http://collation-charts.org/mysql60/mysql604.utf8_general_ci.european.html
utf8_unicode_ci – http://collation-charts.org/mysql60/mysql604.utf8_unicode_ci.european.html

下面提供幾種方式來設定 Character Sets 與 Collations.

Database

  • 建立資料庫時指定
    下面我們在建立資料庫時指定資料庫字元編碼 ( Character Sets ) 以及 文字排序 ( Collations ) 設定為 utf8 與 utf8_general_ci.

    MariaDB [(none)]> CREATE DATABASE test1 DEFAULT CHARACTER SET utf8 COLLATE utf8_general_ci;
    Query OK, 1 row affected (0.00 sec)
    

    檢視一下.

    MariaDB [(none)]> USE test1;
    Database changed
    MariaDB [test1]> STATUS;
    --------------
    mysql  Ver 15.1 Distrib 5.5.60-MariaDB, for Linux (x86_64) using readline 5.1
    
    Connection id:          3
    Current database:       test1
    Current user:           root@localhost
    SSL:                    Not in use
    Current pager:          stdout
    Using outfile:          ''
    Using delimiter:        ;
    Server:                 MariaDB
    Server version:         5.5.60-MariaDB MariaDB Server
    Protocol version:       10
    Connection:             Localhost via UNIX socket
    Server characterset:    latin1
    Db     characterset:    utf8
    Client characterset:    utf8
    Conn.  characterset:    utf8
    UNIX socket:            /var/lib/mysql/mysql.sock
    Uptime:                 1 hour 13 min 44 sec
    
    Threads: 1  Questions: 19  Slow queries: 0  Opens: 1  Flush tables: 2  Open tables: 27  Queries per second avg: 0.004
    --------------
    
  • 修改資料庫編碼
    如果在建立資料庫時沒有指定資料庫字元編碼 ( Character Sets ) 以及 文字排序 ( Collations ) 時,可以透過 ALTER 設定為 utf8 與 utf8_general_ci.

    MariaDB [(none)]> CREATE DATABASE test2;
    Query OK, 1 row affected (0.00 sec)
    
    MariaDB [(none)]> USE test2;
    Database changed
    MariaDB [test2]> STATUS;
    --------------
    mysql  Ver 15.1 Distrib 5.5.60-MariaDB, for Linux (x86_64) using readline 5.1
    
    Connection id:          3
    Current database:       test2
    Current user:           root@localhost
    SSL:                    Not in use
    Current pager:          stdout
    Using outfile:          ''
    Using delimiter:        ;
    Server:                 MariaDB
    Server version:         5.5.60-MariaDB MariaDB Server
    Protocol version:       10
    Connection:             Localhost via UNIX socket
    Server characterset:    latin1
    Db     characterset:    latin1
    Client characterset:    utf8
    Conn.  characterset:    utf8
    UNIX socket:            /var/lib/mysql/mysql.sock
    Uptime:                 1 hour 16 min 23 sec
    
    Threads: 1  Questions: 26  Slow queries: 0  Opens: 1  Flush tables: 2  Open tables: 27  Queries per second avg: 0.005
    --------------
    

    透過 ALTER 設定為 utf8 與 utf8_general_ci.

    MariaDB [test1]> ALTER DATABASE test2 CHARACTER SET utf8 COLLATE utf8_general_ci;
    Query OK, 1 row affected (0.00 sec)
    
    MariaDB [test2]> STATUS;
    --------------
    mysql  Ver 15.1 Distrib 5.5.60-MariaDB, for Linux (x86_64) using readline 5.1
    
    Connection id:          3
    Current database:       test2
    Current user:           root@localhost
    SSL:                    Not in use
    Current pager:          stdout
    Using outfile:          ''
    Using delimiter:        ;
    Server:                 MariaDB
    Server version:         5.5.60-MariaDB MariaDB Server
    Protocol version:       10
    Connection:             Localhost via UNIX socket
    Server characterset:    latin1
    Db     characterset:    utf8
    Client characterset:    utf8
    Conn.  characterset:    utf8
    UNIX socket:            /var/lib/mysql/mysql.sock
    Uptime:                 1 hour 17 min 16 sec
    
    Threads: 1  Questions: 29  Slow queries: 0  Opens: 1  Flush tables: 2  Open tables: 27  Queries per second avg: 0.006
    --------------
    
  • 預設 Character Sets
    這個方式是透過修改設定檔 /etc/my.cnf (資料庫服務需要重啟),主要在 [mysqld] , [mysqld_safe] 區塊下新增 character-set-server=utf8 與 character-set-filesystem = utf8 , collation-server=utf8_general_ci ,以及新增 [client] 區塊及 #default-character-set=utf8

    • CentOS 7
      [root@localhost ~]# vi /etc/my.cnf
      [mysqld]
      datadir=/var/lib/mysql
      socket=/var/lib/mysql/mysql.sock
      # Disabling symbolic-links is recommended to prevent assorted security risks
      symbolic-links=0
      # Settings user and group are ignored when systemd is used.
      # If you need to run mysqld under a different user or group,
      # customize your systemd unit file for mariadb according to the
      # instructions in http://fedoraproject.org/wiki/Systemd
      character-set-server=utf8
      character-set-filesystem=utf8
      collation-server=utf8_general_ci
      
      [mysqld_safe]
      log-error=/var/log/mariadb/mariadb.log
      pid-file=/var/run/mariadb/mariadb.pid
      character-set-server=utf8
      character-set-filesystem=utf8
      collation-server=utf8_general_ci
      
      #
      # include all files from the config directory
      #
      !includedir /etc/my.cnf.d
      
      [client]
      default-character-set=utf8
      
      [root@localhost ~]# systemctl restart mariadb
      [root@localhost ~]# systemctl status mariadb
      ● mariadb.service - MariaDB database server
         Loaded: loaded (/usr/lib/systemd/system/mariadb.service; enabled; vendor preset: disabled)
         Active: active (running) since 三 2019-10-16 11:31:06 CST; 8s ago
        Process: 14422 ExecStartPost=/usr/libexec/mariadb-wait-ready $MAINPID (code=exited, status=0/SUCCESS)
        Process: 14387 ExecStartPre=/usr/libexec/mariadb-prepare-db-dir %n (code=exited, status=0/SUCCESS)
       Main PID: 14420 (mysqld_safe)
          Tasks: 20
         CGroup: /system.slice/mariadb.service
                 ├─14420 /bin/sh /usr/bin/mysqld_safe --basedir=/usr
                 └─14644 /usr/libexec/mysqld --basedir=/usr --datadir=/var/lib/mysql --plugin-dir=/usr/lib64/mysql/pl...
      
    • CentOS 8
      設定方式同上, [mysqld] 區塊位於檔案 /etc/my.cnf.d/mariadb-server.cnf 與 [client] 區塊位於檔案 /etc/my.cnf.d/client.cnf

    檢視一下.

    [root@localhost ~]# mysql -u root -p
    Enter password: 
    Welcome to the MariaDB monitor.  Commands end with ; or \g.
    Your MariaDB connection id is 3
    Server version: 5.5.60-MariaDB MariaDB Server
    
    Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.
    
    Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
    
    MariaDB [(none)]> SHOW VARIABLES LIKE "char%";
    +--------------------------+----------------------------+
    | Variable_name            | Value                      |
    +--------------------------+----------------------------+
    | character_set_client     | utf8                       |
    | character_set_connection | utf8                       |
    | character_set_database   | utf8                       |
    | character_set_filesystem | utf8                       |
    | character_set_results    | utf8                       |
    | character_set_server     | utf8                       |
    | character_set_system     | utf8                       |
    | character_sets_dir       | /usr/share/mysql/charsets/ |
    +--------------------------+----------------------------+
    8 rows in set (0.00 sec)
    

Table

  • 建立資料表時指定
    MariaDB [astl1]> USE test1;
    Database changed
    
    MariaDB [test1]> CREATE TABLE T1 (ID INT(11) UNSIGNED , Data1 TEXT ) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci;
    Query OK, 0 rows affected (0.004 sec)
    
    MariaDB [test1]> SHOW CREATE TABLE T1;
    +-------+-------------------------------------------------------------------------------------------------------------------------------+
    | Table | Create Table                                                                                                                  |
    +-------+-------------------------------------------------------------------------------------------------------------------------------+
    | T1    | CREATE TABLE `T1` (
      `ID` int(11) unsigned DEFAULT NULL,
      `Data1` text DEFAULT NULL
    ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 |
    +-------+-------------------------------------------------------------------------------------------------------------------------------+
    1 row in set (0.001 sec)
    
  • 修改資料表編碼
    透過 ALTER 設定為 utf8 與 utf8_general_ci.

    MariaDB [test1]> ALTER TABLE T1 CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;
    Query OK, 0 rows affected (0.016 sec)
    Records: 0  Duplicates: 0  Warnings: 0
    
    MariaDB [test1]> SHOW CREATE TABLE T1;
    +-------+----------------------------------------------------------------------------------------------------------------------------+
    | Table | Create Table                                                                                                               |
    +-------+----------------------------------------------------------------------------------------------------------------------------+
    | T1    | CREATE TABLE `T1` (
      `ID` int(11) unsigned DEFAULT NULL,
      `Data1` text DEFAULT NULL
    ) ENGINE=InnoDB DEFAULT CHARSET=utf8 |
    +-------+----------------------------------------------------------------------------------------------------------------------------+
    1 row in set (0.001 sec)
    

Column

  • 修改資料欄位編碼
    MariaDB [test1]> ALTER TABLE T1 MODIFY Data1 text CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
    Query OK, 0 rows affected (0.010 sec)
    Records: 0  Duplicates: 0  Warnings: 0
    
    MariaDB [test1]> SHOW CREATE TABLE T1;
    +-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    | Table | Create Table                                                                                                                                                                |
    +-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    | T1    | CREATE TABLE `T1` (
      `ID` int(11) unsigned DEFAULT NULL,
      `Data1` text CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci DEFAULT NULL
    ) ENGINE=InnoDB DEFAULT CHARSET=utf8 |
    +-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    1 row in set (0.000 sec)
    

遇過問題

要把資料輸入到資料表時遇到以下的錯誤訊息.

Incorrect string value: '\\xF0\x9F\x91\xBD\xF0

查詢後是編碼問題,我的資料含有特殊字元,建議使用 CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci .

沒有解決問題,試試搜尋本站其他內容

發佈留言

發佈留言必須填寫的電子郵件地址不會公開。 必填欄位標示為 *

這個網站採用 Akismet 服務減少垃圾留言。進一步了解 Akismet 如何處理網站訪客的留言資料