SQL 語法 SELECT REGEXP

Loading

SQL 語法 SELECT 的字串比對 (需注意資料庫字元編碼 Character Sets 與 文字排序 Collations ,請參考資料庫編碼 – https://benjr.tw/102156 ),除了常用 LIKE – https://benjr.tw/98899 外還有 REGEXP (正規表示式) 可以使用.

SELECT * FROM Table1 WHERE col1 REGEXP '^abc';

測試環境為 CentOS 7 x86_64 (虛擬機)

REGEXP 正規表示式字元可以使用下面的的特殊符號來表示:

  • ^ 表示匹配的字元串在行首.
    搜尋 a 開頭的字串 (0 沒有 , 1 有).

    MariaDB [(none)]> SELECT 'abc' REGEXP '^a';
    +-------------------+
    | 'abc' REGEXP '^a' |
    +-------------------+
    |                 1 |
    +-------------------+
    1 row in set (0.00 sec)
    
    MariaDB [(none)]> SELECT 'babc' REGEXP '^a';
    +--------------------+
    | 'babc' REGEXP '^a' |
    +--------------------+
    |                  0 |
    +--------------------+
    1 row in set (0.01 sec)
    
  • $ 表示匹配的字元串在行尾.
    搜尋 有 a 結尾的字串 (0 沒有 , 1 有).

    MariaDB [(none)]> SELECT 'babc' REGEXP 'a$';
    +--------------------+
    | 'babc' REGEXP 'a$' |
    +--------------------+
    |                  0 |
    +--------------------+
    1 row in set (0.00 sec)
    
    MariaDB [(none)]> SELECT 'babca' REGEXP 'a$';
    +---------------------+
    | 'babca' REGEXP 'a$' |
    +---------------------+
    |                   1 |
    +---------------------+
    1 row in set (0.00 sec)
    
  • . 表示為任一字元.
    搜尋 有 a任一字元 的字串 (0 沒有 , 1 有).

    MariaDB [(none)]> SELECT 'babc' REGEXP 'a.';
    +---------------------+
    | 'babca' REGEXP 'a.' |
    +---------------------+
    |                   1 |
    +---------------------+
    1 row in set (0.00 sec)
    
    MariaDB [(none)]> SELECT 'bbca' REGEXP 'a.';
    +--------------------+
    | 'bbca' REGEXP 'a.' |
    +--------------------+
    |                  0 |
    +--------------------+
    1 row in set (0.00 sec)
    
  • * 表示 “重複” 零個到無窮多個的字元.
    搜尋 有 a aa aaa aaaa… 的字串 (0 沒有 , 1 有).

    MariaDB [(none)]> SELECT 'bbc' REGEXP 'aa*';
    +--------------------+
    | 'bbc' REGEXP 'aa*' |
    +--------------------+
    |                  0 |
    +--------------------+
    1 row in set (0.00 sec)
    
    MariaDB [(none)]> SELECT 'babc' REGEXP 'aa*';
    +---------------------+
    | 'babc' REGEXP 'aa*' |
    +---------------------+
    |                   1 |
    +---------------------+
    1 row in set (0.00 sec)
    
    MariaDB [(none)]> SELECT 'baaaabc' REGEXP 'aa*';
    +------------------------+
    | 'baaaabc' REGEXP 'aa*' |
    +------------------------+
    |                      1 |
    +------------------------+
    1 row in set (0.00 sec)
    
  • .* 表示為任意字元.
    . 表示為任一字元,* 表示 “重複” 零個到無窮多個的前字元, e.* 表示為只需要有 e 的任意字串.

    MariaDB [(none)]> SELECT 'ven' REGEXP 'e.*';
    +--------------------+
    | 'ven' REGEXP 'e.*' |
    +--------------------+
    |                  1 |
    +--------------------+
    1 row in set (0.00 sec)
    
    MariaDB [(none)]> SELECT 've' REGEXP 'e.*';
    +-------------------+
    | 've' REGEXP 'e.*' |
    +-------------------+
    |                 1 |
    +-------------------+
    1 row in set (0.00 sec)
    
    MariaDB [(none)]> SELECT 'vbc' REGEXP 'e.*';
    +--------------------+
    | 'vbc' REGEXP 'e.*' |
    +--------------------+
    |                  0 |
    +--------------------+
    1 row in set (0.00 sec)
    
  • + 表示 “重複” 一個到無窮多個的字元.
    搜尋 有 a aa aaa aaaa… 的字串 (0 沒有 , 1 有).

    MariaDB [(none)]> SELECT 'bbc' REGEXP 'a+';
    +-------------------+
    | 'bbc' REGEXP 'a+' |
    +-------------------+
    |                 0 |
    +-------------------+
    1 row in set (0.00 sec)
    
    MariaDB [(none)]> SELECT 'baaaabc' REGEXP 'a+';
    +-----------------------+
    | 'baaaabc' REGEXP 'a+' |
    +-----------------------+
    |                     1 |
    +-----------------------+
    1 row in set (0.00 sec)
    
    MariaDB [(none)]> SELECT 'babc' REGEXP 'a+';
    +--------------------+
    | 'babc' REGEXP 'a+' |
    +--------------------+
    |                  1 |
    +--------------------+
    1 row in set (0.00 sec)
    
  • ? 表示 “重複” 零個到一個的字元.
    搜尋 有 a aa 的字串 (0 沒有 , 1 有).

    MariaDB [(none)]> SELECT 'babc' REGEXP 'aa?';
    +---------------------+
    | 'babc' REGEXP 'aa?' |
    +---------------------+
    |                   1 |
    +---------------------+
    1 row in set (0.00 sec)
    
    MariaDB [(none)]> SELECT 'bbbc' REGEXP 'aa?';
    +---------------------+
    | 'bbbc' REGEXP 'aa?' |
    +---------------------+
    |                   0 |
    +---------------------+
    1 row in set (0.00 sec)
    
  • A|B , (A|B) 可以是 A 或是 B.
    搜尋含有 aa 或 bb 的字串 (0 沒有 , 1 有).

    MariaDB [(none)]> SELECT 'bbbc' REGEXP 'aa|bb';
    +-----------------------+
    | 'bbbc' REGEXP 'aa|bb' |
    +-----------------------+
    |                     1 |
    +-----------------------+
    1 row in set (0.00 sec)
    
    MariaDB [(none)]> SELECT 'babc' REGEXP 'aa|bb';
    +-----------------------+
    | 'babc' REGEXP 'aa|bb' |
    +-----------------------+
    |                     0 |
    +-----------------------+
    1 row in set (0.00 sec)
    
    MariaDB [(none)]> SELECT 'baac' REGEXP 'aa|bb';
    +-----------------------+
    | 'baac' REGEXP 'aa|bb' |
    +-----------------------+
    |                     1 |
    +-----------------------+
    1 row in set (0.00 sec)
    

    搜尋含有 ab 或 bb 的字串 (0 沒有 , 1 有).

    MariaDB [(none)]> SELECT 'bbbc' REGEXP '(a|b)b';
    +------------------------+
    | 'bbbc' REGEXP '(a|b)b' |
    +------------------------+
    |                      1 |
    +------------------------+
    1 row in set (0.00 sec)
    
    MariaDB [(none)]> SELECT 'abbc' REGEXP '(a|b)b';
    +------------------------+
    | 'abbc' REGEXP '(a|b)b' |
    +------------------------+
    |                      1 |
    +------------------------+
    1 row in set (0.00 sec)
    
    MariaDB [(none)]> SELECT 'acbc' REGEXP '(a|b)b';
    +------------------------+
    | 'acbc' REGEXP '(a|b)b' |
    +------------------------+
    |                      0 |
    +------------------------+
    1 row in set (0.00 sec)
    
  • {n} 配對次數 n 次(須為整數).
    搜尋含有 aa 的字串 (0 沒有 , 1 有).

    MariaDB [(none)]> SELECT 'babac' REGEXP 'a{2}';
    +-----------------------+
    | 'babac' REGEXP 'a{2}' |
    +-----------------------+
    |                     0 |
    +-----------------------+
    1 row in set (0.00 sec)
    
    MariaDB [(none)]> SELECT 'baabac' REGEXP 'a{2}';
    +------------------------+
    | 'baabac' REGEXP 'a{2}' |
    +------------------------+
    |                      1 |
    +------------------------+
    1 row in set (0.00 sec)
    
  • [ ] 包含 [ ] 內的任一字元.
    搜尋含有 aa 或是 ac 的字串 (0 沒有 , 1 有).

    MariaDB [(none)]> SELECT 'babac' REGEXP 'a[ac]';
    +------------------------+
    | 'babac' REGEXP 'a[ac]' |
    +------------------------+
    |                      1 |
    +------------------------+
    1 row in set (0.00 sec)
    
    MariaDB [(none)]> SELECT 'babad' REGEXP 'a[ac]';
    +------------------------+
    | 'babad' REGEXP 'a[ac]' |
    +------------------------+
    |                      0 |
    +------------------------+
    1 row in set (0.00 sec)
    

    a[a-bf] 表示搜尋含有 aa , ab ,或是 af 的字串 (0 沒有 , 1 有).

    MariaDB [(none)]> SELECT 'baaa' REGEXP 'a[a-bf]';
    +-------------------------+
    | 'baaa' REGEXP 'a[a-bf]' |
    +-------------------------+
    |                       1 |
    +-------------------------+
    1 row in set (0.00 sec)
    
    MariaDB [(none)]> SELECT 'baba' REGEXP 'a[a-bf]';
    +-------------------------+
    | 'baba' REGEXP 'a[a-bf]' |
    +-------------------------+
    |                       1 |
    +-------------------------+
    1 row in set (0.00 sec)
    
    MariaDB [(none)]> SELECT 'baca' REGEXP 'a[a-bf]';
    +-------------------------+
    | 'baca' REGEXP 'a[a-bf]' |
    +-------------------------+
    |                       0 |
    +-------------------------+
    1 row in set (0.00 sec)
    
    MariaDB [(none)]> SELECT 'bafa' REGEXP 'a[a-bf]';
    +-------------------------+
    | 'bafa' REGEXP 'a[a-bf]' |
    +-------------------------+
    |                       1 |
    +-------------------------+
    1 row in set (0.00 sec)
    

    搜尋含有 a0 , a1 ,或是 a2 的字串 (0 沒有 , 1 有).

    MariaDB [(none)]> SELECT 'baafa' REGEXP 'a[0-2]';
    +-------------------------+
    | 'baafa' REGEXP 'a[0-2]' |
    +-------------------------+
    |                       0 |
    +-------------------------+
    1 row in set (0.00 sec)
    
    MariaDB [(none)]> SELECT 'ba0fa' REGEXP 'a[0-2]';
    +-------------------------+
    | 'ba0fa' REGEXP 'a[0-2]' |
    +-------------------------+
    |                       1 |
    +-------------------------+
    1 row in set (0.00 sec)
    
    MariaDB [(none)]> SELECT 'ba1fa' REGEXP 'a[0-2]';
    +-------------------------+
    | 'ba1fa' REGEXP 'a[0-2]' |
    +-------------------------+
    |                       1 |
    +-------------------------+
    1 row in set (0.00 sec)
    
    MariaDB [(none)]> SELECT 'ba2fa' REGEXP 'a[0-2]';
    +-------------------------+
    | 'ba2fa' REGEXP 'a[0-2]' |
    +-------------------------+
    |                       1 |
    +-------------------------+
    1 row in set (0.00 sec)
    
    MariaDB [(none)]> SELECT 'ba3fa' REGEXP 'a[0-2]';
    +-------------------------+
    | 'ba3fa' REGEXP 'a[0-2]' |
    +-------------------------+
    |                       0 |
    +-------------------------+
    1 row in set (0.00 sec)
    
  • {n,m} 配對次數從 n 到 m 次(皆須為整數,n 需小於等於 m , n<=m).
    搜尋含有 abbc 或是 accc 的字串 (0 沒有 , 1 有).

    MariaDB [(none)]> SELECT 'abbcdf' REGEXP 'a[bc]{2}c';
    +-----------------------------+
    | 'abbcdf' REGEXP 'a[bc]{2}c' |
    +-----------------------------+
    |                           1 |
    +-----------------------------+
    1 row in set (0.00 sec)
    
    MariaDB [(none)]> SELECT 'abbddf' REGEXP 'a[bc]{2}c';
    +-----------------------------+
    | 'abbddf' REGEXP 'a[bc]{2}c' |
    +-----------------------------+
    |                           0 |
    +-----------------------------+
    1 row in set (0.00 sec)
    
    MariaDB [(none)]> SELECT 'acccdf' REGEXP 'a[bc]{2}c';
    +-----------------------------+
    | 'acccdf' REGEXP 'a[bc]{2}c' |
    +-----------------------------+
    |                           1 |
    +-----------------------------+
    1 row in set (0.00 sec)
    

    搜尋含有 abbdf , abbbdf , accdf 或是 acccdf 的字串 (0 沒有 , 1 有).

    MariaDB [(none)]> SELECT 'acdf' REGEXP 'a[bc]{2,3}df';
    +------------------------------+
    | 'acdf' REGEXP 'a[bc]{2,3}df' |
    +------------------------------+
    |                            0 |
    +------------------------------+
    1 row in set (0.00 sec)
    
    MariaDB [(none)]> SELECT 'accdf' REGEXP 'a[bc]{2,3}df';
    +-------------------------------+
    | 'accdf' REGEXP 'a[bc]{2,3}df' |
    +-------------------------------+
    |                             1 |
    +-------------------------------+
    1 row in set (0.00 sec)
    
    MariaDB [(none)]> SELECT 'acccdf' REGEXP 'a[bc]{2,3}df';
    +--------------------------------+
    | 'acccdf' REGEXP 'a[bc]{2,3}df' |
    +--------------------------------+
    |                              1 |
    +--------------------------------+
    1 row in set (0.00 sec)
    
    MariaDB [(none)]> SELECT 'accccdf' REGEXP 'a[bc]{2,3}df';
    +---------------------------------+
    | 'accccdf' REGEXP 'a[bc]{2,3}df' |
    +---------------------------------+
    |                               0 |
    +---------------------------------+
    1 row in set (0.00 sec)
    
  • [^ ] 不包含 [ ] 內的任一字元.
    搜尋 Name 所有不包含 A.. , B.. 或是 C.. 的紀錄.

    MariaDB [(none)]> SELECT 'ben' REGEXP '[^abc]..';
    +-------------------------+
    | 'ben' REGEXP '[^abc]..' |
    +-------------------------+
    |                       0 |
    +-------------------------+
    1 row in set (0.00 sec)
    
    MariaDB [(none)]> SELECT 'Ann' REGEXP '[^abc]..';
    +-------------------------+
    | 'Ann' REGEXP '[^abc]..' |
    +-------------------------+
    |                       0 |
    +-------------------------+
    1 row in set (0.00 sec)
    
    MariaDB [(none)]> SELECT 'Vic' REGEXP '[^abc]..';
    +-------------------------+
    | 'Vic' REGEXP '[^abc]..' |
    +-------------------------+
    |                       1 |
    +-------------------------+
    1 row in set (0.00 sec)
    

REGEXP : character_class

前面有用到 [0-9] 來指示需包含 0-9 內的任一數字,我們還可以使用 character_class 的方式,用 [[:digit:]] 取代 [0-9] 來指定.

  • [:alnum:] 文字或是數位字元,表示為 [A-Z,a-z,0-9]
    MariaDB [(none)]> SELECT 'babac' REGEXP 'a[a-z]';
    +-------------------------+
    | 'babac' REGEXP 'a[a-z]' |
    +-------------------------+
    |                       1 |
    +-------------------------+
    1 row in set (0.00 sec)
    
    MariaDB [(none)]> SELECT 'babac' REGEXP 'a[[:alnum:]]';
    +-------------------------------+
    | 'babac' REGEXP 'a[[:alnum:]]' |
    +-------------------------------+
    |                             1 |
    +-------------------------------+
    1 row in set (0.00 sec)
    MariaDB [(none)]> SELECT 'ba9ac' REGEXP 'a[[:alnum:]]';
    +-------------------------------+
    | 'ba9ac' REGEXP 'a[[:alnum:]]' |
    +-------------------------------+
    |                             1 |
    +-------------------------------+
    1 row in set (0.00 sec)
    

    搭配 ^[[:alnum:]]+$ 可以確保資料只包含 0-9 與 英文字母.

    MariaDB [(none)]> SELECT 'ba12' REGEXP '^[[:alnum:]]+$';
    +--------------------------------+
    | 'ba12' REGEXP '^[[:alnum:]]+$' |
    +--------------------------------+
    |                              1 |
    +--------------------------------+
    1 row in set (0.000 sec)
    
    MariaDB [(none)]> SELECT 'ba12+' REGEXP '^[[:alnum:]]+$';
    +---------------------------------+
    | 'ba12+' REGEXP '^[[:alnum:]]+$' |
    +---------------------------------+
    |                               0 |
    +---------------------------------+
    1 row in set (0.000 sec)
    
  • [:alpha:] 文字字元,表示為 [A-Z,a-z]
    MariaDB [(none)]> SELECT 'babac' REGEXP 'a[[:alpha:]]';
    +-------------------------------+
    | 'babac' REGEXP 'a[[:alpha:]]' |
    +-------------------------------+
    |                             1 |
    +-------------------------------+
    1 row in set (0.00 sec)
    
    MariaDB [(none)]> SELECT 'ba9a3' REGEXP 'a[[:alpha:]]';
    +-------------------------------+
    | 'ba9a3' REGEXP 'a[[:alpha:]]' |
    +-------------------------------+
    |                             0 |
    +-------------------------------+
    1 row in set (0.00 sec)
    
  • [:lower:] 小寫字元,表示為 [A-Z] 不確定大小寫是否會區分? 要看編碼.
  • [:upper:] 大寫字元,表示為 [a-z] 不確定大小寫是否會區分? 要看編碼.
  • [:digit:] 數位字元,表示為 [0-9]
    MariaDB [(none)]> SELECT '0988-123456' REGEXP '[0-9]'; 
    +-----------------------+
    | '0988' REGEXP '[0-9]' |
    +-----------------------+
    |                     1 |
    +-----------------------+
    1 row in set (0.00 sec)
    
    MariaDB [(none)]> SELECT '0988' REGEXP '[[:digit:]]'; 
    +-----------------------------+
    | '0988' REGEXP '[[:digit:]]' |
    +-----------------------------+
    |                           1 |
    +-----------------------------+
    1 row in set (0.00 sec)
    
    MariaDB [(none)]> SELECT 'ABCD' REGEXP '[0-9]'; 
    +-----------------------+
    | 'ABCD' REGEXP '[0-9]' |
    +-----------------------+
    |                     0 |
    +-----------------------+
    1 row in set (0.00 sec)
    
    MariaDB [(none)]> SELECT 'ABCD' REGEXP '[[:digit:]]'; 
    +-----------------------------+
    | 'ABCD' REGEXP '[[:digit:]]' |
    +-----------------------------+
    |                           0 |
    +-----------------------------+
    1 row in set (0.00 sec)
    

    搭配 ^[[:digit:]]+$ 可以確保資料只包含 0-9.

    MariaDB [(none)]> SELECT '0988-123456' REGEXP '^[[:digit:]]+$';
    +---------------------------------------+
    | '0988-123456' REGEXP '^[[:digit:]]+$' |
    +---------------------------------------+
    |                                     0 |
    +---------------------------------------+
    1 row in set (0.000 sec)
    
    MariaDB [(none)]> SELECT '0988123456' REGEXP '^[[:digit:]]+$';
    +--------------------------------------+
    | '0988123456' REGEXP '^[[:digit:]]+$' |
    +--------------------------------------+
    |                                    1 |
    +--------------------------------------+
    1 row in set (0.000 sec)
    

    如果要允許包含 – 可以使用 , 來分隔.

    MariaDB [(none)]> SELECT '0988-123456' REGEXP '^[[:digit:],-]+$';
    +-----------------------------------------+
    | '0988-123456' REGEXP '^[[:digit:],-]+$' |
    +-----------------------------------------+
    |                                       1 |
    +-----------------------------------------+
    
  • [:xdigit:] 十六進位數字 0-9,A(10),B(11),C(12),D(13),E(14),F(15),表示為 [0-9,a-f,A-F]
    MariaDB [(none)]> SELECT 'XXYY' REGEXP '[[:xdigit:]]+'; 
    +-------------------------------+
    | 'XXYY' REGEXP '[[:xdigit:]]+' |
    +-------------------------------+
    |                             0 |
    +-------------------------------+
    1 row in set (0.00 sec)
    
    MariaDB [(none)]> SELECT 'AXXYY' REGEXP '[[:xdigit:]]+'; 
    +--------------------------------+
    | 'AXXYY' REGEXP '[[:xdigit:]]+' |
    +--------------------------------+
    |                              1 |
    +--------------------------------+
    1 row in set (0.00 sec)
    
  • [:space:] 所有空白字元 ,含 空白鍵, Tab, CR (新行)
    MariaDB [(none)]> SELECT 'AXX.YY' REGEXP '[[:space:]]+'; 
    +--------------------------------+
    | 'AXX.YY' REGEXP '[[:space:]]+' |
    +--------------------------------+
    |                              0 |
    +--------------------------------+
    1 row in set (0.00 sec)
    
    MariaDB [(none)]> SELECT 'AXX YY' REGEXP '[[:space:]]+'; 
    +--------------------------------+
    | 'AXX YY' REGEXP '[[:space:]]+' |
    +--------------------------------+
    |                              1 |
    +--------------------------------+
    1 row in set (0.00 sec)
    
  • [:graph:] 圖型字元,不包括空格.
  • [:print:] 同 graph 但包括空格.
  • [:cntrl:] 控制字元.
  • [:blank:] 空白鍵.
  • [:punct:] 標點符號.
    MariaDB [(none)]> SELECT 'AXXYY' REGEXP '[[:punct:]]+'; 
    +-------------------------------+
    | 'AXXYY' REGEXP '[[:punct:]]+' |
    +-------------------------------+
    |                             0 |
    +-------------------------------+
    1 row in set (0.00 sec)
    
    MariaDB [(none)]> SELECT 'AXX.YY' REGEXP '[[:punct:]]+'; 
    +--------------------------------+
    | 'AXX.YY' REGEXP '[[:punct:]]+' |
    +--------------------------------+
    |                              1 |
    +--------------------------------+
    1 row in set (0.01 sec)
    

如果是要全域搜尋(不指定欄位),可以使用 Full Text Index 來處理,詳細請參考 – https://benjr.tw/101644

沒有解決問題,試試搜尋本站其他內容

發佈留言

發佈留言必須填寫的電子郵件地址不會公開。 必填欄位標示為 *

這個網站採用 Akismet 服務減少垃圾留言。進一步了解 Akismet 如何處理網站訪客的留言資料