MariaDB – String Split using CTE

Loading

測試環境為 CentOS 8 x86_64 虛擬機.

有辦法在單一欄位的資料依據分隔字元將字串拆解成陣列嗎?

如:

+-------------+
|    Data     | 
+-------------+
| 'A , B , C' | 
+-------------+

依據分隔字元 ‘ , ‘ 將以上字串拆解成以下陣列.

+------+
| Data | 
+------+
|    A | 
|    B | 
|    C | 
+------+

MyaSQL / MariaDB 找不到相關函數 ( PostgreSQL 有相關 string_to_array() 函数) .不過找到一個方式,參考文章 – https://stackoverflow.com/questions/6381587/efficient-way-to-string-split-using-cte (範例為 微軟的 T-SQL,下面轉成 MySQL 語法)

[root@localhost ~]# mysql -u root -p
Enter password: 
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 9
Server version: 10.3.17-MariaDB MariaDB Server
  
Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.
  
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
  
MariaDB [(none)]> CREATE DATABASE testdb;
Query OK, 1 row affected (0.00 sec)
   
MariaDB [(none)]> USE testdb;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
  
Database changed

建立測試用資料表.

MariaDB [testdb]> CREATE TABLE String_Content (RowID int, Layout varchar(200));
Query OK, 0 rows affected (0.003 sec)

MariaDB [testdb]> INSERT String_Content VALUE(1,'hello,world,welcome,to,tsql');
Query OK, 1 row affected (0.001 sec)

MariaDB [testdb]> INSERT String_Content VALUE(2,'welcome,to,stackoverflow');
Query OK, 1 row affected (0.001 sec)

MariaDB [testdb]> INSERT String_Content VALUE(3,'welcome,to,stackoverflow,tese1,test2,test3,test4');
Query OK, 1 row affected (0.001 sec)

檢視一下資料.

MariaDB [testdb]> SELECT * FROM String_Content;
+-------+--------------------------------------------------+
| RowID | Layout                                           |
+-------+--------------------------------------------------+
|     1 | hello,world,welcome,to,tsql                      |
|     2 | welcome,to,stackoverflow                         |
|     3 | welcome,to,stackoverflow,tese1,test2,test3,test4 |
+-------+--------------------------------------------------+
3 rows in set (0.004 sec)

利用下面 SQL 語法即可將字串做拆解.

WITH recursive SplitSting AS
(
    SELECT
        RowID,LEFT(Layout, INSTR(layout , ',') -1) AS Part
            ,RIGHT(Layout,LENGTH(Layout)-INSTR(layout , ',')) AS Remainder
        FROM String_Content
        WHERE Layout IS NOT NULL AND INSTR(layout , ',')>0
    UNION ALL
    SELECT
        RowID,LEFT(Remainder,INSTR( Remainder, ',')-1)
            ,RIGHT(Remainder,LENGTH(Remainder)-INSTR(Remainder , ','))
        FROM SplitSting
        WHERE Remainder IS NOT NULL AND INSTR( Remainder , ',')>0
    UNION ALL
    SELECT
        RowID,Remainder,null
        FROM SplitSting
        WHERE Remainder IS NOT NULL AND INSTR(Remainder ,',')=0
)
SELECT * FROM SplitSting ORDER BY RowID;

程式說明:
主要利用了 Recursive Common Table Expressions (CTE) 的功能,更多關於請參考 – https://benjr.tw/103485

WITH recursive SplitSting AS
  • 第一段程式
    Anchor Part 只會執行一次,資料為外部資料表.

    裡面用到 INSTR 函數可以查看子字串在字串的位置.

    MariaDB [testdb]> SELECT layout , INSTR(layout , ',') FROM String_Content;
    +--------------------------------------------------+---------------------+
    | layout                                           | INSTR(layout , ',') |
    +--------------------------------------------------+---------------------+
    | hello,world,welcome,to,tsql                      |                   6 |
    | welcome,to,stackoverflow                         |                   8 |
    | welcome,to,stackoverflow,tese1,test2,test3,test4 |                   8 |
    +--------------------------------------------------+---------------------+
    3 rows in set (0.000 sec)
    

    LEFT 與 RIGHT 是擷取指定字元數量,從左邊或是右邊.

    MariaDB [testdb]> SELECT layout , LEFT(layout,5) , RIGHT(layout,5) FROM String_Content;
    +--------------------------------------------------+----------------+-----------------+
    | layout                                           | LEFT(layout,5) | RIGHT(layout,5) |
    +--------------------------------------------------+----------------+-----------------+
    | hello,world,welcome,to,tsql                      | hello          | ,tsql           |
    | welcome,to,stackoverflow                         | welco          | rflow           |
    | welcome,to,stackoverflow,tese1,test2,test3,test4 | welco          | test4           |
    +--------------------------------------------------+----------------+-----------------+
    3 rows in set (0.000 sec)
    
        SELECT
            RowID,LEFT(Layout, INSTR(layout , ',') -1) AS Part
                ,RIGHT(Layout,LENGTH(Layout)-INSTR(layout , ',')) AS Remainder
            FROM String_Content
            WHERE Layout IS NOT NULL AND INSTR(layout , ',')>0;
    

    執行:

    +-------+---------+------------------------------------------+
    | RowID | Part    | Remainder                                |
    +-------+---------+------------------------------------------+
    |     1 | hello   | world,welcome,to,tsql                    |
    |     2 | welcome | to,stackoverflow                         |
    |     3 | welcome | to,stackoverflow,tese1,test2,test3,test4 |
    +-------+---------+------------------------------------------+
    3 rows in set (0.000 sec)
    

    透過 LEFT 與 RIGHT 依據 INSTR 提供的 字串與 ‘,’ 位置來將資料來擷取.

  • UNION ALL
    我們還有下面動作需執行,需靠 UNION ALL 將資料結合在一起.

    UNION ALL
    
  • 第二段程式
    Recursive Part 會重複執行直接 Where 判斷為非,資料為 CTE 本身資料表.

        SELECT
            RowID,LEFT(Remainder,INSTR( Remainder, ',')-1)
                ,RIGHT(Remainder,LENGTH(Remainder)-INSTR(Remainder , ','))
            FROM SplitSting
            WHERE Remainder IS NOT NULL AND INSTR( Remainder , ',')>0
    

    程式與第一段程式一樣,差別在於資料來源變成從 Recursive Common Table Expressions (CTE) 本身 SplitSting 這個資料表,也就是第一段 SQL 所產生的資料表.

  • 第三段程式
    Recursive Part 會重複執行直接 Where 判斷為非,資料為 CTE 本身資料表.
    這一段要讓 SQL 判斷前面產生 SplitSting 資料表的 Remainder 欄位是否為空並且無 ‘,’ 存在時就結束 Recursive .

        SELECT
            RowID,Remainder,null
            FROM SplitSting
            WHERE Remainder IS NOT NULL AND INSTR(Remainder ,',')=0
    

執行結果.

MariaDB [testdb]> WITH recursive SplitSting AS
    -> (
    ->     SELECT
    ->         RowID,LEFT(Layout, INSTR(layout , ',') -1) AS Part
    ->             ,RIGHT(Layout,LENGTH(Layout)-INSTR(layout , ',')) AS Remainder
    ->         FROM String_Content
    ->         WHERE Layout IS NOT NULL AND INSTR(layout , ',')>0
    ->     UNION ALL
    ->     SELECT
    ->         RowID,LEFT(Remainder,INSTR( Remainder, ',')-1)
    ->             ,RIGHT(Remainder,LENGTH(Remainder)-INSTR(Remainder , ','))
    ->         FROM SplitSting
    ->         WHERE Remainder IS NOT NULL AND INSTR( Remainder , ',')>0
    ->     UNION ALL
    ->     SELECT
    ->         RowID,Remainder,null
    ->         FROM SplitSting
    ->         WHERE Remainder IS NOT NULL AND INSTR(Remainder ,',')=0
    -> )
    -> SELECT * FROM SplitSting ORDER BY RowID;
+-------+---------------+------------------------------------------+
| RowID | Part          | Remainder                                |
+-------+---------------+------------------------------------------+
|     1 | to            | tsql                                     |
|     1 | hello         | world,welcome,to,tsql                    |
|     1 | tsql          | NULL                                     |
|     1 | world         | welcome,to,tsql                          |
|     1 | welcome       | to,tsql                                  |
|     2 | welcome       | to,stackoverflow                         |
|     2 | to            | stackoverflow                            |
|     2 | stackoverflow | NULL                                     |
|     3 | test3         | test4                                    |
|     3 | to            | stackoverflow,tese1,test2,test3,test4    |
|     3 | stackoverflow | tese1,test2,test3,test4                  |
|     3 | test2         | test3,test4                              |
|     3 | test4         | NULL                                     |
|     3 | welcome       | to,stackoverflow,tese1,test2,test3,test4 |
|     3 | tese1         | test2,test3,test4                        |
+-------+---------------+------------------------------------------+
15 rows in set (0.001 sec)
沒有解決問題,試試搜尋本站其他內容

發佈留言

發佈留言必須填寫的電子郵件地址不會公開。 必填欄位標示為 *

這個網站採用 Akismet 服務減少垃圾留言。進一步了解 Akismet 如何處理網站訪客的留言資料