Linux – Convert CSV to JSON

Loading

測試環境為 CentOS 8 x86_64

在 Linux 下有辦法把 CSV (每筆資料用 “,” 做區分) 轉成 JSON 格式嗎? 轉換後的 JSON 儲存在資料庫要怎麼存取資料?

Convert CSV to JSON

找到下面兩種方式.

  • jq : jq is a lightweight and flexible command-line JSON processor.
    套件安裝後就可以使用了,下載網站 https://stedolan.github.io/jq/download/
  • csv2json.sh
    這邊 (網站 https://gist.github.com/dsliberty/3de707bc656cf757a0cb) 提供多種把 CSV 轉成 JSON 陣列 (Array) 格式的 Linux Bash,我選擇的是下面版本的程式.

    [root@localhost ~]# vi csv2json.sh
    #!/bin/bash
    
    # CSV to JSON converter using BASH
    # original script from https://gist.github.com/dsliberty/3de707bc656cf757a0cb
    # Usage ./csv2json.sh input.csv > output.json
    
    #set -x
    shopt -s extglob
    
    input="${1:-}"
    SEP=","
    
    [ -z "${input}" ] && echo "No CSV input file specified" && exit 1
    [ ! -e "${input}" ] && echo "Unable to locate ${input}" && exit 1
    
    read first_line < "${input}"
    a=0
    headings=`echo ${first_line} | awk -F"${SEP}" {'print NF'}`
    lines=`cat "${input}" | wc -l`
    while [ ${a} -lt ${headings} ]
    do
        head_array[${a}]=$(echo ${first_line} | awk -v x=$(( ${a} + 1 )) -F"${SEP}" '{print $x}' | sed 's/\r//g')
        a=$(( ${a} + 1 ))
    done
    
    c=0
    echo "["
    while [ ${c} -lt ${lines} ]
    do
        read each_line
        each_line="$(echo "${each_line}" | sed 's/\r//g')"
    
        if [[ ${c} -eq 0 ]]; then
            c=$(( ${c} + 1 ))
        else
            d=0
            echo "    {"
            while [[ ${d} -lt ${headings} ]]; do
                item="$(echo "${each_line}" | awk -v x=$(( ${d} + 1 )) -F"${SEP}" '{print $x}')"
                echo -n "        \"${head_array[${d}]}\": "
                case "${item}" in
                    "")
                        echo -n "null"
                        ;;
                    null|true|false|\"*\"|+([0123456789]))
                        echo -n ${item}
                        ;;
                    *)
                        echo -n "\"${item}\""
                        ;;
                esac
                d=$(( ${d} + 1 ))
                [[ ${d} -lt ${headings} ]] && echo "," || echo ""
            done
    
            echo -n "    }"
    
            c=$(( ${c} + 1 ))
            [[ ${c} -lt ${lines} ]] && echo "," || echo ""
        fi
    
    done < "${input}"
    echo "]"
    

    把權限改成可執行 +x (Execution)

    [root@localhost ~]# chmod a+x csv2json.sh 
    

    透過 csv2json.sh 程式把下面 CSV 轉成 JSON (陣列格式)

    [root@localhost ~]# cat Sensor.csv 
    TimeStamp,Status,Temp1,Temp2
    01/07/2021 07:56:35 PM,ok,30,30
    01/07/2021 07:56:48 PM,ok,31,31
    01/07/2021 07:57:00 PM,ok,32,32
    
    [root@localhost ~]# ./csv2json.sh Sensor.csv > Sensor.json
    

    資料轉換成 陣列 (Array) 格式.

    JSON 資料類型:

    1. 物件 (Object) : 使用 { } , 無序成對 (pair) 的 {鍵 Key name :值 value} 方式來儲存,鍵值之間使用逗號來分隔.
    2. 陣列 (Array) : 使用 [ ] , 資料彼此間使用逗號來分割,如:[value, value] .
    3. 數字 (Number) : 直接使用整數或是浮點數.
    4. 字串 (String) : 使用 “” (括號)來表示字串.
    5. 布林函數 (Boolean) : 使用 TRUE 或 FALSE.
    6. NULL 類型.
    [root@localhost ~]# cat Sensor.json 
    [
        {
            "TimeStamp": "01/07/2021 07:56:35 PM",
            "Status": "ok",
            "Temp1": 30,
            "Temp2": 30
        },
        {
            "TimeStamp": "01/07/2021 07:56:48 PM",
            "Status": "ok",
            "Temp1": 31,
            "Temp2": 31
        },
        {
            "TimeStamp": "01/07/2021 07:57:00 PM",
            "Status": "ok",
            "Temp1": 32,
            "Temp2": 32
        }
    ]
    

MariaDB JSON Arrary

使用關聯式資料庫( RDBMS : Relational Database Management System )的 MariaDB 也可以儲存與查詢 NoSQL ( Not Only SQL : 使用非關聯式資料庫的資料庫,資料儲存不需要固定的欄位 ) 的 JSON ( JavaScript Object Notation )資料格式.

[root@localhost ~]# mysql -u root -p
Enter password: 
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 10
Server version: 10.3.17-MariaDB MariaDB Server

Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

MariaDB [(none)]> CREATE DATABASE testdb;
Query OK, 1 row affected (0.001 sec)

MariaDB [(none)]> USE testdb;
Database changed

定義包含 json 欄位 (Sensor) 的資料表 (jsontest),並使用 JSON_VALID 函數檢查 JSON 格式是否正確.

MariaDB [testdb]> CREATE TABLE jsontest ( 
ID int(10) unsigned NOT NULL AUTO_INCREMENT, 
Sensor JSON, TS TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP, PRIMARY KEY (ID) , CHECK (JSON_VALID(Sensor))
);
Query OK, 0 rows affected (0.008 sec)

使用剛剛透過 csv2json.sh 程式轉換出來的 Json 檔案當輸入資料.

MariaDB [testdb]> INSERT INTO jsontest (Sensor) VALUES ('[{"TimeStamp": "01/07/2021 07:56:35 PM","Status": "ok","Temp1": 30,"Temp2": 30},{"TimeStamp": "01/07/2021 07:56:48 PM","Status": "ok","Temp1": 31,"Temp2": 31},{"TimeStamp": "01/07/2021 07:57:00 PM","Status": "ok","Temp1": 32,"Temp2": 32}]');
Query OK, 1 row affected (0.002 sec)

可以透過 JSON_KEYS 函數來讀取 鍵 Key name.

MariaDB [testdb]> SELECT JSON_KEYS(Sensor,'$[0]') FROM jsontest;
+-------------------------------------------+
| JSON_KEYS(Sensor,'$[0]')                  |
+-------------------------------------------+
| ["TimeStamp", "Status", "Temp1", "Temp2"] |
+-------------------------------------------+
1 row in set (0.001 sec)

可以透過 JSON_EXTRACT 函數來讀取特定鍵 Key name 的 值 value.

MariaDB [testdb]> SELECT JSON_EXTRACT(Sensor,'$[0].TimeStamp') FROM jsontest;
+---------------------------------------+
| JSON_EXTRACT(Sensor,'$[0].TimeStamp') |
+---------------------------------------+
| "01/07/2021 07:56:35 PM"              |
+---------------------------------------+
1 row in set (0.000 sec)
MariaDB [testdb]> SELECT JSON_EXTRACT(Sensor,'$[*].TimeStamp') FROM jsontest;
+--------------------------------------------------------------------------------+
| JSON_EXTRACT(Sensor,'$[*].TimeStamp')                                          |
+--------------------------------------------------------------------------------+
| ["01/07/2021 07:56:35 PM", "01/07/2021 07:56:48 PM", "01/07/2021 07:57:00 PM"] |
+--------------------------------------------------------------------------------+
1 row in set (0.001 sec)
MariaDB [testdb]> SELECT JSON_EXTRACT(Sensor,'$[*].TimeStamp','$[*].Temp1') FROM jsontest;
+--------------------------------------------------------------------------------------------+
| JSON_EXTRACT(Sensor,'$[*].TimeStamp','$[*].Temp1')                                         |
+--------------------------------------------------------------------------------------------+
| ["01/07/2021 07:56:35 PM", 30, "01/07/2021 07:56:48 PM", 31, "01/07/2021 07:57:00 PM", 32] |
+--------------------------------------------------------------------------------------------+
1 row in set (0.001 sec)
沒有解決問題,試試搜尋本站其他內容

發佈留言

發佈留言必須填寫的電子郵件地址不會公開。 必填欄位標示為 *

這個網站採用 Akismet 服務減少垃圾留言。進一步了解 Akismet 如何處理網站訪客的留言資料