2,124 瀏覽數

安裝 CoreOS – 設定 etcd2

先來看一下 cloud-config.yaml 設定檔.

#cloud-config
hostname: coreos1
ssh_authorized_keys:
  - ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC5wZYPD/mBs+9O9CrUxdg9kpOus24VrMuNncdt4BRc4iF5npV90HYe5j/y3IG6+2MRbAb2edyf/FUcaJHN/V+i123456yuqyAT2rv9T0eB2+wpmYCUQzqZscJP2uLK8jMhezKWS0l7X5CgJf+d17VooS6CADR9MyTbku3upKp5yEnsCfB+pBLGdrqCUTnGHPfJcLTBIvuMriz/kae0azxcderfbw7YWR8oKdWjKYKlznnBmH6VYFcgv/jSXbRbdZjKNSXIm2xIj6TIIJmo6sWhptcGohi467ODyrzCDioXD1MsYx6ImTMcY5mzL2RDePAW7CM4gWIMaIxDeL5e10SX ben@appledeAir

coreos:
  units:
    - name: etcd2.service
      command: start
    - name: systemd-networkd.service
      command: stop
    - name: 00-eth0.network
      runtime: true
      content: |
        [Match]
        Name=ens32

        [Network]
        Address=172.16.15.21/24
        Gateway=172.16.15.2
        DNS=168.95.1.1
    - name: systemd-networkd.service
      command: start
  etcd2:
    name: "node01"
    discovery: https://discovery.etcd.io/9dd875ca6dd759d67445a681adde3875
    advertise-client-urls: http://172.16.15.21:2379
    initial-advertise-peer-urls: http://172.16.15.21:2380
    listen-client-urls: http://0.0.0.0:2379
    listen-peer-urls: http://172.16.15.21:2380

hostname , ssh_authorized_keys 與 units , Network 前面連結都介紹過了 http://benjr.tw/96511,這次把重點放在 etcd2 上.

那 etcd 是做什麼的, etcd 是一種分散式的 key/value 儲存方式 (至少要有三個 node ,會把資料複製三份到個別的 node 作儲存,以確保資料的可靠度),不同於傳統的關聯式資料庫系統 (傳統的關聯式資料庫基本上就是一堆 tables),etc2 採用的是 key / value Stores 儲存,資料就只有 key / value Stores 採用 雜湊表 (Hash table) 是根據鍵 (Key) 來查詢 (noSQL 的方式) 存儲的資料結構.

關於 fleet 請參考 http://benjr.tw/96502

目前這一台 CoreOS 使用固定 IP 172.16.15.21/24 DNS 168.95.1.1 (中華電信 DNS)

etcd2

關於 etcd2 設定檔內容

  • name: “node01”
    這裡的 name 是指 etcd node 名稱,不同於前面的 hostname .不設定也是可以的,系統會指定一串數字為名稱.
  • discovery: https://discovery.etcd.io/9dd875ca6dd759d67445a681adde3875
    Discovery service 透過這個來找到彼此 Cluster 的列表 peer list (Peer IP 用於服務器彼此間的直接通信).這個 token 要到 https://discovery.etcd.io/ 網站來產生,直接複製到瀏覽器 https://discovery.etcd.io/new?size=1 ,Cluster 要具備 容錯功能至少要有 3 個 node,但我只先設定一台 etcd2 ,後面會再用指令 #etcdctl member add 的方式去加入.

    關於 CLUSTER 容錯功能的 SIZE ,MAJORITY, FAILURE TOLERANCE 請參考 https://coreos.com/etcd/docs/latest/v2/admin_guide.html#optimal-cluster-size

    除了上網產生 token 也可以直接透過指令的方式獲取.

    core@localhost ~ $ curl -w "\n" 'https://discovery.etcd.io/new?size=1'
    https://discovery.etcd.io/9dd875ca6dd759d67445a681adde3875
    
  • advertise-client-urls: http://172.16.15.21:2379
    很多設定範例都有同步設定 port 埠 4001,查了一下這是為了相容舊版的 etcd 所使用的埠,所以我這邊就不設定.

    advertise-client-urls – List of this member’s client URLs to advertise to the rest of the cluster. These URLs can contain domain names.

    列出此成員的 client URLs 以通知其他台 Cluster 成員.除了直接設定 IP 外還可以設定成 Domain name (需要有辦法解析).

  • initial-advertise-peer-urls: http://172.16.15.21:2380
    initial-advertise-peer-urls – List of this member’s peer URLs to advertise to the rest of the cluster. These addresses are used for communicating etcd data around the cluster. At least one must be routable to all cluster members. These URLs can contain domain names.

    列出此成員的 Peer URL (用於服務器彼此間的直接通信) 以通知其他台 Cluster 成員. 這地址用於傳遞 cluster etcd 的數據資料.至少需要一個可路由到的集群成員.除了直接設定 IP 外還可以設定成 Domain name (需要有辦法解析).

  • listen-client-urls: http://0.0.0.0:2379
    listen-client-urls – List of URLs to listen on for client traffic.

    很多設定範例都有同步設定 port 埠 4001,查了一下這是為了相容舊版的 etcd 所使用的埠,所以我這邊就不設定.

    用在與客戶端 etcd 數據傳輸,http://0.0.0.0 代表可接受所有的客戶端.

  • listen-peer-urls: http://172.16.15.21:2380 
    listen-peer-urls- List of URLs to listen on for peer traffic.

    用於 peer 節點與節點之間數據交換.

其他參數可以參考官方網站說明 https://coreos.com/etcd/docs/latest/op-guide/configuration.html

CoreOS 光碟開機後就進入文字模式,直接透過指令 #coreos-install 來安裝.這次有透過 -c 來指定 cloud-init config .

core@localhost ~ $ sudo coreos-install -d /dev/sda -C stable -c ~/cloud-config.yaml
2016/12/21 09:41:12 Checking availability of "local-file"
2016/12/21 09:41:12 Fetching user-data from datasource of type "local-file"
Downloading the signature for https://stable.release.core-os.net/amd64-usr/1185.3.0/coreos_production_image.bin.bz2...
2016-12-21 09:41:14 URL:https://stable.release.core-os.net/amd64-usr/1185.3.0/coreos_production_image.bin.bz2.sig [543/543] -> "/tmp/coreos-install.fmCj9mKD5k/coreos_production_image.bin.bz2.sig" [1]
Downloading, writing and verifying coreos_production_image.bin.bz2...
...
Success! CoreOS stable 1185.3.0 is installed on /dev/sda
core@localhost ~ $ sudo reboot

使用的參數:
-d ( DEVICE ) – Install CoreOS to the given device.
-C ( CHANNEL ) – Release channel to use (e.g. stable, beta)
-c ( CLOUD ) – Insert a cloud-init config to be executed on boot.
重開機後就可以透過 SSH 來連線.

安裝完後的可以檢查一下 etcd2 的服務狀態是否正常.

core@coreos1 ~ $ sudo systemctl status etcd2
● etcd2.service - etcd2
   Loaded: loaded (/usr/lib/systemd/system/etcd2.service; disabled; vendor preset: disabled)
  Drop-In: /run/systemd/system/etcd2.service.d
           └─20-cloudinit.conf
   Active: active (running) since Wed 2017-01-11 05:55:26 UTC; 27min ago
 Main PID: 957 (etcd2)
    Tasks: 7
   Memory: 20.9M
      CPU: 7.889s
   CGroup: /system.slice/etcd2.service
           └─957 /usr/bin/etcd2

Jan 11 05:55:26 coreos1 systemd[1]: Started etcd2.
Jan 11 05:55:26 coreos1 etcd2[957]: added local member e380570f06dea90a [http://172.16.15.21:2380] to 
Jan 11 05:55:26 coreos1 etcd2[957]: e380570f06dea90a is starting a new election at term 1
Jan 11 05:55:26 coreos1 etcd2[957]: e380570f06dea90a became candidate at term 2
Jan 11 05:55:26 coreos1 etcd2[957]: e380570f06dea90a received vote from e380570f06dea90a at term 2
Jan 11 05:55:26 coreos1 etcd2[957]: e380570f06dea90a became leader at term 2
Jan 11 05:55:26 coreos1 etcd2[957]: raft.node: e380570f06dea90a elected leader e380570f06dea90a at ter
Jan 11 05:55:26 coreos1 etcd2[957]: setting up the initial cluster version to 2.3
Jan 11 05:55:26 coreos1 etcd2[957]: set the initial cluster version to 2.3
Jan 11 05:55:26 coreos1 etcd2[957]: published {Name:92d7c022309e4cf2a4d6acd621471130 ClientURLs:[http:

當要除錯時可以透過指令 journalctl 來獲取詳細關於 etcd2 的訊息.

core@coreos1 ~ $ journalctl -u etcd2

下面這個指令可以確認 Cluster 的狀態是否正常.

core@coreos1 ~ $ etcdctl cluster-health
member e380570f06dea90a is healthy: got healthy result from http://172.16.15.21:2379
cluster is healthy
core@coreos1 ~ $ etcdctl member list   
e380570f06dea90a: name=node01 peerURLs=http://172.16.15.21:2380 clientURLs=http://172.16.15.21:2379 isLeader=true

在 etcd2 相對應的路徑也會產生相對應的 member 檔案.

core@coreos1 ~ $ sudo ls -l /var/lib/etcd2/member/
total 16
drwx------. 2 etcd etcd 4096 Jan 11 05:55 snap
drwx------. 2 etcd etcd 4096 Jan 11 05:55 wal

如果想要確認目前 etcd2 的設定值.

core@coreos1 ~ $ cat /run/systemd/system/etcd2.service.d/20-cloudinit.conf 
[Service]
Environment="ETCD_ADVERTISE_CLIENT_URLS=http://172.16.15.21:2379"
Environment="ETCD_DISCOVERY=https://discovery.etcd.io/9dd875ca6dd759d67445a681adde3875"
Environment="ETCD_INITIAL_ADVERTISE_PEER_URLS=http://172.16.15.21:2380"
Environment="ETCD_LISTEN_CLIENT_URLS=http://0.0.0.0:2379"
Environment="ETCD_LISTEN_PEER_URLS=http://172.16.15.21:2380"
Environment="ETCD_NAME=node01"

etcdctl

服務與 Cluster 的狀態都確認正常之後,我們可以透過 etcdctl 指令來試一下 etcd 的 key / value Stores 儲存資料運作是否正常.

set 是儲存 key = test , value = CoreOS testing

core@coreos1 ~ $ etcdctl set /test "CoreOS testing"
CoreOS testing

get 是讀取 test 這個 key

core@coreos1 ~ $ etcdctl get /test                 
CoreOS testing

其他相關指令還有

  • backup backup an etcd directory
  • mk make a new key with a given value
  • mkdir make a new directory
  • rm remove a key
  • rmdir removes the key if it is an empty directory or a key-value pair
  • get retrieve the value of a key
  • ls retrieve a directory
  • set set the value of a key
  • setdir create a new or existing directory
  • update update an existing key with a given value
  • updatedir update an existing directory
  • watch watch a key for changes
  • exec-watch watch a key for changes and exec an executable
  • member member add, remove and list subcommands

錯誤檢查

  1. Temporary failure in name resolution
    如果遇到下面的錯誤訊息要確定妳的網路狀態是不是可以連到 http://discovery.etcd.io , 關於 Network 設定與使用方式請參考 – http://benjr.tw/96370

    core@coreos1 ~ $ journalctl -u etcd2
    ...
    Jan 11 08:00:32 coreos1 etcd2[890]: error #0: dial tcp: lookup discovery.etcd.io: Temporary failure in name resolution
    Jan 11 08:00:32 coreos1 etcd2[890]: cluster status check: error connecting to https://discovery.etcd.io, retrying in 8s
    
  2. has previously registered with discovery service
    需要重新申請一個 新的 token

    core@coreos1 ~ $ journalctl -u etcd2
    ...
    Jan 10 06:03:48 coreos1 etcd2[867]: member "09c2ea6e6df44dfda86ce9f8e2e64eb1" has previously registered with discovery service (https:// discovery.etcd.io/<disco key>), 
    Jan 10 06:03:48 coreos1 etcd2[867]: But etcd could not find valid cluster configuration in the given data dir (/var/lib/etcd2)
    Jan 10 06:03:48 coreos1 etcd2[867]: Please check the given data dir path if the previous bootstrap succeeded
    Jan 10 06:03:48 coreos1 systemd[1]: etcd2.service: Main process exited, code=exited, status=1/FAILURE
    Jan 10 06:03:48 coreos1 systemd[1]: Failed to start etcd2.
    
  3. server error Gateway Timeout
    一開始我設定 Cluster 為 3 nodes ,並透過 https://discovery.etcd.io/new?size=3 產生 token ,在第一個 Node 起來時卻發現 etcd2 服務並不正常. 錯誤訊息如下.

    core@coreos1 ~ $ etcdctl cluster-health
    cluster may be unhealthy: failed to list members
    Error:  client: etcd cluster is unavailable or misconfigured
    error #0: client: endpoint http://127.0.0.1:2379 exceeded header timeout
    error #1: dial tcp 127.0.0.1:4001: getsockopt: connection refused
    
    core@coreos1 ~ $ journalctl -u etcd2
    ...
    18 coreos1 etcd2[970]: found 1 peer(s), waiting for 2 more
    Jan 11 08:37:18 coreos1 etcd2[970]: error #0: client: etcd member https://discovery.etcd.io returns server error [Gateway Timeout]
    Jan 11 08:37:18 coreos1 etcd2[970]: waiting for other nodes: error connecting to https://discovery.etcd.io, retrying in 2s
    Jan 11 08:37:20 coreos1 etcd2[970]: found self e5f1821e81b8d32d in the cluster
    Jan 11 08:37:20 coreos1 etcd2[970]: found 1 peer(s), waiting for 2 more
    

    etcd2 token 如果設成幾個 node 時一開始就需要全部的 node 都啟動服務才會正常運作,如果要先試試看 etcd2 的功能建議先使用 https://discovery.etcd.io/new?size=1 即可.

其他關於 etcd2 設定與使用方式

6 Replies to “安裝 CoreOS – 設定 etcd2”

  1. 自動參照通知: CoreOS – Flannel – Benjr.tw

  2. 自動參照通知: CoreOS ETCD2 Cluster 容錯 – Benjr.tw

  3. 自動參照通知: CoreOS – Fleet – Benjr.tw

  4. 自動參照通知: CoreOS 設定檔 – Benjr.tw

  5. 自動參照通知: CoreOS 新增/移除 etcd2 Cluster member – Benjr.tw

  6. 自動參照通知: CoreOS – etcd2 Cluster 的災難復原 – Benjr.tw

發表迴響