SR-IOV

何謂SR-IOV 為何要使用SR-IOV SR-IOV優缺點 開VF on Proxmox

何謂SR-IOV

SR-IOV就是直接由實體的PCIE模擬出虛擬的PCIE,他跳過了Vitrual Switch,

由PF的NIC Switch來分發 packet 的到下面的 VF/VM

SR-IOV分為PF(Physical Function)及VF(Virtual Function)

PF:包含SR-IOV的完整PCIe功能-擴展能力。該功能用於配置和管理SR-IOV,是實體的PCIE,由PF開啟VF

VF:是由PF開出來的輕量型PCIE

 

為何要使用SR-IOV?

為什麼會選擇開SR-IOV?而不是像是一般用bridge/openvswitch/nat等模式?

1.大多數使用bridge/openvswitch/nat的模式時,在效能上一定會打折扣

2.最大效益化網卡的效能

如下圖,左邊的Pfsense實際上使用的實體網卡數量是3張,右邊使用了整整5張網卡,

且左邊的網卡還有其他VF提供其他VM使用,非單單只有給Pfsense一台Guest VM使用

 

 

 

 

SR-IOV優缺點

優點:

最大效益化PCIE的效能:開出來的VF效能上比起bridge/openvswitch/nat等模式是不打折扣的

減少CPU的Latency:因為 network packet 不經過CPU,其 Latency 也不會因此增加

簡化了佈線

在PCIE SLOT較少的主板上,能有效的利用PCIE的SLOT

缺點:

對設備有依賴性(開SRIOV要CPU、主板及DEVICE DRIVER都支援)

 

開VF on Proxmox(Intel 網卡)

設置的前提:

CPU、主板、Device都需支援SR-IOV

在開啟前,需先到BIOS把SR-IOV enable

查看網卡可以開啟幾張VF

以下為使用i350開啟SR-IOV,eno1是網卡在Proxmox裡的logical name

cat /sys/class/net/eno1/device/sriov_totalvfs
8

以enp36s0f0開啟4張VF

echo 2 > /sys/class/net/eno1/device/sriov_numvfs

如果使用此一方式不行的話,

請試試看以下方式(如果你是用SSH的方式連線時,記得改用IPMI或是直接到本機上使用,避免斷線)

modprobe -r igb 
modprobe igb max_vfs=2,4

max_vfs=2,4的意思是,用07:00.0開2張VF,07:00.1開4張VF

root@pve:/etc# lspci |grep Eth
03:00.0 Ethernet controller: Intel Corporation Ethernet Connection X552/X557-AT 10GBASE-T
03:00.1 Ethernet controller: Intel Corporation Ethernet Connection X552/X557-AT 10GBASE-T
05:00.0 Ethernet controller: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection (rev 01)
05:00.1 Ethernet controller: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection (rev 01)
07:00.0 Ethernet controller: Intel Corporation I350 Gigabit Network Connection (rev 01)
07:00.1 Ethernet controller: Intel Corporation I350 Gigabit Network Connection (rev 01)

查看VF是否已成功開啟

root@pve:~# lspci |grep Eth
03:00.0 Ethernet controller: Intel Corporation Ethernet Connection X552/X557-AT 10GBASE-T
03:00.1 Ethernet controller: Intel Corporation Ethernet Connection X552/X557-AT 10GBASE-T
05:00.0 Ethernet controller: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection (rev 01)
05:00.1 Ethernet controller: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection (rev 01)
07:00.0 Ethernet controller: Intel Corporation I350 Gigabit Network Connection (rev 01)
07:00.1 Ethernet controller: Intel Corporation I350 Gigabit Network Connection (rev 01)
08:10.0 Ethernet controller: Intel Corporation I350 Ethernet Controller Virtual Function (rev 01)
08:10.1 Ethernet controller: Intel Corporation I350 Ethernet Controller Virtual Function (rev 01)
08:10.4 Ethernet controller: Intel Corporation I350 Ethernet Controller Virtual Function (rev 01)
08:10.5 Ethernet controller: Intel Corporation I350 Ethernet Controller Virtual Function (rev 01)
08:11.1 Ethernet controller: Intel Corporation I350 Ethernet Controller Virtual Function (rev 01)
08:11.5 Ethernet controller: Intel Corporation I350 Ethernet Controller Virtual Function (rev 01)

讓SR-IOV開機時就自動開啟

echo "options igb max_vfs=2,4" > /etc/modprobe.d/igb.conf
depmod -a
update-initramfs -u

為VF設定MAC

註:

pfsense的base為freebsd,在硬體相容性唯一考慮用intel
而在其它的 vm lxc ,無論是Linux 或是windows,其他可以SR-IOV的卡都可以

開VF on Proxmox(Mellanox網卡)

開啟mst

# mst start

Starting MST (Mellanox Software Tools) driver set
Loading MST PCI module - Success
Loading MST PCI configuration module - Success
Create devices

找出pci slot

# mst status

MST modules:
------------
    MST PCI module loaded
    MST PCI configuration module loaded
 

MST devices:
------------
/dev/mst/mt4103_pciconf0         - PCI configuration cycles access.
                                   domain:bus:dev.fn=0000:81:00.0 addr.reg=88 data.reg=92
                                   Chip revision is: 00
/dev/mst/mt4103_pci_cr0          - PCI direct access.
                                   domain:bus:dev.fn=0000:81:00.0 bar=0xc8000000 size=0x100000
                                   Chip revision is: 00
/dev/mst/mt4115_pciconf0         - PCI configuration cycles access.
                                   domain:bus:dev.fn=0000:05:00.0 addr.reg=88 data.reg=92
                                   Chip revision is: 00

確認device的狀態

#  mlxconfig -d /dev/mst/mt4115_pciconf0 q

Device #1:
----------
Device type:    ConnectX4      
PCI device:     /dev/mst/mt4115_pciconf0

Configurations:                              Current
         SRIOV_EN                            0              
         NUM_OF_VFS                          0              
         LINK_TYPE_P1                        2              
         LINK_TYPE_P2                        2              
         INT_LOG_MAX_PAYLOAD_SIZE            0              
         LOG_DCR_HASH_TABLE_SIZE             14             
         DCR_LIFO_SIZE                       16384          
         ROCE_NEXT_PROTOCOL                  254            
         ROCE_CC_ALGORITHM_P1                0              
         ROCE_CC_PRIO_MASK_P1                0              
         ROCE_CC_ALGORITHM_P2                0              
         ROCE_CC_PRIO_MASK_P2                0              
         CLAMP_TGT_RATE_P1                   0              
         CLAMP_TGT_RATE_AFTER_TIME_INC_P1    1              
         RPG_TIME_RESET_P1                   5000           
         RPG_BYTE_RESET_P1                   150            
         RPG_THRESHOLD_P1                    5              
         RPG_MAX_RATE_P1                     0              
         RPG_AI_RATE_P1                      10             
         RPG_HAI_RATE_P1                     50             
         RPG_GD_P1                           7              
         RPG_MIN_DEC_FAC_P1                  50             
         RPG_MIN_RATE_P1                     1              
         RATE_TO_SET_ON_FIRST_CNP_P1         0              
         DCE_TCP_G_P1                        64             
         DCE_TCP_RTT_P1                      2              
         RATE_REDUCE_MONITOR_PERIOD_P1       2              
         INITIAL_ALPHA_VALUE_P1              3              
         MIN_TIME_BETWEEN_CNPS_P1            0              
         CNP_DSCP_P1                         0              
         CNP_802P_PRIO_P1                    7              
         CLAMP_TGT_RATE_P2                   0              
         CLAMP_TGT_RATE_AFTER_TIME_INC_P2    1              
         RPG_TIME_RESET_P2                   5000           
         RPG_BYTE_RESET_P2                   150            
         RPG_THRESHOLD_P2                    5              
         RPG_MAX_RATE_P2                     0              
         RPG_AI_RATE_P2                      10             
         RPG_HAI_RATE_P2                     50             
         RPG_GD_P2                           7              
         RPG_MIN_DEC_FAC_P2                  50             
         RPG_MIN_RATE_P2                     1              
         RATE_TO_SET_ON_FIRST_CNP_P2         0              
         DCE_TCP_G_P2                        64             
         DCE_TCP_RTT_P2                      2              
         RATE_REDUCE_MONITOR_PERIOD_P2       2              
         INITIAL_ALPHA_VALUE_P2              3              
         MIN_TIME_BETWEEN_CNPS_P2            0              
         CNP_DSCP_P2                         0              
         CNP_802P_PRIO_P2                    7      

在Firmware上開啟SR-IOV

# mlxconfig -d /dev/mst/mt4115_pciconf0 set SRIOV_EN=1 NUM_OF_VFS=4

Device #1:
----------
 
Device type:    ConnectX4      
PCI device:     /dev/mst/mt4115_pciconf0
 

Configurations:                              Current         New
         SRIOV_EN                            0               1              
         NUM_OF_VFS                          0               4              
         LINK_TYPE_P1                        2               2              
         LINK_TYPE_P2                        2               2              
         INT_LOG_MAX_PAYLOAD_SIZE            0               0              
         LOG_DCR_HASH_TABLE_SIZE             14              14             
         DCR_LIFO_SIZE                       16384           16384       

...

Apply new Configuration? ? (y/n) [n] : y
Applying... Done!
-I- Please reboot machine to load new configurations.

Reboot server 

在driver上開啟SR-IOV

echo 1 > /sys/class/net/eno3/device/sriov_numvfs

echo 4 > /sys/class/net/eno3/device/sriov_numvfs

ip link set eno3 vf 0 mac ec:0d:9a:f3:37:42

確定開啟SR-IOV

# lspci -D | grep Mellanox

0000:05:00.0 Ethernet controller: Mellanox Technologies MT27700 Family [ConnectX-4]
0000:05:00.1 Ethernet controller: Mellanox Technologies MT27700 Family [ConnectX-4]
0000:05:00.6 Ethernet controller: Mellanox Technologies MT27700 Family [ConnectX-4 Virtual Function]
0000:05:00.7 Ethernet controller: Mellanox Technologies MT27700 Family [ConnectX-4 Virtual Function]
0000:05:01.0 Ethernet controller: Mellanox Technologies MT27700 Family [ConnectX-4 Virtual Function]
0000:05:01.1 Ethernet controller: Mellanox Technologies MT27700 Family [ConnectX-4 Virtual Function]