求助:RM520N-GL AP版在PVE内的直通

【问题描述】

如题,尝试将模组通过 PCIE 直通到 PVE 的虚拟机内,但是无法启动虚拟机,报错如下。不清楚是模组的问题还是 PVE 的问题,想请教大家。

error writing '1' to '/sys/bus/pci/devices/0000:01:00.0/reset': Inappropriate ioctl for device
failed to reset PCI device '0000:01:00.0', but trying to continue as not all devices need a reset
kvm: -device vfio-pci,host=0000:01:00.0,id=hostpci0,bus=ich9-pcie-port-1,addr=0x0: vfio 0000:01:00.0: error getting device from group 13: Permission denied
Verify all devices in group 13 are bound to vfio-<bus> or pci-stub and not already in use
TASK ERROR: start failed: QEMU exited with code 1

驱动是编译的 Quectel_Linux_PCIE_MHI_Driver_V1.3.8,屏蔽了系统自带的 mhi_pci_generic,在虚拟机启动前的 /dev 目录下可以看到 mhi_ 相关的项目。其他信息如下:

  • PVE 9.1
  • 内核 6.17.4-2-pve
  • 模组通过 M.2 B KEY - M KEY 转接板连接到主板上

【虚拟机启动前的 dmesg】

root@slayN150:~# dmesg | grep mhi
[    3.455762] pcie_mhi: loading out-of-tree module taints kernel.
[    3.455770] pcie_mhi: module verification failed: signature and/or required key missing - tainting kernel
[    3.516135] mhi_init Quectel_Linux_PCIE_MHI_Driver_V1.3.8
[    3.516212] mhi_pci_probe pci_dev->name = 0000:01:00.0, domain=0, bus=1, slot=0, vendor=17CB, device=0308
[    3.520870] [I][mhi0][mhi_arch_set_bus_request] Setting bus request to index 1
[    3.520877] mhi_q 0000:01:00.0: BAR 0 [mem 0x81001000-0x81001fff 64bit]: assigned
[    3.520896] mhi_q 0000:01:00.0: enabling device (0000 -> 0002)
[    3.521043] [I][mhi0][mhi_init_pci_dev] msi_required = 4, msi_allocated = 4, msi_irq = 159
[    3.521054] [I][mhi0][mhi_power_up] dev_state:RESET
[    3.521055] [I][mhi0][mhi_async_power_up] Requested to power on
[    3.521077] [I][mhi0][mhi_alloc_coherent] size = 98304, dma_handle = 1127a0000
[    3.521078] [I][mhi0][mhi_init_dev_ctxt] mhi_ctxt->ctrl_seg = 0000000031e60640
[    3.521916] [I][mhi0][mhi_async_power_up] dev_state:RESET ee:AMSS
[    3.521939] [I][mhi0][mhi_pm_st_worker] Transition to state:READY
[    3.521946] [I][mhi0][mhi_pm_st_worker] INVALID_EE -> AMSS
[    3.521948] [I][mhi0][mhi_ready_state_transition] Waiting to enter READY state
[    3.521966] [I][mhi0][mhi_async_power_up] Power on setup success
[    3.521980] [I][mhi0][mhi_pci_show_link] LnkCap:     Speed 8GT/s, Width x2
[    3.521981] [I][mhi0][mhi_pci_show_link] LnkSta:     Speed 8GT/s, Width x1
[    3.521982] [I][mhi0][mhi_pci_probe] Return successful
[    6.524034] [I][mhi0][mhi_ready_state_transition] Device in READY State
[    6.524039] [I][mhi0][mhi_tryset_pm_state] Transition to pm state from:POR to:POR
[    6.524041] [I][mhi0][mhi_init_mmio] Initializing MMIO
[    6.524050] [I][mhi0][mhi_init_mmio] CHDBOFF:0x300
[    6.524058] [I][mhi0][mhi_init_mmio] ERDBOFF:0x700
[    6.524059] [I][mhi0][mhi_init_mmio] Programming all MMIO values.
[    6.587885] [I][mhi0][mhi_dump_tre] carl_ev evt_state_change mhistate=2
[    6.587928] [I][mhi0][mhi_process_ctrl_ev_ring] MHI state change event to state:M0
[    6.587930] [I][mhi0][mhi_pm_m0_transition] Entered With State:READY PM_STATE:POR
[    6.587932] [I][mhi0][mhi_tryset_pm_state] Transition to pm state from:POR to:M0
[    6.587937] [I][mhi0][mhi_dump_tre] carl_ev evt_ee_state execenv=2
[    6.587939] [I][mhi0][mhi_process_ctrl_ev_ring] MHI EE received event:AMSS
[    6.588125] [I][mhi0][mhi_pm_st_worker] Transition to state:MISSION MODE
[    6.588131] [I][mhi0][mhi_pm_st_worker] INVALID_EE -> AMSS
[    6.588132] [I][mhi0][mhi_pm_mission_mode_transition] Processing Mission Mode Transition
[    6.588137] [I][mhi0][mhi_init_timesync] TIME OFFS:0xa20
[    6.588145] [I][mhi0][mhi_pm_mission_mode_transition] Adding new devices
[    6.588358] [I][mhi0][mhi_dtr_probe] Enter for DTR control channel
[    6.588360] [I][mhi0][__mhi_prepare_channel] Entered: preparing channel:18
[    6.592441] [I][mhi0][mhi_dump_tre] carl_ev evt_cmd_comp code=1, type=33
[    6.592478] [I][mhi0][__mhi_prepare_channel] Chan:18 successfully moved to start state
[    6.592485] [I][mhi0][__mhi_prepare_channel] Entered: preparing channel:19
[    6.596073] [I][mhi0][mhi_dump_tre] carl_ev evt_cmd_comp code=1, type=33
[    6.596079] [I][mhi0][__mhi_prepare_channel] Chan:19 successfully moved to start state
[    6.596090] [I][mhi0][mhi_dtr_probe] Exit with ret:0
[    6.596179] [I][mhi_netdev_enable_iface] Prepare the channels for transfer
[    6.596181] [I][mhi0][__mhi_prepare_channel] Entered: preparing channel:100
[    6.619857] [I][mhi0][mhi_dump_tre] carl_ev evt_cmd_comp code=1, type=33
[    6.619923] [I][mhi0][__mhi_prepare_channel] Chan:100 successfully moved to start state
[    6.619924] [I][mhi0][__mhi_prepare_channel] Entered: preparing channel:101
[    6.628306] [I][mhi0][mhi_dump_tre] carl_ev evt_cmd_comp code=1, type=33
[    6.628370] [I][mhi0][__mhi_prepare_channel] Chan:101 successfully moved to start state
[    6.629622] [I][mhi_netdev_enable_iface] Exited.
[    6.629826] rmnet_vnd_register_device(rmnet_mhi0.1)=0
[    6.629854] [I][mhi0][mhi_pm_mission_mode_transition] Exit with ret:0

【虚拟机启动后的 dmesg】

[   11.470913] [I][mhi0][mhi_pci_device_removed] Triggering shutdown process
[   11.470918] [I][mhi0][mhi_tryset_pm_state] Transition to pm state from:M0 to:LD or Error Fatal Detect
[   11.470920] [I][mhi0][mhi_pm_disable_transition] Enter with from pm_state:LD or Error Fatal Detect MHI_STATE:M0 to pm_state:SHUTDOWN Process
[   11.470922] [I][mhi0][mhi_tryset_pm_state] Transition to pm state from:LD or Error Fatal Detect to:SHUTDOWN Process
[   11.470923] [I][mhi0][mhi_pm_disable_transition] Waiting for all pending event ring processing to complete
[   11.470924] [I][mhi0][mhi_pm_disable_transition] Reset all active channels and remove mhi devices
[   11.471485] [I][mhi_netdev_remove] Remove notification received
[   11.471488] rmnet_vnd_unregister_device(rmnet_mhi0.1)
[   11.495550] [I][mhi0][mhi_pm_disable_transition] Finish resetting channels
[   11.495553] [I][mhi0][mhi_pm_disable_transition] Waiting for all pending threads to complete
[   11.495554] [I][mhi0][mhi_pm_disable_transition] Resetting EV CTXT and CMD CTXT
[   11.495562] [I][mhi0][mhi_tryset_pm_state] Transition to pm state from:SHUTDOWN Process to:DISABLE
[   11.495563] [I][mhi0][mhi_pm_disable_transition] Exit with pm_state:DISABLE mhi_state:RESET
[   11.495738] [I][mhi0][mhi_arch_set_bus_request] Setting bus request to index 0
[   11.496183] vfio-pci 0000:01:00.0: Runtime PM usage count underflow!
[   11.496516] vfio-pci 0000:01:00.0: resetting
[   12.521517] pcieport 0000:00:1c.0: broken device, retraining non-functional downstream link at 2.5GT/s
[   13.521492] pcieport 0000:00:1c.0: retraining failed
[   14.521355] pcieport 0000:00:1c.0: Data Link Layer Link Active not set in 100 msec
[   14.582847] vfio-pci 0000:01:00.0: reset done

【虚拟机启动前的 lspci -vvv】

01:00.0 Unassigned class [ff00]: Qualcomm Technologies, Inc Device 0308
        Subsystem: Qualcomm Technologies, Inc Device 5201
        Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0, Cache Line Size: 64 bytes
        Interrupt: pin ? routed to IRQ 158
        IOMMU group: 13
        Region 0: Memory at 81001000 (64-bit, non-prefetchable) [size=4K]
        Region 2: Memory at 81000000 (64-bit, non-prefetchable) [size=4K]
        Capabilities: [40] Power Management version 3
                Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
                Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
        Capabilities: [50] MSI: Enable+ Count=4/32 Maskable+ 64bit+
                Address: 00000000fee00818  Data: 0000
                Masking: fffffff0  Pending: 00000000
        Capabilities: [70] Express (v2) Endpoint, IntMsgNum 0
                DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s unlimited, L1 unlimited
                        ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset- SlotPowerLimit 10W TEE-IO-
                DevCtl: CorrErr- NonFatalErr- FatalErr- UnsupReq-
                        RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
                        MaxPayload 256 bytes, MaxReadReq 512 bytes
                DevSta: CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr+ TransPend-
                LnkCap: Port #0, Speed 8GT/s, Width x2, ASPM L0s L1, Exit Latency L0s <1us, L1 <64us
                        ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp+
                LnkCtl: ASPM Disabled; RCB 64 bytes, LnkDisable- CommClk+
                        ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
                LnkSta: Speed 8GT/s, Width x1 (downgraded)
                        TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
                DevCap2: Completion Timeout: Range ABCD, TimeoutDis+ NROPrPrP- LTR+
                         10BitTagComp+ 10BitTagReq- OBFF Not Supported, ExtFmt- EETLPPrefix-
                         EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit-
                         FRS- TPHComp+ ExtTPHComp-
                         AtomicOpsCap: 32bit- 64bit- 128bitCAS-
                DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-
                         AtomicOpsCtl: ReqEn-
                         IDOReq- IDOCompl- LTR+ EmergencyPowerReductionReq-
                         10BitTagReq- OBFF Disabled, EETLPPrefixBlk-
                LnkCap2: Supported Link Speeds: 2.5-8GT/s, Crosslink- Retimer+ 2Retimers+ DRS-
                LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-
                         Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
                         Compliance Preset/De-emphasis: -6dB de-emphasis, 0dB preshoot
                LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete+ EqualizationPhase1+
                         EqualizationPhase2+ EqualizationPhase3+ LinkEqualizationRequest-
                         Retimer- 2Retimers- CrosslinkRes: Upstream Port
        Capabilities: [100 v2] Advanced Error Reporting
                UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP-
                        ECRC- UnsupReq- ACSViol- UncorrIntErr- BlockedTLP- AtomicOpBlocked- TLPBlockedErr-
                        PoisonTLPBlocked- DMWrReqBlocked- IDECheck- MisIDETLP- PCRC_CHECK- TLPXlatBlocked-
                UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP-
                        ECRC- UnsupReq- ACSViol- UncorrIntErr+ BlockedTLP- AtomicOpBlocked- TLPBlockedErr-
                        PoisonTLPBlocked- DMWrReqBlocked- IDECheck- MisIDETLP- PCRC_CHECK- TLPXlatBlocked-
                UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+
                        ECRC- UnsupReq- ACSViol- UncorrIntErr+ BlockedTLP- AtomicOpBlocked- TLPBlockedErr-
                        PoisonTLPBlocked- DMWrReqBlocked- IDECheck- MisIDETLP- PCRC_CHECK- TLPXlatBlocked-
                CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr- CorrIntErr- HeaderOF-
                CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+ CorrIntErr+ HeaderOF+
                AERCap: First Error Pointer: 00, ECRCGenCap+ ECRCGenEn- ECRCChkCap+ ECRCChkEn-
                        MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap-
                HeaderLog: 00000000 00000000 00000000 00000000
        Capabilities: [148 v1] Secondary PCI Express
                LnkCtl3: LnkEquIntrruptEn- PerformEqu-
                LaneErrStat: 0
        Capabilities: [168 v1] Physical Layer 16.0 GT/s <?>
        Capabilities: [18c v1] Lane Margining at the Receiver
                PortCap: Uses Driver-
                PortSta: MargReady- MargSoftReady-
        Capabilities: [19c v1] Transaction Processing Hints
                No steering table available
        Capabilities: [228 v1] Latency Tolerance Reporting
                Max snoop latency: 3145728ns
                Max no snoop latency: 3145728ns
        Capabilities: [230 v1] L1 PM Substates
                L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ L1_PM_Substates+
                          PortCommonModeRestoreTime=70us PortTPowerOnTime=0us
                L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2- ASPM_L1.1-
                           T_CommonMode=0us LTR1.2_Threshold=81920ns
                L1SubCtl2: T_PwrOn=0us
        Capabilities: [240 v1] Data Link Feature <?>
        Kernel driver in use: mhi_q
        Kernel modules: mhi_pci_generic, pcie_mhi

【虚拟机启动后的 lspci -vvv】

01:00.0 Unassigned class [ff00]: Qualcomm Technologies, Inc Device 0308
        Subsystem: Qualcomm Technologies, Inc Device 5201
        !!! Unknown header type 7f
        IOMMU group: 13
        Region 0: Memory at 81001000 (64-bit, non-prefetchable) [size=4K]
        Region 2: Memory at 81000000 (64-bit, non-prefetchable) [size=4K]
        Kernel driver in use: vfio-pci
        Kernel modules: mhi_pci_generic, pcie_mhi