-
Notifications
You must be signed in to change notification settings - Fork 72
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
run_tests.sh locks host machine #10
Comments
This usually has 2 causes:
If you try the new driver and it doesn't solve the problem let me know, I'll send you another Litefury. The units are fully tested before they ship, but stuff happens. |
Also, the best way to get support is via emailing me ([email protected]). |
Thanks for your reply! I built the driver from the repo you posted and it no longer hangs when running run_tests.sh. It does however complete with errors, should I be concerned? Info: Number of enabled h2c channels = 1 dmesg: |
Tobias,
I don't think the XDMA test you are running will work with the factory
bitstream.
Can you try dma-test-2.py from here:
https://github.com/RHSResearchLLC/NiteFury-and-LiteFury/tree/master/Sample-Projects/Project-0/Host/Host
Best regards,
Dave.
…On Tue, Feb 2, 2021 at 2:40 PM Tobias Björnsson ***@***.***> wrote:
Thanks for your reply! I built the driver from the repo you posted and it
no longer hangs when running run_tests.sh. It does however complete with
errors, should I be concerned?
Info: Number of enabled h2c channels = 1
Info: Number of enabled c2h channels = 1
Info: The PCIe DMA core is memory mapped.
Info: Running PCIe DMA memory mapped write read test
transfer size: 1024
transfer count: 1
Info: Writing to h2c channel 0 at address offset 0.
Info: Wait for current transactions to complete.
/dev/xdma0_h2c_0 ** Average BW = 1024, 5.287372
Info: Writing to h2c channel 0 at address offset 1024.
Info: Wait for current transactions to complete.
/dev/xdma0_h2c_0 ** Average BW = 1024, 3.657091
Info: Writing to h2c channel 0 at address offset 2048.
Info: Wait for current transactions to complete.
/dev/xdma0_h2c_0 ** Average BW = 1024, 3.771014
Info: Writing to h2c channel 0 at address offset 3072.
Info: Wait for current transactions to complete.
/dev/xdma0_h2c_0 ** Average BW = 1024, 3.803708
Info: Reading from c2h channel 0 at address offset 0.
Info: Wait for the current transactions to complete.
/dev/xdma0_c2h_0 ** Average BW = 1024, 3.376251
Info: Reading from c2h channel 0 at address offset 1024.
Info: Wait for the current transactions to complete.
/dev/xdma0_c2h_0 ** Average BW = 1024, 2.740687
Info: Reading from c2h channel 0 at address offset 2048.
Info: Wait for the current transactions to complete.
/dev/xdma0_c2h_0 ** Average BW = 1024, 4.069499
Info: Reading from c2h channel 0 at address offset 3072.
Info: Wait for the current transactions to complete.
/dev/xdma0_c2h_0 ** Average BW = 1024, 2.467690
Info: Checking data integrity.
data/output_datafile0_4K.bin data/datafile0_4K.bin differ: char 53, line 1
Error: The data written did not match the data that was read.
address range: 0 - 1024
write data file: data/datafile0_4K.bin
read data file: data/output_datafile0_4K.bin
data/output_datafile1_4K.bin data/datafile1_4K.bin differ: char 53, line 2
Error: The data written did not match the data that was read.
address range: 1024 - 2048
write data file: data/datafile1_4K.bin
read data file: data/output_datafile1_4K.bin
data/output_datafile2_4K.bin data/datafile2_4K.bin differ: char 54, line 2
Error: The data written did not match the data that was read.
address range: 2048 - 3072
write data file: data/datafile2_4K.bin
read data file: data/output_datafile2_4K.bin
data/output_datafile3_4K.bin data/datafile3_4K.bin differ: char 53, line 2
Error: The data written did not match the data that was read.
address range: 3072 - 4096
write data file: data/datafile3_4K.bin
read data file: data/output_datafile3_4K.bin
Error: Test completed with Errors.
Error: Test completed with Errors.
dmesg:
[ 870.983951] xdma:xdma_mod_init: Xilinx XDMA Reference Driver xdma
v2020.1.8
[ 870.983965] xdma:xdma_mod_init: desc_blen_max: 0xfffffff/268435455,
timeout: h2c 10 c2h 10 sec.
[ 870.985805] xdma:xdma_device_open: xdma device 0000:01:00.0,
0xffffffc0ed678800.
[ 870.985876] xdma 0000:01:00.0: enabling device (0000 -> 0002)
[ 870.986124] xdma:map_single_bar: BAR0 at 0xfa000000 mapped at
0xffffff800e400000, length=1048576(/1048576)
[ 870.986220] xdma:map_single_bar: BAR1 at 0xfa100000 mapped at
0xffffff800bec0000, length=65536(/65536)
[ 870.986241] xdma:map_bars: config bar 1, pos 1.
[ 870.986259] xdma:identify_bars: 2 BARs: config 1, user 0, bypass -1.
[ 870.986728] xdma:pci_keep_intx_enabled: 0000:01:00.0: clear
INTX_DISABLE, 0x406 -> 0x6.
[ 870.986873] xdma:probe_one: 0000:01:00.0 xdma0, pdev 0xffffffc0ed678800,
xdev 0xffffffc0eb376000, 0xffffffc0eb374000, usr 16, ch 1,1.
[ 870.997049] xdma:cdev_xvc_init: xcdev 0xffffffc0eb377b88, bar 0, offset
0x40000.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#10 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AG3OOAF7YCVYFSKCYKBAUCDS5BIMVANCNFSM4W5NZY7A>
.
|
Hi, I tried the dma-test-2.py with the following results: #0 Running it again locks up the host. Error 512 sent me down a rabbit hole with the post here: https://forums.xilinx.com/t5/PCIe-and-CPM/debug-the-driver-of-IP-PCIE-with-DMA/m-p/1003427/highlight/true#M14300 I tested the suggested solution but the test failed anyway: Info: Number of enabled h2c channels = 1 I'm not sure if the board is faulty or just not compatible with the RockPi4. |
OK, stick with dma-test-2.py. Reduce the transfer size and see if that
helps. Also, once you get that error (unknown error 512), the test won't
work properly. It seems to mess up the driver. So if you see that error 512
its best to reboot and try again. I see that error happen when the transfer
size is too large for the OS to handle.
Best regards,
Dave.
…On Mon, Feb 8, 2021 at 2:53 PM Tobias Björnsson ***@***.***> wrote:
Hi, I tried the dma-test-2.py with the following results:
#0
Sent in 2190.97900390625 milliseconds (490.07399070719003 MBPS)
Traceback (most recent call last):
File "dma-test-2.py", line 81, in
main()
File "dma-test-2.py", line 68, in main
mem_test_random()
File "dma-test-2.py", line 41, in mem_test_random
rx_data.append(os.pread(fd_c2h, TRANSFER_SIZE, page * TRANSFER_SIZE))
OSError: [Errno 512] Unknown error 512
Running it again locks up the host.
Error 512 sent me down a rabbit hole with the post here:
https://forums.xilinx.com/t5/PCIe-and-CPM/debug-the-driver-of-IP-PCIE-with-DMA/m-p/1003427/highlight/true#M14300
I tested the suggested solution but the test failed anyway:
Info: Number of enabled h2c channels = 1
Info: Number of enabled c2h channels = 1
Info: The PCIe DMA core is memory mapped.
Info: Running PCIe DMA memory mapped write read test
transfer size: 1024
transfer count: 1
Info: Writing to h2c channel 0 at address offset 0. TransferSize: 1024
TransferCount: 1
Info: Wait for current transactions to complete.
/dev/xdma0_h2c_0 ** Average BW = 1024, 3.382519
Info: Writing to h2c channel 0 at address offset 1024. TransferSize: 1024
TransferCount: 1
Info: Wait for current transactions to complete.
/dev/xdma0_h2c_0 ** Average BW = 1024, 2.005331
Info: Writing to h2c channel 0 at address offset 2048. TransferSize: 1024
TransferCount: 1
Info: Wait for current transactions to complete.
/dev/xdma0_h2c_0 ** Average BW = 1024, 3.795151
Info: Writing to h2c channel 0 at address offset 3072. TransferSize: 1024
TransferCount: 1
Info: Wait for current transactions to complete.
/dev/xdma0_h2c_0 ** Average BW = 1024, 3.331175
Info: Reading from c2h channel 0 at address offset 0. TransferSize: 1024
TransferCount: 1
Info: Wait for the current transactions to complete.
/dev/xdma0_c2h_0 ** Average BW = 1024, 3.323952
Info: Reading from c2h channel 0 at address offset 1024. TransferSize:
1024 TransferCount: 1
Info: Wait for the current transactions to complete.
/dev/xdma0_c2h_0 ** Average BW = 1024, 3.227494
Info: Reading from c2h channel 0 at address offset 2048. TransferSize:
1024 TransferCount: 1
Info: Wait for the current transactions to complete.
/dev/xdma0_c2h_0 ** Average BW = 1024, 3.982716
Info: Reading from c2h channel 0 at address offset 3072. TransferSize:
1024 TransferCount: 1
Info: Wait for the current transactions to complete.
/dev/xdma0_c2h_0 ** Average BW = 1024, 2.669127
Info: Checking data integrity.
data/output_datafile0_4K.bin data/datafile0_4K.bin differ: char 53, line 1
Error: The data written did not match the data that was read.
address range: 0 - 1024
write data file: data/datafile0_4K.bin
read data file: data/output_datafile0_4K.bin
data/output_datafile1_4K.bin data/datafile1_4K.bin differ: char 53, line 2
Error: The data written did not match the data that was read.
address range: 1024 - 2048
write data file: data/datafile1_4K.bin
read data file: data/output_datafile1_4K.bin
data/output_datafile2_4K.bin data/datafile2_4K.bin differ: char 54, line 2
Error: The data written did not match the data that was read.
address range: 2048 - 3072
write data file: data/datafile2_4K.bin
read data file: data/output_datafile2_4K.bin
data/output_datafile3_4K.bin data/datafile3_4K.bin differ: char 53, line 2
Error: The data written did not match the data that was read.
address range: 3072 - 4096
write data file: data/datafile3_4K.bin
read data file: data/output_datafile3_4K.bin
Error: Test completed with Errors.
Error: Test completed with Errors.
I'm not sure if the board is faulty or just not compatible with the
RockPi4.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#10 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AG3OOAFK25P5FYWVSDLBAXLS6A6LPANCNFSM4W5NZY7A>
.
|
Right, tested the following:
All resulted in: #0 dmesg from driver being loaded to the seg fault: [ 169.797974] xdma: loading out-of-tree module taints kernel. |
Doesn't look like enough information to really help. Probably the last thing you could try:
I'm not familiar with the Xilinx project, not sure what offset they map memory to, and they probably just use BRAM as the memory. If the problem persists, you'll have a nice clean project that you can use to open a support request. If the problem doesn't persist, you can use MIG and replace BRAM with DDR and see if the problem comes back. Its not likely to be the Litefury itself, its more likely a compatibility issue with XDMA driver and RockPi; but if you really think LiteFury is defective I can send you another. My recommendation is to use BRAM instred of DDR- if BRAM works and DDR doesn't, that would indicate a defective Litefury. |
OK, thanks Dave for all your help! |
Hi,
I'm running the NiteFury on a RockPi4 (ARM). I had to recompile everything but after that step the kernel module loaded and everything seemed fine. When I ran run_tests.sh the whole machine locked up. Lights are still blinking on the NiteFury but the machine is non responsive. What may cause this?
lspci -vv
01:00.0 Serial controller: Xilinx Corporation Device 7024 (prog-if 01 [16450])
Subsystem: Xilinx Corporation Device 0007
Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- SERR- <PERR- INTx-
Interrupt: pin A routed to IRQ 234
Region 0: Memory at fa000000 (32-bit, non-prefetchable) [disabled] [size=1M]
Region 1: Memory at fa100000 (32-bit, non-prefetchable) [disabled] [size=64K]
Capabilities: [40] Power Management version 3
Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1+,D2+,D3hot+,D3cold-)
Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [48] MSI: Enable- Count=1/1 Maskable- 64bit+
Address: 0000000000000000 Data: 0000
Capabilities: [60] Express (v2) Endpoint, MSI 00
DevCap: MaxPayload 512 bytes, PhantFunc 0, Latency L0s <64ns, L1 unlimited
ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset- SlotPowerLimit 0.000W
DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
MaxPayload 128 bytes, MaxReadReq 512 bytes
DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
LnkCap: Port #0, Speed 5GT/s, Width x4, ASPM L0s, Exit Latency L0s unlimited, L1 unlimited
ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp-
LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk-
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 2.5GT/s, Width x4, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
DevCap2: Completion Timeout: Range B, TimeoutDis-, LTR-, OBFF Not Supported
DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis-
Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
Compliance De-emphasis: -6dB
LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete-, EqualizationPhase1-
EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
Capabilities: [100 v1] Device Serial Number 00-00-00-00-00-00-00-00
sudo ./load_driver.sh
Loading xdma driver...
The Kernel module installed correctly and the xmda devices were recognized.
DONE
dmesg
[ 7641.378840] xdma: loading out-of-tree module taints kernel.
[ 7641.382338] xdma:xdma_mod_init: Xilinx XDMA Reference Driver xdma v2017.1.47
[ 7641.382351] xdma:xdma_mod_init: desc_blen_max: 0xfffffff/268435455, sgdma_timeout: 10 sec.
[ 7641.383695] xdma:xdma_device_open: xdma device 0000:01:00.0, 0xffffffc0ed638800.
[ 7641.383731] xdma 0000:01:00.0: enabling device (0000 -> 0002)
[ 7641.383757] xdma:pci_check_extended_tag: 0xffffffc0ed638800 EXT_TAG disabled.
[ 7641.383765] xdma:pci_check_extended_tag: pdev 0xffffffc0ed638800, xdev 0xffffffc0e8568000, config bar UNKNOWN.
[ 7641.383931] xdma:map_single_bar: BAR0 at 0xfa000000 mapped at 0xffffff800e400000, length=1048576(/1048576)
[ 7641.383969] xdma:map_single_bar: BAR1 at 0xfa100000 mapped at 0xffffff800bf80000, length=65536(/65536)
[ 7641.383980] xdma:map_bars: config bar 1, pos 1.
[ 7641.383987] xdma:identify_bars: 2 BARs: config 1, user 0, bypass -1.
[ 7641.384198] xdma:probe_one: 0000:01:00.0 xdma0, pdev 0xffffffc0ed638800, xdev 0xffffffc0dd202000, 0xffffffc0e8568000, usr 16, ch 1,1.
[ 7641.409170] xdma:cdev_xvc_init: xcdev 0xffffffc0dd203b88, bar 0, offset 0x40000.
sudo ./run_test.sh
Info: Number of enabled h2c channels = 1
Info: Number of enabled c2h channels = 1
Info: The PCIe DMA core is memory mapped.
Info: Running PCIe DMA memory mapped write read test
transfer size: 1024
transfer count: 1
Info: Writing to h2c channel 0 at address offset 0.
Info: Wait for current transactions to complete.
After this the system becomes unresponsive.
The text was updated successfully, but these errors were encountered: