Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Device could not be initialized or missing initialization #41

Open
Zamion101 opened this issue Nov 26, 2024 · 4 comments
Open

Device could not be initialized or missing initialization #41

Zamion101 opened this issue Nov 26, 2024 · 4 comments

Comments

@Zamion101
Copy link

I'm trying to setup Aurora-HLS on Noctua2 on Node n2fpga17 with 2 xcu280_u55c_0 connected together in the following configuration. (https://pc2.github.io/fpgalink-gui/index.html?import=%20--fpgalink%3Dn00%3Aacl0%3Ach0-n00%3Aacl1%3Ach0%20--fpgalink%3Dn00%3Aacl0%3Ach1-n00%3Aacl1%3Ach1)

I followed the README and compiled the project with make aurora than build the example with

make host
make xclbin

after the 2 hours of wating for xclbin to be compiled, I tried using ./host_aurora_hls_test and got [n2fpga17:3185664] MCW rank 0 is not bound (or bound to all available processors) error/warning as well as [n2fpga17:3185664:0:3185664] Caught signal 11 (Segmentation fault: address not mapped to object at address (nil)).

I added following debug prints to host/host_aurora_hls_test.cpp and tried again

uint32_t device_id = emulation ? 0 : (((node_rank / 2) + config.device_id_offset) % 3);
    printf ("device_id: %u %u \n", device_id, config.device_id_offset);

    uint32_t instance = node_rank % 2;
    printf ("instance: %u \n", instance);

    xrt::device device = xrt::device(device_id);
    std::cout << "device name:     " << device.get_info<xrt::info::device::name>() << "\n";
    std::cout << "device bdf:      " << device.get_info<xrt::info::device::bdf>() << "\n";

which gave me following output and the place where the problem occurs.

emulation: 0 
device_id: 0 0 
instance: 0 
[n2fpga17:3185664:0:3185664] Caught signal 11 (Segmentation fault: address not mapped to object at address (nil))
==== backtrace (tid:3185664) ====
 0 0x0000000000012cf0 __funlockfile()  :0
 1 0x00000000000ccc6f xrt_core::system_linux::get_userpf_device()  ???:0
 2 0x00000000000d1187 xrt_core::get_userpf_device()  ???:0
 3 0x00000000001186cb xrt::device::device()  ???:0
 4 0x00000000004046b8 main()  /scratch/hpc-prf-gripv/Aurora-HLS/host/host_aurora_hls_test.cpp:102
 5 0x000000000003ad85 __libc_start_main()  ???:0
 6 0x000000000040415e _start()  ???:0
=================================
[n2fpga17:3185664] *** Process received signal ***
[n2fpga17:3185664] Signal: Segmentation fault (11)
[n2fpga17:3185664] Signal code:  (-6)
[n2fpga17:3185664] Failing at address: 0x12d6f00309c00
[n2fpga17:3185664] [ 0] /lib64/libpthread.so.0(+0x12cf0)[0x1553fbcd6cf0]
[n2fpga17:3185664] [ 1] /opt/software/FPGA/Xilinx/xrt/xrt_2.15/lib/libxrt_core.so.2(_ZNK8xrt_core12system_linux17get_userpf_deviceEj+0x1f)[0x1553e1719c6f]
[n2fpga17:3185664] [ 2] /opt/software/FPGA/Xilinx/xrt/xrt_2.15/lib/libxrt_coreutil.so.2(_ZN8xrt_core17get_userpf_deviceEj+0x67)[0x1553fc564187]
[n2fpga17:3185664] [ 3] /opt/software/FPGA/Xilinx/xrt/xrt_2.15/lib/libxrt_coreutil.so.2(_ZN3xrt6deviceC2Ej+0x5b)[0x1553fc5ab6cb]
[n2fpga17:3185664] [ 4] ./host_aurora_hls_test[0x4046b8]
[n2fpga17:3185664] [ 5] /lib64/libc.so.6(__libc_start_main+0xe5)[0x1553fb939d85]
[n2fpga17:3185664] [ 6] ./host_aurora_hls_test[0x40415e]
[n2fpga17:3185664] *** End of error message ***
Segmentation fault (core dumped)

I know the cards are present and working as shown in the screenshot below:
image

@Mellich
Copy link
Collaborator

Mellich commented Nov 26, 2024

According to your output, the application seems to crash on this line

xrt::device device = xrt::device(device_id);

or when retrieving the device info.
Is XRT properly set up on the compute node (maybe try to execute xbutil examine)? Does creating a xrt::device object work for you with other applications?

Please also note, that our project is currently only tested with the U280.

@papeg
Copy link
Collaborator

papeg commented Nov 26, 2024

When running on Noctua 2, you should run "make xclbin" as a compute job, as it is a long running synthesis. There is a script for it (https://github.com/pc2/Aurora-HLS/blob/main/scripts/synth.sh):

The example is running on an U280 card. The n2fpga17 node has a custom configuration, so you need to set the correct PART and PLATFORM when running the synthesis. It could be u280 or u55. The following is just a guess for u55, better check what are the correct ones.

sbatch ./scripts/synth.sh make xclbin PART=xcu55c-fsvh2892-2L-e PLATFORM=xilinx_u55c_gen3x16_xdma_3_202210_1

And also the correct xrt version needs to be set in the env.sh file, if it is fixed one.

For running it is also best to use the provided scripts (adjusted to the configuration): https://github.com/pc2/Aurora-HLS/blob/main/scripts/run_N1.sh

@Zamion101
Copy link
Author

Okay after talking with Riebler, found out that the custom configuration on n2fpga17 does not includes XRT and that's the reason why the Aurora-HLS does not work on it. Also n2fpga17 does include 2 Alveo U280 cards and not U55C. I'm trying on another node at the moment.

I also would like to ask is there a way to import or port the Aurora-HLS for XDMA as well?

@Mellich
Copy link
Collaborator

Mellich commented Nov 27, 2024

The focus of Aurora-HLS is the support for Vitis/XRT toolchains. But Maybe you can just build the object files by running

make aurora_hls_0.xo aurora_hls_1.xo

0 for QSFP0 and 1 for QSFP1.
Actually, the object files should already be there from your previous synthesis.
Both kernels should have two AXI streams (512bit by default) rx_axis and tx_axis. Then you need to link the object files into your design using your tool flow and connect the AXI streams accordingly. That should do the trick for the streaming configuration.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants