Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Let XRT allocate host memory (include again) #72

Merged
merged 1 commit into from
Jul 20, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions Getting_Started/Vitis/Part3.md
Original file line number Diff line number Diff line change
Expand Up @@ -68,11 +68,11 @@ There are 4 main steps in the source code for this simple example.

* **Step 1:** The OpenCL environment is initialized. In this section, the host detects the attached Xilinx device, loads the FPGA binary (.xclbin file) from file and programs it into the first Xilinx device it found. Then a command queue and the kernel object are created. All Vitis applications will have code very similar to the one in this section.

* **Step 2:** The application creates the three buffers needed to share data with the kernel: one for each input and one for the output. On data-center platforms, it is more efficient to allocate memory aligned on 4k page boundaries. On embedded platforms, it is more efficient to perform contiguous memory allocation. A simple way of achieving either of these is to let the Xilinx Runtime allocate host memory when creating the buffers. This is done by using the cl::Buffer constructor to create the buffers and then mapping the allocated memory to user-space pointers.
* **Step 2:** The application creates the three buffers needed to share data with the kernel: one for each input and one for the output. On data-center platforms, it is more efficient to allocate memory aligned on 4k page boundaries. On embedded platforms, it is more efficient to perform contiguous memory allocation. A simple way of achieving either of these is to let the Xilinx Runtime allocate host memory when creating the buffers. This is done by using the cl::Buffer constructor to create the buffers and then mapping the allocated memory to user-space pointers.

```cpp
// Create the buffers and allocate memory
cl::Buffer in1_buf(context, CL_MEM_READ_ONLY, sizeof(int) * DATA_SIZE, NULL, &err);
cl::Buffer in1_buf(context, CL_MEM_READ_ONLY, sizeof(int) * DATA_SIZE, NULL, &err);

// Map host-side buffer memory to user-space pointers
int *in1 = (int *)q.enqueueMapBuffer(in1_buf, CL_TRUE, CL_MAP_WRITE, 0, sizeof(int) * DATA_SIZE);
Expand Down
6 changes: 3 additions & 3 deletions Getting_Started/Vitis/example/src/host.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -65,9 +65,9 @@ int main(int argc, char **argv)
// Step 2: Create buffers and initialize test values
// ------------------------------------------------------------------------------------
// Create the buffers and allocate memory
cl::Buffer in1_buf(context, CL_MEM_READ_ONLY, sizeof(int) * DATA_SIZE, NULL, &err);
cl::Buffer in2_buf(context, CL_MEM_READ_ONLY, sizeof(int) * DATA_SIZE, NULL, &err);
cl::Buffer out_buf(context, CL_MEM_WRITE_ONLY, sizeof(int) * DATA_SIZE, NULL, &err);
cl::Buffer in1_buf(context, CL_MEM_ALLOC_HOST_PTR | CL_MEM_READ_ONLY, sizeof(int) * DATA_SIZE, NULL, &err);
cl::Buffer in2_buf(context, CL_MEM_ALLOC_HOST_PTR | CL_MEM_READ_ONLY, sizeof(int) * DATA_SIZE, NULL, &err);
cl::Buffer out_buf(context, CL_MEM_ALLOC_HOST_PTR | CL_MEM_WRITE_ONLY, sizeof(int) * DATA_SIZE, NULL, &err);

// Map buffers to kernel arguments, thereby assigning them to specific device memory banks
krnl_vector_add.setArg(0, in1_buf);
Expand Down