diff --git a/AI_Engine_Development/Design_Tutorials/01-aie_lenet_tutorial/README.md b/AI_Engine_Development/Design_Tutorials/01-aie_lenet_tutorial/README.md index 01e2b5fde3..a46e474df2 100644 --- a/AI_Engine_Development/Design_Tutorials/01-aie_lenet_tutorial/README.md +++ b/AI_Engine_Development/Design_Tutorials/01-aie_lenet_tutorial/README.md @@ -46,7 +46,7 @@ After completing the tutorial, you should be able to: Tutorial Overview ## Tutorial Overview -In this application tutorial, the LeNet algorithm is used to perform image classification on an input image using five AI Engine tiles and PL resources including block RAM. A top level block diagram is shown in the following figure. An image is loaded from DDR memory through the NoC to block RAM and then to the AI Engine. The PL input pre-processing unit receives the input image and sends the output to the first AI Engine tile to perform matrix multiplication. The output from the first AI Engine tile goes to a PL unit to perform the first level of max pool and data rearrangement (M1R1). The output is fed to the second AI Engine tile and the output from that tile is sent to the PL to perform the second level max pooling and data rearrangement (M2R2). The output is then sent to a fully connected layer (FC1) implemented in two AI Engine tiles and uses the rectified linear unit layer (ReLu) as an activation function. The outputs from the two AI Engine tiles are then fed into a second fully connected layer implemented in the fifth AI Engine tile. The output is sent to a data conversion unit in the PL and then to the DDR memory through the NoC. In between the AI Engine and PL units is a datamover module (refer to the Lenet Controller in the following figure) that contains the following kernels: +In this application tutorial, the LeNet algorithm is used to perform image classification on an input image using five AI Engine tiles and PL resources including block RAM. A top level block diagram is shown in the following figure. An image is loaded from DDR memory through the NoC to block RAM and then to the AI Engine. The PL input pre-processing unit receives the input image and sends the output to the first AI Engine tile to perform matrix multiplication. The output from the first AI Engine tile goes to a PL unit to perform the first level of max pool and data rearrangement (M1R1). The output is fed to the second AI Engine tile and the output from that tile is sent to the PL to perform the second level max pooling and data rearrangement (M2R2). The output is then sent to a fully connected layer (FC1) implemented in two AI Engine tiles and uses the rectified linear unit layer (ReLu) as an activation function. The outputs from the two AI Engine tiles are then fed into a second fully connected layer implemented in the `core04` AI Engine tile. The output is sent to a data conversion unit in the PL and then to the DDR memory through the NoC. In between the AI Engine and PL units is a datamover module (refer to the Lenet Controller in the following figure) that contains the following kernels: * `mm2s`: a memory mapped to stream kernel to feed data from DDR memory through the NoC to the AI Engine Array * `s2mm`: a stream to memory mapped kernel to feed data from the AI Engine Array through NoC to DDR memory @@ -264,7 +264,7 @@ v++ --target hw_emu \ make graph: Creating the AI Engine ADF Graph for Vitis Compiler Flow ## make graph: Creating the AI Engine ADF Graph for Vitis Compiler Flow -An ADF graph can be connected to an extensible Vitis platform (the graph I/Os can be connected either to platform ports or to ports on Vitis kernels through Vitis compiler connectivity directives. +An ADF graph can be connected to an extensible Vitis platform (the graph I/Os can be connected either to platform ports or to ports on Vitis kernels through Vitis compiler connectivity directives). * The AI Engine ADF C++ graph of the design contains AI Engine kernels and PL kernels. * All interconnects between kernels are defined in the C++ graph. * All interconnections to external I/O are fully specified in the C++ simulation testbench (`graph.cpp`) that instantiates the C++ ADF graph object. All `adf::sim` platform connections from graph to PLIO map onto ports on the AI Engine subsystem graph that are connected using the Vitis compiler connectivity directives. No dangling ports or implicit connections are allowed by the Vitis compiler.