Home
EPEE User Guide
Contents
1. home ceca gon gjian SW_src app interrupt ok test 993 wait for interrupt interrupt ok test 994 wait for interrupt interrupt ok test 995 wait for interrupt interrupt ok test 996 wait for interrupt interrupt ok test 997 wait for interrupt interrupt ok test 998 wait for interrupt 6 interrupt ok test 999 wait for interrupt 6 interrupt ok root ceca Pc home ceca gongjian SW_src app a Figure 3 12 UDI test 3 4 About the Example Design The structure of the example design is illustrated in figure 3 13 There are three main parts of the example design The PCIE TOP module is the main part of EPEE library hardware side Xilinx PCle IP core is in CORE_WRAPPER module of PCIE_TOP The EXTEN_LIB part consists of four clock switch module and a RESET_GEN module These five modules do clock domain switch between EPEE library and user hardware The USER_HW contains three small demos for UCR PIO DMA and UDI interfaces The usr_pio module has four registers in it Reg O and reg 1 can be read and written by software while reg 2 and reg 3 are read only The value of reg 2 and reg 3 follows these equations Reg2 reg 0 reg1 Reg3 regO reg1 usr_pio DMA_HOST2BOARD _CLK_SWITCH YddddVuM 4YOD YALSVIN SNA usr_dma sng 29d DMA_BOARD2HOST _CLK_SWITCH l l usr int Figure 3 13 Example design structure 11 9109 dl l d XU X Module usr
2. RAAFERBHASER Ys Center for Energy efficient Computing and Applications EPEE User Guide Jian Gong jian gong pku edu cn 1 Hardware Interface The hardware interface consists of UCR user controllable register DMA direct memory access and UDI user defined interrupt interfaces as figure 1 1 shows The DMA interface can be used for transferring large amount of data The UCR and UDI interface can be used to control the hardware Hardware FPGA Application Software Application Figure 1 1 EPEE user interface 1 1 DMA Interface DMA interface can be divided into host to board DMA read part and board to host DMA write part FIFO is used as the interface Table 1 1 illustrates the DMA interface Table 1 1 DMA Interface 70 64 bit interface output DMA host to board DMA read data output FIFO _host2board_dout 71 0 64 bit interface 128 bit interface Data will be transferred in bits 63 0 Bit 65 indicates FIFO_host2board_dout 129 0 dout 63 23 contains valid data bit 64 indicates dout 31 0 contains valid data 128 bit interface Data will be transferred in bits 127 0 Bit 129 128 indicates whether the QWs 8 Byte in 127 64 and 63 0 contains valid data FIFO _host2board_empty output Empty signal of Xilinx FIFO The FIFO is a First Word Fall Through FIFO FIFO _host2board_rd_en Read enable signal of Xilinx FIFO 1 For more information of Xilinx FIFO please see Xilinx PGO57 LogiCORE IP FIFO Genera
3. b00000000 TimeGen Figure 1 8 User defined interrupt interface timing 2 Software API Definition The software APIs are in sPcie h and sPcieZerocopy h The APIs can be divided into DMA direct memory access UCR user controllable register UDI user defined interrupt and Zero Copy DMA parts Besides these APIs we also provide some other APIs such as the reset API These APIs are listed in table 2 1 2 5 Note For 64 bit interface DMA data should be an integral multiple of DW 4Byte for 128 bit interface hardware DMA data should be an integral multiple of QW 8Byte Table 2 1 DMA APIs dma_host2board DMA data from host to board It will block until DMA is done dma_host2board_ unblocking DMA data from host to board Return immediately after the DMA is started If hardware is currently busy it will return 1 dma_board2host DMA data from board to host It will block until DMA is done lt annnnn sannnnnnnnnnh a REKGERRUFSARG Vauaeuecesasener Center for Energy efficient Computing and Applications get_host2board_count Return the number of DWs that has been DMA from host to board It can be used for debugging get_board2host_count Return the number of DWs that has gone through FIFO board2host It can be used for debugging Table 2 2 UCR APIs Read a user controllable register Write a user controllable register Table 2 3 UDI APIs block_until_ interrupt Block until the corresponding interrupt oc
4. slot in your computer as figure 3 5 shows Use the included PC power adaptor and then turn on the power switch Do not use the PCle connector from the PC power supply 2 XILINX FMC a Turn on the computer to power on VC707 Use Impact tool to program the generated bit file to VC707 Then restart the computer to reset the board 3 3 Software Setup Linux After rebooting the computer open a terminal and follow these steps 1 List the PCI devices to make sure EPEE can be detected Figure 3 6 shows the lspci See Xilinx xtp144 for details on the hardware setup D REAFERBUAS ARG Center for Energy efficient Computing and Applications command lists all the PCI devices including PCle devices and Xilinx board VC707 with EPEE can be detected the Xilinx Corporation Device 7024 ceca ceca PC ceca ceca PC S lspci vvv t 0000 00 00 0 Intel Corporation 4th Gen Core Processor DRAM Controller 01 0 01 00 0 Xilinx Corporation Device 7024 61 1 02 02 0 Intel Corporation Xeon E3 1200 v3 4th Gen Core Processor Inte grated Graphics Controller 03 0 Intel Corporation Xeon E3 1200 v3 4th Gen Core Processor HD A udio Controller 14 0 Intel Corporation 8 Series C220 Series Chipset Family USB xHC I 16 0 Intel Corporation 8 Series C220 Series Chipset Family MEI Con Figure 3 6 List PCI devices Make the driver module The following commands require sudo privilege so we use sudo
5. 8 byte each time For each 9 BREKFERAUFTS ARG Center for Energy efficient Computing and Applications DMA the DMA_test program firstly DMA data from host to board then DMA data from board to host The user hardware see USER _HW usr_dma v bit wises the data from host The DMA_test program will check whether the data from board are bitwise from the original data If no error occurred as figure 3 10 shows the test passed root ceca Pc home ceca gongjian SW_src app Test 4016 Test 4024 Test 4032 Test 4040 Test 4048 ts Test 4056 Test 4064 Test 4072 Test 4080 Test 4088 Test 4096 root ceca Pc home ceca gongjian SW_src app g Figure 3 10 Run DMA test PIO test root ceca Pc home ceca gongjian SW_src app root ceca PCc home ceca gongjian SW_src app PIO test gt configure mode x4gen2 current mode x4gen2 reg 6x1 regi 6x2 reg2 Oxffffffff reg3 0x3 root ceca Pc home ceca gongjian SW_src app fil Figure 3 11 PIO test In this test the PIO_test program firstly writes 1 to regO and then writes 2 to reg1 Then it read reg2 and reg3 The reg2 and reg3 are read only by software side User hardware in this demo calculates reg2 and reg3 this way see USER_HW usr_pio v reg 2 lt reg O reg 1 reg 3 lt reg O reg 1 UDI test 10 lt annnna Pr gga gee REAFHERAUFSARG Seeusesscceuscacs n pa Center for Energy efficient Computing and Applications root ceca Pc
6. _dma loop each data from host2board fifo back to board2host fifo It also does bitwise operation to each data Module usr_int generates two kinds of user defined interrupts vector O and vector 1 When software is waiting for interrupt the counter inside this module will count Interrupt will be generate when the counter reach a certain number 10 for vector 0 and 1000 for vector 1 3 5 About Xilinx PCIe Core The Xilinx PCle core is in CORE_WRAPPER _1Mbar directory User can also generate the IP core If one wants to generate the IP core he she can follow these steps The component name of the core should be assigned to PCIE CORE _1MBar as the figure 3 14 shows Q 7 Series Integrated Block for PCI Express 7 Series Integrated Block for CPE as PCI Express PCIe Device Port Type The Integrated Block for PCI Express allows selection of the Device Port Type Device Port Type PCI Express Endpoint device Number of Lanes The Integrated Block for PCI Express requires that an initial lane width be selected Wider lane width cores can tr atarhod fa a cmaller lane width dadre Selert rnb the lane width that ic nereccary for the dogiin Figure 3 14 Modify the name of component The base address register is 1MB with 64 bit enabled and only BarO is used in EPEE which is shown in figure 3 15 C7 Seve tga lek TBE lll Documents 7 Series Integrated Block for oo PCI Express Base Address Registers Ba
7. _host2board_dout 63 32 Dwo owe X Xow FIFO_host2board_dout 31 0 Dwi Cows X Xow TimeGen ck FF LP LAP LP LP LEP LS LE LE FIFO_host2board_rd_en A n n E E E E E FIFO_host2board_empty Va A a a FIFO_host2board_dout 65 64 2 b11 Cn X WX o FIFO_host2board_dout 63 32 Dwo we X Xow FIFO_host2board _dout 31 0 Dwi DW3 C rr TimeGen Figure 1 3 Timing of host2board FIFO 2N 1 data 2 _ lt 285 eee Se GE RAKEHRRUES ARG EAH Center for Energy efficient Computing and Applications Timing for 128 bit interface is like that of the 64 bit interface Data within one DMA transaction from host to board will be continuous Only in the last cycle the FIFO_host2board_dout 129 128 can be 2 b10 For board to host side interface take 64 bit interface as an example user can assert and deassert FIFO_board2host_din 65 64 at any time EPEE will pack data for user and software will see continuous data As figure 1 4 and 1 5 shows in software side continuous data DWO DW1 DW2 DW3 DW4 will be seen clk FIFO_board2host_wr_en FIFO_board2host_prog_full FIFO_board2host_din 65 64 FIFO_board2host_din 63 32 X DWwo Dwe2 FIFO_board2host_din 31 0 X DWI DW3 TimeGen Figure 1 4 Timing of board2host FIFO clk FIFO_board2host_wr_en FIFO_board2host_prog_full FIFO_board2host_din 65 64 FIFO_board2host_din 63 32 X pwo X owe x Dw4 FIFO_board2host_din 31 0 TimeGen Figu
8. cur Table 2 4 Zero Copy APIs Table 2 5 Other APIs get_pcie_ cfg mode Return the configured PCle mode E g Gen2 X4 get_pcie_cur_mode Return the currently used PCle mode sys_reset Reset the whole system usr_reset Reset the user hardware host2board_reset Reset host to board side DMA data transfer including the FIFO board2host_reset Reset board to host side DMA data transfer including the FIFO 3 Example Design 3 1 Generate the Example Design Take the VC707 for example Extract the source code to V7_485T_X4Genz2 as figure 3 1 shows annuun _ SSS000eeee LTS i RAR ARHAR PE ELLLLTLLITILLITITI S Sissin Center for Energy efficient Computing and Applications di BUS MASTER 2014 6 19 10 07 Mi di CORE_ WRAPPER 2014 6 19 10 07 IFE EXTEN_UIB 2014 6 19 10 07 9 SHE USER_HW 2014 6 19 10 07 324s demo_top_v _x4gen2 v 2013 12 5 22 36 VW 3z44 8 KB E PCIE TOP 2013 12 5 20 32 V ede 6 KB E PCleCL_demo_v 7_x4gen2 uct 2013 9 25 22 26 UCF 3744 12 KB Figure 3 1 Extracted source files in folder V7_485T_X4Gen2 Open an ISE project navigator use file gt new project to make a new project Then choose the Virtex 7 VC707 Evaluation Platform in Evaluation Development Board option to specify to the VC707 board As figure 3 2 shows Project Settings opecity device and project properties Select the device and design flow for the project Property Name Value Evaluation Development Board Virt
9. est Assert to send an interrupt The usr_int_req signal should be asserted until usr_int_clr is asserted usr_int_vector 2 0 Interrupt vector indicating which interrupt will be sent EPEE lt annnnn sannnnnnnnnnh AET RARE DRRHRDAA YE Ss nn SSSSeeeeeeeeeeeee YSSSSeeeeeeeeeer guscuceecusss Center for Energy efficient Computing and Applications T_T suport merus current usr_int_Sw_waiting 7 0 Output Indicates whether software is waiting for the interrupt occur Software calling function block_until_ interrupt will cause corresponding bit to be asserted Output Clear the interrupt Output User interrupt enable signal After user software call function block _until_interrupt vector_ num it will be blocked and usr_int_sw_waiting vector _num in hardware side will be asserted If hardware send an interrupt with usr_int_vector 2 0 usr_int_sw_waiting that software will be waked up and usr_int_Sw_waiting vector_num will deassert Figure 1 8 shows how interrupt with vector 1 works A software process called block_until_interrupt 1 function before clock edge 2 so the usr_int_sw_waiting s bit 1 is asserted After the interrupt finished and that process has been waked up that bit is deasserted 3 5 6 clk ee ee a EE EE usr_int_enable oo Se Anaa a usr_int_req Ki FY FY usr_int_clr WY tot NEEL Gi i E usr_int_vector E s i usr_int_sw_waiting 8 b0000000 8 b00000010 X 8
10. ex 7 VC707 Evaluation Platform Product Cateqory All Family Virtex Device AC VX465T Package FRG1 61 Speed 2 Figure 3 2 Project settings Add all the source files in V7_485T_X4Gen2 folder into the ISE project Then we can see the hierarchy of the source code which is as figure 3 3 shows Hierarchy E EPEE_VC707_Demo Gl g3 xc7vx485t 2ffg1761 H B Automatic includes el weft demo_top x4gen2 demo _top v _ 4gen2 v fl v PCIE_TOP PCIE_TOP PCIE_TORv a V EXTEN_LIB EXTEN_LIB EXTEN_LIB v el V USER_HW USER_LHW USER_HW v usr_dma usr_dma usr_dma v z usr_pio usr_pio usr_pio v a usr_int usr int usr_intv ne E PCleCL_demo_v _x4gen2 uct Figure 3 3 Project hierarchy Select the top level demo_top_ x4gen2 then double click the Generate Programming File in Process window Figure 3 4 gga gee RMEAFHRAU TSAR GS mn ss ss F Saaseeaeeeseses Center for Energy efficient Computing and Applications Processes demo_top igen2 E Design Summary Reports 8 Design Utilities F User Constraints oe Synthesize XST G F Implement Design Gl Translate Gf Map GF Place amp Route Generate Programming File ls Configure Target Device gu Analyze Design Using ChipScope Figure 3 4 Generate programming file Wait for all process done The bit file will generate in your project s directory 3 2 Hardware Setup Insert the VC707 board into PCI Express
11. re 1 5 Timing of board2host FIFO 2 1 2 UCR Interface PIO read and PIO write bus make the UCR interface User can define controllable register with these bus signals The PIO bus signals are listed in table 1 2 Table 1 2 UCR interface signals Output Read request in PIO bus usr_pio_rd_ack Input Read acknowledge in PIO bus It should be asserted for one cycle after usr_pio _rd_req asserted and data prepared in usr_pio_rd_data usr_pio_rd_addr 16 0 Read address It is only valid when usr_pio_rd_req asserted usr_pio_rd_data 31 0 Input The read data It should be prepared when usr_pio_rd_ack asserted Output Write request in PIO bus usr_pio_wr_ack Input Write acknowledge in PIO bus Assert it for one cycle when data is written into register usr_pio_wr_addr 16 0 Write address it is valid when usr_pio_wr_req asserted usr_pio_wr_data 31 0 Write data it is valid when usr_pio_wr_req asserted clk usr_pio_rd_req YEW RL usr_pio_rd_ack x usr pio rd adr Acc gt usr_pio_d data M To iii TimeGen Figure 1 6 UCR read timing clk TOS se La ee se Se o usr_pio_wr_req Se 2 0 0 ae S E E E S E E usr_pio_wr_ack WF Ye NS o S usr pio wr ador M ao eT usr pio wr data ME o TimeGen Figure 1 7 UCR write timing 1 3 UDI Interface The UDI interface supports up to 8 interrupts The signals are listed in table 1 3 Table 1 3 UDI interface signals usr_int_req Input User interrupt requ
12. se Address Registers BARs serve two purposes Initially they serve as a mechanism fo system memory map After the BIOS or OS determines what addresses to assign to the de addresses and the device uses this information to perform address decoding BAR 0 Options BAR 1 Options iV BarO Type Memory v Prefetchable Bari Type Value FFF00004 Hex Valu BAR 2 Options BAR 3 Options F Bar2 Type N A 64 bit Prefetchable Bar3 Type an m n m PaA wire Figure 3 15 Base address registers option mA 12 D REAFHERRHRSARG Center for Energy efficient Computing and Applications For the other options the default option can be used Note For 7 series integrated block for PCI Express there is some problem that the Intel Z77 chipset can t detect the board It is a known issue which is recorded by Xilinx AR 51135 See http www xilinx com support answers 51135 html for details Note For KC705 evaluation board the UCF differs between different revisions of board The following line should be changed in UCF INST PCIE_TOP CORE_WRAPPER refclk_ibuf LOC IBUFDS_GTE2_XOY1 Revision of KC705 Location constrain LOC Rev B Rev C IBUFDS_GTE2_X0Y1 13
13. su to change to root user root ceca Pc home ceca gongjian SW_src sPciDriver ceca iceca PC gongjian SW_src sPciDriversS sudo su root ceca PC home ceca gongjian SW_src sPciDriver make make C Lib modules 3 2 0 60 generic build M home ceca gongjian SW_src sPciDri ver modules make 1 Entering directory usr src lLinux headers 3 2 0 60 generic cc M home ceca gongjian SW_src sPciDriver sPciDriver o Building modules stage 2 MODPOST 1 modules cc fhome ceca gongjian SW_src sPciDriver sPciDriver mod o LD M home ceca gongjian SW_src sPciDriver sPciDriver ko make 1 Leaving directory usr src lLinux headers 3 2 0 60 generic root ceca PCc home ceca gongjian SW_src sPciDriver a Figure 3 7 Make the driver module Insert the driver module with make_device script root ceca Pc fhome ceca gongjian SW_src sPciDriver root ceca Pc home ceca gongjian SW_src sPciDriver make_device crw r r 1 root root 241 1 Jun 19 13 51 dev sPciDriver sPciDriver 22697 6 root ceca PC home ceca gongjian SW_src sPciDriver il Figure 3 8 insert driver module into Linux kernel Go to directory app cd app Make test applications with the make sh script root ceca Pc home ceca gongjian SW_src app root ceca PC home ceca gongjian SW_src app make sh root ceca Pc home ceca gongjian SW_src app DMA_testl Figure 3 9 Compile test applications DMA test The DMA length is from 8 to 4096 bytes increasing
14. tor The FIFO we use is native interface FIFO with First Word Fall Through The PCle Gen2 X8 mode uses the 128 bit interface 64 bit interface DMA board to host DMA write data input FIFO_board2host_din 71 0 64 bit interface 128 bit interface Data should be transferred in bits 63 0 Bit 65 64 FIFO _host2board_din 129 0 indicates whether the DWs 4 bytes in 63 32 and 31 0 are valid 128 bit interface Data should be transferred in bit 127 0 Bit 129 128 indicates whether the QWs 8 Byte in 127 64 and 63 0 are valid FIFO_board2host_prog_ full FIFO s program full signal This FIFO s depth is 512 It is configured to assert program full if it contains more than 500 items For host to board side interface take 64 bit interface as an example FIFO_host2board_dout 65 64 2 b00 will never occur Data within one DMA transaction will be continuous that is FIFO _host2board_dout 65 64 can be 2 b10 only in the last DW of a DMA transaction For example when software wants to DMA 2N N 1 2 3 DWs 1 DW 4 Byte to board FIFO _host2board_dout 65 64 will always be 2 b11 as figure 1 2 shows If DMA 2N 1 DWs to board the last data in FIFO host2board will let FIFO host2board_ dout 65 64 be 2 b10 indicating FIFO_host2board_dout 31 0 is invalid this cycle clk oe ee Le L he sr a anno FIFO_host2board_rd_en a a a a E E E E FIFO_host2board_empty VO A a a a FIFO_host2board_dout 65 64 2 b11 Cn X X20 FIFO
Download Pdf Manuals
Related Search
Related Contents
HTC Touch Diamond Cell Phone User Manual Issue #72 (仮称)川崎市南部学校給食センター整備等事業 事業契約書(案) THE DEVELOPMENT OF SITUATIONAL SIMULATION GAME BBP33 - manuel DE Cellulite-Massagegerät AC 850 GB Cellulite HP FB950 User's Manual Network Reference Guide Vol.17 Miller Electric Miller DU-OP User's Manual Oreiller Nikken Naturest® Copyright © All rights reserved.
Failed to retrieve file