Home
“I hereby declare that I have read this thesis and in my opinion this
Contents
1. Mid f B E e SA aaa SEE e i E ek EAN ANNA bod bid GN A AA TN AAN NANA EAEAN GOEN aT qaq Nid 5 x VR a EE et CERERI GE GE GEO EE GE GE GO CLE Nid EZEZ gd dida i i 09 dei M Md PLA Mid 19 Ge deie PN e Mes 5 AR i LL Nid NORTE GEEAE SO n 2127 SES St 218 7 a E SEIN eia E ge gt jopoedjopus r xti 014 FEEN Sey rss E IO 2 p oxy oxy a MO T ert nama da ete et de E rrerak p E FERNER Apes Oxy paubis o saap paus DEE preg 00000005 Aquanbad i E MOASZCIISY E panys Ea POSEE Lys 15 Visual Basic for RFID card verification 1 The Interface Design Fs Mifare Microsoft Visual Basic design Mifare frmScanMifare Form zl File Edit View Project Format Debug Run Query Diagram Tools Add Ins Window Help 23 6 E 5 E D 8 0 Project Mifare 2 Mifare Mifare vbp Zu GO Forms E frmScanMifare FrmScanMifare frm Properties frmScanMifare frmScanMifare Form Alphabetic Categorized frmScanMifare 1 3D False amp H8000000F amp 2 Sizable Mifare ID HC linc True
2. case state 4 50000 start 50005 2 v5 1 005 451000 AVE OOS D LECRXDCIDIE 18v bit found if next bit if next bit if next bit if next bit if next bit if next bit if next bit if next bit if next bit Skate Skate Skate Skate Skate Skate Skate Skate state state ole Sp Gre pre spacing lt 4 b1000 MOTOS dda 60 o tIl DOS JE s 47 41750000 D Irt Irt It It bit bit bit nn A w N O It cop 65 default state lt 4 0000 endcase reg 7 0 RxD_data always posedge clk if Baud8Tick amp amp next bit amp amp state 3 RxD data lt nv RZD reg RxD_data_ready RxD data error always posedge clk begin RxD_data_ready lt Baud8Tick amp amp next_bit amp amp state 4 b000l amp amp RxD ready only 3 the stop bit is received RxD data error lt Baud8Tick amp amp next bit amp amp State 4 50001 66 Reb invi stop bit 15 not received end reg 4 0 gap count always posedge clk if state 0 gap_count lt 5 h00 else if Baud8Tick amp gap_count 4 gap counc lt Gap Count assign RxD_idle gap_count 4 reg RxD_endofpacket always posedge clk RxD endofpacket lt Baud8Tick amp
3. Basis block diagram of the proposed system Project Methodology Flow Processor Core Design Flow Quartus Design Flow Altera DE2 FPGA Board Processor Flow Architecture overview The processor four stages RISC processor flow Instruction hexcode GPR Structure The GPR Write Enable Multiplexer The Operand Selection ALU for ADD and SUB The multiplexing stages Conditional Codes Figure 5 10 Conditional Codes Vector PAGE Xil 12 13 15 16 18 24 25 24 29 29 30 31 32 32 33 Figure 5 11 Portion of Program Control Coding Figure 5 12 TTL CMOS Serial Logic Waveform Figure 5 13 The Asynchronous Receiver I O Figure 5 14 Parameter Values Figure 6 1 Final txt control program Figure 6 2 Timing simulation of Final txt control program Figure 6 3 Last txt control program Figure 6 4 Timing simulation of Final txt control program Figure 6 5 VB6 RFID card verification Figure 6 6 Output waveform of RxD RFID Reader Figure 6 7 Flow of RFID card serial number verification on DE2 board Figure 6 8 Shifted and converted data of 12 bytes RFID card serial number Figure 6 9 Negedge clock solve the timing problem for read and write xiii 34 35 35 36 38 38 39 40 4 42 42 43 45 XIV LIST OF ABBREVIATIONS ASIC Application Specific Integrated Circuit ALU Arithmetic Logic Unit CPU Central Processing Unit CISC Complex Instructi
4. Name Returns the name used in code to identify an object Form Layout E 2 The Coding Private Sub cmdClearID_Click txtMifareID Text Sub Private Sub Form Load With MSComml1 sure the serial port is not open by this program If PortOpen Ihen PortOpen False set the active serial port CommPort 12 16 set the badurate parity databits stopbits for the connection octbImgs set the DRT and RTS flags DTREnable True RTSEnable True enable the oncomm event for every reveived character RIhreshold 1 disable the oncomm event for send characters SThreshold 0 open the serial port PO t True End With MSComm1 End Sub Private Sub MSComml1 OnComm Dim strliunput As String strinput Mm txtMifareID Text With MSComml1 test for incoming event oelect Case CommEvent Case comEvReceive display incoming event data to displaying textbox br Lip il bg UE txtMifarelD SelText Trim strInput End Select End With MSComml End Sub
5. I hereby declare that I have read this thesis and in my opinion this thesis is sufficient in terms of scope and quality for the award of the degree of Bachelor s Degree of Electrical Engineering Microelectronic Signature n Name of Supervisor Dr Ooi Chia Yee Date Jap sha a Bl A DESIGN OF PROCESSOR CORE FOR RFID IMPLEMENTATION AMIR ZAKI BIN AMRAN A report submitted in partial fulfillment of the requirements for the award of the degree of Bachelor of Electrical Engineering Microelectronic Faculty of Electrical Engineering Universiti Teknologi Malaysia MAY 2009 Specially dedicated to my beloved parents Amran Bin Jamaludin Norzakiah Binti Ahmad and all my friends for their never ending support ACKNOWLEDGEMENTS I would like to thank my supervisor Dr Ooi Chia Yee for her help and guidance for the past two semesters It 1s not easy to supervise a topic proposed by a student but she took the task and has become a steady source of support throughout the project Her reasoning has always been beneficial to this project and I am thankful to her My appreciation also goes to my family who has been supporting me all over these years I am grateful for their encouragement love steady patience prayers and financial supports that they had given to me and have faith in me I also would like to offer my special thanks to my colleagues Logeish Al Raj an
6. 9200 200 X 72004 2004 22004 1200 4 0200 A 3100 X 3100 X 0100 A 21004 8100 v LOO SUA 8 L00 K 2 00 4 9100 4 1004 7100410 H le 5 le le a 5 1500 0 00 X 3200 4 3200 4 0200 A 3700 4 8200 vc00 X 6200 4 8200 2200 9200 5200 7200 2200 X 2200 1200 0200 3100 3100 01 00 200 4 81004 v 00 6100 8100 2100 4 91004 51004 UAU H 0600 3200 A 3200 X Gz00 X 2200 8200 X vz00 4 6200 X 8200 X 2200 X 9200 5200 X 7200 200 X 2200 1200 X 0200 3100 3100 DURA 2100 8100 X v LOO 6 00 8100 2100 9 00 5100 X 9100 2 00 ALO H sn 8 sn 9 6 pug 4215 90 uniod E Linoewp p ZOO E FE FE yam H gixe E wawasoul ound dwn youesg od peu zJ fF EI FE i gt aaa AA 8 a ir 26 t8 vL 59 95 alalalalalalalalalala e Alalalalalalala Jeg IUEN EZA APPENDIX L Twelve Stages of Shift Register CU TOT HTHH h Tint P pe ee ETEEN at f 4 pl EZE pl APPENDIX M Converter ASCII to 7 Segment module ASCII27SEG a z Epub 15301 a POUT 2 reg 6 0 2 a
7. The decoder organization Output Bit Notation 4 1 is used to distinguished all the Instruction from the assembly code This is the address of Destination Register for further operation such as Register File This is the address of Source Register for further operation such as Register File Immediate Since the Register File PORT ROM register 15 amp bit format thus the immediate value 15 S bit to synchronization all process Displacement Displacement value 18 used branching and jump instruction value 15 the reference value of the 5 6 Register File In most processor cores GPR is a very important Each instruction reads maximum of two registers rd and rs and some instructions need to write back one result in rd register This processor only runs on 16 registers to complete all its instruction execution process The format of the general purpose register 1s 8 bit The Figure 5 4 shows the structure of the 16 general purpose register 29 Figure 5 4 GPR Structure The rd and rs fields select registers that are read out onto the dreg and sreg buses which these two will be used as inputs to ALU The data bus is connected directly to the destination register for write back process As shown in Figure 5 5 the write enable for GPR is MUXed to distinguish ROM PORT register JMP CMP and SB because these four instructions are not used to write back result onto register
8. The input of the shift register 1s from the RxD asynchronous receiver Then the control signal to move the 8 bit stage by stage is from the data ready of asynchronous receiver 37 6 RESULTS AND DISCUSSIONS This chapter discussed the results obtained from the work done 6 1 Results 6 1 1 The Processor Core The processor was tested with two control programs The first control program is Final txt as in Figure 6 1 The objective to test the processor core with Final txt 1s to determine whether the processor core is calculating the correct values or not According to Figure 6 2 it can be proved that the processor cores with all designed instructions are operating In correct manner 38 Hotepad File Edit Format View Help 11 RDZ Ql RD3 02 04 ROS 10 ROL 11 PDA ROS RDZ ROS RDL ROS RD4 ROS RD5 2 64 ROS 2 24 ROS ROS RDS RD2 QT RD2 ROMIL BON RDG ROG ARO Figure 6 1 Final txt control program Simulation Waveforms Simulation mode Timing Master Time Bar Ops OO Pointer 1 36 us Interval 1 36 us Start End A ps 160 0 ns 320 0 ns 480 0 ns 640 0 ns 800 0 ns 360 0 ns 1 12 us 1 28 us 1 44 us 1 6 us Ne ps Ek pcinc 0007 0002 Y 0003 0004 0005 0006 0007 0008 0008 Y 0002 0002 Y 0010 0012 0013 700037 52 nest pc 0001 0002 K 0003 0004 4 0005 0006 0007 0009 0000 0000 DEA 000 0010 0011
9. negative assign add c W c W assign v c W sum 7 a 7 b 7 overflow Figure 5 9 Conditional Codes The ALU is also determining the condition codes ZNCV which are zero negative carry and overflow as shown in Figure 5 8 This 1s also used for conditional branch instruction that may follow The zero condition is self evident and any two s complement number is negative if its most significant bit is set Carry is the carry out of the most significant bit of the add sub operation and complemented for 33 subtracts For overflow detection an addition overflows if the most significant bit of a b sum and the carry out is high Then the values of z n co and v captured into conditional codes vector which are ccz ccn ccc and ccv as each instruction completes as shown in Figure 5 9 If rst signal 1s asserted the ccz ccc and will go the zeroes but if the instruction 15 valid the z n co will be transfer into CCZ CCN CCC CCV reg cen ccc conditional codes vector always 4 negedge GUK it rst GE eh eee mer se else if valid insn ce OE ee Rea See era Figure 5 10 Conditional Codes Vector 5 9 ROM Register ROM Random Access Memory register is like GPR from the way it operates The difference is the ROM registers provide register for memory and register transfer The memory we are talking now 15 the real ROM which contains ROM non volat
10. 4 disp e disp 56 57 APPENDIX D Register File Verilog Module module RegFilel x8bit rd rs data we clk dregl sregl dreg2 sreg2 TS Jud Sy input we clk input 7 0 data reg 7 0 mem 0 15 reg 7 0 dregl sregl output 7 0 dregl sregl dreg2 sreg2 always G8 negedge begin if we begin mem rd data end dregl mem rd sregl mem rs end assign dreg2 mem rd assign sreg2 mem rs endmodule APPENDIX E ALU and Conditional Codes Verilog Module module ALUandCCT add a CCG input add b sum cout valid insn ce COVENS input 7504 se ds output LIO Sum QUT Ub input valid insn ce input CE 07 assign add PSE Cen OOV 2 condition codes wire c_W assign 2 assign assign assign v LOG Coz COUE sum 0 Sum ze add eg wo W 7 e 58 Gilkey Sty eX 10 GEO Cen Go Cov 7 negative Carr y OUE overflow 59 always negedge clk XT rst lt 0 else if valid insn ce eiza CO ei eeta ei A 42704 CO d endmodule 60 APPENDIX F ROM Register Verilog Module module ROM16x8bit clk we data addr ROMoutl ROMout2 input clk we inpu
11. 91 08 X 3309 420 H 66 cr co 58 sc 804 22 X 3 4 ZA 99 Sv 90 ss K sv K SO X A 70 X EEA cv cO cc X ZAZULA DAA 9 33 Aag H 66004 6700 6000 8900 4 82004 8000 004 2300 2000 9900 9700 9000 5500 Sv00 5000 X 7000 7000 A 00 A EVOO A 000 2200 2200 X 2000 1100 17004 1000 0000 X 9100 3300 40204 H 1 6700 X 6000 8800 8200 8000 2 00 400 2000 9900 9v00X 9000 5500 SVOOA 5000 A ___ ___ 2200 2000 ___2200 AZIDOA URAL IOA 1000 0000 9100 300 X 3200 420 H 4 4 4 H 4 H a 4 K 00 39 39 A vs K vs A a 4 vs X o0 6 o0 Jl vs 00 K 29 AK c9 k vo A 00 4 o3 A 00 00 22 H H H H 4 H de B le 8 les dl 1200 4 0600 X 3200 3200 02004 3200 8200 4 vc00 4 6200 X 82004 2200 X 9200 62004 7200 X 200 4 22004 1200 4 02004 3100 3100 4 0100 4 DLOOA 8100 v LOO a 6 00 X 800 2100 X 900 51004 LOO ALO H LISOA 0800 4200 3200 X 02004 2200 X 8200 vc00 6200 8200 2200 9200 5200 X 200 2200 X 2200 1200 X 0200 31004 31004 0 00 2100 X 81004 KUL 6100 8100 21 00 91004 SUIA SUAIA H 0600 3200 X 3200 X 0200 2200 8200 A vz00 6200 X 8200 2200 X 9200 5200 7200 2200 X 2200 1200 0200 3100 3100 QLOO A 2LOO 8100
12. ARCHITECTURE 4 1 4 2 Central Processing Unit Instruction Set CHAPTER 5 PROCESSOR CORE DESIGN 5 1 Overview 10 11 12 13 15 16 16 18 20 23 vill 5 2 5 3 5 4 5 5 5 6 207 5 8 5 9 5 10 5 11 Architecture Overview Program Counter Instruction Memory Decoder General Purpose Register Operand Selection Arithmetic Logic Unit and Conditional Codes ROM Register PORT Universal Asynchronous Receiver Transmitter and Shift Register CHAPTER 6 RESULTS AND DISCUSSIONS 6 1 6 2 Results 6 1 1 Processor Core Simulation 6 1 2 Implementation of RFID on UART Discussions 6 2 1 Instruction Set 6 2 2 Memory Initialization File 6 2 3 Negedge Clock 6 2 4 RFID and Receiver Design System 24 25 26 27 28 29 30 33 34 34 37 41 43 44 44 45 CHAPTER 7 CONCLUSION AND RECOMMENDATIONS Recommendation for Future Works 7 2 Conclusion REFERENCES APPENDICES 46 47 LIST OF TABLES TABLE TITLE Table 4 1 Table 4 2 Table 5 1 Table 5 2 Six instruction formats Designed Instructions The decoder organization Instruction grouping PAGE 20 22 28 31 XI LIST OF FIGURES FIGURE NO TITLE Figure 1 1 Figure 3 1 Figure 3 2 Figure 3 3 Figure 3 4 Figure 4 1 Figure 5 1 Figure 5 2 Figure 5 3 Figure 5 4 Figure 5 5 Figure 5 6 Figure 5 7 Figure 5 8 Figure 5 9
13. file of GPR wire rf we jinsn 11 8 15 UMP 0 IE SRI CMP 0 valid insn ce amp lt r3t Figure 5 5 The GPR Write Enable Multiplexer 5 7 Operand Selection Reviewing the instruction set architecture of Section 4 2 we can see that there are several in instruction formats that influence the input of ALU For rr and ri all instructions have two operands one is either an immediate constant or the register selected by rd and rs and the other 15 the register selected by rd For memory register rm and mr it also has two operands one is the output from the memory and the other is rd or rs From there we can obtain operands a and b as shown in Figure 5 6 30 wire N D t LB CLE 55 D wire N D b LB ROMoute 11 ADDI SUBI MOV imm srege Figure 5 6 The Operand Selection Operand 15 the data from register file addressed by rd 0 The zero will be selected only when there is instruction of LB CLR or SB Operand b 15 the data from ROMout2 immediate value or rs data For LB which is loading ROM register data to GPR output from ROM register is needed For ri format and MOV instruction an immediate value is taken into ALU For all other instruction the rs data is taken from GPR 5 8 Arithmetic Logic Unit ALU and Conditional Codes With the 8 bit operand of a and b we can perform arithmetic and logic operation The add subtract and logic units operate concurrently upon two
14. gap count 5 h0FP endmodule APPENDIX I Control Program to Test Assembly Language 1 Control Program for Final txt Final txt Notepad File Edit MON MON MON MON MO Format View Help 11 ROL RDZ 01 03 Of 804 RD5 10 ROL 11 862 ROS ROS ROS ROS RD4 RD3 RD5 RD4 RD5 Final txt 66 2 Control Program for last txt NET REI Notepad File Edit Format View Help 1 PORTOLIT INIT 01 2015 checkpart RDL PORTIN pc ke process data ke ROM reg oop balik untuk check port GES ke 00 PORTOUT ROL mula start simpan data ke ROM rag RXDATAL RDI RDL BOK ROL RxDATA2 RD2 RBZ ROM RD3 ReDATAS RDS R gt xDATA4 RDA RD4 ROMA RD5 RxDATA5 RD5 ROS ROMS RDS RDS ROS ROME RD RD RXDATAS RDS RDS ROME RDG RXDATAD RDO RDS ROM 2010 RXDATALO 010 RELO ROMLO ROLL RXDATALL RDLL 2011 ROM11 ROL RXDATAL PEIO RDL ROML 00 Last txt APPENDIX J Control Program Hexcode Machine Language E FinaLtxt Notepad Final hexcode txt 68 D rak E EAK KEAK Hotepad File Edit Format View Help Ez 0 FFOL B016 YFOQ 1041 3111 2002 D522 GEZA 00 0643 5133 2004 UZU 5144 005 64 5 5155 006 646 5166 007 7 ole 10728 2188 009 0549 5199 DETA LAA UOB 5188 EDO
15. itu Bahasa rekabentuk perkakas projek ini di tulis di dalam bahasa Verilog mengunakan Altera Quartus II 9 1 Sebagai aplikasi teras pemproses ini rekabentuk ini akan di gunakan pada Pembaca RFID Teras pemproses ini telah menghasilkan keputusan yang baik berdasarkan suruhan data yang direkabentuk dahulu malah memenuhi semua kehendak suruhan data dengan baik Dengan ini dapat disimpulkan bawah teras pemproses ini adalah mudah tetapi sangat boleh dipercayai dan menjadi alternatif yang terbaik utk perekabentuk pemproses yang selalu mengubah kehendak rekabentuk TABLE OF CONTENTS CHAPTER TITLE DECLARATION OF THESIS ACKNOWLEDGEMENTS ABSTRACT ABSTRAK TABLE OF CONTENTS LIST OF TABLES LIST OF FIGURES LIST OF ABBREVIATIONS LIST OF APPENDICES CHAPTER 1 INTRODUCTION 1 1 1 2 1 3 1 4 1 5 Background Basis of Project Problem Statement Problem Objective Scope of Project PAGE 1V vi Vil X1V Vil CHAPTER 2 LITERATURE REVIEW 2 1 2 2 Ed 2 4 2D 2 6 2 7 2 8 Discrete Processor Hard Processor Core Soft Processor Core Customizable Processor Core Board level Configurability Multiple Processors Complex Instruction Set Computer CISC Reduced Instruction Set Computer RISC CHAPTER 3 PROJECT METHODOLOGY 3 1 22 3 3 3 4 3 5 Intoduction Design Flow Altera Quartus Verilog HDL Altera DE2 Board CHAPTER 4 INSTRUCTION SET
16. of Shift Register APPENDIX M APPENDIX N APPENDIX O Converter ASCII to 7 Segment Asynchronous Receiver System to catch 12 data bytes Visual Basic for RFID card verification XVII CHAPTER 1 INTRODUCTION This section gives an overview about the project such as the background and the basic idea of the project 11 Background Soft core processors on field programmable gate array FPGA chips are becoming an increasingly popular software implementation platform due to their custom logics A soft core processor 1s synthesized onto the FPGA s fabric On the FPGA device soft core processors have the advantages of utilizing standard mass produced and hence lower cost FPGA parts and enabling a custom number of microprocessors per FPGA over 100 soft core processors can fit on modern high end FPGAs as described by David Sheldon Rakesh Kumar Roman Lysecky Frank Vahid and Dean Tullsen 2006 FPGA soft core processors have the instruction sets arithmetic logic units register files and other features specifically tailored to efficiently use FPGA resources and can be reconfigured The re configurability of FPGA gives FPGA designers an advantage over ASIC designers They can tune develop debugging and testing the processor configuration much faster and more accurate using simulation to enhance the processor The flexibility of FPGAs provides unique opportunities in FPGA processor design A FPGA designer can change their FP
17. operands Then a multiplexer selects one of these destinations to write the result of the instruction Table 5 2 shows how ALU 15 being organized rf we signal 15 asserted if the result of ALU will be written back to the destination register Note that rf we us a register control signal While the weX signal is asserted only when it is SB instruction operation 31 Table 5 2 Instruction grouping ADI XXX SUB sB v AND AAR i MEMREG CLEAR CLR MOVE BRANCH JUMP BEG JMP The most straightforward way to code the ADD and SUB groups as shown in Figure 5 6 The add signal is from the processor core that is determined from the instruction such as ADDI and ADD The cout signal asserted is when there is carry out in the addition process as shown in Figure 5 6 assign icout sum add Guz Figure 5 7 ALU for ADD and SUB There are five logic unit operations here which are AND OR NOT XOR and XNOR To make it simple to be fit into design a multiplexer for five stages 1s needed to distinguish all five logic units operation The result of the each logic operation will be stored in a destination register The flow of how these operations are performed can be explained in Figure 5 5 32 OR NOT XOR XNOR WRITING Figure 5 8 The multiplexing stages condition codes assign sum D ff assign n sum 7
18. 0012 X 0002 0003 69 branch 70 jump 8 insn reg 7111 7222 7333 K 7444 7555 K 0101 1112 2053 K 3051 8034 K 0035 045 k FOO 4116 E006 A 002 K 7222 7533 12 88 TIE IR TZ A go 733 E rs EL EL AES IA 13 UR ELE SLS ITA ZA DL IRI HN 5938 steg2 00 AHA 00 K 00 K 00 K 00 K 2 K 55 ARA EA REA 00 K ALA ZAA 1 K 2 E rd SEIN GER GER SUS SUN GRIE EOI NIE BUE LL ISI GEIS T IK IS KITS SR ee 2112 E deg2 AHA 00 K 00 K 00 K 00 K T K 22 K 3 K 21 K u K 5 05 KEC K3 KO SULA K5 1 imm UK 22 K 3 KM K 55 K TO K TL K 09 K 09 X 03 X 03 K M K 0 X 00 K 12 X 1 X 00 K 00 X E 8 1111 373 E E B E 2 3 Km K T2 3 3 KEKEKE bus writing K 33 K SxS K 2 K T K S K cc K G KOKO K 22 KS KT KU an KT K 35 KES K 00 K 0 K EA o K H K 2 K 5 K 21 Kk 44 55 K D 0 895 Hb GINTZAZAN K K HI SEN S GE RA DK 93 E sum EE I ES RK 21 ET R SS KC KO K pp KOO K cC K KT JU KT Y 33 KB Q 8 Q 5 E Co S bea F s GO eno NN ek gt u 4 GO addr SERS BIS DES TIS BE LIS Hk E35 BE BS PR 0 ROMout2 EL EPT PT ELE EARE UE UL E EEL DUR U
19. 8 we can see that data received on the asynchronous receiver after shifted and converted is only STX 02h for few shift register stages while some stages contain no data at all Thus this verification of RFID card serial number on DE2 FPGA has failed and for the processor core to control the RFID reader cannot be proceed This must be due to fast multiple data transfer of 9600bps at 5 or shifted register that did not works well when implemented in real application This module is also done in slower baudrate and 25Mhz clock But the result on 7 Segment was worst There was nothing displayed This asynchronous receiver was done to receive one data byte it was correct but when it comes to multiple data transfer the data received is not correct It is also possible that the data shift register control signal to shift the data received is not enough one clock since if we see from example output waveform of RFID 12 bytes data RxD the receiver is on asynchronous mode where the data send 15 free of clock Then it makes the transition of each byte is not synchronized with clock while the control signal of shift register is synchronized with clock This problem can be solves by easier way using Universal Synchronous Asynchronous Receiver Transmitter USART where there are clock synchronizing the data transition 46 CHAPTER 7 CONCLUSION AND RECOMMENDATIONS This chapter explains about the project conclusion and the recommendation for future
20. A V LOO 6100 X 8100 100 9100 5100 EUA LOO ALO H 0000 sn gl 51699 51 61 31216 sn SO SAL fa 90 ZEGOK E RE Sunum FE yam y FF LLLLINO F8 gip FJ RE Z od peu zJ 2 EI OO E 3 al E TESTE US LJ amp iaa aaa T 26 8 EL 59 95 FA kona E E US E LE Jeg Jajse 4 j A2 se Figure 6 4 Timing simulation of Final txt control program 41 6 1 2 Implementation of Radio Frequency Identification RFID Mifare Type Technology and UART The processor core was intended to implement on RFID reader type to control the RFID reader Where the processor can save the data on ROM register that are sent out from the RFID reader But before the processor can control well the RFID reader it must be tested first whether the Receive UART of the DE2 FPGA receive correct data from the RFID reader The RFID actually has two main parts the RFID card and the RFID reader The RFID card contains 10 digits of decimal number as serial number This 10 digits will be sent out from RFID reader to with a format of STX 10 decimal digits data ETX 12 bytes This format is in ASCII when they are being sent out to UART of Receiver F
21. DE2 FPGA board with RFID reader connected the data received on UART is only STX which is 02 of ASCII is not as desired operation it was wrong Figure 6 8 shows result when the RFID card is touched onto RFID reader 43 Figure 6 8 Shifted and converted data of 12 bytes RFID card serial number 6 2 Discussion 6 2 1 Instruction Set 16 instructions set was designed It consists of addition subtraction load and store data from memory to register or register to memory branching and jumping and logic operation All instructions except JMP and BEQ are using data from Register File and memory data ROM register The most difficult instruction to design is BEQ because there are few steps need to be First it needs to check whether instruction executed before BEQ was zero high in conditional codes It can be executed The displacement value pointed by BEQ instruction in control program is in 8 bit format Then it will be extended to 16 bit to follow format of program counter The easiest instruction is subtraction and addition because these two operations just only need to consider two data from either ROM register memory or Reigster File After execution in ALU it will be write back into desired location 44 6 2 2 Memory Initialization File The control program at first 1s written in text format Then it will be converted to hexcode as processor core only understand machine language Altera Quartus has special function named MIF to init
22. Dus wroitrng d6eg2 2 a Sum GOX quu Oc OV addr ROMout2 weY PortOUTdataoutl eric pec Prod lt Proc input ls TSC input proc rst vector input input 7 0 RXDATA assign hit 1 parameter N 7 register MSB parameter AN 15 address MSB parameter IN 15 instruction MSB opcode decoding define ADDI 0 define SUBI 1 define ADD Op 2 define SUB Op 3 define LB op 4 define SB 5 define CMP 6 define MOV 7 define AND 8 define XOR 9 define JMP 10 define BEQ 11 define XNOR 12 define OR 13 define CLR 14 define NOT 15 inst mem mushy fetch next_instruction clk insn oUtpub 5201 reg mens wire 15 0 insn wire 15 0 next_instruction next_pc 77 DECODE PROCESS Decoder dec insn op rd rs imm disp Output 52040 pr OUtDUE Poo 356 S290 as 1740 pnm Sur i Register file wire N 0 dreo2 sreo2 wire valid_insn_ce hit amp insn_ce I4 l1 the x nmstructlon Ls valid assign en valid ce 50 51 wire rf insn 11 8 15 JMP D 2 0 valid insn ce amp rst RegFilelox bit regfile rd rs writing rf_we clk dregl
23. GA processor configuration whenever design requirements change An ASIC designer cannot change their ASIC processor configuration without creating a new ASIC Jari Nurmi 2007 explained in his book the challenge of FPGA processor design 15 to accommodate the different relative performance of FPGA resources like logic elements RAMs multipliers and routing because not all FPGA manufacturers produced the same chip for their FPGA and there is wide range of FPGA chips nowadays However soft core processor has the disadvantages of reduced processor performance higher power consumption and larger size Processor design is not rocket science and is no longer the exclusive realm of elite designers in large companies Jan Gray 2000 said FPGAs are now large and fast enough for many embedded systems with processor core speed in the 33 100 MHz range HDL synthesis tools and FPGA place and route tools are now fast inexpensive and open source software tools help to bridge the compiler chasm 12 Basis of Project The idea behind this project is to design a processor core that will be implemented on FPGA Then with dedicated pins as PORT the processor core can interact with peripherals such as RFID reader Processor Core gt FPGA gt Peripherals Figure 1 1 Basis block diagram of the proposed system 13 Problem Statement A microcontroller like PIC from Microchip usually uses 40 of the total instructions making the other 60 of th
24. OF OOK _ o Ko u Ankoku Mook ELA IN OA SAO ROR A 0 LAGA IZA OR TA PA 0 NOK EA OX SA OR OA 0 394394 00 SSL KL 02 vs A 0 or B00 A v9 v9 X 00 29 c9 LARA vo k 00 o3 X 00 F_00 20 r9 X o0 cc A ZALA 39 39 k 00 K vo K v9 00 v9 A v9 A o0 or or 00 v9 A vs X 004 c9 X 23 00 A vo 70 X 00 A 03 A 00 eo Azo K 00 _00 o0 X o0 K oO A OO K oo X oo o0 00 00 oo 00 00 K oo 00 K OO OO 004 __ 00 ZL ADI NEA 00 X SL X ZA 00 zt X 39 9 K v9 o0 st v9 A 004 vt O7 X MA ct v9 A 00 X zt zs A BAILA LA 00 A 03 X 10 30 420 AHA 00 A zc c9 A o0 K 39 A co K o0 v9 00 K o0 v9 o0 X o0 00 A v9 X 004 o0 AWK 00 z A00 KEA LAS AZALA ZAA a ARANA Kg KA AEA LA AZ SASA a ig Kn ZAZ K 00 _00 2 00 AGA o0 X 00 X vs K 00 A 00 v9 LA oo A NALA LARA MA 00 X 29 K 00 _00 K roA 004 00 0 GL OKU ALS 07000 070 65 2 KA AEA LA LASA AUSA LAZA LAZA EA LAZA EA LAZA EA LAZA EA LA LLA B A AL 6615 4 67904 6003 X 88154 82 04 8003 X 21154 2390 4 7003 9915 9790 9003 99154 5790 A S003 77134 70 0 X GIGA CE LS Y 2790 003 22152 2290 2003 1115 17004 10034 0037 4
25. T S ECC Last hexcode txt 70 APPENDIX K Timing Simulation of Control Program Control Program for Final txt 1 2000 X zuo Y 100 m amp l mai mm pu 3 E ETD w ETDE Amo d ug Bl A EO X E X sx E rmi 06 ooe mow sowe 05 wooer woo dog 5 BEL usu jr vxo awn X zo X Lun X om X Y 10 X 200 4 1000 X O sq oy bad ido 5 aladel GOIO m a 71 2 Control Program for Last txt GE 4197 ED l tar A v9 10 A404 10 00 EU E T KE K 00 o0 A o0 K 00 o0 ALA oo oo Ko K OF ZZ LA BAZALA 0 6 JEAN ALAIA LAZ 0 SA 0 J rR ON ORE AOR OKRA OL OX ROR 0 EA v9 A_00 K ZA 00 394394 00 A v9 K K o0 v9 v9 00 0 0 f oo X v9 K vo K 00 29 cs k 00 A 70 vo 00 03 004 00 20 NEA 00 22 4 ce A 00 39 39 0 K vs K vs 400 K vs X 0 or MA o0 K vs K vs A 00 X c9 cs vo K 70 00 X o3 00 00 20 00 22 ze A 39 A 39 00 v9 00 K v9 K v9 00 oc X oo vs K v9 004 c9 X 00 A v0 A v0 X 00 03 00 20 20 K 00 00 oo K oo K oo K o0 00 X OO X oo
26. TA EG 00 m 193 rdy 0014 i gt 197 wey m 198 8 0 29207 PortOU T dataout 00 BJ LU E Figure 6 2 Timing simulation of Final txt control program 39 E Last txt Notepad File Edit Format View Help 01 PORTOUT O1 RDLS checkport RDL PORTIN 16 pc ke process data ke ROM reg pc ke loop balik untuk check port 00 PORTOUT RDL mula start simpan data ke ROM reg RXDATAL 01 01 ROML RDL RXDATA2 02 RD2 ROM2 RD3 RXDATAS3 RD3 RD3 ROM3 RD4 RXDATA4 RD4 RD4 ROM4 RD5 RXDATAS 05 RDS ROMS RDG RDG 206 ROME RD RXDATA RD 207 ROM RDS RXDATA8 RDS RDS ROMS RDO RXDATAS RDO 209 ROMS RDLO RXDATALO RD10 2010 ROMLO 2011 11 011 011 ROML1 RD12 RXDATAL2 012 8012 ROML2 00 Figure 6 3 Last txt control program Then the processor core was tested with second control program named Last txt The Last txt is actually to show the process when UART detects 12 bytes data from RxD asynchronous receiver RXDATA Then these 12 data bytes are moved into Register File first and again moved to ROM memory From Figure 6 4 we see that data are moved correctly RXDATAI RXDATAI2 taken from shift register to ROMI ROMI2 memory 40 4 4194 ABALO AWA A UA LOU LAKA i KK 00 KOL ADA o0 X oo K oo o0 K O
27. able 4 1 Six instruction formats jmp These kinds of formats are simple and useful for beginner of FPGA soft core processor For the first format rr is the instruction that handles register and register operation The instruction bits from 12 to 15 are the opcode section where all instructions have different values of opcode so that we can differentiate each instruction The instruction bits of 8 11 are set to 0 so that we can differentiate rr format with other instruction formats Then there are two kind of registers here rs is 21 the source register whose address is represented by bit 4 7 and destination register whose address is represented by bit 0 3 These two kinds of registers hold the data for the ALU operands to operate For example ADD instruction the result takes data from rs and adds with rd data to complete the arithmetic operation Then it will write back the result to rd For the second format there 1s no much different Instruction bit from 4 1011 changed to immediate value and rd 15 still the same The best example of second format is ADDI where an immediate value is added with data register value and stored into rd For rm and mr format I specially designed to let data of GPR interact with data memory register So load store data can be done easily here But there is one limitation here when we are using SRAM SDRAM and Flash memory on Altera DE2 board only 8 bits data can be accessed at one time The value of
28. can work nicely It is possible for mere mortal to build a compact reasonably fast embedded processor or even a complete system on a chip in a small fraction of a small FPGA if the processor and system are designed to make the best use of FPGA 48 REFERENCES Dr Mohamed Khalil Hani Digital Systems VHDL amp Verilog Design 295 edition UTM Skudai Malaysia Prentice Hall 2007 D Sulik M Vasilko D Durackova and P Fuchs Design of a RISC Microcontroller Core in 48 Hours Bournemouth University UK Jari Nurmi Processor Design SoC Computing for ASICs and FPGAs Tamper University of Technology Finland Springer 2007 Don Arbinger and Jeremy Erdmann Designing and Embedded Soft core Processor The Plexus Technology Group 2006 Jan Gray Designing a Simple FPGA Optimized RISC CPU and System on a Chip Gray Research LLC Bellevue 2000 Yap Zi He Building A RISC Microcontroller in an FPGA UTM Skudai 2002 David Sheldon Rakesh Kumar Roman Lysecky Frank Vahid and Dean Tullsen Application specific Cuztomization of Parameterized FPGA Soft core Processors 06 San Jose 2006 DE2 Development and Education Board User Manual Version 1 4 Altera 2006 Gareth Knight CISC vs RISC http www amigau com aig riscisc html 49 APPENDIX A The Processor Core Verilog Module module 18 clk Sty RXDATA insi reg Op mm
29. cess from assembly language to hexcode is manually there is no assembler 3 3 Altera Quartus II Quartus II is free software provided by Altera It has many functions thus it will be used to design the processor core where it has its own compiler simulator waveform editor and programmer Includes block based design Design Entry system level design amp software development Power Place amp Route Debugging B Engineering Timing Change Analysis Management Timing Simulation Closure Programming amp Configuration Figure 3 3 Quartus Design Flow 16 34 Verilog HDL Verilog is a hardware description language HDL used to model electronic systems It is used to design verify and implement digital logic chips at the Register Transfer Level RTL of abstraction The Altera Quartus 2 compiler uses the Verilog 20001 standard 3 5 Altera DE2 Board Pdr Figure 3 4 Altera DE2 FPGA Board The FPGA device used in my design is from Altera DE2 Board where the device is Altera Cyclone II 2C35 The DE2 board has many features that allow me to implement a wide range of designed circuits The features of Cyclone 2C35 that related to my verilog system design are 17 33 216 Logic Elements LE 105 M4k RAM blocks 483 480 total RAM bits 35 embedded multipliers 4 Phase locked Loop PLLs 475 user I O pins 18 CHAPTER 4 INSTRUCTION SET ARCHITECTURE This chapter gives the inf
30. concerned because afraid the design does not meet the requirement of FPGA device Then to use the FPGA to control of RFID the I O RFID reader and what being data contains in RFID card must be covered to get correct data extracted After these 3 stages done the design of processor core can be started 3 2 Design Flow Technical Specification Architectural Specification Instruction Set Abstract Instruction Set Design and Coding Architecture Design FPGA Implementation Testing Complete System Figure 3 2 Processor Core Design Flow Figure 2 above shows the processor core design flow The design flow can be divided into 2 main parts first is the microcontroller design with Verilog HDL and second is the FPGA implementation 14 There are eight stages of design flow in order to complete the processor core design The first step is to define the technical specification by capturing the requirements for the processor In my case of project this 15 a general processor core yet it can be used to control specific device such as RFID reader Then the next step is to define the microcontroller architectural specification from which way you want the microcontroller it has to be such as Universal Asynchronous Receiver Transmitter UART The next big step which leads the processor core strength is to has an abstract of the instructions or prototype instructions that support efficient execution of the known algorithms We als
31. d Muhammad Juffri for their advices and giving a helping hand which greatly making this project and thesis a reality ABSTRACT The design of processor core on a FPGA board nowadays is not a rocket science and it is very popular due to their advantages which ASIC does not have This project is mainly on designing the FPGA based processor core using Altera DE2 FPGA Cyclone II device This work will only cover the complete methods of designing the instruction set and its architecture of the processor Meanwhile the hardware design language of this design is written in Verilog using Altera Quartus II 8 1 As an application of this FPGA based processor core this design was intended to control a RFID Reader The processor core produced a good result on the instruction designed where all of it follows all the desired operation It can be concluded that the processor core design is a simple but reliable and as a great alternative to processor designer where design requirements always change ABSTRAK Rekabentuk teras pemproses di atas FPGA sekarang ini bukanlah sesuatu yang luar biasa lagi tambahan pula 1 sangat popular kerana kelebihan dan fleksibilitasi 1 mana ASIC tiada kelebihan tersebut Projek hanya meliputi rekabentuk sebuah pemproses di pengkalan FPGA mengunakan Altera DE2 FPGA Cyclone Kerja Inti hanya menyentuh merekabentuk sepenuhnya sebuah teras pemproses berdasarkan suruhan data yang telah siap dibina Sementara
32. e instruction since it is not used for the control program coded More complex microcontrollers require more transistors and design time making them more expensive to manufacture With proposed soft core processor we could now use the processor to run the control program such as RFID reader with the reduced instruction set 1 4 Problem Objective From the problem statement I have come up with an objective The objective of this project is to design a soft core processor with a set of instructions and a few peripherals to act as a microcontroller processor core on a FPGA board where it can control a program for targeted device a RFID reader 1 5 Scope of Project The aim of the project is to design a soft core processor that can run control program for the RFID reader The soft core processor must be able to fit into a targeted FPGA device which 15 the Altera Cyclone 2C35 provided on Altera DE2 Education Board The HDL used to write the processor core is Verilog HDL CHAPTER 2 LITERATURE REVIEW This chapter consists of compilation of researches information articles and theories done on specific parts or components or system that make up the whole project This chapter highlights the basic concepts and fundamental theories of each chosen parts 2 1 Discrete Processor A discrete off the shelf OTS microprocessor solution is the traditional approach that designers have used These types of processors are available from a
33. ialize desired data and then it can be called So to make it easier to understand this MIF will be used to store the control program hexcode Then it will be called using function readmemh in specific design module In my design case the MIF of control program hexcode 15 called in module inst mem mushy 6 2 3 Design of Negedge Clock At first the design was using posedge clock design for all verilog design related to clock At that stage the design and simulation is in functional simulation The problem using functional simulation 15 that operation like read and write for Register File is wrong Then when simulated in timing simulation I can see there is a gap when posedge clock and instruction executed thus it affects read and write for register file where its operation is late for one clock This will make the read and write take the wrong address Then I overcame this problem with changing all posedge clock condition to negedge condition This action made the read and write for all process not only for Register File are correct Figure 6 9 shows the correct operation for read and write 45 Figure 6 9 Negedge clock solve the timing problem for read and write 6 2 4 The RFID and Receiver Design System The design module consists of asynchronous receiver 12 stages shift register convertor and 7 Segment to retrieve 12 data bytes and to display However the system does not work From Figure 6
34. ile data The non volatile refer to the content of memory which remains even though the power is plugged off The instructions using this ROM register are LB and SB 34 5 10 PORT There 15 one 8 bits one directional output port in the design This is because the port is intended to show some output such as LED The output has its own register at RTL design The data bus is connected directly to this register When writing a data to the PORT data is received from the data bus and the write signal to the respected PORT s register is asserted Figure 5 10 shows the portion of program control coding how to write data into PORT register MOV 11 RD1 MOV 01 PORT ADDI 03 RD4 Figure 5 10 Portion of Program Control Coding 5 11 Universal Asynchronous Receiver Transmitter UART and Shift Register The UART 15 implemented for RS 232 communication usage Figure 5 11 shows RS 232 communication is asynchronous This means the clock signal is not sent with the data Each word is synchronized using its start bit and an internal clock on each side keeps tabs on the timing For this design format 9600 8 N 1 is used The diagram above shows the expected waveform from the UART when using the common 8 1 format 9600 8 N 1 signifies 9600 baudrate 8 Data bits No Parity and 1 Stop Bit The RS 232 line when idle is in the Mark State Logic 1 A transmission starts with a Start Bit which 1s Logic 0 Then each bit is sent down the line one at a ti
35. ing text will briefly introduce the whole system The system can be divided into four stages the fetch stage execute stage and write back stage Fetch stage 1s in charge of fetching the next instruction Decode stage is in charge of decoding the 16 bit instruction to several parts The execute unit is used to execute the instruction and write back is to write the result of execution into desired destination The flow can be illustrated by Figure 5 2 5 3 Program Counter The first instruction is at the reset vector address The following instructions can be categorized into three types sequential instruction branch instruction and jump instruction 26 For then branch instruction first it will consider the conditional codes of the instruction For BEQ it will consider the zero flag Z If the Z is valid or high branch will be taken and PC is incremented by the displacement value coded JMP instruction is much simpler It will go straight to effective address pointed by the displacement value coded Both displacement values need to be size extended because instruction memory size is 16 bit Other than branch and jump execution continues with the next sequential instruction The current instruction address will be added up with one to move to next instruction in next cycle 5 4 Instruction Memory As its name suggests instruction memory 15 the place to store the instructions in order to be executed in the processor Then the in
36. instruction bits 8 11 15 set to 1 for rm and mr format For branching instruction it depends on the conditional codes of ALU operation For BEQ if the flag of zero in conditional codes goes high after ALU operation then branching occurs displacement disp values will decides where it will branch to While for jump instruction JMP there is no conditional codes need to be considered It will go directly to values of disp The instruction bits of 8 11 15 to differentiate JMP instruction with other instructions 22 Table 4 2 Designed Instructions Sm em i hop SB 23 r _ i i _____ 2 m 1 j mmoreg 1 0 2 AND 1 3 r Bor gt xo jer e XNOR 1 mP jumpto imp ma b CLR clear reg L s bk bo ba e e Table 4 2 shows all instructions designed in this processor core Note that all instructions are one cycle but they are not in pipelined way but in ordinary sequential implementation Thus this processor is a bit slower than pipelined processor 23 CHAPTER 5 PROCESSOR CORE DESIGN This chapter explains about the full design of the processor core according to designed instruction set in previous chapter 5 1 Overview The design of the processor core is divided into se
37. instructions to only those most frequently used the computer would get more done in a shorter amount of time The term short for Reduced 11 Instruction Set Computer was later coined by David Patterson a teacher at the University of California in Berkeley 9 These are the features that are associated with RISC 3 e Provides basic primitives not complete solutions such as instruction This leads to the reduced instruction set 3 e Orthogonality and regularity in the instruction as much as possible 3 e Single cycle execution of most instructions 3 Easy to pipeline 3 e A lot of general purpose registers GPR 3 e Arithmetic and logic operation are done for register operands or immediate the load store architecture principle 3 12 CHAPTER 3 PROJECT METHODOLOGY This chapter elaborates the procedure of the whole project corresponding to the objective of the project 3 1 Introduction Literature Review Figure 3 1 Project Methodology Flow 13 The first stage of this project was literature review A lot of papers and books gave good information to learn how to design a processor core Such as Digital Systems Verilog Design by Dr Mohammed Khalil Hani Design of a RISC Microcontroller by D Sulik M Vasilko D Durackova and P Fuchs and others Then the next stage 15 FPGA The FPGA also influenced the design because the processor core is implemented on FPGA The device used must be
38. ional IO that 15 needed whereas an ASIC solution restricts the IO use to what is on the IC no expansion is available 4 An SoPC solution also provides more options with prototyping possible solutions without a significant change to the hardware This benefit in flexibility 158 mostly realizable when using hardware provided in the form of a development kit from the vendor 4 2 6 Multiple Processors More complex embedded systems could benefit from the use of multiple processors to decrease the execution time by executing tasks in parallel Soft core processors and their accompanying toolsets can make the task of implementing multiple processor cores that interface with a common set of peripherals much more feasible and appealing to designers Also there are not any additional BOM costs for adding a soft core processor in an FPGA as long as there is enough space in the FPGA for the implementation The only restriction on how many processors can be in an SoPC is the logic available in the FPGA Therefore when using an SoPC in a design where more parallel processing 15 required adding another soft core processor 15 a viable solution that does not impact hardware significantly For the most part the soft core design process is not too different from any other embedded development Perhaps the only major differences are additional roles that may not be found in other development approaches and the stress on continuous communication be
39. lwaysQ begin case a 7 90110001 100110010 T bOLIOO0ll TF DOLLOLOG TT DOLLO LOL T DOL ERIO TED OL LOO 1 7 90000010 7 90000011 7 90001010 L bOOQlLlOl default endcase end endmodule Z Z DUBII TII GZ DO 10010027492 7 50110000777 3 T U500 71001029 75 7150000010 6 7247 7 5000000057 8 7 b0011000 9 IOO deek 77E LUS0DOOTIOSZZLE z 73 74 APPENDIX N Asynchronous Receiver System to catch 12 data bytes fen Mid SZA EEA i T mq qI aw EE BERNAM KANGSAR qw qk q q q S qq q q q qq q s s q qq q q q q q q q q q qaq Mid i I d i zk zk zk ek eko zk zk ak ak ako ako zk ek b zk zk zk zko zk b ek ko zko zk zk dk zko zk zk sme s la n m e mim ls Mid CIES GEOG ECA q qq T racer vor T T qa TER Tq CG ROT T A CEOS CG AE A RT ECG ROT ROG Ree Red Ded E EC Nd ccc lg ZZA Nid x S Os GEEAE babar 5 25 LEN Mid 7 OSES ROR D BAU NAN WZA x E E E EE EE roa d AAA EZE AYALA Wa Sa talak elizki ti a Sil NA RASA a a AA AA AREA aaa a 979 e 3 S db a Eid cq bed Bea qaqta q CAR A Pe Eco Cd kuq eq Pod en bia bea
40. mance In this case a higher performance version of the processor would have more pipeline stages increasing throughput This offers more flexibility to the user 4 It is important to note that along with a performance increase there will also be an increase in the amount of logic elements that the processor will consume or the amount of memory that the processor will consume leaving fewer resources for peripherals and custom logic 4 On a higher level of complexity a designer can take the source code for the processor core and modify it to meet the needs of the application Being able to modify the source code to the actual processor core offers the greatest of flexibility that one can obtain Not all vendors will offer source code for their soft core processor solution sometimes the core is encrypted 4 2 5 Board level Configurability Using a System on a Programmable Chip SoPC solution also offers flexibility external to the FPGA A discrete microprocessor solution has a fixed pinout sometimes making routing difficult Since an SoPC exists in an FPGA the pinout is flexible This gives the board designer almost complete freedom with component placement provided the FPGA still meets the timing constraints with the final pin placement 4 Another benefit is that there are more GPIO available in an SoPC solution compared to a discrete microprocessor The FPGA can be scaled up in size if necessary to accommodate any addit
41. me The LSB Least Significant Bit is sent first A Stop Bit Logic 1 is then appended to the signal to make up the transmission 35 Logic TU 5 Logic 0 Figure 5 11 TTL CMOS Serial Logic Waveform A serial interface is a simple way to connect an FPGA to a In this design there is only asynchronous receiver needed It takes an RS232 signal RxD from outside the FPGA and de serializes it for easy use inside the FPGA Figure X shows the I O of the asynchronous receiver Azwnc Receiver Figure 5 12 The asynchronous receiver I O The implementation of asynchronous receiver works like this e The module assembles data from the RxD line as it comes e As a byte is being received it appears on the data bus Once a complete byte has been received data ready is asserted for one clock For the oversampling the date an asynchronous receiver has to somehow get in sync with the incoming signal but it doesn t have access to the clock used during transmission since this is asynchronous format Receivers oversample the incoming signal at 8 times the baud rate At 9600 bauds that gives a sampling rate of 76800Hz 36 The baudrate and the clock for the asynchronous receiver have been parameterized for easier future usage as shown In Figure 5 13 parameter ClkFrequency 50000000 50MHz parameter Baud 9600 Figure 5 13 Parameter Values This shift register was intended to shift 8 bit data for 12 stages
42. multitude of vendors and have a wide range of features A discrete microprocessor 15 implemented as an ASIC with a specific peripheral set along with the processor core 4 Selecting a discrete processor that meets the application s cost and functional requirements can be a time consuming process There are times however where an OTS processor solution will not meet those requirements An example would be an application that requires custom logic or a significant amount of peripheral functionality that is not available in a discrete solution In this case the logical place to look is at a processor and peripheral set that can be tailored to the application and included with the custom logic that 15 needed for the application 4 2 2 Hard Processor core A hard processor core is different from Discrete Processor and Soft core Processor because it has dedicated silicon on the FPGA The dedicated silicon allows it to operate with a core frequency and have DMIPS rating similar to that of a discrete microprocessor The benefit of a hard core provides is that it exists In environment where the surrounding peripherals can be customized for the application The hard processor core does not provide the ability to adjust the core for the application nor does it allow for the flexibility of adding a processor to an existing design or an additional processor for more processing capabilities In addition only specific FPGAs will have the o
43. o have to figure out how we can address the operands and what type of data we will processing the instructions In a simple case the operations can be sketched by looking at the algorithm descriptions in a more complicated case some profiling is need to find out how frequently some operations operation patterns or common subroutines are executed Then we can design the instruction set and the coding Then we must start capturing the organizational architecture This can be accomplished by pen and paper methods with spread sheet calculations of cycle counts etc independently of the method used the estimation of the foreseen implementation based on the architectures explored is the key importance FPGA implementation is performed by downloading the design into the targeted FPGA devices we have to make sure the device used 1s sufficient with our microcontroller design from view of architecture such as ROM and RAM After done selecting the correct device and downloaded it to FPGA the FPGA implementation testing in real physical environment can be done by running the control program for the RFID reader But before the microcontroller is downloaded into FPGA the control program for RFID reader must be written such as to load data extracted from 15 the reader to ROM register The control program at first is assembly language format then I have to convert it to hex code so that the processor core can proceed to do its works The convert pro
44. on Set Computer DDR Double Data Rate DMIPS Dhrystone Million Instructions Per Second FPGA Field Programmable Gate Array GPIO General Purpose Input Output GPR General Purpose Registers HDL Hardware Design Language IC Integrated Circuit IR Instruction Register MIF Memory Initialization File OTS off the shelf PC Program Counter PCI Peripheral Component Interconnect PIC Peripheral Interface Controller XV RAM Random Access Memory RFID Radio Frequency Identification RISC Reduced Instruction Set Computer ROM Read Only Memory RTL Register Transfer Level SDRAM Synchronous Dynamic Random Access Memory SRAM Synchronous Dynamic Random Access Memory SoPC System on a Programmable Chip UART Universal Asynchronous Receiver Transmitter USART Universal Synchronous Asynchronous Receiver Transmitter APPENDIX A APPENDIX B APPENDIX C APPENDIX D APPENDIX E APPENDIX F APPENDIX G APPENDIX H APPENDIX I APPENDIX J APPENDIX K APPENDIX L LIST OF APPENDICES Processor Core Verilog Module Instruction Memory Verilog Module Decoder Verilog Module Register File Verilog Module ALU and Conditional Codes Verilog Module ROM Register Verilog Module Port Verilog Module Asynchoronous Receiver Control Program to Test Assembly Language Control Program Hexcode Machine Language Timing Simulation of Control Program Twelve Stages
45. oo A 00 A oo K 00 X oo OO 004 _ 00 420400 NELA 00 X et 39 v9 00 SL X v9 00 X 00 ct v9 X 00 ct 29 X 00 A 1L 70 X 00 03 10 30 Azo LMA 00 K 39 20 o0 9 00 00 K v9 K 00 X BA MALA 00 oo X o0 K c9 00 _00 A PKR 0 A co A00 KEA LA AZALA ZALA SAKA LA EA KA AEA LARA SAKA AZA ZA LA KEA LAA KA d AT MILA 00 z X MARAGALL o0 4 vs K LA o0 X v9 K o0 oO 4 KALA v9 4 00 X 00 K 29 00 00 70 004 00 00 A zo ESATE GEZAK 3 66154 6790 60034 8815 4 82 0 80034 ISA 2390 4 4003 4 9915 9790 4 9003 4 5915 5790 A 90034 EIZA 70 0 7003 CE LG A v90 2003 2215 4 2230 20034 1115 700 1003 0037 91084330940 H 66 cr X 60 58 sc X 804 21 X 3 X 0 99 90 ss K sv K S0 vr X ce cv A co X cc A cc d co X HAREA DA 004 9 33 Aag H 6600 4 6700 6000 8800 8200 80004 00 L300 4 000 X 9900 9v00 9000 5500 SY00 9000 7000 7000 2200 X v00 4 000 2200 2200 X 2000 1100 17004 L000 0000 X 900 4 3300 420 H K 6700 6000 8800 X 8200 8000 00 2300 X 000 4 9900 KA 9000 5500 SY00 X 5000 EEA EDU AEEOO 2000 2200 AZIDOA 1100 LF00 X 1000 X 0000 X 9100 X 34004 3200 AZO H 1200 0800 X 3200 4 32004 02004 ZEIA 82004 vz00 6200 4 82004 2200
46. or the verification of RFID card 10 digits data a Visual Basic 6 GUI has been made to display the data but the RFID reader must be connected DB 9 RS 232 on the back of computer instead of DE2 FPGA The format of RS 232 in the VB6 GUI is 9600 8 N 1 This is the output on Figure 6 5 when touch the RFID card to RFID reader when it is connected to computer im Mifare ID 1585579260 Clear ID Figure 6 5 VB6 RFID card verification The coding for VB6 GUI of Mifare ID is in Appendix The example output waveform of the RxD RFID reader when connected to oscilloscope in the Figure 6 6 below From the Figure 6 6 we can see the 12 bytes of RFID serial number being sent out 42 Figure 6 6 Output waveform of RxD RFID Reader Thus this 12 bytes data must be received by the RxD asynchronous receiver of UART on FPGA These 12 bytes data must be shifted for 12 stages for RXDATAI RXDATA12 then converted to normal hexadecimal of ASCII since the 12 bytes of 8 bit the LSB bit 15 none parity which we do not use as the character of 10 digits of the RFID card serial number REID card touch the REID reader 12 bytes data sent out RxD UART 12 stages of Shift Register Conwert Display on Figure 6 7 Flow of RFID card serial number verification DE2 board Figure 6 8 shows the flow when the RFID reader and DE2 board connected together for verification Unfortunately when UART implemented on the Altera
47. ormation of the proposed processor core design from the view of processor instructions 4 1 Central Processing Unit CPU Multiple operands Instruction complete Return for string fetch next instruction or vector data Figure 4 1 Processor Flow 19 The operation of a processor core 15 rather simplistic in nature it repeatedly fetches an instruction from memory decodes it executes it and then return to the fetch cycle and fetch the next instruction The next instruction to be executed is normally the next instruction sequence in memory So Figure 4 1 shows the processor core flow of my design This processor core of this design 1s non pipelined For the initial condition of my processor the instruction fetched out is the first instruction inside the instruction memory since the address of the instruction is zero program counter PC The instruction fetched is copied from cache into instruction register IR After fetching an instruction the instruction will be decoded From the decoded instruction the processor knows what to do next what operands address to calculate and what arithmetic operation is performed For example ADD instruction adds the contents of source register and destination register and place the result into destination register Prior to that the values of both source and destination registers need to be fetched out from the register file to perform the addition While for ADDI operation da
48. parameter Baud8GeneratorAccWidth 16 wire Baud8GeneratorAccWidth 0 Baud8Generatorlnc Baud8 Baud8GeneratorAccWidth 7 C1lkFrequency gt gt 8 ClkFrequency gt gt 7 reg Baud8GeneratorAccWidth 0 Baud8GeneratorAcc always posedge clk Baud8GeneratorAcc Baud8GeneratorAcc Baud8GeneratorA ccWidth 1 0 Baud8GeneratorInoc wire Baud8Tick Baud8GeneratorAcc Baud8GeneratorAccWidtnh reg 1 0 RxD sync inv always posedge clk Bauder ick lt RxD sync any eps we invert RxD so that the idle becomes 0 to prevent a phantom character to be received at startup reg Sk Grt reg RxD_bit_inv always posedge clk if Baud8Tick begin TTC RZD Syne any RxD idv lt RxD cnt 125v c 2 bls else IERD Syne qv ue RU gt RxD cnt inv lt RxD ont Inv 2 bls if RXD cnt inv 2 b00 RxD bit inv lt LT EO else if RxXD cnt inv 2 b11 RxD bit inv lt 1 bl end reg 3 0 reg 3 0 state bit spacing 64 next bit controls when the data sampling occur depending on how noisy the RxD is work better with a clean connection work different values might values from 8 to 11 wire next bit bit spacing 4 d10 always E posedge clk if state 0 else bit spacing lt 4 b0000 if Baud8Tick Dit spacing lt 4 50001 always E posedge clk
49. ption of having a hard core therefore the choice of vendors and FPGAs are limited 4 2 3 Soft Processor Core A soft core processor solution is one that is implemented entirely in the logic primitives of an FPGA Because of this implementation the processor will not operate at the speeds or have the performance of a hard core or a discrete solution In many embedded applications the high performance achieved by the previous two processing options is not required and performance can be traded for expanded functionality and flexibility Soft core processors may be appropriate for a simple system where the only functionalities are the manipulation of GPIO General Purpose Input Output Moreover they may also fit a complex system where an operating system is incorporated and interfaces include Ethernet PCI Peripheral Component Interconnect and DDR SDRAM Double Data Rate Synchronous Dynamic Random Access Memory and any other custom IP 4 2 4 Customizable Processor Core A soft core processor also offers the flexibility of tailoring the core itself for the application There are a few different levels of how this can be accomplished depending on the vendor On one level things such as cache size can be easily adjusted 4 Most toolsets offer the option to configure different cache sizes to suit what the application requires A vendor may also offer different versions of the processor that have varying levels of perfor
50. sregl dreg2 sreg2 TEST DECODER IS OK wire 7 0 bus writing MOV amp amp insn 11 8 8 RXDATA CLR sreg2 XNOR 2 COR alb 2 lt a AND 2 asb XOR 2 sum output 7 0 dreg2 sreg2 output eso pus wetting Operand Selection wire N 0 a SB 2 0 dreg2 wire N 0 b OB 2 ROMOUEZ X d OEE I SUBIT MOV 2 imm sreg2 EXECUTION PROCESS ALU and conditional codes wire add SURI GURI CMP ALUSnaccT add Sed out Valid insi Ge Clk St ex CO Cov Output D output INO sum CCO COV ROM 52 wire 3 0 addr SB rs LB rs 0 wire insn 11 8 15 0 SB 1 0 ROM16x8bit ROM clk writing addr ROMOULI ROMout2 output output 270 addr output 770 PORTout port wey cik PortOUIdataoutl PoreOUlTdatacucZ wire weY amp insn 11 8 15 1 0 wire 7 0 dataY insn 7 0 output output 7201 PorbtOUIdataouti PROGRAM COUNTER AND INSTRUCTION FETCH conditional branch for BEQ reg T always negedge clk begin Bei eb 6 O07 t 1 else t 0 end 53 assign out1111 insn 7 0 disp assign branch hit amp E output Yu s output eie ebek o tpu
51. struction fetched from the memory will be loaded into Instruction Register The IR will only latch the new instruction in if the HIT signal is asserted HIT signal is a signal to verify whether the instruction is valid or not For simplicity HIT is always asserted high The instruction that will be fetched out depends on the address of PC from previous module The data fetched out is the 16 bit instruction and will be decoded in Decoder Module The size of the instruction memory can be changed to desired size in the Verilog module of Instruction Memory 21 1 2 3 a5 5 F E 5 H b b b z E m Bw Pi bk H 1 Figure 5 3 Instruction hexcode The program control coding of processor core 15 at first in the form of assembly language stage for programmer s view Then it will be converted into machine code in order to allow the processor to work with it In this case there is no assembler so I have to convert it manually Figure 5 3 shows how the instructions arranged in the Instruction Memory and the address of PC in instruction memory started after 20 The format in the instruction memory is in hexadecimal format 5 5 Decoder Decoder is playing an important role in the processor core This is a place where the instruction will be decoded into several parts The input of the decoder is 16 bit instruction from the Instruction Memory There are five parts here as we can see in Table 5 1 28 Table 5 1
52. t 7 0 data T3290 addr output 720 BOMOULtZ reg 7 0 mem 0 15 reg 7 0 ROMoutl always 8 negedge clk begin if we begin mem addr lt data end ROMoutl mem addr end assign ROMout2 mem addr endmodule 61 APPENDIX H Port Verilog Module module PortOut we data PortOUTdataoutl PortOUTdataout2 input we clk input I770 data reg 7 50 PFortoUrdatacut ls gutput PFPOrtOUTdataout a PortoUTJataout z always negedge clk begin if we PortOUTdataoutl lt data end assign PorctOUTdataout2 data endmodule 62 APPENDIX H Asynchoronous Receiver RS 232 RX module iP KNIN LLG 20037 2004 2005 2006 module async_receiver clk RxD RxD_data_ready RxD_data RxD_endofpacket _1 1 ick output RxD data ready onc clock pulse when RxD data is valid output 740 RxD dataj parameter ClkFrequency 50000000 50MHz parameter Baud 9600 We also detect if a gap occurs in the received stream of characters That can be useful if multiple characters are sent in burst So that multiple characters can be treated packet w output RxD endofpacket one clock pulse when no more data is received 15 Ka output RxD idle no data is being received Baud generator we use 8 times oversampling parameter Baud8 Baud 38 63
53. t branch Jump output 1370 next pes output L370 wire 15 0 extl6 8 b00000000 insn 7 0 branch reg 15 0 increment always negedge clk begin increment lt 16 end jump reg 15 0 for jump always negedge clk begin for jump lt 16 end reg jump always 8 negedge clk begin Ir 01E Ae MP jump lt 1 else jump lt 0 end 54 last PG check wire AN 0 pcinc branch increment pc 1 assign next po Jump Tor Jump reg AN 0 pc always negedge clk begin ie Troe DG else if valid insn ce lt Next PE end assign insn ce rst s 0 amp rdy as SiG pe endmodule 55 APPENDIX B Instruction Memory Verilog Module module inst_mem_mushy addr pc clk data out input elk input 45206 sa 1370 data ot reg 1520 601 begin Sreadmemh LAST txt mem end assign data out mem addr pc endmodule APPENDIX C Decoder Verilog Module module Decoder insn op rd rs imm input 11530 OI ES OND NS os BAe wire 3 wire 3 wire 3 wire 7 wire 7 Dos Opr Boh 3557 LJ SO mm sms 0 op GT 0 rs 0 imm OU disp 7 1 structlion decoding assign assign assign assign assign endmodule Op es reek de rd mesmnbsso s rs insn 7 4 imm insn 11
54. ta will be fetched from a register and an immediate value will be extracted from the instruction Then the addition of the data and the immediate value will be performed For the data operation which is normally executed by arithmetic logic unit ALU it will do specific operation according to the decoded instruction The proposed processor has operations such as add subtract load store multiply AND OR and branching For my processor core the only memory access occurs during load and store instructions The memory being accessed is the data cache Then the result of the operation being performed in ALU 15 to write back to the appropriate register in the register file Lastly the value of Program Counter PC will be 20 updated by increment one PC PC 1 However on branch and jump instructions PC can be updated to other addresses 4 2 Instruction Set The operation of the processor is determined by the instructions it executes referred as machine instructions or computer instructions There are six instruction formats in this design as shown on Table 4 1 Each instruction consists of 16 bit the 16 bit 1s divided into several sections for the processor to do its works For general information of this design a simple architecture has been made where there are 16 8 bit registers In general purpose register and 16 8 bit registers in the memory register file for communication with real memory such as RAM and ROM T
55. tween the different designer roles to maximize the use of additional features that may be available 10 2 7 Complex Instruction Set Computer CISC Computers had only a small number of instructions and used simple instruction sets forced mainly by the need to minimize the hardware used to implement them As digital hardware become cheaper computer instructions tended to increase both in number and complexity These computers also employ a variety of data types and a large number of addressing modes A computer with a large number of instructions are known as complex instruction set computer abbreviated CISC Major characteristics of CISC architecture are 6 e A large number of instructions typically from 100 to 250 instructions 6 e Some instructions that perform specialized tasks and are used infrequently 6 e A large variety of addressing modes typically from 5 to 20 different modes 6 e Variable length instruction formats 6 e Instructions that manipulate operands in memory 6 2 8 Reduced Instruction Set Computer RISC The concept was developed by John Cocke of IBM Research during 1974 His argument was based upon the notion that a computer uses only 20 of the instructions making the other 80 superfluous to requirement 9 A processor based upon this concept would use few instructions which would require fewer transistors and make them cheaper to manufacture By reducing the number of transistor and
56. veral sections which it explains each design modules This processor core is sequential processor where it will process and complete one instruction in one cycle before it fetches a new instruction for the next cycle The modules inside the processor core are e Program Counter e Instruction Memory e Decoder e General Purpose Register e Operand Selection e ALU and Conditional Codes e ROM register 24 e PORT e and Shift Register 5 2 Architecture Overview INSN MEMORY DECODER PROGRAMI COUNTER Data Cache Figure 5 1 Architecture overview Figure 5 1 shows the top level block diagram of the design every block represents a module of the processor At first glance there are 10 modules are to be designed separately using the top down design approach Some modules like the decoder are easy to design but modules like ALU require a lot of understanding The overall dataflow and bus structure between all modules must be understood before designing the modules individually Buses provide connection between modules This bus is a common bus such as connection to PORT ROM register and GPR It is called common bus in the 25 design because it is being shared by many modules For example GPR can receive data from the data bus The other modules can receive and send data to the data bus Decode Execute Write Back Figure 5 2 The processor four stages RISC processor flow The follow
57. works of this project to readers 7 1 Recommendation for Future Works At first the processor core does not connected to real ROM or RAM So the memory register design in the processor core is just at low level where it provides the data of 8 bit at register level With real implementation of memory such as ROM and RAM it will make the LB and SB instructions shows benefit to the processor The processor core is sequential processor where each instruction must complete first then the processor can fetch new instruction to process So this made the processor runs a bit slower with the pipelining design implementation to this processor core It will make the processor runs at faster rate when fetch decode execute and write back can be done in one cycle 47 For processor core to runs a control program for the RFID reader the first stage of extracting the serial numbers from the RFID card must be done well The problem is whether at UART or shift register control signal made this implementation cannot be done To solve the uncontrolled transition of data retrieve a USART design module can be replaced with UART 7 2 Conclusion As a conclusion this project for the processor core has been completed successfully fulfilling the objective and scope specified for the processor core design But the processor core is not implemented well on the Altera DE2 FPGA due to problem of UART and RFID If this problem can be overcome this processor
Download Pdf Manuals
Related Search
Related Contents
Owner Manual - Energizer Generators Bodum PICNIC GAS BBQ FYRKAT 11450 User's Manual GS2-Display 1800 - stellarsupport global Fujitsu LIFEBOOK C1320 Manual MA-1606_ESP - Albalab Dear EvenCare G2 Owner, - Medline Industries, Inc. Copyright © All rights reserved.
Failed to retrieve file