Home

Thesis Paper - BRAC University Institutional Repository

1. end else 1f X Cont 0 amp amp Y Cont 0 when the inside a frame begin inside active frame end else 1f X Cont 2047 amp amp Y Cont 1943 amp amp inside active frame begin inside active frame 0 end end regprev odval wirebusodval assignbusodval sCCD_DVAL always posedge CLOCK 50 or negedge DLY_RST_2 begin if DLY RST 2 begin if prev_odval busodval 2 b01 begin new_pixel lt 1 prev_odval lt busodval end if write amp amp acknowledge begin new pixel 0 end end end 8 3 2 Interfacing Qsys Component with FPGA board Jan 18th u0 clk clIK CLOCK 50 clk clk ifo bridge write write write Difo bridge write data grayscale write data fifo bridge acknowledge acknowledge acknowledge sdram clk clk CLOCK 50 2 sdram clk clk Video vga controller 0 external interface CLK CLOCK 50 video vga controller O0 external interface CLK Video vga controller 0 external interface HS VGA HS Video vga controller 0 external interface VS VGA VS video vga controller 0 external interface R VGA R 91 Vvideo vga controller O0 external interface G VGA G video vga controller 0 external interface B VGA B 8 3 3 Creating SDRAM allocation always posedgesCCD_DVAL begin if address count 5038848 begin address count address count l end else begin address count 0 end end assignaddress line address count
2. Principal Template based Component Adaptive approaches Analysis PCA appearance models Discrete Cosine Piecemeal Or Transform Neural networks Wholistic approach Linear Discriminant with Gabor Filters Analysis Appearance based Locality Preserving 3 Neural networks and Or Model based Projections hidden Markov approaches Gabor Wavelet models Independent Template Or Component Fuzzy Neural Statistical Or Neural Analysis ICA netowks network Kernel PCA approaches Genetic Algorithms Bayesian Network Bi dimensional regression Ensemble based and other Boosting methods Neural Network FIGURE 2 1 Face Recognition approaches 2 2 FFT The fasr fourier transform FFT is simply a fast computationally efficient way to calculate the Discrete Fouries Transform FFT algortithm was first published by Cooley and Tukey in 1965 This is a clever algorithm which can be used to transform a signal from time domain to fequency domain The FFT greatly reduces the amount of calculation It also reduces the noise of a signal that are present in time domain Functionally the FFT decomposes the set of data to be transformed into a series of smaller data sets to be transformed Then it decomposes those smaller sets into even smaller sets At each stage of preocessing the results of previous stages are combined in special way Finally it calculates the DFT of each small data set For example an FF
3. Qsys system content output The connections to the components are given as per our project s requirements 77 The image is given below avalon master external interface B onchip memory cki s1 reset B cpu ck reset n data master instruction master fag debug module fa9 debug module custom instruction descriptor read descriptor write m wrte Figure 5 8 Qsys system content 78 Base End 0x00002000 0x00003fff IRQ 0 IRQ 31 0x00004800 0x00004 fff 0x00000000 0x007fffff 0x00005000 0x0000503f 0x00800000 0x00800fff 0x00005040 0x0000507 0x00800000 0x00800fff 0x00800000 0x00800000 After the connections were made Verilog codes had to be written in order for the parameters of some of the components work That Is the HDL example generated from these Qsys system output these had to be assigned input and output signals As for example we had to write codes for the External bus to Avalon bridge work The logic we have created 1s shown by the help of a Flow chart below No No Yes write 1 No Yes No Yes wv write O amnium Figure 5 9 Flowchart for valid frame capture 79 The logic we proposed was when the button for capturing the image is pressed a frame from the streaming data will be captured First we checked 1f the current frame just when the button is pressed meets the inside active frame
4. Matched Equivalent Database Image We experimented taking another image of different expression along with glasses to verify whether FFT can recognize it and it showed Equivalent Database Image 78 676196 Matched 3 78 6761 Matched Equivalent Database Image Therefore it can be said undoubtedly that FFT can recognize faces of different expressions successfully 45 On the other hand if we place the same image to Test Database that 1s already stored to the Train Database the accuracy is 100 For example Equivalent Database Image 10096 Matched 100 Matched Equivalent Database Image This means if we test exactly the same image this algorithm can identify the same image from the train database Therefore we used the above steps to verify 1f our proposed algorithm is suitable and also to what extent as a face recognition algorithm 46 Chapter 4 4 1 DEO Board and TRDB D5M Specifications The field programmable gate array FPGA is a semiconductor device that can be programmed after manufacturing We can use a FPGA to implement any logical function that an application specific integrated circuit ASIC could perform Unlike previous generation FPGAs using I Os with programmable logic and interconnects todays FPGAs consist of various mixes of configurable embedded SRAM high speed transceivers high speed I Os logic blocks and routing Most importantly an FPGA contains programmable logic components called lo
5. Next we have computed the mean value of the FFT images using mean2 function We repeated the above steps on test database also Then we made an array containing the differences of means between test and train databases This 1s to mention that Test database can contain a single image We have set a threshold by trial and error method which is 21 If the difference of mean is less than 21 it has been declared that the image is matched For the values that are more than 21 we considered the images are not matched Images that are not included in the Train Database will not match in the end 38 The whole procedure can be shown in a following flow chart Difference lt Threshold Do Not Matched Figure 3 1 Flowchart of FFT based face recognition dM ER 39 18 19 Figure 3 2 Train Database In Figure3 2 our Train Database 1s shown It is noticeable in the database that the images we have taken are not uniform One image has the light effect another may not One is smiling another is not It has been done intentionally so that we can identify the limitations of FFT spontaneously To check FFT s performance whether it can match considering the light effect and different expressions 40 Another Database we have the Test Database it may or may not be from the Train Database It can include only one image For a while let s consider the Test Database has the following image Figure 3 3 Test Databa
6. We have chosen VGA connector to suit our purpose 5 6 Video DMA Controller The DMA Controller IP core stores and retrieves video frames to and from memory When in the from stream to memory mode the core stores frames from an incoming stream to an external memory The core uses its Avalon 67 Memory mapped MM master interface to send the data to the memory When in the from memory to stream mode the DMA controller uses its Avalon memory mapped master interface to read video frames from an external memory Then it sends those video frames out by means of its Avalon streaming interface 28 System clock Reset From Avalon To Avalon switch switch fabric Avalon Avalon fabric streaming streaming sink source optional optional DMA Avalon Controller Avalon control buffer EE Avalon Avalon mule memory mapped port memory mapped port Block Diagram for DMA controller core The DMA controller s configuration wizard 1s used to specify the desired characteristics Such as e Mode DMA Direction specifies whether a video stream is to be stored to or retrieved from memory e Addressing Parameters Addressing Mode specifies the addressing mode Default Buffer Start Address the start address of the buffer upon reset Default Back Buffer Start Address the start address of the back buffer upon reset can be equal to the Default Buffer Start Address if no back buffer 1s desired o Frame Resolut
7. choices are 8 16 32 64 128 256 512 or 1024 bits Assign Data width to match the width of the master that accesses this memory the most frequently or has the most critical throughput requirements Suppose if we connect the on chip memory to the data master of a Nios II processor we should set the data width of the on chip memory to 32 bits the same as the data width of the Nios II data master Otherwise the access latency could be longer than one cycle because the Avalon interconnects fabric performs width translation e Total memory size this setting determines the total size of the on chip memory block The total memory size must be less than the available memory in the target FPGA e Minimize memory block usage may impact fmax Minimize memory block usage may impact fmax this option is only available for devices that include M4K memory blocks But we are M9k memory blocks in our system 24 Read Latency On chip memory components use synchronous pipelined Avalon MM Memory Mapped slaves Non Default Memory Initialization For ROM memories we can specify your own initialization file by selecting Enable non default initialization file This option allows the file you specify to be used to initialize the ROM in place of the default initialization file created by Qsys Enable In System Memory Content Editor Feature Enables a JTAG interface used to read and write to the RAM while it 1s operating We can use this int
8. component analysis ICA genetic algorithms neural networks FFT these are the algorithms established so far by the researchers In our project we will be focusing on implementing FFT and PCA on the FPGA board Then we will compare the results at the end The FFT is used in a wide range of applications such as image analysis image filtering image reconstruction image compression and we used it for image recognition as well 36 2 7 Benefits of using FPGA As it has mentioned earlier that one of the important objectives of our project is to get acquainted with FPGA board since it is a complete new area for us While researching we have known very interesting things about FPGA and had decided to choose our project based on this board An FPGA is exactly what the name suggests a Field Programmable Gate Array We program it as a piece of hardware The FPGA basically implements look up tables It is good at doing complex logic very fast Using hardware programming languages such as VHDL and Verilog someone can create complex logic structures Speed is the biggest advantage of FPGA It is reprogrammable More than one project can be implemented using same FPGA board FPGAs exceed the computing power of digital signal processors by taking the advantage of hardware parallelism It accomplishes more per clock cycle It has specialized functionality to closely match application requirements It supports long term maintenance As a product functio
9. 8 4 Code for SGDMA Code for SGDMA ey include lt stdio h gt include altera avalon sgdma h include altera avalon sgdma descriptor h include altera avalon sgdma regs h int main initialize scatter gather dma Memory To Stream alt sedma descriptor desc 1 alt sgdma descriptor next 2 alt u32 read addr 3 alt ul6 length leangth 4 intread fixed 5 intgenerate_sop 6 intgenerate_eop 7 alt u atlantic channel 8 voidalt avalon sgdma construct mem to stream desc 99929 ss 92 alt sedma dev dev alt avalon sgdma start dev Stream To Memory alt sgdma descriptor desc 1 alt sedma descriptor next A This does not need to be a complete or functional descriptor but must be properly allocated intwrite fixed 3 alt u32 write addr alt ul6 length or eop 5 voidalt avalon sgdma construct stream to mem desc T stop scatter gather dma alt_sgdma_dev dev voidalt_avalon_sgdma_stopQ return 0 8 5 Code for recognition in C NIOS II include lt stdio h gt include lt math h gt include lt algorithm gt float mean int m int a mean fuction int sum 0 1 for 120 1 m 1 sum 4 a i return float sum m j void main intn 1 max min intabsvalue 4 8 float mean int int intmeancalc 4 int temp 8 int diff 4 int threshold 10 intfft_value 4 8 90 42 21 5 43 9 34 2 2 42 21 12 4
10. Bayer Pattern Filter Above figure shows a bayer pattern filter and each pixel shows only one component of each primary color To convert an image from Bayer format to RGB format each pixel needs to have values of all three primary colors 4 9 1 RGB conversion Camera is configured in such a way that a Bayer image is getting 960 rows and 1280 columns with 5 frames per second Camera outputs the data 1n Bayer pattern with 12 bit on parallel bus In Bayer pattern format each pixel contains one of three primary colors which consists of four colors greenl blue red and green2 The layout is shown in following figure that means two of the remaining color components are missing in each pixel of Bayer pattern column readout direction black pix row readout direction Figure 4 9 Bayer image Pixels 53 This bayer pattern data is then passed through a module which converts it into RGB values and utilizes four pixels of Bayer pattern format to construct one pixel of RGB After applying formula other two component s value can be find out Camera manages green pixels as two different colors depending on which line they are coming from In Bayer format when 1 complete row and only first 2 pixels of the second row complete scanning then filter creates the 1 pixel of RGB Greenl Green2 Figure 4 10 RGB pixel from Bayer format Above figure shows a RGB pixel format As the second row out of camera completes scanning fi
11. Figure List Figure 2 1 Face Recognition Approaches iiie eren RE UEEPARAE I a RE RAE A SE rU 25 Figure 2 2 Time Domain Decomposition sseeeeeeeeeeeesseeeeeeeeee ens 27 Figure 2 3 Rearrangement Pattern Required iach tusvdevesbcreretevensetesredevenboratewcuiess Zi Figure 2 4 Time domain to frequency domain ccc cece cece cece eee eee e eee eeen eens ees 28 Figure 2 5 FFT Synthesis Flow Diagt tti ai eerie era hat rir tb RE Rito e Rara e 20 Pisure 2 6 Result Gl Cooley TUKE yii HERE ETE ETE ERRE ERA PER PER PIdE 3l Figure 2 7 The Magnitude calculated from the complex result LLsueesuuuuu 34 Figure 2 8 Magnitude after logarithmic transform ssussessussessessessessessersesse 35 Figure 2 9 Whe Phase Or PET asse dose oo Eb ERA A E 35 Figure 2 10 Magnitude and phase of a Fourier image sessi 36 Figure 3 1 Flowchart of FFT based Face Recognition cccc cece cece cece cece teen eens 39 Pieu 5 2 Tram Data Das Loa he opor cases cand Uo neon chan cea eau ann endian mide rte a ps denen 40 Figure 5 5 Test Database Image s eost iawn a E NET OERS 4 Figure 4 1 Cyclone III Device Architecture Over view Floorplan ceeeee eee 47 Pieure 4 2 Cyclone MILE PGA sanserne unions d dodo Ehe dide ates d dide a en dede aes 47 Figeure4 5 Cyclone IIT logic elements iuo ebore cioe ntt breed imb buo a cuoio ved e dut 48 Figure 4 4 DEO EPO X Specification iav ari
12. H Figure 2 4 Time domain to Frequency domain The figure shows how two frequency spectra each composed of 4 points are combined into a single frequency spectrum of 8 points This synthesis must undo the interlaced decomposition done in the time domain In other work ds the frequency domain operation must correspond to the time domain procedure of combining two 4 point signals by interlacing Considering two time domain signals abcd and efgh An 8 point time domain signal can be formed by two steps dilute each 4 point signal with zeroes to make it an 8 point signal and then add the signals together That is abcd becomes aObOcOdO and efgh becomes eOfO0gOhO Adding these two 8 point signal produces aebfcgdh Diluting the time domain with zeroes corresponds to the duplication of the frequency spectrum Therefore the frequency spectra are combined in the FFT by duplicating them and then adding the duplicated spectra together 28 Odd Four Point Even Four Point Frequency Spectrum Frequency Spectrum ag A un M i i fw iit CS KS FFT synthesis flow diagram This shows the method of combining two 4 point frequency spectra into a single amp point frequency spectrum Ihe 5 operation means that the signal is multiplied by a sinusoid with an appropriately selected frequency Eight Point Frequency Spectrum Figure 2 5 FFT Synthesis flow diagram In order to match up added the two time domain signals diluted with zeroes
13. M eR RE SS e UR 49 A 5 Altera Cyclone III 3CTO BPO A de VICG u o iere tron tutti Putent a REVWRRO PUE TELLURE 49 4d o C amera Module Pixel Array tret Ufe soe eem pos IP EOW EE DU en Nd abun a PET Opal 50 AADC POO O henare Ato dta ude shamed deine AE Made sha Rer Raacd Una ad cR ia Ms 51 48 Camera limage ACQUISITIONS ySEETI voveo Sexo FI P EopR KE EE VEEEEREIRda EE ARES 51 Zo dE ALY UEC ee retrasa aa a tea iMi D M M LII ee 52 Fe MINS WAU Gh NETTE ROT DETTO IEE ESE AEE EAE EEA E EEA 52 4 9 Bayer to RGB conversion in FPGA ccc cece cece cece eee eee cee eee eene ens 53 dO Ah Cry CONVE IOM os arta dues n vitem alta ee cee a eta ak ee eae ee 53 Chapter 5 Hardware Implementatioln c oed vines eiu EE HER RON VERRASS RU AN OPDOM E A S5 5 1 External Bus to Avalon BUd9e ucc eoeso eter tnp Eres o pbev e MERE sten MEE I SEV MERE Dia 57 SL SDRAM CONO aa E E E A 59 2 9 Phase Wocked Loop P DD 2s catu tena A T 62 5 4 Scatter Gather DMA Direct Memory ACCESS ccc cece eee e cece cece cece eene 64 INOA CONTONE aeea tod a een eee E eee a UM 66 JO Video DMA Controllers cc asaasacscnsaecen REA EbU NU cleanses enced NU EA REUS RR aAA NICHTS 67 5 7 Fast Fourier Transform FFT Generated from Mega wizard sssuuuuuuuuss 69 2 0 Create PET DIOCK I0b OSS ss i OL a boo datu banda A 70 5 9 On chip Memory RAM or ROM sssssssseeesssssse eee eene rens 71 OUDOUNIOS TE Processor aaiae r E E au E eee ds 12 SLE Hardware Ab
14. are used alt avalon sgdma open Returns a pointer to the SG DMA controller with the given name Table 5 3 Function List 66 5 5 VGA Controller The VGA controller IP core generates the timing signals required by the on board VGA DAC on the DE series boards and Terasic s LCD with touchscreen daughtercards In our project we use this controller IP core to view the data on the LCD monitor for the display and verification Data is provided to the VGA Controller via its Avalon Streaming Interface The controller takes the incoming data Then it adds the suitable VGA timing signals and then sends that information to either the on board VGA DAC digital to analog or the LCD with touchscreen daughtercard 23 28 The VGA Controller core generates the timing signals as well as vertical and horizontal synchronization signals The timing information generated by the VGA Controller core produces screen resolutions of 640 X 480 800 X 480 and 800 X 600 pixels for the VGA DAC the LCD with touchscreen TRDB LTM and the 8 inch LCD on the tPad respectively VGA clock Reset From Avalon switch To VGA fabric DAC VGA Controller The parameters to be assigned for the Qsys configuration wizard are DE Series Board Specifies the Altera DE series board that the system 1s being designed for For our project we have used DEO Video Out Device Specifies the VGA compatible device being used and by extension the screen resolution
15. available in the SDRAM The above steps are followed again to keep the information of another image the concept of creating database Then Nios II carries out further processing of comparison and recognition for both the images FFT values are compared The data route can be viewed from the RTL viewer to get an idea on the logic gate implementation for different blocks we have used 56 JR TE NN NN o Wy waa Ss cin Jos he Z sh Sy e e acne Ba L4 Hz E BE nn fee Seg See See ee futu ae TRER ass sm E oe ee RAS B Ja Joo ia J 1 LES LL L8 01 11 LI ig sum DET H 0 0 SS e e a e e 1 O T O a B ou r E a ue k a i Aa JA Lad E e Figure 5 2 RTL Viewer To accomplish the above processes we have used some Qsys components The components that we have considered putting in our system are as follows the order of the components might not be exactly as the following list 1 External Bus to Avalon Bridge 2 SDRAM controller 3 Avalon ALTPLL 4 SG DMA scatter gather Controllers 5 VGA controller 6 Video DMA Controller 7 FFT block generated from Mega Wizard 8 NIOSII Processor 9 On chip Memory RAM 5 1 External Bus to Avalon Bridge We have used this IP core or component to make an interface with our external camera module to our system This bridge provides a 57 simple interface for a peripheral device in our case t
16. conditions That is if the frame has a dimension of an active image 1944X 2592 it must be starting from x 0 and y 0 till the end of an active image frame If this condition is satisfied then this data will be transferred during the immediate next Odval rising edge of the driving clock This condition is defined by the new pixel block This will make the write signal high After writing the whole frame to SDRAM inside active frame will be 0 and this will generate an Acknowledge from the external bus When Acknowledge is high 1 the write signal 1s made 0 to stop the writing process as we have successfully collected one frame that is the data of an 1mage By following the above steps we tried to implement our proposed architecture 1n hardware 80 Chapter 6 Results and Discussion 6 1 Software Principal component analysis decomposes the covariance structure of the dependent variables into orthogonal components by calculating the eigenvalues and eigenvectors of the data covariance matrix Eigenvalues assist in making decisions about the number of orthogonal components that will be used in further analysis while eigenvectors assist in determining the relationship between the original variables and these new components Eigenvalues and eigenvectors transform the original variable space into a new set of variables called principal components PCs 32 The First Fourier Transform FFT is the most common
17. esz5r 030 WOR wr o Ei FPGA Specifications To provide maximum flexibility for the user all connections are made through the Cyclone IIII FPGA device Thus the user can configure the FPGA to implement any system design 49 PushButton Switches 3 Slide Switches 10 Expansion Headers 2 7 Segment Display 4 16X2 LCD Interface RS 232 Transceiver EPCS4 USB c Blaster Device Figure 4 5 DEO FPGA Components SDRAM 8 Mbytes Flash 4 Mbytes SD Card Socket Triple 4 bit VGA DAC DEO board has 50 MHz Clock input and Cyclone IIII 3C16 which has 15 408 LEs 56 M9K Embedded Memory Blocks 504K total RAM bits 56 embedded multipliers 4 PLLs 346 user I O pins and FineLine BGA 484 pin package It has Built in USB Blaster circuit SDRAM which has one 8 Mbyte Single Data Rate Synchronous Dynamic memory chip and Supports 16 bits data bus In addition it has 4 Mbyte NOR Flash memory which Support Byte 8 bits Word 16 bits mode and General User Interfaces which includes 10 Green color LEDs Active high 4 seven segment displays Active low and 16x2 LCD Interface Not include LCD module Moreover it has SD card socket which Provides both SPI and SD 1 bit mod SD Card access Furthermore it has Pushbutton switches Slide switches VGA output Serial ports and two 40 pin expansion headers 20 4 6 Camera Module Pixel Array Structure TRDB D5M Camera Module is used to capture the image of a
18. flow chart below Open Scatter Gather DMA Initialization Adding Pointer to Input and Output Memory Rearrange Memory Bytes Construct Descriptor Memory to Stream Descriptor stream to Memory Descriptor Start Asynchronous Data Transfer Start Asynchronous Data Receiver Apply Recognition Algorithom Using C Language Display whether Matched or Unmatched Figure 5 7 Flowchart for Nios II Instruction 76 Our initial task was to read image from SD ram and after that to store the fast fourier transformed FFT image in another place of the SD ram so that it does not overlap SD ram is connected to Scatter Gather DMA for asynchronous data transfer Since SD ram has no software configurable settings and no memory mapped registers we have programmed SGDMA using it s built in library routine For this we have first opened SGDMA and added pointer to input output memory as part of the initialization Before construct descriptor it was necessary to rearrange memory blocks Then we called the two built in function and passed the required parameters to that function alt avalon sgdma construct mem to stream desc alt avalon sgdma construct stream to mem desc After that it was ready for asynchronous transfer and receiver When transmission and receive was complete we used this value to compare it with existing database for recognition After configuring and adding all the components we finally get a
19. pens reed Ca 20 40 60 80 100 120 140 160 180 Phase Spectrum 42 The 2D FFTs are accomplished using fft2 The image files are imported as unit8 so they should be converted to double arrays before doing the FFTs The FFT of real non even data is complex so the magnitude and phase of the 2D FFTs should be displayed The function fftshift 1s used to shift the quadrant of the FFT around to see the lowest frequencies in the center of the plot 18 If we look at the FFT of above image it can be seen that most of the energy in the Fourier domain is present in the center on the image which corresponds to low frequency data in the image domain This corresponds to many gradual changes in the image The phase of the FFT is hard to interpret and generally looks like noise However the phase holds a great deal of the information needed to reconstruct the image To demonstrate the role of the phase of the FFT we switched the magnitude and phase of the image If we want to reconstruct the 1mage it is necessary to show the magnitude and phase part separately However our project is not concerned with reconstructing the image using inverse 2D FFT therefore we have considered the magnitude and phase part together in a single frame 18 3 3 Functions used in Matlab In this section we will discuss the functions that have been used in Matlab for recognition and the results Import images sdirectory Train Database tifffiles dir sdi
20. person The address start from Column 0 Row 0 and it locates at the upper right corner of the whole region TRDB D5M pixel array consists of 2 752 column by 2 004 row However whole region is not considered as an active region Array consists of a 2 592 column by 1 944 row is considered as an active region including boundary region In addition boundary region is not used to show pictures 50 to avoid edge effects Moreover the black region which is surrounded by the boundary region is not used to display any pictures 4 de Dark 134 Ege boundary 10 0 0 Dark 50 Active boundary 4 v Active Image 2592x1944 Pixels Active boundary 2 Dark 2 EH Dark 10 Active boundary 6 Figure 4 6 Pixel Array Description Pixels are output in a Bayer pattern format consisting of four colors Greenl Green2 Red and Blue G1 G2 R B representing three filter colors When no mirror modes are enabled the first row output alternates between G1 and R pixels and the second row output alternates between B and G2 pixels The Greenl and Green2 pixels have the same color filter but they are treated as separate colors by the data path and analog signal chain 4 7 DC Protocol In early 80 s Philips designed I2C bus This name is taken from Inter IC and mostly called as IIC or I2C 21 It permits simple communication to achieve data communication between components that resides on same circuit board It is not as
21. sns a Carry 4 m Row Column And Direct Link Routing Chip Wide eset DEV CLRn 4 4 Cyclone III FPGA Applications 48 The Cyclone III FPGAs are the first to implement a complete suite of security features at the silicon software and IP level on a low power high functionality FPGA platform 20 Cyclone III FPGAs has the following application areas Automotive Consumer Displays of all sizes Industrial Military Video and image processing Wireless communications 4 5 Altera Cyclone III 3C16 FPGA device The DEO board has many features that allow the user to implement a wide range of designed circuits from simple circuits to various multimedia projects DEO has Altera Cyclone III 3C16 FPGA device Altera Serial Configuration device EPCS4 USB Blaster 8 Mbyte SDRAM 4 Mbyte Flash memory S D Card socket 3 pushbutton switches 10 toggle switches 10 green user LEDs 50 MHz oscillator for clock sources VGA DAC with VGA out connector RS 232 transceiver PS 2 mouse keyboard connector Two 40 pin Expansion Headers 20 BET r 1 Le m ott DD TIHLE Th 1 t7 H4 TE vs FOCEXEIXEIIIXIIIIIIL IT n TIT TT ML Uam vw sm endl bed n 2 5 Gi TI t He Figure 4 4 DEO don La p p t ME UE D 1 huit xx2 te bi 3 e ld Fa gt T a2 M ae NEM erg e mm maen ud 1 coc Re wn RR mo Hn
22. the Bridge It 1s possible to specify the address range of 1 2 4 8 16 32 64 128 256 512 and 1024 in either bytes kilobytes KB or megabytes MB 5 2 SDRAM controller it allows designers to create custom systems in an Altera device that connect easily to SDRAM chips This SDRAM controller connects to one or more SDRAM chips and handles all SDRAM protocol requirements 23 Clock Source Altera FPGA SDRAM Clock PLL Phase Shift Controller Clock SDRAM Controller Core SDRAM Chip PC100 o 5 E clock a o lt gt c EE Avalon MM slave address o e interface data control g to on chip logic waitrequest g lt x t readdatavalid E SDRAM controller with Avalon Interface block diagram 59 Avalon MM interface The Avalon MM slave port is the user visible part of the SDRAM controller core The slave port presents a flat contiguous memory space as large as the SDRAM chip s The Avalon MM interface behaves as a simple memory interface There are no memory mapped configuration registers Signal Timing and Electrical Characteristics The timing and sequencing of signals depends on the arrangement of the core The hardware designer configures the core to match the SDRAM chip chosen for the system The SDRAM controller Mega Wizard has two pages Memory Profile and Timing These can be configured by using the option Custom or we could use any of the several pr
23. www altera com literature ds ds nios2 perf pdf 27 www altera com 28 ftp ftp altera com up pub Altera_Material 9 1 University_Program_IP_Cores Audio_Vid eo Video pdf 29 http www altera com literature ug ug_fft pdf 30 Altera Corporation 2011 May Nios II Software Developer s HandBook U S A 3 http blog goo ne jp 32 Srinivasulu Asadi Dr Ch D V Subba Rao and V Saikrishna November 2010 A Comparative Study of Face Recognition with Principal Component Analysis and Cross Correlation Technique International Journal of Computer Applications 10 8 0975 8887 33 Abhishek Kesh Rachit Gupta Siddharth S Seth and T Anish August 2004 05 Implement of Fast Fourier Transform FFT on FPGA using Verilog HDL An Advanced VLSI Design Lab Term Project AVDL 76 Kharagpur India 34 external_bus_to_avalon_bridge pdf 35 Face Recognition On FPGA Final year project 36 Eigenfaces for recognition MIT 87 Chapter Appendix 8 1 FFT Matlab code clc clear all sdirectory Train Database tifffiles dir sdirectory jpg Zolength tifffiles I 2 cell 1 numel tifffiles for k I length tfffiles filename sdirectory tifffiles k name I k imread filename Rb imresize I k 50 50 J rgb2gray Rb fftb fft2 J figure imshow I k imshow Rb figure imshow J figure imshow uint8 fftb Absolutevalue_train k abs fftb mean_train k mean2 Absolutevalue_train k
24. 2C Sensor Configaration FFT Image Ls Figure 5 1 Block diagram of our proposed architecture Bayer Color Pattern to 30 Bit RGB g o The hardware architecture we proposed is as follows An image is captured from the FPGA board compatible camera module TRDB D5M The output pixels or the raw data are in Bayer color Pattern Therefore the data 1s passed through Bayer color pattern to 30 bit RGB RED GREEN BLUE module Once the pixels are in RGB they are then converted into Grayscale to reduce down the number of planes in this case from 3 planes to Iplane to reduce the complexity of data manipulation This Grayscale data or the captured image is then stored in memory SDRAM through the assistance of the external bridge bus and SDRAM controller At this stage to verify 1f the data 1s actually stored 1n the SDRAM we can include the VGA controller and the Video In decoder and display the data on a LCD monitor Once the data or the 1mage 1s stored in the SDRAM the data is accessed from the SDRAM through the Scatter Gather DMA direct memory access controller and is passed to the FFT block Fast Fourier Transform block The output that is the data after FFT is again stored in SDRAM this time at a different memory location in order to keep both the stored data At this stage another DMA controller is used to transfer these data and access the SDRAM Then both sets of data are now
25. 3 2 34 2 24 32 100 90 2 24 56 45 4 21 5 43 45 34 23 for int 1 0 1 lt 4 14 for int p 0 p lt 8 p absvalue 1 p abs fft value 1i p printf abs_value d n absvalue i p 93 j for int 1 0 1 lt 4 14 for int p 0 p lt 8 p temp p absvalue i p printf temp_value d n temp p meancalc 1 mean 8 temp printf meancalc d n meancalc 1 j for int g20 g 4 g diff g meancalc 0 meancalc g printf diff d n diff g j max abs diff 1 min abs dift 1 for int t 2 t lt 4 t if max lt abs diff t max abs diff t if min abs diff t min abs diff t printf min d n min if min lt threshold printf Image is matched j else printf Image DO NOT Matched j j 94 95 96
26. 3 The M eigenvalues of 44 A along with their corresponding nm PA uU X eigenvectors correspond to the M largest eigenvalues of AA along with their corresponding eigenvectors ee Step 6 3 compute the M best eigenvectors of AA u Av important normalize 4t such that e 1 Step 7 keep only A eigenvectors corresponding to the A largest eigenvalues 20 e Representing faces on to this basis Each face minus the mean in the training set can be represented as a linear combination of the best A eigenvectors r K RPM e ye A m T o mean 2 Wj wj u5 we call the ts eigenfaces 09571 Each normalized training face b is represented in this basis by a vector Wy wi 21 e Face Recognition Using Eigenfaces Given an unknown face image centered and of the same size like the training laces follow these steps Step l normalize 1 M Step 2 project on the eigenspace A T D wy wv Li cp i HF Step 3 represent as 2 wx Step 4 find e min C 1 17 1 F Step 5 ife T then is recognized as face from the training set The distance e is called distance within the face space difs r Comment we can use the common Euclidean distance to compute e r however it has been reported that the Mahalanobis distance performs better I 2 QJ 2 1 Ww w p va
27. Design and VLSI Implementation of High Performance Face Recognition System A report submitted to department of Electrical amp Electronic Engineering BRAC University in partial fulfillment of the requirements for thesis work BRAC UNIVERSITY SY Priyanka Das Dewan 10221078 Tasnim Harun Shamma 09221032 Afifa Abbas 10221073 Raktim Kumar Mondol 09221232 April 2013 Declaration We do hereby declare that the thesis titled Design and VLSI Implementation of High Performance Face Recognition System is submitted to the Department of Electrical and Electronics Engineering of BRAC University in partial fulfillment of the Bachelor of Science in Electronics and Electrical Engineering This is our original work and was not submitted elsewhere for the award of any other degree or any other publication Date Supervisor Professor Dr A B M Harun Ur Rashid Priyanka Das Dewan student s ID 10221078 priyanka bracu gmail com Tasnim Harun Shamma Student ID 09221032 tasnim h shamma gmail com Afifa Abbas Student ID 10221073 afifa abbas 118 gmail com Raktim Kumar Mondol Student ID 09221232 raktim live gmail com Abstract In this paper we have proposed a novel hardware architecture for face recognition system In order to make the system cost effective we have used a simple yet efficient algorithm of face recognition system We have designed implemented and verified the algor
28. LL Phase Locked Loop is used to adjust the phase of the SDRAM clock so that edges occur in the middle of the valid window Tuning the PLL might require trial and error effort to align the phase shift to the properties of the target board But usually Phase shift for 50MHz clock is 3ns and for 100 MHz is 1 5 ns 23 25 The PLL that we select from Qsys depends on the device family Tor our three kinds are available We have chosen ALT PLL for our Cyclone III family Example Calculation Value ns in 7 Speed Grade Parameter Access time from CLK pos edge Address hold time Address setup time CLK high level width CLK low level width Timing Parameters for Micron MTASLCAMS32B2 SDRAM Device Table 5 2 PLL calculations 62 Value ns in 7 Speed Grade Parameter CKE hold time CKE setup time Data in hold time E ess n E E ee Data out 1 3 two 55 hiohimpedance cL 2 wa 8 i e Ee M Data out low ter uH time Data out hold time Table 5 2 PLL calculations Cook pero Minimum clock to output time 2 399 Maximum clock to output time 2 477 Maximum hold time after clock 5 607 Maximum setup time before clock 5 936 FPGA 1 0 Timing Parameters Table 5 2 PLL calculations The SDRAM clock can lag the controller clock by the lesser of Read Lag or Write Lag Read Lag tOH SDRAM tH_MAX FPGA 2 5 ns 5 607 ns 8 107 ns or Write Lag tCLK tCO
29. PAM PS S oe rou GEO NITET E EEE E E E 23 24 2 1 53N amp utal NEEWOER o oe fits eletti ed ia bitu rele ated dun Maa b ted es 24 2 2 Fast Foutier Transformi PES D socuee osea tene rode itae Unda hte meet bd aon ade Run 25 2 9 Howdoes EET WORKS dope sir E e NE UL aldetem eee D tuU 26 29 ZAFE AISOMIMINSescsasurineae PRESSE NUMEN ER dUra bd RN A Rer Sa UM Sian mantaees hows 30 2 4 1 FFT Implementation in NIOS II using Cooley tukey algorithm 30 31 2 9 JA DpIy me EET Onan Mae 6 xis bu d EO da EDT psc e mde hdd baeo tomb 32 2 0 JABDIIC ALIOS oor ta enc oat aemserin as erase saab oes aoa Ped MERO MI MORD UA 36 ZA Bene Mso usns PPCUA to ere EYE CES EYVERE OE ESSE ea eee ree e tuya e o do aas a7 Chapter 3 MATLAB Impl iehtatlon 50s csteveexscesisncesttnoessdddusswslvetans ceeesacass 38 S TOBSSICADDIOGELD ata bk oak alah AAO AA Dated edu pM atu dr EI 38 3 2 Two dimensional FFT on an 1mage ccc cece cece eee eee eee eese 4 o o purictions used M MAID c 2404ss rV PETAT EVA E PEPPER EUV PEN IU eU 43 Chapter 4 DEO board and TRDB D5M Specifications eee 47 4 1 Inttoductonmto T PA cus sin acest ata pare Eee icc E ead Ox TM E Epl NM Ded 47 42 Cyclone Tl PPGA AtebltectUtesisi eo o AR ada ten tertia xe xata tad acete s ERE IR eeu 47 2 boc ECM Ne Da euus eese voted pacer here Wie a bee NE E N id One Ke OTE n e dons 48 44x yclone HLEPGA A pplicaons ss cco ccicsadccereietnd ates seen DE e NR
30. T of size 32 is broken into 2 25 FFTs of sizel6 which are broken into broken 4 FFTs of size 8 which are broken into 8 FFTs of size 4 which are broken into 16 FFTs of size 2 10 The number of complex multiplication and addition operations required by the simple forms both the Discrete Fourier Transform DFT and Inverse Fourier Transform IDFT is of order 2 N as there are N data points to calculate each of which requires N complex arithmatic operations For length n input vector x the DFT is a length n vector X with n elements n 1 rpe I ae j 0 n 1 k 0 On the other hand DFT has algorithm complexity and hence is not a very efficient method It will not be very useful for the majority of practical DSP applications However there are number of different Fast Fourier Transform FFT algorithms that enable the calculation of a signal much faster than DFT 2 3 How does FFT work As discussed earlier the FFT operated by decomposing an N point time domain signal into each composed of a single point The second step is to calculate the N frequency spectra corresponding to these N time domain signals Lastly the N spectra are synthesized into a single frequency spectrum 11 1 signal of 16 points 9g O0 IO 11 12 2 signals of amp points HU J 4 f 4 g L i k v A An ETT EEN a 0 4 6 8 10 12 14 ro A s a kellel s i a 16 signals of 1 point E
31. Wy Wa oy Ug I8 a basis of the K dimensional space Note if both bases have the same size GV A then X X 14 Example LY HN 15 e Methodology Suppose x X X are N x 1 vectors M Step I Eee i Xj M j Step 2 subtract the mean x x Step 3 form the matrix A P 5 Py NxM matrix then compute M C 0 0 44 M n sample covariance matrix x V characterizes the scatter of the data Step 4 compute the eigenvalues of C Ay gt Ap gt gt Ay Step 5 compute the eigenvectors of C Hq 3 UN Since C is symmetric i 1 Hy form a basis Le any vector X or actually X X can be written as a linear combination of the eigenvectors N y buy bos byly but i 16 step 6 dimensionality reduction step keep only the terms correspond ing to the A largest eigenvalues K x 2 Y biu where K lt lt N he representation of X X into the basis Hy H5 Mg is thus Dy Da b e Linear transformation implied by PCA N K The linear transformation R R that performs the dimensionality reduction is I ut Da TA lx nsU fx x be Luk e Geometric Interpretation PCA projects the data along the directions where the data varies the most These directions are determined by the eigenvectors of the covariance matrix corresponding to the largest Eigenvalues The magnitude of the Eigen
32. _MAX FPGA tDS SDRAM 20 ns 2 477 ns 2 ns 15 523 ns The SDRAM clock can lead the controller clock by the lesser of Read Lead or Write Lead 63 Read Lead tCO MIN FPGA tDH SDRAM 2 399 ns 1 0 ns 1 399 ns or Write Lead tCLK tHZ 3 SDRAM tsSU_MAX FPGA 20 ns 5 5 ns 5 936 ns 8 564 ns Therefore for this example you can shift the phase of the SDRAM clock from 8 107 ns to 1 399 ns relative to the controller clock Choosing a phase shift in the Middle of this window results in the value 8 107 1 399 2 3 35 ns These values are collected from Datasheets of the corresponding devices To drive the SDRAM we required a PLL Phase Locked Loop Cyclone series supports only one type of PLL A phase locked loop PLL is a control system that generates an output signal whose phase will be related to the phase of an input reference signal PLL circuitry is an electronic circuit consisting of a phase detector and a variable frequency oscillator PLL measures up the phase of the input signal against the phase of the signal derived from its output oscillator and adjust the frequency of its oscillator to keep the phases matched The PLL can be used to generate stable frequencies recover signals from a noisy communication channel or distribute clock signals throughout the design Usually we chose 3ns for 50 MHz and 1 5ns for 1OOMHz 5 4 Scatter Gather DMA Direct Memory Access The S
33. ansform is a global transform applicable to full images If we are going to use it to recognize faces we must consider the local version of it One of the key properties of the Trace transform is that it can be used to construct features invariant to rotation translation and scaling We should point out that invariance to rotation and scaling is harder to achieve than invariance to translation It 1s assumed that an object is subjected to linear distortions like rotations translations and scaling It is equivalent to saying that the image remains the same but viewed from the linearly distorted coordinate system 2 1 5 Neural Network A Neural Network is a system of programs and data structures that approximates the operation of the human brain A neural network usually involves a large number of processors operating in parallel each with its small sphere of knowledge and access to data in its local memory Typically a neural network is initially trained or fed large amounts of data and rules according to the data A program can tell the network how to behave in response to an external stimulus or can initiate activity on its own The main disadvantage of neural networks is that there is no clear method to find the initial topologies The training takes long time For face recognition a neural network must be trained to recognize an individual That is time consuming and not well suited for real time applications 9 24 Geometric Or
34. catter Gather Direct Memory Access SG DMA controller core implements high speed data transfer between two components 23 We can use the SG DMA controller core to transfer data from e Data stream to memory e Memory to data stream e Memory to memory For our project to transfer data to the FFT block and to place the output data from the FFT block we have used the first two of the three processes Firstly we used memory to data stream to access the data from SDRAM as the streaming input of the FFT block Then we used data stream to memory to pass the output of the FFT block to the SDRAM again 64 The SG DMA controller core transfers and merges non contiguous memory to a continuous address space and vice versa The core reads a series of descriptors that specify the data to be transferred For applications requiring more than one DMA channel such as in our case multiple instantiations of the core can provide the desired throughput Each SG DMA controller has its own series of descriptors those specify the data transfers The SG DMA controller core is Qsys Builder ready and integrates easily into any Qsys Builder generated system The device drivers are provided in the Hardware Abstraction Layer HAL system library if we want to use NIOS II processor SG DMA can be called from the library available in NIOS II Since we have used internal memory SDRAM here is an example of how SG DMA controller core transfers data betwee
35. ch we had was to create our own small database with the 1mages of our university students comprising different facial expressions Then we attempted to apply some of the recognized algorithms on our database using MATLAB Then we used our proposed algorithm using Fast Fourier Transform to assess the feasibility of using FFT as a face recognition algorithm We afterwards moved towards hardware part which was our main interest We used the FPGA board with the digital camera that is compatible with the board We took images and stored them in the board s SDRAM We then took these images from the memory and applied FFT on them and kept the transformed images to the SDRAM again The next step was to compare between the values of the transformed images to verify if our algorithm was working 12 Chapter 2 2 1 Algorithms for face recognition 2 1 1 Principle component Analysis Principle component Analysis PCA was invented in 1901 by Karl Pearson This algorithm consists extracting relevant information in a face image which is called the principle component and encode that information in a suitable data structure For recognition it takes the sample image and encodes it in the same way and compares it with the set of encoded images In mathematical terms we want to find Eigen vectors and Eigen values of a covariance matrix of images where one image is just a single point in high dimensional space n n where n n are the dimensions of an image Th
36. ct using the New Project Wizard available from the File menu in the Quartus II software 2 Launch MegaWizard Plug in Manager from the Tools menu and select the option to create a new custom megafunction variation Then we parameterized the core according to our purpose and suitability In our project we have chosen Input output data flow as Streaming according to the suitability of the project 69 To set up simulation from the IP tool bench the Step2 set up simulation and then Generate Simulation Model is turned on The Language we used was Verilog HDL And then to generate the mega core we have selected Generate from the IP tool bench A list of files will be generated After reviewing the generation report we have to click YES on the Quartus II IP files prompt to add the qip file to the current Quartus II project 5 8 Creating FFT block in Qsys In our project for easier manipulation and interconnectivity we transformed this mega core function into Qsys IP following few steps Mega core functions can be included as a new component in the Qsys IP library The Steps are given below o MegaCore Function FFT is not supported by Qsys launched from the Tool menu of the Quartus II MegaWizard Plug In Manager o From MegaWizard Plug In Manager a new custom megafunction variation is selected o Megafunction of FFT on the next page is created o In MegaCore Function we clicked Parameterize o In Parameters tab sp
37. e 6 2 Recognition Result with FFT based Algorithm eeesesseesse 81 Figure 6 3 Recognition Result with PCA based Algorithm ccc ce eeeeeeeeeeeeees 82 Chapter 1 Introduction 1 1 Background A facial recognition system is a computer application for automatically identifying or verifying a person from digital image or a video frame from a video source Therefore there are two types of approaches for face recognition One is image based and another one is video based There are more classifications to it now One is partially automated systems and the other 1s fully automates systems Face Recognition has become a well liked and popular area of research in image analysis understanding and in computer vision as well This topic has raised curiosity among computer science researchers neurologists and psychologists Basically face recognition in our case is given still images of a person it can verify and identify one or more persons using a stored database of faces For the research help purpose already many databases have been created As for example AT amp T face database Yale face databases etcetera comprising of different poses and illumination conditions Many universities and institutions have shown interest in this image processing and recognition systems from very early time and still aspire to excel in this field Recognition algorithms is divided into two main categories or approached in two differe
38. e SDRAM controller refreshes the Issue one refresh SDRAM A typical SDRAM requires 4 096 refresh commands every command every 64 ms which can be achieved by issuing one refresh command every Settings Description 64 ms 4 096 15 625 us Delay after a HNS The delay from stable clock and power to SDRAM initialization before initialization Duration of refresh command t_rfc ST Auto Refresh period spin area Precharge command period command t_rp ACTIVE to READ or WRITE delay t rcd Sr ACTIVE to READ or WRITE delay Access time t_ac mms Access time from clock edge This value may depend on CAS latency Write recovery time t_wr Write recovery if explicit precharge commands are issued This No auto precharge SDRAM controller always issues explicit precharge commands Table 5 1 Descroptions of SDRAM parameters 61 There are issues related to synchronizing signals from the SDRAM controller core with the clock that drives the SDRAM chip During SDRAM transactions the address data and control signals are valid at the SDRAM pins for a small window of time and during this time the SDRAM clock must toggle to capture the correct values At slower clock frequencies the clock naturally falls within the valid window but at higher frequencies the SDRAM clock must be compensated to align with the valid window This is usually done by either calculating or analyzing the SDRAM pins with an oscilloscope 5 3 PLL A P
39. e objective of our project is to work with still image based algorithm and to implement it on a cyclone III FPGA chip from Altera Inc The cyclone chip is relatively cheaper and includes ROM DEO board has been chosen as a tool for debugging process We have emphasized on using FFT Since FPGA implementation itself 1s a huge challenge we will start with a simpler function that 1s FFT Fast Fourier Transform We have used the Cooley Tukey algorithm for FFT In addition we have gone through for a hardware software co design approach Our aim was to do the whole recognition in Matlab using FFT and PCA verified that if they had worked properly then we compared the algorithms After that we focused on implementing FFT first on the board using Nios2 processors SDRAM on chip memory DMA blocks etcetera 1 3 Research Goal Our research goal is to get acquainted with FPGA board to learn how to use it On the other hand our goal was to enter into the huge area of image processing Combining these two fields together can definitely broaden our knowledge One of the prime concerns of our research is to start with the simpler algorithm to confirm that it is possible to implement any other algorithm using FPGA so that we can work on it in future FPGA itself 1s complex device Therefore we couldn t take our goal to the benchmark Hopefully we will learn from mistakes and can go for further algorithms 11 1 4 Problem Formulation The first approa
40. ecified the size of the FFT and the Target Device Family o In Architecture tab specified the I O Data Flow This time we have chosen Streaming o When finished clicked the Generate screen FFT MegaCore Function o MegaCore Function is generated and added to the fft qip Files of Quartus II o In order to capture the Qsys the FFT MegaCore Function that is generated by MegaWizard a wrapper module is created o To be added as a new component of the Qsys FFT MegaCore Function clicked New Component from the Component Library tab o In the HDL Files Tags Component Editor created in as a Top Level Module the wrapper v is added o Signals in the next tab to set the Signal Type and Interface o Then opened the Interfaces tab Error Master has no read or write interface that 1s eliminated when we clicked the Remove Interfaces with No Signals button 70 o Error Interface must have an associated reset the Associated Reset was resolved by choosing the appropriate reset signal from the pull down menu o When a component is successfully generated it is added as a new component to the Library We added this FFT to the system o Then other connections are given according to our project s need 31 5 9 On chip Memory RAM or ROM Altera FPGAs include on chip memory blocks that can be used as RAM or ROM in Qsys systems On chip memory has the following benefits for Qsys systems On chip memory has fast access time compared to
41. edefined SDRAM configurations provided if the If the SDRAM subsystem on the target board DEO in our case matches one of the preset configurations Some of the preset configurations are for e Micron MT8LSDT1664HG module e Four SDRIOO 8 MByte x 16 chips e Single Micron MT48LC2M32B2 7 chip e Single Micron MT48LC4M32B2 7 chip e Single NEC D4564163 A80 chip 64 MByte x 16 e Single Alliance AS4LC1M16S1 10 chip e Single Alliance ASA4LC2MSSO 10 chip But we have configured it for our convenience which was appropriate for our SDRAM subsystem The Memory Profile page allows one to indicate the structure of the SDRAM subsystem such as address and data bus widths the number of chip select signals and the number of banks 60 Allowed Default Values Values Data Width 32 2 e data bus width This value determines the width of the dq 2 Je data and the aqm bus byte enable Number of independent chip selects in the SDRAM subsystem By Chip Selects 1 2 4 8 using multiple chip selects the SDRAM controller can combine Architecture multiple SDRAM chips into one memory subsystem Settings Number of SDRAM banks This value determines the width of the Banks ba bus bank address that connects to the SDRAM The correct value is provided in the data sheet for the target SDRAM Settings Description Number of row address bits This value determines the width of the 11 12 13 addr bus The Row and Column values depend on the geometry of 14 t
42. em until the resulting composite wave gets closer and closer to the actual profile of the original image Eventually by adding enough waves we can exactly reproduce the original image Therefore it can be said that images are nothing but the summation of sine and cosine Waves In other words by adding together a sufficient number of sine waves of the right frequency and amplitude any fluctuating pattern can be reproduced Fourier Transform generally works out to find out the waves that comprise an image 14 33 The Fast Fourier Transform is an important image processing tool which is used to decompose an image into its sine and cosine components or waves Undoubtedly the output of FFT represents the image in the frequency domain while the input image is the spatial domain or time domain equivalent In the Fourier domain image each point represents a particular frequency contained the spatial domain image If we want to access the geometric characteristic of a spatial domain image then FFT can be used Because the image in the Fourier domain is decomposed into its sinusoidal components which is the easy way to examine or process certain frequencies of the image that influences the geometric structure in the spatial domain 14 In most implementations the Fourier image 1s shifted in such a way that the DC value or the image mean is displayed in the center of the image The further away from center of an image point is the higher is its co
43. end sdirectoryl Test Database tifffiles1 dir sdirectoryl jpg Zolength tifffiles 1 Q cell 1 numel tifffiles1 for R 1 length tifffiles1 filenamel sdirectoryl tifffiles1 R name Q R imread filenamel Rb1 imresize Q R 50 50 Jl rgb2gray Rb1 fftb 1 fft2 J1 figure imshow Q R imshow Rb1 figure imshow J1 figure imshow uint8 fftb 1 Absolutevalue_test R abs fftb 1 mean_test R mean2 Absolutevalue_test R end for i 1 1 length tifffiles j l difference i mean train 1 mean_test j axxx 1 abs difference 1 end t min axxx 2o to show the equvalent image for us 1 1 length tfffiles 88 post min axxx compare axxx us if post compare counter us else end end if t lt 21 threshold level disp Matched percentage 21 t 21 100 figure imshow Q R title num2str percentage Matched figure imshow I counter titleEquivalent Database Image else disp Not Matched percentage t 21 t 100 figure imshow Q R title num2str percentage 7o Deviated Not Matched end 8 2 PCA based Face Recognition Matlab code Collected A sample script which shows the usage of functions included in o PCA based face recognition system Eigenface method See also CREATEDATABASE EIGENFACECORE RECOGNITION clear all cle close all o You can customize and fix initial directory paths TrainDatabasePath uig
44. ere can be many Eigen vectors for a covariance matrix but very few of them are principle one s Each Eigen vector can be used for finding different amount of variations among the face image However we are emphasizing only in principle Eigen vectors because these can show account for substantial variations among a bunch of images They can show the most significant relationship between the data dimensions Eigenvectors with highest Eigen values are the principle component of the image set We may lose some information if we ignore components of lesser significance But if the Eigen values are small then we won t lose much Using those set of Eigen vectors we can construct Eigen faces The goal of PCA is to reduce the dimensionality of the data while retaining as much as possible of the variation present in the original dataset PCA allows us to compute a linear transformation that maps data from high dimensional space to low dimensional sub space 1 2 bj 61 4 fya at F EN D gt fa 1 T foa a5 Lop EN D x f g1 tipadst rhea ct A F i F 2 ae ek A I Poy Pax L3 or y x where 7 m am r L PK PRED EKEN 13 e Lower dimensionality basis e Approaximate vectors by finding a basis in an approapriate lower dimensional space 1 Higher dimensional space representation A G Vi iy V4 zu E C Vy V V2 Vy Is a basis of the W dimensional space 2 Lower dimensional space representation A Pil Dalla H Dylly
45. erface to update or read the contents of the memory from your host PC That is on chip memory contents van be viewed from this feature 5 10 Nios II processor is one of the most resourceful and versatile embedded processors Like any other processor it interprets program instructions and processes data makes the appropriate services available to other parts of the system presents user interfaces and interprets the user input 26 27 72 eee Figure 5 4 Nios 2 processor This processor is the most widely used soft processor in the FPGA industry The Nios II processor delivers unparalleled flexibility and performance in cost sensitive real time ASIC optimized safety critical and applications processing needs Nios II comprises three configurable cores which we have selected on the basis of individual s design needs Nios H f The Nios II f fast processor is designed for superior performance while presenting the majority configuration options which are unavailable in the other Nios II processors e Nios II s The Nios II s standard processor is designed for small size while maintaining fair performance e Nios H e The Nios II e economy processor is designed for the smallest possible processor size while providing sufficient performance A Summary of Features Supported by the Nios II processor is listed below e MMU memory management unit e Memory protection unit MPU e External Vector Inter
46. ess bandwidth used and offer a wider range of choices for compression methods It is also possible to implement several vision detection algorithms in software Itis possible to use already implemented algorithms cross compile them to this TERASIC TRDB D5M camera Other hardware or software applications can be implemented on this system since the hardware present on the FPGA can be changed or increased with HDL modules and the operating system allows easy software development taking into account system constraints Another way in which the present implementation can be improved is by changing the input output process The input output block remains idle when processing is going on We cannot enter new sets of data as long as the entered set has been completely computed The new proposed architectural modification takes care of the fact that when computation of one is going on input and output blocks are not staying idle This will lead to kind of pipelined input output architecture for the whole block 84 6 5 Conclusion Face recognition is biometric identification by scanning a person s face and matching it against a library of known faces The end result of this project focuses on developing a Face recognition system on FPGA An advantage of developing this system on a FPGA was the ability to update the functionalities or correct any error by re programming the FPGA with a system s new version This system is targeted for access contr
47. etdir D Program Files MATLAB R2006a work Select training database path TestDatabasePath uigetdir D Program Files MATLAB R2006a work Select test database path prompt Enter test image name a number between 1 to 10 dlg title Input of PCA Based Face Recognition System num lines 1 def 117 TestImage inputdlg prompt dlg_title num_lines def TestImage strcat TestDatabasePath char TestImage Jpg im imread TestImage lime counting start time cputime T CreateDatabase TrainDatabasePath m A Eigenfaces EigenfaceCore T OutputName Recognition TestImage m A Eigenfaces SelectedImage strcat TrainDatabasePath V OutputName 89 SelectedImage imread SelectedImage 2o lime resutl Time cputime start time imshow im title Test Image figure imshow SelectedImage title Equivalent Image str strcat Matched image is OutputName disp str 8 3 FPGA Code 8 3 1 Storing data from camera module to SDRAM always posedge CLOCK_50 or negedge DLY_RST_2 begin if DLY_RST_2 begin write lt 0 end else begin if write begin if acknowledge begin write lt 0 end end else begin if button_pressed 0 used twice begin if nside active frame begin if new pixel begin write lt 1 end end end end end end always posedge CLOCK_50 or negedge DLY RST 2 begin 90 if DLY RST 2 begin inside active frame 0
48. famous as USB or Ethernet but much of electronic devices depend on I2C protocol It is unique in the use of special combination of signal conditions and changes It entails only 2 signals or bus lines for serial communications one is clock and other is data clock is recognized as SCL or SCK for serial clock and data is known as SDA I2C protocol uses certain registers for common resolutions their frame rates LVAL FVAL exposure time green gain red gain and blue gain 4 8 Camera Image Acquisition System When FPGA gets power to start system initializes sensor chip and determines mode of operation and certain value of registers in image sensor controls corresponding parameters 51 22 From the following figure it can be seen that LVAL is vertical synchronization signal and FVAL is horizontal reference signal PIXCLK represents pixel output synchronization signal When FVAL signal goes high the system sends out 1280 number of columns data at the same time and the LVAL will appear 960 number of rows times high during the FVAL high One frame image with resolution 1280 960 is collected completely when the next FVAL signal rising edge arrives FVAL F LVAL EE MEM 2 ZZ V RYRIRIRIMD OO l Vertical Blank Horizontal Blank Valid Image Data oa Blank Vertical Blank Figure 4 7 Default Pixel Output Timing 4 8 1 Frame Valid This hardware pin 1s asserted during the total No of active rows in the image This
49. gic elements LEs and a hierarchy of reconfigurable interconnects that allow the LEs to be physically connected We can configure LEs to perform complex combinational functions or merely simple logic gates like AND and XOR In most FPGAs the logic blocks also include memory elements which may be simple flipflops or more complete blocks of memory In addition newer FPGA families are being developed with hard embedded processors transforming the devices into systems on a chip SoC 20 Advantages of using FPGAs over ASICs and ASSPs are including e Rapid prototyping e Shorter time to market e The ability to re program in the field for debugging e Lower NRE costs Long product life cycle to mitigate obsolescence risk 4 2 Cyclone III FPGA Architecture EInpnamppipmup odd BERBEBEEBSNE T Doooooooooo Doooooooooo Phase Locked Loops D D EH gg M9K Memory Blocks pE OF ims is Logic Array ult o M zc la Br c Embedded ul ag 18 bit x 18 bit is m Multipliers Side I O Cell er bO Wih O LH D verte LVDS Signals Bg D up to 875 Mbps n pd HE Top and Bottom EH dH l O Cell for Memory L o feminiini ninina poogonnnnpnon RU p rua Interfaces Up to DIEODODODObOOd DUDLUDOUdoubooo Corfu Ps LLL AH nri rn m 400 Mbps Figure 4 1 Cyclone III Device Architecture Overview Figure 4 2 Cyclone Ill FPGA Floorplan 47 Cyclone III FPGAs has low power high functionality and low cost The 65nm architecture co
50. he TRDB D5M to connect with the Avalon Switch Fabric as a master device The Bridge creates a bus like interface to which one or more master peripherals can be connected 34 Clock avalon address avalon writedata avalon read avalon write External Bus to Avalon Bri avalon_byteenable s avalon_waitrequest avalon_readdata Figure 5 3 External bus to avalon bridge The Bus signals provided are 1 Address k bits up to 32 2 Read 1 bit 3 Write 1 bit 4 Byte Enable 16 8 4 2 or 1 bit External Master Peripheral 5 Write Data 128 64 32 16 or 8 bits 6 Read Data 128 64 32 16 or 8 bits 7 Acknowledge 1 bit The bus is synchronous all bus signals must be read by the master peripheral on the rising edge of the clock A bus transfer happens when either Write or Read 1s high For our project we coded in such a way that the bridge does the work of a write command as we want to write the data from the camera to the SDRAM 58 Nios l System External Bus ByteEnable Master to Avalon Bndge WriteData s Peripheral Acknowledae ReadData 16 Figure External Bus to Avalon Bridge with Nios II system Two parameters are needed to specify in Qsys External Bridge to Avalon core 1 Data Width the number of data bits involved in a transfer The Bridge supports data widths of 8 16 32 64 and128 bits 2 Address Range the addressable space supported by
51. he binary numbers are the reversals of each other For example sample 3 0011 is exchanged with sample number 12 1100 Likewise sample number 14 1110 is swapped with sample number 7 0111 and so forth The FFT time domain decomposition 1s usually carried out by a bit reversal sorting algorithm This involves rearranging the order of the N time domain samples by counting in binary with the bits flipped left for right The next step in the FFT algorithm is to find the frequency spectra of the 1 point time domain signals The frequency spectra of the 1 point signal 1s equal to itself that means nothing 1s 27 required to do this step Now each of the 1 point signals is a frequency spectrum not a time domain signal The last step in the FFT is to combine the N frequency spectra in the exact reverse order that the time domain decomposition took place The algorithm gets messy here There is no shortcut for bit reversal It is must to go back one stage at a time In the first stage 16 frequency spectra 1 point each are synthesized into 8 frequency spectra 2 point each In the second stage the 8 frequency spectra 2 point each are synthesized into 4 frequency spectra 4 point each and so on The last stage results the output of the FFT a 16 point frequency spectrum Time Domain Frequency Domain ABCD WSS DO a 0 b JO c O d O lale C D A B C D elf left E F G H ojejo rjoj gjO h E F G H IE F amp
52. he chosen SDRAM For example an SDRAM organized as 4096 Address 21 rows by 512 columns has a Row value of 12 Width Settings gt 8 and Number of column address bits For example the SDRAM less than organized as 4096 rows by 512 29 columns has a Column value Row value of 9 When set to No all pins are dedicated to the SDRAM chip When share pins via tri state set to Yes the addr dq and dam pins can be shared with a tristate bridge dq dqm addr l O pins DS bridge in the system In this case select the appropriate tristate bridge from the pull down menu When on SOPC Builder creates a functional simulation model for the SDRAM chip This default memory model accelerates the process of creating and verifying systems that use the SDRAM controller See Hardware Simulation Gonsiderations on page 2 7 Include a functional memory model in the system testbench Table 5 1 Descriptions of SDRAM parameters The Timing page allows designers to enter the timing specifications of the SDRAM chip s used The correct values are available in the manufacturer s data sheet for the target SDRAM For our case it is IS42816400 Allowed Default Values Value CAS latency 12 3 3 Latency in clock cycles from a read command to data out Initialization refresh cycles 1 8 9 This value specifies how many refresh cycles the SDRAM controller performs as part of the initialization sequence after reset This value specifies how often th
53. igital Signal Processing Central Fast Fourier Transform FFT FAQ http www dspguru com dsp faqs fft 1l The Scientist and Engineer s Guide to Digital Signal Processing by Steven W Smith Chap 12 Fast Fourier Transform 12 James W Cooley and John W Tukey An algorithm for the machine calculation of complex Fourier series Math Comput 19 297 301 1965 13 Raghu Muthyalam Implementation of Fast Fourier Transform for Image Processing in DirectX 10 14 ImageMagick v6 Examples Fourier Transforms http www imagemagick org Usage fourier 86 15 R Gonzales R Woods Digital Image Processing Addison Wesley Publishing Company 1992 pp 81 125 16 A Jain Fundamentals of Digital Image Processing Prentice Hall 1989 pp 15 20 17 http Answers yahoo com question index qid 1005 120802386 18 How to do a 2D Fourier Transform In matlab by Eric Verner Matlab Geek http matlabgeeks com tips tutorials how to do a 2 d fourier transform in matlab 19 How to plot 2D FFT in Matlab stackoverflow http stackoverflow com questions 13549186 how to plot a 2d fft in matlab 20 Terasic Technologies 2011 DEO User Manual 21 http www 12c bus org 12c Interface 22 Terasic TRDB D5M Hardware specification 2010 www terasic com 23 http www altera com literature ug ug embedded i1p pdf 24 http www altera com literature ug ug sopc builder pdf 25 http www altera com literature ug altera_pll pdf 26 http
54. in a slightly different way In one signal the odd points are zero while in the other signal the even points are zero In other words one of the time domain signals OeOfOgOh is shifted to the right by one sample This time domain shift corresponds to multiplying the spectrum by a sinusoid A shift in the time domain 1s equivalent to convolving the signal with a shifted delta function This multiplies the signal s spectrum with the spectrum of the shifted delta function The spectrum of a shifted delta function 1s a sinusoid This was the basic of FFT In case of image it may work differently which has discussed later 11 2 4 FFT Algorithms As it has been discussed earlier DFT is a complex algorithm and not that efficient Due to slow processing it is not applicable in real world problems To make DFT calculation faster and efficient there are number of FFT algorithms Such as Radix 2 Butterfly Cooley tukey Prime factor FFT algorithm Bruun s FFT algorithm Radar s FFT algorithm Bluestein s FFT algorithm etc In our project we have used Cooley tukey algorithm of FFT for recognition 29 Cooley tukey algorithm is the most common FFT algorithm It is named after J W Cooley and John Tukey It re expresses the Discrete Fourier Transform DFT of an arbitrary composite size N NjN in terms of smaller DFTs of sizes Nj and No recursively in order to reduce the computation time to O NlogN for highly composite N The Cooley Tukey a
55. ion Width of pixels specifies the incoming stream s width Height of lines specifies the incoming stream s height 68 o Pixel Format Color Bits specifies he number of bits per color plane Color Planes specifies the number of color planes 5 7 Fast Fourier transform FFT generated from megawizard The FFT MegaCore IP which should be bought or can be used when I licensed version of Quartus is used function is a high performance highly parameterizable FastFourier transform FFT processor The FFT MegaCore function implements a complex FFT or inverse FFT IFFT for high performance applications 29 The FFT MegaCore function implements two architectures e Fixed transform size architecture e Variable streaming architecture To use this core installation and licensing procedures must be followed The FFT MegaCore function supports the following design flows e DSP Builder Use this flow if you want to create a DSP Builder model that includes a FFT MegaCore function variation e MegaWizard Plug In Manager Use this flow if you would like to create a FFT MegaCore function variation that you can instantiate manually in your design In our project we have chosen the Mega Wizard Plug In Manager The MegaWizard Plug in Manager flow allows you to customize an FFT MegaCore function and manually integrate the MegaCore function variation into a Quartus II design The steps are 1 Create a new proje
56. ithm in a cyclone III Field Programmable Gate Array FPGA chip Altera DEO development board which contains a cyclone III chip on it have been used for debugging purpose We have also ensured for low power consumption such that the chip could be used universally in a wide range of security systems To develop a simple yet efficient face recognition algorithm such as PCA FFT etc on digital hardware we have researched on various face recognition algorithms using Matlab codes and studied their detection efficiency under various posture and background and also the complexity of the algorithm To save hardware resource and at the same time to obtain an acceptable level of recognition we have chosen to use Fast Fourier Transform The search database is developed by taking pictures of BRAC University students in various background and postures and used them to evaluate the developed face recognition system Images were captured using TRDB D5M camera module and digital data from the camera was transferred to the SDRAM of the DEO board using GPIO interface A NIOS2 microprocessor was synthesized in the cyclone III chip which controlled the total recognition system and the communication between the FFT core SDRAM and On chip memory The performance of the hardware is now under evaluation Keywords FFT FPGA Face Recognition Nios2 TRDB D5M Preface One of the most important reasons for choosing this task as our undergraduate thesis is
57. lgorithm can be combined arbitrarily with any other algorithm as it breaks the DFT into smaller DFTs 12 2 4 1 FFT implementation in NIOS 2 using Cooley tukey Algorithom We have implemented FFT in NIOS2 using Cooley Tukey Algorithm To achieve this we first created a processor using Qsys We added various components such as CPU SDRAM PLL Tri state bridge Onchip memory etc We made connection by connecting master to slave source to sink assigned base address and connected clock through PLL After adding all the components it automatically generates a blank code which we will use in our Verilog project After that we have written our Verilog code to interface in our FPGA through pin assignment Then we included our SOPC code in Verilog code and interface with our board s pin which generates the SOF file Finally we have completed our hardware configuration Next we have written our C code for FFT in Eclipse Finally we wrote code for Cooley Tukey Algorithm in C and implemented on NIOS 2 processor We compiled the code and saw the result in the console pane E Problems amp Tasks Console B Properties P Mios I Console 2 newproject Nios II Hardware configuration cable USB Blaster on localhost USB 0 device ID 1 instance ID 0 name jtag_uart 4 0 1 2 41421 0 0 1 0 414214 0 0 1 0 414214 0 0 1 i l 2 41421 Figure 2 6 Result of Cooley Tukey 30 If we compare the resul
58. like VGA display We have the grayscale data into the SDRAM Using DMA1 we could call the stored image and put it in the FFT work by curetting a code in NIOS2 Using DMA2 we stored the FFT image into the SDRAM again We have prepared an approximate code for final recognition in NIOS2 83 6 3 Limitation 6 3 1 Software The FFT is a complicated and non effective algorithm still we tried to implement it to FPGA board as our topic contains two huge areas that is image processing and implanting it on the FPGA board Since FPGA implementation is our priority we started it with an easier algorithm using FFT FFT may not be as perfect as other algorithms however to reduce noise we are emphasizing on converting the image into frequency domain 6 3 2 Hardware After we generate an IP for FFT from mega wizard sof is not generated We are assuming that this problem 1s due to the unavailability of the licensed version Therefore we could not verify our proposed architecture But individually we could verify many components of our proposed architecture as for example if the data could be saved in the SD RAM and FFT core 1s working 6 4 Future Work As future improvements PCA algorithm could be implemented on FPGA This algorithm must be implemented directly on the FPGA to accelerate the encoding process and not overload the processor This hardware implementation lead to a higher frame rate encoding less data per frame consequently l
59. method of face recognition with respect to frequency spectrum The FFT variables are ranked according to their variance thereby reflecting a decreasing importance as to their ability to capture the whole information content of the original data set for signal reconstruction purposes By virtue of its ability to reduce the complexity of the resulting feature space the PCA is widely used in a number of pattern recognition applications 33 Two face recognition strategies 1 e PCA Principal Component Analysis and FFT Fast Fourier Transform were implemented in our project If we want to recognize the same image of a student with FFT algorithm the accuracy is 100 and if we take a slight changed expression of the same person the accuracy is 40 where the PCA gives 70 accuracy in changed expressions Accuracy In Percentage PCA Figure 6 1 Accuracy rate of Face Recognition for PCA and FFT 81 Principal Component Analysis gave better results for varying poses Fast Fourier Transform can recognize faces but if any person smiles and if that images contains too much light are taken into account The results are pretty good for the test samples that we have considered Figure 6 2 Recognition result with FFT based algorithm Figure 6 3 Recognition result with PCA based algorithm 6 2Hardware Individually we could verify many components of our proposed architecture
60. n an internal and external memory Descriptor DMA Read Processor Write Status Registers m Avalon MM Master Port E Awvalon MM Slave Port lO Breakout Programming with SG DMA Controller The description of the device descriptor data structures and the application programming interface API for the SG DMA controller core are given below 65 typedef struct alt sgdma dev alt llist llist Device linked list entry const char name Name of SGDMA in SOPC System void base Base address of SGDMA alt u32 descriptor base reserved alt u32 next index reserved alt u32 num descriptors reserved alt sgdma descriptor current descriptor reserved alt sgdma descriptor next descriptor reserved alt avalon sgdma callback callback Callback routine pointer void callback context Callback context pointer alt u32 chain control Nalue OR d into control reg alt sgdma dev Fig Device data structure typedef struct alt u32 read addr alt u32 read addr pad alt u32 write addr alt u32 write addr pad alt u32 next alt u32 next pad alt ul6 bytes to transfer alt u8 read burst Reserved field Set to 0 alt u8 write burst Reserved field Set to 0 alt u1l6 actual bytes transferred alt u8 status alt u8 control alt avalon sgdma packed alt sgdma descriptor Fig Descriptor data structure Mme 0 o O O alt_avalon sgdma do async transfer Starts a non bl
61. nal enhancement can be made without spending time on redesigning hardware or modifying board layout However FPGAs are much expensive than microcontrollers If our design needs greater integration density then FPGAs are appropriate For smaller projects we go for microcontrollers 17 37 Chapter 3 MATLAB Implementation 3 1 Basic Approach Before implementing the process in hardware we verified our project in Matlab first In our project first we have used FFT as a basic algorithm We have made our own database consisting of Brac University students and used them to develop our recognition system The database that contains the images of different expressions of the students is named Train Database It has total 20 images Here we have considered two different expression of an 1mage We have used another database which is named the Test Database The database that contains the image that will be compared with the train database s image 1s named the Test Database It may contain image inside or outside image of Train Database Firstly we placed the Train Database containing 20 images in a directory using Matlab Then we resized the 1mage into 50 50 to ensure same dimension for every image For the ease of further processing we converted the RGB data into Gray scale which reduces the matrix dimension After that we have applied FFT on the entire database using the Matlab function FFT2 as images are two dimensional
62. nd ay IL EM Figure 2 2 Time Domain Decomposition The above figure shows an example of the time domain decomposition used in the FFT In this example a 16 point signal 1s decomposed through four separate stages The first stage breaks the 16 point signal into two signals each consisting of 8 points The second stage decomposes the data into four signals of 4 points This pattern continues until there are N signals composed of a single point An interlaced decomposition is used each time a signal is broken in two that 1s the signal 1s separated into its even and odd numbered samples After understanding the structure of decomposition we can say that using it any N point signal can be easily simplified It is nothing more than a reordering of the samples in the signal Sample numbers Sample numbers in normal order after bit reversal Decimal Binary Decimal Binary D OO00 D OOOO 1 OOO 1 S 1000 2 0010 4 0100 3 0011 12 1100 4 0100 z 0010 3 0101 10 1010 6 0110 E 6 0100 y 0111 14 1110 S 1000 1 0001 e 1001 e 1001 10 1010 5 0101 11 1011 13 1101 12 1100 3 0011 13 1101 11 1011 14 1110 y 0111 15 1111 15 1111 Figure 2 3 Rearrangement pattern required The given figure shows the rearrangement pattern required On the left the sample numbers of the original signal are listed along with their binary equivalents On the right the rearranged sample numbers are listed also along with their binary equivalents The important part 1s that t
63. nsists of up to 120K vertically arranged logic elements LEs 4 Mbits of embedded memory arranged as 9 Kbit M9K blocks and 200 18x18 embedded multipliers Cyclone III LS FPGAs have a memory rich and multiplier rich floor plan consisting of up to 200K logic elements 8 2 Mbits of embedded memory and 396 embedded multipliers 20 Both architectures include highly efficient interconnect and low skew clock networks providing connectivity between logic structures for clock and data signals The logic and routing core fabric 1s surrounded by I O elements IOEs and phase locked loops PLLs as shown in Figure 4 2 4 3 Logic Elements The logic array consists of LABs with 16 LEs LAB control signals LE carry chains Resister chains and local interconnect in each LAB LABs are grouped into rows and columns across the device Cyclone III devices range from 5 136 to 119 088 LEs A LE is compact and provides advanced features with efficient logic utilization Each LE has four input look up table LUT a programmable register a carry chain connection a register chain connection and support for resister packing and resister feedback Moreover it has the ability to drive all types of interconnect local row column resister chain and direct link interconnect 20 Register Chain Register Bypass Routing from LAB Wide previous LE Synchronous LAB Wide Programmable Load Synchronous Register LE Carry In L 1 ata 2
64. nt ways One is geometric another one is photometric Geometric approach focuses on distinguished features of a face and photometric approaches statistically that distils an image into some values and these values are compared with some templates so that variances are eliminated The researches directions include recognition from outdoor images non frontal facial images increased understanding of the effects of demographic factors on the performance develop improved models for predicting identification performance on very large galleries and many more 1 10 1 2 Motivation and objectives After extensive research in the field of face recognition 35 36 we discovered that none of the projects included FFT as a face recognition algorithm We have designed the hardware architecture from a new perspective Most of the available algorithms are implemented in software As a result the recognition speed is not as expected On the other hand hardware implementation has many promises Therefore we emphasized on hardware implementation The improvement includes robustness of the speed and accuracy of the system An FPGA can provide us necessary resources to achieve such improvements in face recognition The resources includes built in blocks various communication interfaces millions of logic gates scopes to run C codes into the digital hardware circuitry high level design tools performance long term maintenance reliability etcetera Th
65. ocking transfer of a descriptor chain Starts a blocking transfer of a descriptor chain This function alt avalon sgdma do sync transfer blocks both before transfer if the controller is busy and until the requested transfer has completed alt avalon sgdma construct mem to Constructs a single SG DMA descriptor in the specified memory mem desc for an Avalon MM to Avalon MM transfer Constructs a single SG DMA descriptor in the specified memory alt avalon sgdma construct stream to mem de for an Avalon ST to Avalon MM transfer The function automatically terminates the descriptor chain with a NULL descriptor alt avalon sgdma construct mem to Constructs a single SG DMA descriptor in the specified memory stream desc for an Avalon MM to Avalon ST transfer Enables descriptor polling mode To use this feature you need to make sure that the hardware supports polling alt avalon sgdma disable desc poll Disables descriptor polling mode RT T i te src Reads the status of a given descriptor status Associates a user specific callback routine with the SG DMA alt avalon sgdma register callback interrupt handler Starts the DMA engine This is not required when alt avalon sgdma start alt avalon sgdma do async transfer and alt avalon sgdma do sync transfer are used We ene otops the DMA engine This is not required when alt avalon sgdma stop alt avalon sgdma do async transfer and alt avalon sgdma do sync transfer
66. off chip memory Qsys Builder automatically instantiates on chip memory inside the Qsys system so there is no fuss about making any manual connections Certain memory blocks can have initialized contents when the FPGA powers up This feature is useful for example for storing data constants or processor boot code On chip memories support dual port accesses allowing two masters to access the same memory concurrently 23 The configuration wizard for the On chip Memory RAM or ROM component has the following options Memory type Size and Read latency Memory Type The Memory type options define the structure of the on chip memory RAM writable this setting creates a readable and writable memory ROM read only this setting creates a read only memory Dual port access this setting creates a memory component with two slaves which allows two masters to access the memory simultaneously Block type this setting directs the Quartus II software to use a specific type of memory block when fitting the on chip memory in the FPGA Because of the constraints on some memory types it is frequently best to use the Auto setting Auto allows the Quartus II software to choose a type and the other settings direct the Quartus II software to select a particular type Size The Size options define the size and width of the memory 71 e Data width this setting determines the data width of the memory The available
67. ol face databases face identification human computer interaction law enforcement smart cards featuring important characteristics to achieve this goal Hardware development was done using an Altera DEO development board with a Cyclone III FPGA which was found to be appropriate for multimedia projects The CMOS sensor from TERASIC TRDB D5M with 5 megapixel resolution is from the same vendor and was specially made to use with DEO board This development kit includes Verilog HDL examples for the image acquisition conversion and image storage Some of them were used with a couple of changes to meet project s needs Using SOPC Builder or QSYS it was possible to implement a Nios II soft core processor with all necessary options enabled A Nios II system includes a processor core an UART peripheral an interval timer input output components a SDRAM memory controller a LCD module an Ethernet interface a SD MMC card interface and a CMOS slave controller All of these modules use an Avalon Memory Mapped interface and are connected using system interconnect fabric The gate ware design was implemented with Verilog HDL using Quartus II 12 0 Web Edition This Face Recognition System can be used in any application purpose but it is optimized for security 1ssue The fact that this system was developed in FPGA and with an open source operating system allows any developer to continue this project and implement more features 85 Chapter 7 Refe
68. op REN ERE REPE EET E PINE pM Ide 49 Figure 4 5 DEO PPOrA COMPONEN S dsaeduses tetur dorus ines Salem Sida daca a adaa en n EENES 50 Pigure4 6 Pixel Artay DOSCUtHpU OD 2659 c6 cho vo uto Peine euo RT a qt exi e Uo UO UR eve 51 Figure 4 7 Default Pixel Output TTT llpg uon P aa e puo PSU Led t en 32 Iasure 4 5 Bayer Patter Filte seisan eaa aA EX RR ER LOO EST A NER 53 Figure 4 9 Bayer Image Pixels 2o Wes worsen cca ERE xen rud ue mansen can Wendie lr reca UH UE 53 Fisure 4 L0 RGB Pixel from Bayer Format 240 20 eese tro pes uno ANERER IRA Raten 54 Figure 5 1 Block Diagram of Our Proposed Architecture 0 ccc cece eee e cece cece eee 55 Figure 2 2 RTL AMO WET exes taereuivsoentanicxearocatatesuivestaens eases EUR ERP nine steed Hindu D Figure 5 3 External Bus to Avalon Bridge ccceeeeeeeeeeeeeeeeeeeeeeeeeeeeeeees 58 Pisure 3 42 NIOS IL PrOGGSSOFz o e proe eie hv cides dees ees Meee 73 Figure 5 57 EVA ATO CUUEG ernea dues io beaa eR EE died ted ET Rua NOR TEM EE HUE EYE Ua 74 Fure 5 6 Nios TII HAL Project SUCU ore ood ue Ea Ep e ote si n da Ubi e npe a 75 Figure 5 7 Flowchart for Nios II Instruction ssseeeeeeeeeeeeeeees 76 Figure 2 9 OSys System Content so cui Eo vat rk he rt dee E pad et dut nare rU ERE 71 Figure 5 9 Flowchart for Valid Frame Capture sese enn 78 Figure 6 1 Accuracy rate of Face Recognition for PCA and FFT uuuussese 80 Figur
69. pin 1s also s responsible for the start and end of the pixel stream in the image This pin goes high only once during each image provided by the camera In above figure FVAL goes high when camera provides image For a complete configuration we also need to write the valid values for the various configuration registers in the camera For example we configure the camera when to start row and columns and what should be the rate of images provided by the camera Digital and analog gain for the three color components are adjusted to give best performance in specific environment 4 8 2 Line Valid This is the hardware pin on the camera which goes high during the valid pixels in a row of the image This pin asserted number of row times in the image For our configuration this pin is asserted 960 times for one image Each time line valid pin goes high there are 1280 pixels transferred by the camera Each pixel is transferred by triggering the pixel clock pin in the camera 52 4 9 Bayer to RGB conversion in FPGA Image sensor exports the image in Bayer format and in FPGA a Bayer color filter array converts Bayer pattern image into RGB The pattern of this filter shows that half of its pixels are green while quarter of the total number is assigned for red and same for blue color Odd pixel lines in the image sensor contain green and blue components while the even lines contain red and green color components Figure 4 8
70. re we will confine ourselves to displaying only the magnitude of the Fourier Transform unless our interest does not belong to reconstruct the image 15 16 On the other hand if we do not separate the magnitude and phase part of an 1mage after applying FFT on that image we will obtain 35 Figure 2 10 Magnitude and Phase of a Fourier image The above diagram contains both the magnitude and phase value of a Fourier image In our project we have considered the both parts 2 6 Applications Face recognition systems have achieved a huge popularity due to wide range of applications It has been an area of research from very beginning Applications exist in two main categories practical application and research application From practical standpoint face recognition is extensively used in security systems The FBI is already using it to identify suspects who are caught on surveillance cameras The places like airports International borders the need is raising for a face recognition system that identifies individuals Face recognition systems can be used in entertainment purpose like video games In research applications face recognition has paved the way for research in areas like image and video processing Due to the increasing demand of this system into many sectors researchers are working on developing many algorithms of face recognition Principle Component Analysis PCA and KPCA Linear Discriminant Analysis LDA Independent
71. rectory jpg I cell 1 numel tifffiles for k I length tfffiles filename sdirectory tifffiles k name I k 2 imread filename Resize images Rbzimresize I k 50 50 RGB to Gray images J rgb2gray Rb 2D FFT fftb fft2 J Display images figure imshow I k imshow Rb 43 figure imshow J figure imshow uint8 fftb Imread is used to import the images into Matlab This function can handle most of the standara image file formats such as bmp jpg tiff and png 18 In our code Imshow is usd to display the images Imshow is one of several functions that plot images but this function automatically eliminates the axes displaying 1mage nicely This function works well for original images When we applied Imshow in our original image it shows Original Image After applying RGBtoGray function in the original image we obtained Gray scale Image After turning the original image into gray scale we performed 2D FFT on the image considering both magnitude and the phase it results 44 T FFT Image FFT based face recognition is able to recognize faces with slight change in expression In Test Database we put an image of different expression of one the images of Train Database After simulation they matched Though it is not effective as PCA algorithm yet to some extend it works perfectly and we get 85 096 Matched Equivalent Database image 85 096
72. rences l Face Recognition Editors Kresimir Delac Mislav Grgic and Marian Stewart Bartlett IN TECH Vienna Austria 2008 M Turk A Pentland Eigenfaces for Recognition Journal of Cognitive Neuroscience 3 1 pp 71 86 1991 D Swets J Weng Using Discrminant Eigenfeatures for Image Retrieval IEEE Transactions on Pattern Analysis and Machine Intelligence 18 8 pp 831 836 1996 P N Belhumeur J P Hespanha D J Kriegman Eigenfaces vs Fisherfaces Recognition using class specific linear projection IEEE Trans Pattern Anal Machine Intell vol 19 pp 711 720 May 1997 5 J Karhunen E Oja L Wang R Vigario and J Joutesensalo A class of neural netowrks for independent component analysis IEEE Trans On Neural Networks 8 3 486 504 1997 P Comon Independent Component Analysis a new concept Signal Processing 36 287 314 1994 MLS Bartlett J R Movellan T J Sejnowski Face Recognition by Independent Component Analysis IEEE Trans On Neural Networks Vol 13 No 6 November 2002 pp 1450 1464 A Kadyrov and M Petrou The Trace Transform and Its Applications IEEE Transactions on Pattern Analysis and Machine Intelligence Vol 23 No 8 pp 811 828 August 2001 http www authorstream com Presentation gaurav22788 300314 face recognition using neural networks final present 2 science technology ppt powerpoint 10 dspGuru by Lowegian International D
73. riations along all axes are treated as equally signifi cant In the recognition phase a subject face 1s normalized with respect to the average face and then projected onto face space using the eigenvector matrix Next the Euclidean distance 1s computed between this projection and all known projections The minimum value of these comparisons is selected ans compared with the threshold calculated during the training phase Based on this if the value is greater than the threshold the face 1s new Otherwise it is a known face 22 2 1 2 LDA Linear discriminant analysis LDA is another effective algorithm for face recognition It is closely related to PCA and factor analysis in that they both look for linear combination of varaibles whice best explain the data LDA explicitly attempts to model the difference between the classes of data PCA on the other hand doesnot take into account any difference in class and factor analysis builds the feature combinations based on differences rather than similarities The face space created in LDA gives higher weight to the variations between individuals than those of the same individual LDA is less sensitive than the phase spectrum Indeed it is the phase spectrum that contains information which humans use to identify faces 4 2 1 3 ICA As PCA considers the 2 order moments only it lacks information on higher order statistics The Independent Component Analysis ICA accounts for higher order s
74. rresponding frequency 15 16 In general if we apply FFT on an image we get the complex result The magnitude calculated from the complex result is shown in 15 16 Figure 2 7 The magnitude calculated from the complex result It is seen that the DC value is by far the largest component of the image However the intensity values in the Fourier image or the dynamic ranges of the Fourier coefficients is too large to be displayed on the screen therefore all other values appear as black If we apply logarithmic transformation to the image we obtain 34 Figure 2 8 Magnitude after logarithmic transform We can see that the image contains component of all frequencies but their magnitude gets smaller for higher frequencies Hence low frequencies contain more image information than the higher ones The transformed image tells us that there are two dominating directions in the Fourier 1mage one passing vertically and one horizontally through the center These originate from the regular patterns in the background of the original image The phase of the FFT of the same image can be shown as Figure 2 9 The phase of FFT The value of each point determines the phase of the corresponding frequency As in the magnitude image we can identify the vertical and horizontal lines corresponding to patterns in the original image The phase 1mage does not contain much new information about the structures of the spatial domain image Therefo
75. rst complete row of RGB image is created Similarly with the completion of B and 4 row of Bayer pattern image a 25 RGB pixel row completed As the pixels are being received by the camera they are simultaneously being transformed into RGB and simultaneously being sent to the memory module in the FPGA After that we converted this RGB pixel into grayscale using the following formula Grayscale Red Green Blue 3 This conversion is used to reduce the matrix dimension Next this memory module stores this pixel in the external SDRAM through external bus and so on 54 Chapter 5 Hardware Implementation Altera Corporation is the pioneer of programmable logic solutions And we have used Altera s FPGA board to use in our project Our FPGA board is from the Cyclone III device family and its model number is DEO 27 In our project we have used Qsys extensively Qsys is the Altera s system integrated tool Qsys system integration tool saves significant amount of time and effort in the FPGA design process by automatically generating interconnect logic to connect intellectual property IP functions and subsystems Qsys is the next generation SOPC Builder tool that is powered by a new FPGA optimized network on a chip NoC technology delivering higher performance enhanced design reuse and faster verification compared to SOPC Builder 27 A block diagram of our problem formulation for the Qsys part is given below CMOS S ja I
76. rt fresh perspectives we have got We are also thankful to Sir Jahangir Alam Lecturer of BRAC University for his constant guidance when we were stuck We highly appreciate the assistance and guidance of Shafiur Rahman student of BUET and Ahsan Ashfaq an Alumni of Halmstad University Sweden throughout the process April 13 2013 JADSIEAC S So detussndMiee iic neu d EEUU EE A D E IE AM E RAE EE 3 PEELA CO rhe ET 4 Acknowledment 259 290609229002 6P eR CS vaEa Eb ES EP o ea Ue pH DE o EIU IRI ES 5 Table OF nicis M 7 TADIG E EA E A E E E A ds EE deasts 8 Figure LS Cie icccuestecavesseeeotonssenwewsune seas seipeescanactegessassnoso wet EI E earners 9 Table of Contents Chapter 1 PNTEOGUCHON vaciackacentacieoveesensctin dad seepridatuinssmeus A EEES 10 B We 2c 424 60 1 04 6 Ree ae eer E Carer EOM renee te weer ee rey or wee mer rer 10 L2MOBVation and ObIe CH Ves quer eiote a qn et ri tete tidie ed ote et ev etsi drei bebe dais ae 11 be SIR eS EOSen E A A r E E A O 11 IA Proble Mir OFM AM OM eoe coepti E EN E N ENT 12 Chapter 2 2 1 Algorithms for Face Recognition esesesessssesesesesecsecesoesesesoeoe 13 2 1 1 Principle Component Analysis PC A sese nnne 13 22 2 1 2 Lincar Discriminant Analysis LDA ocio e opti IR EET REPERI IR DESCR A IU UE UE 23 2 1 3 Independent Component Analysis ICA ccccccccssssssseeeceeeeeeeeeseeeceeeeeeaaeeseeeeeeeeeeaaas 23
77. rty of FFT is that the transform of N points can be written as the sum of two N 2 transforms This is important because some of the computations can be reused thus eliminating expensive operations 13 The output of the Fourier Transform is a complex number and has a much greater range than the image in the spatial domain Therefore to accurately store these values they are stored as floats Furthermore the dynamic range of the Fourier coefficient is too large to be displayed on the screen and these values are jscaled to bring them within the range of values that can be displayed 13 32 A modern interpretation of FFT states that any well behaved function can be represented by a superposition combination or sum of sinusoidal waves It can be said that the frequency domain representation is just another way to store and reproduce the spatial domain image 14 If we take a single row or column of pixel from any image and graph it we will find that it pha If the fluctuations are more regular in spacing and amplitude we would get something more looks more like a wave like a wave pattern Such as HG g N 3 Me If we were to add more waves together we might get a pattern that 1s closer to the original image ww ka e NV The superposition of waves or addition ojf waves in much closer but still does not match the image pattern However we can continue in this manner adding more waves and adjusting th
78. rupt Controller with up to 32 interrupts per controller e Advanced exception support e Separate instruction and data caches configurable from 512 bytes to 64 KB 73 e Access to up to 2 GB of external address space e Optional tightly coupled memory for instructions and data e Up to six stage pipeline to achieve maximum MIPS Dhrystones 2 1 benchmark per MHz e Single cycle hardware multiplies and barrel shifter e Hardware divides option e Dynamic branch prediction e Up to 256 custom instructions and unlimited hardware accelerators e Configurable JTAG debug module e Optional JTAG debug module enhancements including hardware breakpoints data triggers and real time trace 5 11 Hardware Abstraction Layer The HAL serves as a device driver package for Nios II processor systems The HAL is a lightweight embedded runtime environment that provides a simple device driver interface for programs to connect to the underlying hardware Moreover HAL device driver abstraction provides a clear distinction between application and device driver software The HAL application program interface API is integrated with the ANSI C standard library The HALAPI allows us to access devices and files using familiar C library functions The Nios II software development tools extract system information from our SOPC Information File sopcinfo Most noteworthy thing is that we need not to write low level routines to establish basic communica
79. se Image Since another two expressions of the above image is present in the Train Database if we run the system the answer will be Matched 3 2 Two dimensional FFT on an image In image processing the 2D FFT allows one to see the frequency spectrum of the data in both dimensions and lets one visualize filtering operations more easily The 2D FFT is simply a Fourier transform of one dimension of the data followed by a Fourier transform over the second dimension of the data In the following example we have performed a 2D FFT on an image switched the magnitude and phase content Now we would get to see what actually happened in Matlab when we applied 2D FFT of an image from our own database Considering the code written below 19 close all clear all img imread Farhan jpg jpg imagesc img img fftshiftamg 2 F fft2Gmg figure 41 imagesc 100 log 1 abs fftshift F colormap gray title magnitude spectrum figure imagesc angle F colormap gray title phase spectrum Before entering into our main recognition code using the above code we applied FFT2 on an image to observe the output and it results 20 40 60 80 100 120 140 160 180 The Original Image magnitude spectrum 20 40 60 80 100 120 140 160 180 Magnitude Spectrum phase spectrum Tan TUI IE SUE Tt Visi eat i TUERI Tr ur Uv Y nae Fra m LA zs H n T nu A T Ns a i a NUTUS T hz 5S
80. st acon Layo icsessirieier tee oa ER TATE O E US 74 Chapter ge mE 80 Results and DISCussiOn ssn A sareiaeudeusiceeuees eueendebeteaeste 80 Dol SOM AQ esee nto A ertet ated db tests ees ee A ee ttes rS 80 0 2 HIWA abscess oe ne ee ene er ek ee eran eee viuit eee ere ner ey ee ree ne ee eee 82 AO AETS A E ETES pean ta oad deat etre ETE ET EEE EET E 83 OS DEVAL TA A EAE EE E E ETO EE T E EEE oekaeea eaete 83 iM IY ENT E E A EE A E DII E E Hem sad 83 EPOE WOLK araa E E anos ad AEE ATE edu vet a rte 83 0S COn USON aoei TD ms 84 Chapter Teoirice Tinn n A E E 85 INCICRONU CS anonra on QS DI ILI E 85 DIOMEDIS I E E E OE E 87 ADDOIUIX o octet n iste rasete Mesi tania scd ebal iue toten base enata nud di MEE 87 BPE MaD COE oaeen ut datis TERM atat uno o atas pL 87 8 2 PCA based Face Recognition Matlab Code ssseeeeesesessssssseeennenee 88 SP POA Oni T 89 8 3 1 Storing Data from Camera Module to SDRAM sssseeeeessse 89 8 3 2 Interfacing Qsys Components with FPGA sssseesssssssI nene 90 6 9 9 Creatine SDRAM ATIOCABIOT aa co ae e Ut tav e RO ERR Pet Tate ines A 9 8 4 Code 10r OD E WPTTX mmr 9 6 5 Code for Recognition in C NIOS ID essssssseesseeeeee e enne enne 92 Table List Table 5 1 Description of SDRAM Parameters sees 6l Table 5 2 PLL Calculation Table 5 3 Function List
81. t with the Matlab s result it matches Therefore it can be said that our approach is correct 2 5 Applying FFT on an image Fast Fourier Transform on image is a representation of the image in frequency domain Its function on image is to decompose it into its real and imaginary components If we take an image as an input then the number of frequencies in the frequency domain is equal to the number of pixels in the original image 13 31 The inverse FFT re transforms the 1mage from frequency domain to spatial domain or time domain The FFT and its inverse of a 2D image are given by the following equations N 1 i H 12 nd mu FG0 P e m N l UC EN fin LY FQ AN N 0 Here f m n is the pixel at f m n coordinates F x y is the value of the image in the frequency domain at x y coordinates M and N are the dimensions of the image Since image is two dimensional we applied 2D FFT on it The 2D transform can be done as two ID transforms as shown below shown only the horizontal direction one in the horizontal direction followed by the other in the vertical direction on the result of the horizontal transform The end result is equivalent to perform the 2D transform in the frequency space M aN gata aa FG 2 SS fne MUN meer EI M 1N 1 arta f m n FDS F x yp CUM UN MNS The FFT that s implemented in the application here requires that the dimensions of the image are power of two An interesting prope
82. tatistics and it identifies the independent source components from their linear mixtures ICA thus provides a more powerful data representation than PCA 5 as its goal is that of providing an independent image rather than uncorrelated image decomposition and representation ICA of a random vector searches for a linear transformation which minimizes the statistical dependence between its components 6 ICA represents the input as an n dimensional random vector This random vector is then reduced using PCA without losing the higher order statistics Then the ICA algorithm finds the covariance matrix of the result and obtain its factorized form Finally whitening rotation and normalization are performed to obtain the Independent components that constitute the fce space of the individuals Since the higher order relationships between pixels are used ICA is robust in the presence of noise Thus recognition is less sensitive to lighting conditions changes in hair make up and facial ecxpressions 7 2 1 4 Trace Transform The Trace transform 8 a generalizarion of the Radon transform is a new tool for image processing which can be used for recognition objects under transformations rotations translation and scaling To produce the Trace Transform one computes a functional along tracing lines of an image Each line is characterized by two parameters namely its distance 23 from the centre of the axes and the orientation The trace tr
83. that it eave us the possibility to use the theory and knowledge that we have gained over the years to make something useful and practical We also believed that the design task would be good preparation for the future challenges We have always been fascinated by electronics and the wide area of application this technology presents Since our interest include both VLSI and working with FPGA this project became a great opportunity to combine our interest and education The reason behind choosing FPGA is that in our country very few people worked with this board and we took it as a challenge This challenge was the most effective way to learn new things We have learned a lot about image processing DEO board TRDB D5M camera Quartus12 0 and NIOSII The Altera DEO platform was a very good platform to work with Many projects can be implemented using this board Acknowledgements This thesis is submitted to BRAC University in partial fulfillment of the requirements for the degree of bachelor in Electrical and Electronic Engineering We are thankful to our Almighty Allah for his blessings upon us and bestowing us courage to go with such task and also our parents for their support love and patience This project would not have been possible without all the help and the support we have received We would like to thank our supervisor Professor Dr A B M Harun Ur Rashid Bangladesh University of Engineering and Technology for all the help suppo
84. tion with the hardware Therefore Application programmers call the ANSI C or HAL API to access hardware rather than calling your driver routines directly HAL does not support MPU Memory Protection Unit and MMU Memory Management Unit hardware 30 HAL Architecture Figure 5 5 HAL Hardware Abstraction Layer Architecture 74 The HAL provides the following services e Integration with the newlib ANSI C standard library provides the familiar C standard library functions e Device drivers provide access to each device in the system e The HAL API provides a consistent standard interface to HAL services such as device access interrupt handling and alarm facilities e System initialization Performs initialization tasks for the processor and the runtime environment before main e Device initialization Instantiates and initializes each device in the system before main runs Mios Il Program Based on HAL Application Project HAL BSP Project Hardware System Figure 5 6 Nios II HAL Project Structure Every HAL based Nios II program consists of two Nios II projects One is the user application project and another one is HAL BSP Project The HAL drivers relevant to your hardware system are incorporated in the BSP project The BSP project depends on the hardware system defined by a SOPC Information File sopcinfo 75 The procedure we have followed with the assistance of the HAL library is given as a
85. values corresponds to the variance of the data along the eigenvector directions 17 e Main Idea behind Eigenfaces 4 r2 j ET LET a F Suppose is an N xl vector corresponding to an Nx N face image The idea is to represent b mean face into a low dimensional space E 73 P mean wii Wolly t c wgug A NM 18 e Computaion of the eigenfaces Step obtain face images 7j 75 Jag training faces very important the face images must be centered and of the same size Step 2 represent every image as vector Step 3 compute the average face vector V l ME on Step 4 subtract the mean face Step 3 compute the covariance matrix C l Y o o 4A i N x N matrix M H where A d Day N x M matrix 19 T step amp compute the eigenvectors H of AA The matrix AA is very large gt not practical Step 6 1 consider the matrix AT A Mx M matrix ar Step 6 2 compute the eigenvectors v of A A A Av dv W hat is the relationship between 55 and v D au jue Dau _ A AV u vj AA AV p AV CAv uj Av or Cu uu where uj Av T T T TM Thus 44 4 and 4 A have the same eigenvalues and their eigenvec tors are related as follows m Av T d Note 44 can have up to N eigenvalues and eigenvectors T Note 2 4 A can have up to M eigenvalues and eigenvectors Note

Thesis Paper - BRAC University Institutional Repository

Contents

Download Pdf Manuals

Related Search

Related Contents