Home

Part II - Random Processes Goals for this unit: • Give overview of

1. Now we need to map pairs of random numbers from 0 1 to 3 3 x 3 3 and then reject those that lie outside the circle x7 y 9 We simply use x 3 0 6 0 rand n 1 y 3 0 6 0 rand n 1 if x 2 y 72 lt 9 Note that here we are generating n random numbers at once and using the vector operation for exponentiation exact 63 617251235193313079 area pi 3 3 h domain we are integrating over X v 2 r 2x rx rand n 1 r 2 r rand n 1 i find x x y y lt rx r n2 length i result area sum x i x i n2 because our funct fprintf 1 8d ME he n n2 result abs result exact end If we apply the Monte Carlo Method we get decent convergence 1 74 493641 1 087639e 01 52 240562 1 137669e 01 78 62 942561 6 746898e 01 784 61 244910 2 372341e 00 7816 62 653662 9 635895e 01 78714 63 405031 2 122204e 01 785403 63 669047 5 179565e 02 7853972 63 624209 6 958217e 03 Exact 63 617251235193313079 Using Monte Carlo Sampling for Optimization e n the lab you will explore another application of MC sampling find the extrema of a function e A function such as f x cos x 5 cos 1 6 x 2 cos 2 0 x 5 cos 4 5 x x 7 cos 9 0 x plotted over 2 7 has many local extrema Many minimization routines get stuck at these local minima and can t find the global minimum e You will use MC sampli
2. 69 66 68 71 65 67 66 66 67 70 72 We know that if we want the average height of the players we simply compute 69 69 66 68 71 65 67 66 66 67 70 72 12 We can also interpret 67 9 as the mean or expected value of a random variable To see this consider choosing one player at random and let the random variable X be the player s height Then the expected value of X is 67 9 which can also be calculated by forming the product of each x and its corresponding probability 69 60 lt 66 72 12 12 12 DD which is just the formula we gave for the mean expected value 67 9 Example Suppose we are going to toss a fair coin 3 times Let X the random variable be the number of heads which appear Find the mean or expected value and explain what it means We know that there are 2 possible outcomes which are HHH HHT HTH HTT TTT TTH THT THH and each has a probability of 1 8 Now heads can occur no times 1 time 2 times or 3 times The probability that it occurs 0 times is 1 8 that it occurs 1 time is 3 8 that it occurs 2 times is 3 8 and it occurs 3 times is 1 8 Therefore the mean or expected value is 1 oO a9 L 2 g O 5 This says that if we toss the coin three times the number of times we expect heads to appear is 3 which is one half the total possible times exactly what we ex pected Another way to get this answer is to consider the set 3 2 2 1 0 1 1 2 indicating the numbe
3. Now we can move to the right or left or to the top or bottom You will implement this in the lab and use a random walk to solve Laplace s equation in two dimensions 2D Random Walk 1 walks 10 steps 2D Random Walk 1 walks 100 steps Darisss EAE peersssertsseenregeerseeerieeenesaa Seseasecsinenedad QO Das PAE PET prererreeessregitresessatestimerssesanteseejanenneneerssn x Average X T Exercise Modify the code for using random walks for the drunken sailor problem if each sailor has a 60 chance of moving to the right and a 40 chance of moving to the left Output your results in a frequency plot where the x axis is the number of steps in each direction from the origin and the y axis is the frequency Game Theory In the 1950 s John von Neumann who wrote the first computer user manual for the ENIAC developed game theory which sought strategies in multi player games The Monte Carlo method was used to simulate the games and determine the best strategies We will look at two examples of this simulating a gambling game and simulating a duel The first example is a modification to the Fermat Pascal point game which makes it much more difficult Recall in that problem that if one person won the other person did not lose Here we assume that one player wins a dollar from the other player In particular assume that there are two gamblers A and B who decide to play a game A has p dollars and B has q dollars
4. They decide to flip a coin If it comes up heads A wins 1 from B while tails works the other way The game is over when one gambler is bankrupt Here are some questions we can ask 1 What is the probability that A will win 2 What is the expected value of the game to A that is how much will he win or lose on average 3 How long will a typical game last 4 What are the chances that the winner will be ahead the entire game 5 What happens if we change the amount of money that A and B have Here s a typical game in which A starts with 3 and B with 7 assume A wins when a head comes up 2 gt 09 OF Ge H 5 5 OF FW wr e T o N A oO NUTO o NAN O e H 0o 9 H 2 8 10 11 12 13 14 15 T HHT TIT ITI Le 3 2 10 9 7I A loses after 15 tosses having tossed 6 heads and 9 tails Clearly the gambler with the most starting money has an advantage but how much It s easy to see that if A starts with 3 dollars the game can end in 3 steps Can we expect that in such a case a typical game might last 6 or 9 or 12 steps Does it depend in any way on how much more money B has Suppose that A and B start with the same amount of money In a typical game is it likely that they will each be in the lead about the same number of tosses A single plot can show you an example of how the game works but it can t tell you all the things that can happen Here is a plot of the history of A
5. e With our assumptions p C A p C but p BIA p B e We say that A and C are independent but A and B are not e For independent events A and C the probability of p AMC p A p C Example If we roll a single die with sides 1 2 3 4 5 6 what is the probability that you will roll the number 4 If we roll two dice what is the probability that they will both be 4 Clearly all numbers are equally probable and since the probabilities have to sum to 1 and there are 6 possible outcomes p X 4 1 6 To determine the probability that a 4 occurs on both dice we note that they are independent events so the probability is 4 Example Let s return to the Fermat Pascal problem of flipping a coin We know that heads and tails are each equally likely to occur If heads occurs then Fermat gets a point otherwise Pascal gets one When the game is interrupted the score is Fermat 8 and Pascal 7 There is 100 francs at stake so how do you fairly divide the winnings We first ask ourselves what is the maximum number of flips of the coin that guarantee someone will win Clearly Fermat needs 2 points and Pascal 3 The worse case scenario would be for Fermat to get 1 and Pascal to get 3 so in 4 more flips we are guaranteed a winner If we determine the possible outcomes in these 4 flips then we can assign a winner to each Then Fermat s probability of winning will be the number of outcomes he would win over the total number of outcomes
6. rand 2 10000 Of course if our domain is not the unit square then we must map the points Sampling the Unit Circle Suppose we need to evenly sample points in the unit circle i e the circle centered at 0 0 with radius 1 The first thing that might come to mind is to try to choose a radius from 0 to 1 and then choose an angle between 0 and 27 m rand b eo errand but although this seems random it is not a uniform way to sample the circle as the following figure demonstrates More points in a circle have a big radius than a small one so choosing the radius uniformly actually is the wrong thing to do How can we fix this problem The problem is that when we choose the radius to vary uniformly we re saying there are the same number of points at every value of the radius we choose between 0 and 1 But of course there aren t If we double the radius of a circle the area of the circle increases by a factor of 4 So to measure area uniformly we 4000 NONUNIFORM points in a circle 73 oe eae 0 5 oo 0 5 need r to be uniformly random in other words we want to set r to the square root of a uniform random number r rand e eto Another approach to sampling the circle is to use rejection Suppose we sample the square that contains the circle and then only plot the points that are in the circle Now we re sure to get uniformly distributed points We ll also have to reject many poi
7. 2 1e 04 1 2e 03 3 6e 04 Q 1a 0R P21 Error Decrease With N Log Error 10 10 10 10 10 Log N Here is a plot of the error on a log log plot Note that the slope is approximately 1 2 This is what we expect because we said that the error Cr for a constant C so the log of the error is log n 0 5 log n Now let s approximate the integral of a function when the interval is not 0 1 Specifically we want to integrate f x cos x 5 cos 1 6 x 2x cos 2 0 x 5x cos 4 5 x 7 cos 9 0 x over 2 7 This is a very wiggly function which we you can see from the plot Because the interval is 2 7 we have to shift the interval from 0 1 to 2 7 This is easy to do with a linear map we want 0 2 and 1 7 so the linear function that does this is y 2 5x so a random point x in the interval 2 7 is just x 2 0 7 0 2 0 rand n 1 Applying the Monte Carlo Method we get decent convergence to the exact solution 4 527569 Sum of Cosines to Integrate 1 21 486936 2 6e 01 10 21 701206 1 7e 01 100 2 472846 2 0e 00 1000 4 911594 3 8e 01 10000 4 253230 2 7e 01 100000 4 424016 1 0e 01 1000000 4 509781 1 7e 02 10000000 4 529185 1 6e 03 1 Again we see the approximation to a slope of 5 on a log log plot Error decrease for integral of sum of cosines Log fError 10 10 10 10 10 Log N e The two functions both seem hard to in
8. Consider the following two outcomes e A 3 or less hurricanes to hit Florida in 2011 e B 4 or 5 hurricanes to hit Florida in 2011 Then AU B is 5 or less hurricanes to hit Florida in 2011 The probability of A U B is p A U B p A p B because the events are mutually exclusive Then AQA B is zero because both events can t be satisfied Now consider two events which are not mutually exclusive e C rain tomorrow e D rain the day after tomorrow Then CU D rain in the next two days i e rain tomorrow or rain the next day Now the intersection of C and D is everything that is in event and in event D so C N D rain tomorrow and rain the next day What is the probability of C U D We can t simply sum the probabilities For example if there was a probability of 1 4 each day i e 25 chance then we can t say there is a 50 chance that it will rain in the next two days In the case where the events are not mutually exclusive we have p C UD p C p D p C nD Conditional Probability e If we know that one event has occurred it may change our view of the probability of another event For example let A rain today B rain tomorrow C rain in 90 days from now Assume that knowledge that A has occurred will change your view of the probability that B will occur but not of the probability that C will occur e We write p B A to denote the conditional probability of B given A
9. E a d c tenax tent Our convergence to the exact solution 1 433812586520645 1 10 100 1000 10000 100000 1000000 10000000 0 443790 1 547768 1 419920 1 464754 1 430735 1 428808 1 432345 1 433867 9 900230e 01 1 139549e 01 1 389309e 02 3 0941 70e 02 3 077735e 03 5 004836e 03 1 467762e 03 5 394032e 05 Sometimes we have to integrate a function over a more complicated domain where it is not as easy to sample in but we have a way to determine if a point is in the region for example an E shaped region What can we do in that case The answer is to return to the original way we evaluated an integral by putting a bounding box around the region This approach is usually called MC with rejection If we want to integrate over a region in 2 d that is not a box we enclose that region in a box We then generate a point in the box and reject it if it is not in the region of integration To see how this rejection technique works we look at the example of computing an integral over a circle Our integrand will be Tesa and we will integrate over the circle of radius 3 centered at 0 0 For this problem the function is smooth but the region is not a rectangle It s not hard to work out the exact answer using polar coordinates because the circle is special But the Monte Carlo approach with rejection would handle an ellipse or a region shaped like Mickey Mouse just as easily z x Z F X
10. We said A and B should shoot at each other It seems that C should shoot at the better shot say A so that if he accidentally hits him he has to face the weaker player B Another possibility C could shoot at no one if that s allowed Then once A or B is out C is sure to get the first shot at the survivor Coding the Three Person Duel e The three way duel is harder Instead of writing it from scratch let s see what we need to change in the two way duel code When we do this we should see what we would have to revise if we had a four person duel e Now we can t assume that as soon as one person is eliminated the game is over So the first thing we have to change is the part of the code that says if the shot is accurate the game is over and the shooter is the winner e Another issue is that now we have a choice of who to aim at We will take the rational strategy that each player aims at the other player who is the most accurate shot e A brute force approach to coding this would be to write out what Player 1 will do then what Player 2 will do and so on This is straightforward but cluttered It might be better to try to abstract what is going on e To this end let s concentrate on one arbitrary player whose index is I i e 1 2 or 3 If it is J s turn then what should he do check to see if he is alive if alive proceed decide who to aim at there may be one or two choices based on their probabilitie
11. can then write a script to call this function n_duel times the number of duels we are simulating and compute the average number of shots required to end the duel We would also keep a tally of how many duels were won by each player in the calling script so we can estimate probabilities How do we decide if the player shooting hits the other player We generate a random number r between 0 and 1 then Player 1 hits his target if r lt p 1 similarly for Player 2 The duel continues until a target is hit survivor turn_num function duel p turn_num 0 count for number of shots required to complete due if p 1 lt O amp p 2 lt 0 make sure input probabilities are valid fprintf invalid probabilities so duel fails n end while 1 Player 1 fires turn_num turn_num 1 r rand if r lt p 1 amp Player 1 hits his target survivor 1 break leave the while loop end Player 2 fires if r lt p 2 Player 2 hits his target survivor 2 break end end Note the use of the while 1 here The expression which must be satisfied in this statement is always true so the only way to escape the loop is through the break statement This is why we added a check at the first of the program to verify that not both probabilities are lt 0 A 3 person duel is more interesting and we will look at it next Exercise Download the function duel m which has the code described above
12. f the top of the needle has a y coordinate greater than one then the needle touches the line i e we have a hit If the bottom of the needle has a y coordinate less than zero then it touches the other line and we have a hit e Since it is known that the probability that the needle will touch the line is 2 7 then number of hits 2 number of points r and thus number of points number of hits One way to see that the probability is 2 7 is to note that the shaded portion in the figure below is found by using the definite integral of 1 2 sin theta evaluated from zero to pi The result is that the shaded portion has a value of 1 The value of the entire rectangle is 1 2 pi or pi 2 So the probability of a hit is 1 pi 2 or 2 pi That s approximately 6366197 oe f x 12 sint B from centerof L needle to Z nearest line Possible values for 6 Exercise Write pseudo code for an algorithm to approximate 7 using Buffon s needle problem Then modify the code for estimating 7 using the unit circle to solve this problem Using Monte Carlo to Approximate an Integral b e Suppose we want to evaluate f x dz a elf f x gt 0 for a lt x lt b then we know that this integral represents the area under the curve y f x and above the x axis e Standard deterministic numerical integration rules approximate this integral by evaluating the integrand f x at a set number of points and multiplying by appr
13. find the cumulative sum of integers from 1 to 5 we use cumsum 1 5 and the response is 1 3 6 10 15 Here is a script to draw the last bar graph for the CDF x 2 12 Pat li 2 8 458 4655s 4 342 423 7386 cdf cumsum pdf bar edi xlabel Score ylabel CDF Score title CDr for 2 dice Using the CDF for Simulation e The CDF can be used to simulate the behavior of a discrete system e As an example consider the problem of simulating rolling two dice and using the CDF to estimate the probabilities where we are given the CDF as cdf 03 08 16 28 42 58 72 83 92 97 1 00 e First recall that the probability that x is between z and z is cdf x2 cdf 21 e For our problem we generate a random number r between 0 and 1 for a probability We determine 7 such that cdf z 1 lt r lt edf 2 However when 2 1 this doesn t work so we append our CDF to include 0 cdf 0 03 08 16 28 42 58 72 83 92 97 1 00 For example if r 61 then i 8 e We do this lots of times and keep track of how many random numbers were assigned to 7 1 2 2 etc e Then our estimate for the probability of the sum being say 2 i e corre sponding to 2 is the number of times we have chosen a random number between 0 and 0 03 divided by the total number of random numbers gener ated function prob cd_sim n count 1 12 0 for k i n r rand
14. implemented Then write a script which calls this function for each simulation of a duel The input for your script should be the probabilities of each player p 1 2 and the number of simulations you wish to perform Your code should i compute the average number of shots to complete the duel and ii estimate the probabilities and compare with exact After this is done make a plot of the average number of shots required versus the exact probability of Player 1 The Good The Bad and The Ugly The Three Way Duel from In a two person duel there s no strategy except of course to shoot at your Opponent on your turn But suppose we move to a three person duel involving A B and C Let s suppose that C is a relatively poor shot If A and B are reasonable they will probably shoot at each other since each represents the biggest threat to the other As soon as one gets hit it will be C s turn and that s his best chance to have the first shot at the remaining opponent Remember they take turns shooting But a disaster occurs if instead of A or B knocking the other out poor shot C accidentally hits A or B on his turn Because then C does not get the next shot rather the survivor gets first shot at C and we know that whether A or B shoots either is a better shot than C Now the coding for this problem is similar to that for the two person duel except that on each shot the player has a choice of whom to shoot at
15. of what happened e However if we keep records long enough we may see that the frequency plot can be used to make statements about what is likely to happen in the future For two dice we can say that a score of 7 is very likely and can even guess how much more likely it is than say 3 e This suggests that there is an underlying probability that influences which outcomes occur In cases where we are simply collecting data we can turn our frequencies into estimates of probability by normalizing by the total number of cases we observed frequency of result 1 estimated probability of result i W _ total number of results Now let s compare the approximate probabilities found using this formula for the case where we used n throws of 2 dice and then compare this with their exact probabilities To calculate the actual probabilities of each sum we determine the possible out comes of rolling the dice Because there are more outcomes in this case it s better to make an outcome table where we list all possible outcomes of the roll of 2 fair dice H 2 3 4 5 6 101 1 1 2 1 3 1 4 1 5 1 6 2 2 1 2 2 2 3 2 4 2 5 2 6 3 3 1 3 2 3 3 3 4 3 5 3 6 441 4 2 4 3 44 4 4 6 5 6 1 6 2 6 3 5 4 55 6 6 6 1 6 2 6 3 6 4 6 5 6 6 e We see that there are a total of 36 combinations of the dice e f we want to compute the p
16. piecewise constant function defined for all x e cdf x 0 for x less than the smallest possible outcome e cdf x 1 for x greater than or equal to the largest outcome e cdf x is essentially the discrete integral of pdf X over the appropriate interval e cdf x is monotonic and hence invertible e the probability that x is between x and z 1 lt x lt x2 is cdf a2 cdf x1 Return again to the problem of rolling two dice and calculating the probability that the sum of the dice is a number between 2 and 12 We found the exact probability of each sum as 2 3 4 5 6 7 8 9 10 11 12 eel ele al ee 03 06 08 1t 14 17 14 11 08 06 03 For a discrete case like this it s easy to make the corresponding cumulative density function table 2 3 4 5 6 7 8 9 10 11 12 O Q O Q O eases eee eae 03 08 16 28 42 68 72 83 92 97 1 00 The CDF is actually defined for all z not just the values 2 through 12 For example the table tells us that cdf 4 3 0 16 and cdf 15 1 The latter just tells us that the probability of rolling two dice and getting a sum lt 15 is 1 Cumulative Probability for 2 dice 0 9 0 4 0 3 0 2 i 2 3 4 5 6 7 8 9 10 Score bag 2 N gt D CDF Score a ao 11 12 Here is the plot of the CDF for 2 dice Matlab has a built in command cumsum which can assist us in calculating the CDF from a discrete PDF For example to
17. s money for the example game above Gambler s Ruin A 3 B 7 Steps 15 Flips 1 For a simulation we need to look at a lot of games and we try different starting bets the parameters of this model Let s consider the following cases when we run our simulations 3 7 Our starting example 30 70 10 times as long 10 10 Equal stakes equal odds 1 100 Game will be quick Structure of the code e input starting stakes a_stakes b_stakes and number of games to simu late n_games initialize set a wins 0 b_wins 0 a a_stakes b b_stakes e for each game while a gt 0 and b gt 0O generate a random number to see who wins ifawins a a 1 b b 1 if b wins b b 1 a a 1 e stop when a 0 or b 0 e after the game is over increment wins e after all the games are over approximate the probability that each player wins b_wins O we a a_stakes b_stakes o II while 0 lt a amp X amp O lt b end increment wins if bh Q a_wins a_wins else b_wins end y Averag rA prob_a_win a_ prob_b_win The results of 1000 games for each set of starting stakes 7 3 21 30 70 2 010 12 992 10 10 101 1 100 104 10 472 From this data you might be able to guess the following e the expected number of steps is A B e A s chance of winning is A A B What is the expected value of this gam
18. similarly for Pascal Then we will know how to divide the money Here are the 2 possible outcomes for 4 flips and we have indicated the winner in each case remembering that Fermat needs 2 heads to win and Pascal 3 tails HHHH F HHHT F HHTH F HHTT F HTHH F HTTH F HTHT F HTTT P TTTT P TTHT P TTTH P TTHH F THTT P THHH F THTH F THHT F So Fermat will win 11 out of 16 times so his probability is 11 16 Pascal will win 5 out of 16 times so his probability is 5 16 Fermat should get 11 16 of the money and Pascal the rest Simulating random processes using Matlab s rand command e Random events are easily simulated in Matlab with the function rand which you have already encountered A computer cannot generate a truly random number rather a pseudorandom number but for most purposes it suffices because it is practically unpredictable We have seen that if you use the command x rand you get a number between 0 and 1 It can also generate row or column vectors For example rand 1 5 returns a row vector of five random numbers 1 row 5 columns such as a rand b rand c rand d rand e rand TN amp N amp N ON ON D oao uwe ao w 2 2 2 1 5 4 wy NI NAI 2416 0 9644 0 4519 0 3278 0 8913 lt a scalar value lt a column vector lt a row vector lt a matrix lt a 5x5 matrix e f we want to generate n integers between say 0 and k then we first genera
19. tomorrow crop yield number of hurricanes to hit Florida s coast in 2011 the number of times the sum of two dice thrown is even are all examples of events for which we may be interested in determining their probabilities Notation amp Terminology e Let X denote the random variable for example the sum of two dice when they are thrown e The set of all possible outcomes is denoted 2 For example if we roll one die the possible outcomes are Q 1 2 3 4 5 6 e We will denote an event as So for example if we roll a single die and want to determine the likelihood of the result being an even number then E 2 4 6 note that E C Q e We are interested in determining the probability of an event and will denote it p The sum of all probabilities for all possible outcomes is 1 The probability of an event will be the sum of the probabilities of each outcome in the event The probability of the complement of an event will be 1 p What is meant by the probability of an event Suppose the weather forecast is that the probability of rain tomorrow is 1 4 0 25 We say the likelihood of rain is 25 What does this mean The complement of this event is that there is NO rain tomorrow so the probability of no rain is 1 1 0 75 4 So there is a 75 chance of no rain tomorrow and thus it is 3 times more likely that there is no rain tomorrow than there is rain Unions and Intersections
20. want to know the probabilities that either player survives If we assume Player A has an accuracy of and Player B an accuracy of 2 and A gets the first shot then Player A can survive if 1 A hits on the first shot probability 4 6 OR 2 A misses B misses A hits probability 2 6 1 6 4 6 OR 3 A misses B misses A misses B misses A hits probability 2 6 1 6 2 6 1 6 4 6 OR The probability that A survives is thus an infinite sum Dene 1 0 This has the form 4 6 l x2 2 2 where 0 lt x lt 1 so this is equal to 4 6 x 1 1 x so the probability that A wins is p A 12 p A p B p A p B 17 while B s chance is p B 1 p A 9 p A p B p A p B 17 We have worked out the exact probabilities in this case however now we want to simulate duels so we can approximate the probabilities and compare with our exact ones To estzmate the probabilities of A and B surviving we would simulate the duel 100 or 1 000 times counting the frequency with which each player survived and dividing by the number of duels We will also keep track of the number of turns it takes to complete a duel i e until we get a hit and then compute the average Basically we just need a function which performs the duel and returns the number of shots required and the survivor either Player 1 or Player 2 The only input that is needed is p 1 2 which stores each player s probability We
21. 1 ORE if cdf i 1 lt r amp r lt cdf i count i count i 1 end end end freq count 2 12 n x 2 12 bar x freq end Probability sae Exact blue and estimated red PDF N 1000 0 16 0 14 0 12 0 1 0 08 0 06 0 04 0 02 Score Exact blue Approximate red Exercise Let s return again to the problem where we have a die which is not fair in the sense that each side has the following probabilities 1 2 2 2 1 1 lo 2 gt 3 gt 4 gt 5 gt gt 9 9 9 9 9 9 Compute the exact CDF What is the probability that you will roll a 3 or less Write a program to use your CDF to approximate the exact probabilities Com pare The Monte Carlo Method L 336 24 3 50 102 AN 5 peat Spin gt The Monte Carlo Method MCM uses random numbers to solve problems Here we will view MCM as a sampling method First we will see how we can sample different geometric shapes and then apply the sampling to different problems The MC method is named after the city in the Monaco principality because of a roulette a simple random number generator The name and the systematic development of Monte Carlo methods dates from about 1944 The real use of Monte Carlo methods as a research tool stems from work on the atomic bomb during the second world war This work involved a direct simulation of the probabilistic problems concerned with random neutron diffusion in fissile material but even a
22. Part Il Random Processes Goals for this unit e Give overview of concepts from discrete probability e Give analogous concepts from continuous probability e See how the Monte Carlo method can be viewed as sampling technique e See how Matlab can help us to simulate random processes e See applications of the Monte Carlo method such as approximating an inte gral or finding extrema of a function e See how Random Walks can be used to simulate experiments that can t typically be done in any other way e Random processes are important because most real world problems exhibit random variations the term stochastic processes is also used e The Monte Carlo Method is based on the principles of probability and statis tics so we will first look at some basic ideas of probability e Similar to the situation where we looked at continuous and discrete problems recall this was the main division in Algorithms and II we will consider concepts from both discrete and continuous probability e When we talk about discrete probability we consider experiments with a finite number of possible outcomes For example if we flip a coin or roll a die we have a fixed number of outcomes e When we talk about continuous probability we consider experiments where the random variable takes on all values in a range For example if we spin a Spinner and see what point on the circle it lands the random variable is the point and it takes on all values on the cir
23. an motion In 1905 the same year that he published his paper on special relativity Albert Einstein wrote a paper explaining Brownian motion Each pollen grain he said was constantly being jostled by the motions of the water molecules on all sides of it Random imbalances in these forces would cause the pollen grains to twitch and shift Moreover if we observed the particle to be at position x y at time t 0 then its distance from that point at a later time t was a normal random variable with mean 0 and variance t In other words its typical distance would grown as vt Recall that the command randn used here generates numbers with a normal distribution T 10 0 N 1000 h sart T N x 1 0 0 y 1 0 0 for i 1 N x i 1 x i h randn y i 1 y i h randn end We can write this program another way to get rid of the loop T 10 0 N 1000 h sart T N x 1 0 0 y 1 0 0 x 2 N 1 h cumsum randn 1 N 1 y 2 N 1 h cumsum randn 1 N 1 Brownian motion also explained the phenomenon of diffusion in which a drop of ink in water slowly expands and mixes As particles of ink randomly collide with water molecules they spread and mix without requiring any stirring The mixing obeys the vt law so that roughly speaking if the diameter of the ink drop doubles in 10 seconds it will double again in 40 seconds The physical phenomenon of Brownian Motion has be
24. choices right or left e So if we have 120 steps 21 10 possible random walks e How many of them can end up at a given position e There is only one that can end up at n e There are n out of 2 i e 120 out of 10 that end at n 2 they involve n 1 steps of 1 and one step of 1 these can occur at n places For example one sailor could go left on the first step and right on all remaining ones another sailor could go right on all steps except the second one etc e One can show that in general there are ia distinct random walks that will end up at n 2 k e That means the probability of ending up at a particular spot is just the corresponding combinatorial coefficient divided by 2 e In other words row n of Pascal s triangle tells you what a random walk can do But row n of Pascal s triangle can be thought of as dropping balls in a pachinko game that can randomly go left or right at each level An entry in Pascal s triangle is found by summing the two numbers above it to the left and right The top 1 is called the zeroth row An entry in the nth row can be found by n choose r where r is the element number in that row For example in row 3 1 is the zeroth element 3 is element number 1 the next three is the 2nd element and the last 1 is the 3rd element The formula for n Choose r Is n ri n r where 0 1 riin r Random Walks in 2D We can do the same thing in two dimensions
25. cle Historical Notes Archaeologists have found evidence of games of chance on prehistoric digs show ing that gaming and gambling have been a major pastime for different peoples since the dawn of civilization Given the Greek Egyptian Chinese and Indian dynasties other great mathematical discoveries many of which predated the more often quoted European works and the propensity of people to gamble one would expect the mathematics of chance to have been one of the earliest devel oped Surprisingly it wasn t until the 17th century that a rigorous mathematics of probability was developed by French mathematicians Pierre de Fermat and Blaise Pascal t lnformation taken from MathForum http mathforum org isaac problems prob1 html The problem that inspired the development of mathematical probability in Re naissance Europe was the problem of points It can be stated this way Two equally skilled players are interrupted while playing a game of chance for a certain amount of money Given the score of the game at that point how should the stakes be divided In this case equally skilled indicates that each player started the game with an equal chance of winning for whatever reason For the sake of illustration imagine the following scenario Pascal and Fermat are sitting in a cafe in Paris and decide after many arduous hours discussing more difficult scenarios to play the simplest of all games flipping a coin If the coin
26. comes up heads Fermat gets a point If it comes up tails Pascal gets a point The first to get 10 points wins Knowing that they ll just end up taking each other out to dinner anyway they each ante up a generous 50 Francs making the total pot worth 100 They are of course playing winner takes all But then a strange thing happens Fermat is winning 8 points to 7 when he receives an urgent message that a friend is sick and he must rush to his home town of Toulouse The carriage man who has delivered the message offers to take him but only if they leave immediately Of course Pascal understands but later in correspondence the problem arises how should the 100 Francs be divided How would you divide the 100 Francs After we discuss probability you should be able to divide the money and argue that it is a fair division What do we mean by probability and events e Probability measures the chance of a given event occurring It is a number between 0 and 1 e The complement of an event is everything that is not in the event for example the complement of 3 or less hurricanes hitting Florida in 2011 is 4 or more hurricanes hitting Florida e f the probability of an event is 1 then it is certain to occur e A probability of 0 represents an event that cannot occur i e is impossible e Probabilities in between 0 and 1 are events that may occur and the bigger the number the more likely that the event will occur e Rain
27. e ied ETE A s chance of winning is eT SA SB the chance of A losing i e B winning is 1 this probability or ing i winning p i Oly ee The expected value is value win p win value loss x p loss or A B Boz E SA FSB so even when A has 1 against B s 100 it s a fair game small chance of big win big chance of small loss Here are the results of four simulations which mimic our cases above Frequency 300 250 200 150 100 50 Gambler s ruin number of steps 50 100 a 150 200 Steps A 3 B 7 250 Gambler s ruin number of steps 180 T T T 160 F 140F 120 100 Frequency iN D ro ro ro iy Oo 0 0 2000 4000 6000 A 30 B 70 Steps 8000 10000 12000 14000 Frequency 140 120 100 o a oO 40 20 Gambler s ruin number of steps Steps A 10 B 10 500 Frequency Gambler s ruin number of steps 1000 900 800 700 600 500 400 300 200 100 0 2000 4000 6000 8000 10000 12000 Steps A 1 B 100 Game Theory Simulating a Duel Another example in game theory is a duel Two players take turns firing at each other until one is hit The accuracy or probability of hitting the opponent on one shot is known for both players In the two person duel there is no strategy just keep firing but we
28. en explained by assuming Phen s a eden os iagi Sate dee a 2 S55 F ma L eee Ae ae L Poe a i a2 AD ES Seg piat z m a MAES Na that a pollen grain was subjected to repeated random impulses This model was intriguing to physicists and mathematicians and they soon made a series of abstract simplified versions of it whose properties they were able to analyze and which often could be applied to new problems The simplest version is known as the Random Walk in 1D Random Walks 100 Drunken Sailors We ll introduce the random walk with a story about a captain whose ship was carrying a load of rum The ship was tied up at the dock the captain was asleep and the 100 sailors of the crew broke into the rum got drunk and staggered out onto the dock The dock was 100 yards long and the ship was at the 50 yard mark The sailors were so drunk that each step they took was in a random direction to the left or right We re ignoring the width of the pier because they can only move to the right or left They were only able to manage one step a minute Two hours later the captain woke up Oh no he said There s only 50 steps to the left or right and they fall into the sea And two hours makes 120 steps They will all be drowned for sure But he was surprised to see that around 80 of the crew were actually within 11 steps to the left or right and that all of the crew was alive and safe though in
29. he integral is simply the usual fraction times the area of the testing rectangle whose area is M b a number of hits x M b a number of points 2 4 Here we are approximating 2 312 g x de 2907 0 3 10 a Using 12 points we have that the approximation is 5 40 8SM 3 00 12 12 where 8 is the area of the box we are sampling in and 5 out of 12 points were in the desired region Not a good approximation but we only used 12 points How do you choose the bounding box It is not unique Example Approximate the integral 1 2 1 cos 7x amp 0 3183 0 TT using MC We choose the bounding box to be 0 5 x 0 1 The results are given in the table n approx error 100 0 3150 0 0033 1000 0 3065 0 0118 10000 0 3160 0 0023 100 000 0 3196 0 0013 fan 2 In the example of approximating iF x dx we saw that there are a lot of random points generated which do not lie in the region under the curve f x x We will now look at using MCM for approximating an integral from a slightly different viewpoint This has the advantage that we don t have to take a bounding box and we get a better approximation with the same number of points Recall that the Riemann sum for approximating f F dx with the uniform partition en R bhaa aa FAL where Ar e a f toa Ysea se Here is any point in the interval x 1 x so if Z is the midpoint x 1 x 2 then we have the midpoint ru
30. is an easy way to see if everyone has been shot except you so that you can be declared the winner If you hit your target set the target s probability to 0 indicating he has been shot Then all you have to do is sum the probability array p and if you get 0 then you are the only survivor Remember that you have temporarily set your probability to 0 so you won t shoot yourself turn num tura nom 1 p_save p i pti o 0 pmax target max p TARGET is index of max r rand if r lt p_save p target 0 0 if sum p survivor break end end p i p_save reset your probability h you shot opponent target 0 0 everyone has been shot except you Il Conclusion for Three Way Duel e This is not the best way or the only way but it is a way and it has some real advantages e There are hardly any extra variables except for p_save e he code works the same for every player We never assume that the players were given in a particular order e he code works the same if we increase the number of players e Your homework will be simulate a three person duel using this strategy and then to modify your code to change the strategy
31. le As n oo we get the integral The MCM approach to this problem can be written as f to ders iene where now the points x are random ere in the interval fa b This is not a determinate formula because we choose the x randomly For large n we can get a very good approximation Another way to look at this expression is to recall from calculus that the average of the function f x over a b is b b 7 z f sear f se de af x 0 0 2 gt fa One can show rigorously that the error is O Example Consider the example we did before of approximating the integral 1 2 cos 7x 0 3183 0 TT using MC We now compare the two methods We have to do the same number of function values in each case n approx error 100 0 3150 0 0033 0 3278 0 0095 1000 0 3065 0 0118 0 3157 0 0026 10000 0 3160 0 0023 0 3193 0 0010 Example Consider the problem of estimating a hard integral J fix dx where f x sech 10 0 x 0 2 sech 100 0 x x 0 4 sech 1000 0 x 0 6 From the plot it should be clear that this is actually a difficult function to integrate If we apply the Monte Carlo Method we get decent convergence to the exact value of 0 21080273631018169851 Problem 21 0 8 0 6 0 4 0 2 X 1 0 096400 10 0 179749 100 0 278890 1000 0 221436 10 000 0 210584 100 000 0 212075 1 000 000 0 211172 1NA NANN NANN NA 910791 1 1e 01 3 1e 02 6 8e 02 1 0e 02
32. lues of n 3 2 3 16 3 164 3 1292 3 1491 0 0584 0 0184 0 0224 0 0124 0 0075 Note that the error is NOT monotonically decreasing as n increases Buffon s Needle Problem An interesting related problem is Buffon s Needle problem which was first pro posed in 1777 Here s the problem in a simplified form e Suppose you have a table top which you have drawn horizontal lines every 1 inch e You now drop a needle of length 1 inch onto the table e What is the probability that the needle will be lying across or touching one of the lines 2 e Actually one can show that the probability is just r pf o we So if we could simulate this on a computer then we could have another method for approximating 7 e Let s analyze the problem to see how we can implement it e Let s assume without loss of generality that the lines are horizontal they are spaced 1 unit apart and the length of the needle is 1 unit e Assume as in the figure that we have dropped a needle and that we know the location of the middle of the needle actually we just need the y coordinate and the angle 0 the needle makes with the horizon e So in the figure we see that the vertical side of the triangle has length sin 0 Ea A sin 6 e Since we know the y coordinate of the middle of the needle we know the y coordinates of the end of the needle 1 sin 0 y 5 sin e Here are the 3 situations we can have 0 fy e
33. mize his profits How can we use MC to answer this question e Let Q denote the quantity that he orders D the demand amount sold e Set Q 80 e Generate n replications of D i e generate n random numbers between 80 and 140 e For each replication compute the daily profit e After n replications estimate the earnings ordering Q cookies by X daily profit n earnings e Repeat for integer values of Q between 80 and 140 e Select the value of which yields the best earnings prorat Oz for k ll 5 d rand 1 d 80 60 x profit else profit 6 q profit profit 6 d 4 q d end 60 T T T T T 56 F 54 F 50 F 80 90 100 110 120 130 140 Profit is maximized by ordering approximately 116 cookies Brownian Motion In 1827 Scottish botanist Robert Brown was studying pollen grains which he had mixed in water Although the water was absolutely still the pollen grains seemed to quiver and move about randomly He could not stop the motion or explain it He carefully described his observations in a paper When other researchers were able to reproduce the same behav ior even using other liquids and other particles the phenomenon was named Brownian Motion although no one had a good explanation for what they were observing Check out the You Tube video for Brownian Motion at http www youtube com watch v apU1l_baT Ke Here is the result of a simulation of Browni
34. nd at each step generate a random number r between 0 and 1 we could have mapped r between 1 and 1 if 0 lt r lt 0 5 we move to the left so x x 1 if 5 lt r lt 1 we move to the right sor 1 A Take WALK_NUM walks each of length STEP_NUM random 1 or 1 ste h time 1 step_num for walk 1 walk_num x OQ for step 1 step_num r rand h h Take the step h h if r lt 0 5 x x 1 else end Update the average and max x2_ave step x2_ave step x72 x2_ave x2_ave walk_num Plot the results plot time x2_ave time x2_max LineW 1D Random Walk Max and average of distance versus time 1200 T T T T T 1000 800 600 Distance Squared 200 0 20 40 60 80 100 120 Blue square of average distance Green square of maximum distance The square of the average distance behaves linearly but the square of the maxi mum distance each sailor traveled from the starting point varies a lot e In 1D how many possible random walks of n steps are there e To answer this consider flipping a coin If we flip it once there are two choices H and T If we flip it two times there are 2 4 choices HH HT TT TH and if we flip it 3 times there are 2 8 choices HHH HHT ATH ATT TTT TAT THH TTH so in general there are 2 outcomes for flipping a coin n times e Our random walk in 1D is like flipping a coin there are only two
35. ng to estimate where the global minimum is Note that if we have an algorithm to find a minimum of f x we can find its maximum by find the minimum of f x Sum of Cosines to Integrate Exercise Write a script to implement MC for approximating an integral in 1D using the formula Try out the code on the integral 1 cos 47a da 0 Modify the script to to perform integration in two spatial dimensions Try your code out on the integral 3 p2 J 1 8zy dydz 57 0 J1 Simulations using the Monte Carlo Method e So far we ve concentrated on the Monte Carlo method as a means of sam pling This gave us an alternate means of solving integration problems Of course there are already other methods for dealing with integration prob lems e We will next consider a Monte Carlo application in which no other method is available the field of simulation e We will consider the common features of tossing a coin many times watching a drop of ink spread into water determining game strategies and observing a drunk stagger back and forth along Main Street e Using ideas from the Monte Carlo method we will be able to pose and answer questions about these random processes e Computer simulation is intended to model a complex process The underlying model is made up of simplifications arbitrary choices and rough measure ments e Our hope is that there is still enough relationship to the real world that we can study the result
36. nts that we generate which of course is not optimal We imbed the unit circle in a square whose vertices are 1 1 1 1 1 1 1 1 x 1 2 rand y 1 2 rand i find x 2 y2 lt 1 plot xG yi be In this example there were 3110 points in the circle out of 4000 generated Is there any significance in this ratio 3110 4000 UNIFORM points in a circle Sampling using rejection Sampling the Surface of the Unit Sphere Another important shape to sample is the surface of the unit sphere We can do this but we ll have to cheat slightly and use normal random numbers rather than uniform ones Recall that the command randn did this It turns out that if you want a point on the surface of the unit sphere you can just make a vector of 3 normal random numbers and scale by its length xyz randn 3 1000 for j 1 1000 xyz j xyz j norm xyz j end scatter3 xyz 1 xyz 2 Xyz 3 This will also work for higher dimensional spheres 1000 Uniform Points on Sphere Exercise Use MCM to sample the L shaped region O lt 2 lt 10 O lt y lt 4 forO lt x lt 2and0 lt y lt 2 forx gt 2 Use 1000 points plot Using MCM to approximate 7 e We know that the area of the unit circle with radius 1 is just 7 Suppose we enclose the unit circle in a square say the one centered at the origin with side 2 e We want to randomly throw darts at the
37. numbers it only initializes the generator e Sometimes people will use the system clock to make sure the seed is being reset rand state sum 100 clock e Try setting the seed for rand generating a random number then resetting the seed and generate another random number see that you get the same random number two times in a row e g rand state 12345 rand 1 Exercise Use two calls to Matlab s rand command to generate two random row vectors of length 4 Now repeat the exercise where you reset the seed before each call to rand You should get the same vector each time in this latter case Probability Density Function e f we measure a random variable many times then we can build up a distri bution of the values it can take e As we take more and more measurements and plot them we will see an underlying structure or distribution of the values e This underlying structure is called the probability density function PDF e For a discrete probability distribution the PDF associates a probability with each value of a discrete random variable For example let the random variable X be the number of rainy days in a 10 day period The number of outcomes is 11 because we can have 0 days 1 day 10 days The discrete PDF would be a plot of each probability e For a continuous probability distribution the random variables can take all values in a given range so we can not assign probabilities to individual values Ins
38. opriate weights For example the midpoint rule is te dem f b a Simpson s rule is ie dem f a 4f F0 e Deterministic numerical integration or numerical quadrature formulas have the general form b Fle de Y flea i 1 where the points x are called quadrature points and the values w are quadra ture weights To approximate the integral using Monte Carlo which is a probablistic approach we first extend the idea we used to approximate 7 Assume for now that the integrand is greater than or equal to zero in the interval la b then we e choose a rectangle which includes the area you want to determine e g if f x lt M for all a lt x lt b then choose the rectangle with base b a and height M generate a random point in the rectangle determine if the random point is in desired region take area under curve as a fraction of the area of the rectangle e First we generate two random points call them x y and map them to the rectangle enclosing our integral e In our method for approximating m we checked to see if x y lt 1 What do we check in this case At this given x point we want to see if the random point y is above the curve or not It is easy to check to see if y lt f x where x is the random point If this is true then the point lies in the desired area and our counter is incremented e Our approximation to the area i e the value of t
39. pute the mean or expected value as 1 2 3 9 i ee ee ee ee pmen 1 93g a gg 7 O36 and the variance as l 2 2 2 3 2 1 2 Var X T 7 ae te a tant 7 75 12 T 20 8003 with a standard deviation of o X v 5 833 2 415 The mean is what we expected from looking at the PDF If for some reason 7 was impossible to roll it would still be the expected value Normal distribution In probability theory the normal or Gaussian distribution is a continuous PDF that is often used as a first approximation to describe real valued random variables that tend to cluster around a single mean value The graph of the associated PDF is the well known bell shaped curve known as the Gaussian function or bell curve given by 1 z fle e where parameter u is the mean location of the peak and g is the variance the measure of the width of the distribution The distribution with u 0 and g 1 is called the standard normal distribution a Qi 3 Ot 1 Son ae 03 Satan s so So 40 36 20 lo 3 ilo 2u 3v 4a 60 fo 2 86 Omo 067450 26880 The normal distribution is considered the most basic continuous PDF and it is used throughout statistics natural sciences and social sciences as a simple model for complex phenomena randn Matlab has a built in command randn which returns pseudorandom values drawn from the standard normal distribution We can use it to generate values from a normal distribution with a par
40. r of heads appearing in 3 tosses and average them to get OFFI I la 8 12 78 37 2 Statisticians use the variance and standard deviation of a continuous random variable X as a way of measuring its dispersion or the degree to which it is scattered The formula for the variance Var X of a discrete PDF is Var X gt pla a p l and for a continuous PDF b Var p x x u dx where u is the expected value or mean The standard deviation o X is just Var X Note that Var X is the mean or expected value of the function X 1 which measures the square of the distance of X from its mean It is for this reason that Var X is sometimes called the mean square deviation and a X is called the root mean square deviation Var X will be larger if X tends to wander far away from its mean and smaller if the values of X tend to cluster near its mean The reason we take the square root in the definition of o X is that Var X is the mean expected value of the square of the deviation from the mean and thus is measured in square units Its square root therefore gives us a measure in ordinary units Example Compute the mean expected value variance and standard deviation for our example where we throw 2 dice and sum the values Recall that here the random variable X is the sum of the values Returning to our probability table 2 3 4 5 6 7 8 9 10 11 12 Q Q Q Q Q Q Q Q Q Q Q we can com
41. robability of an outcome how would we do it As an example consider the probability of the outcome 3 5 Remember that these events are independent and because the probability of rolling a 3 with the first die is 1 out of 6 and the probability of rolling a 5 with the second die is also 1 out of 6 so the probability of the outcome 3 5 is es eee 6 6 36 e However we want to determine the probability that a particular sum will occur The number of outcomes resulting in a sum of 2 to 12 are 2 1 3 2 4 3 5 4 6 5 1 6 8 5 9 4 10 3 U 12 1 e How do we compute the probability that a particular sum will occur Con sider the probability of the sum 5 occurring There are a total of 4 ways out of 36 outcomes that yield the sum of 5 so its exact probability is 4 36 1 9 e For a pair of fair dice the exact probability of each total is 2 3 4 5 6 7 8 9 10 11 12 A a e aha Ba OL eels a a E 36 36 36 36 36 36 36 36 36 36 36 05068 3114 3714 1108 060s Note that the probabilities are all positive and the sum of the probabilities is 1 as expected Now we compare these to our estimated probabilities Clearly they don t agree but for large n they are a good approximation Remember that are using a frequency interpretation of probability 1181 1358 e Our plot of the exact probabilities is an example of a discrete PDF and the corresponding plot of our estimated probabilitie
42. s generate a random number and see if he hits his target if he hits his target then check to see if he is the only surviving if he is the only survivor then set J as the survivor and return Pseudo code for Player I s turn IF I am alive turn_num turn_num 1 IF two other players are alive TARGET the better one ELSE TARGET the remaining player END r rand IF r lt p I TARGET is eliminated IF I am the ONLY remaining player now survivor I break gt gt END END END e To turn our pseudocode into code we need an efficient way to keep track of who s in the game and who s out e One way to do this is to use with the probability array p Once a person has been shot we could set their p value to zero e That also makes it easy to choose the target Always pick the person with the biggest p value But what if that person is yourself e One simple thing to do is when it is your turn temporarily set your probability to zero so you don t choose to shoot yourself Then reset the value after you ve taken your shot unless you win and it doesn t matter e If you have an array p then the Matlab command max p will give you the maximum entry of that array but if you use value loc max p it will give you the maximum entry and its location So in our case it will give us the player s number which has the largest probability of hitting his opponent e Now there
43. s is an approximation to the discrete PDF The PDF assigns a probability p x to each outcome X in our set of all possible outcomes In our example there were 11 outcomes and Q 2 3 4 5 6 7 8 9 10 11 12 e Recall that the discrete PDF describes the relative likelihood for a random variable to occur at a given point e What properties should a discrete PDF have Clearly two obvious require ments for a discrete PDF are 1 0 lt p a for each x 2 So p x 1 where n is the number of possible outcomes Exercise Suppose you have a die which is not fair in the sense that each side has the following probabilities 2 1 1 2 2 1 l gt 2 gt 3 gt 4 gt gt gt 9 9 9 9 9 9 Would you bet on rolling an even or an odd number Why Write a code to simulate rolling one of these loaded dice and estimate the PDF Graph it Continuous Probability Density Functions e We have seen that the discrete PDF provides us with probabilities for each value in the finite set of outcomes The continuous PDF provides us with probabilities for a range of values in the set of outcomes Recall that in the discrete case the sum of all probabilities had to be 1 The analog of that for the continuous case is that the total area under the curve is 1 e f we determine the area under the curve between two points then it is just the probability that the outcome will be between those values e For example in the plot belo
44. s of the simulation and get an idea of how the original process works predict outputs for given inputs and how we can improve it by changing some of the parameters e A famous applied mathematician once said the purpose of computations is insight not numbers It is good to always keep this in mind e Often it is particularly hard to use rigorous mathematical analysis to prove that a given model will always behave in a certain way for example the output won t go to infinity e Sampling tries to answer these questions by practical examples we try lots of inputs and see what happens e We want to consider some examples of simulations that can be done using the MC method and you will look at another in the lab A Simple Example from Business Suppose a friend is starting a business selling cookies and you want to help him succeed and show off your computational skills Assume he buys the cookies from a local bakery at the cost of 0 40 and sells them for 1 00 assume that your friend has no overhead so he makes 0 60 in profit per cookie if he sells all the ones he ordered However if he has some cookies left over at the end of the day those are given to the homeless shelter and he looses 0 40 per cookie He works for four hours each afternoon and feels there is a fairly uniform demand during those hours he has never sold less than 80 or more than 140 cookies Use MC to recommend how many cookies he should order to maxi
45. sorry shape How can this be explained We will use the idea of Random Walk to simulate it Here s a frequency plot for the simulation 1D Random Walk 100 sailors and 120 steps 1 0 T T T T 30 0 10 30 20 10 20 40 steps from ship right or left number of sailors ine wo A o op N ee T T T T 4 KO oO e We want to simulate this experiment using a random walk e Taking n random steps is like adding up random 1 s and 1 s the average of such a sum tends to zero with an error that is roughly a in our problem fn 11 then and so Simulating the Drunken Sailor Random Walk Experiment Our model for a random walk in 1D is very simple e We let x represent the position of a point on a line e We assume that at step n 0 the position is x 0 e We assume that on each new step we move one unit left or right chosen at random e From what we just said we can expect that after n steps the distance of the point from 0 will on average be about y n If we compare n to the square of the distance we can hope for a nice straight line e First let s look at how we might simulate this experiment In this code we want to plot the number of steps n on the x axis and the square of the distance traveled and the square of the maximum distance from the origin each sailor got e For each sailor we will take n steps before starting we set x 0 the location given in units of steps to right or left a
46. square Our approximation to 7 will be the number of darts inside or on the unit circle the total number of darts thrown Here is a a plot of the region in the first quadrant This method is easily implemented in Matlab We simply generate a random point x y and decide if it is inside the circle or not Because rand generates random numbers between 0 and 1 we either have to map the numbers between 1 and 1 or use numbers between 0 1 to calculate an approximation to 7 4 i e in the first quadrant and multiply the result by 4 we choose the later Input number of random points to generate Output approximation to 7 count 0 for i i n x rand 1 y rand 1 if x 2 y 2 lt 1 count count 1 par end end pi 4 count n It is more efficient to generate all the random numbers at once and do vector operations Note that to square each entry in the vector x i e take dot prod uct we have to use the appropriate syntax Here the command find finds all occurrences where the condition is satisfied it returns a vector with these indices so the total number of times this condition was satisfied is the length of this vector rand etate 123545 x rand 1 n generate row of random numbers y rand 1 n i find x 72 y 72 lt 1 find locations where this condition is m length i pi_approx 4 m n error abs pi pi_approx Here are some approximations and the corresponding errors for some va
47. t an early stage of these investigations von Neumann and Ulam refined this particular Russian roulette and splitting methods However the systematic development of these ideas had to await the work of Harris and Herman Kahn in 1948 Other early researchers into the field were Fermi Metropolis and Ulam e We can think of the output of rand or any stream of uniform random numbers as a method of sampling the unit interval e We expect that if we plot the points along a line they would be roughly evenly distributed e Random numbers are actually a very good way of exploring or sampling many geometrical regions Since we have a good source of random numbers for the unit interval it s worth while to think about how we could modify such numbers to sample these other regions Sampling the interval a b Suppose we need uniform random numbers but we want them in the interval ja b We have seen that we can use rand but now we have to shift and scale The numbers have to be shifted by a and scaled by b a Why r rand s a b a r or we can do this for hundreds of values at once r rand 10000 1 s a b a r 1000 uniformly sampled points on 1 10 Sampling the Unit Square We might need uniform random numbers in the unit square We could compute an x and y value separately for each point x rand y rand or we can do this for hundreds of values at once to create a 2 x n array xy
48. t shows a nonuniformity This is because there is only one way to score 2 but many ways to score 7 Frequency plot for 1000 throws of 2 dice Frequency 2 3 4 5 6 7 8 9 10 11 12 Sum of Dice When we repeat the calculations using 3 and 4 dice and draw a frequency plot for the sums we begin to notice a trend in the shape of the plot and suspect that there is an underlying function f a which is determining its shape Frequency plot for 10000 throws of 3 dice T T r T r r Frequency plot for 100000 throws of 4 dice 12000 z T T T 10000 8000 6000 Frequenc 4000 2000 0 10 Sum of Dice e n fact this plot suggests the normal curve or bell shaped distribution even though we expect to see that curve only for continuous variables How can we simulate rolling the dice and generating these plots using Matlab Here is a version where we don t generate all the random numbers at once n_dice 2 freq 1 12 0 for k i n x rand 1 n_dice h generate 2 random numbers between O0 and 1 x ceil 6 x h turn them into an integer 1 to 6 value sum x h sum the two dice freq value freq value 41 increment the frequency end We can also generate all the random numbers for all the throws for the two dice using throws rand n n_dice throws ceil 6 x throws How can we use the frequency plot to estimate the probabilities e A frequency plot is simply a record
49. te the numbers between 0 and 1 using x rand 1 n and then use either of the following commands x floor k 1 x x ceil k x For example if we generate 5 random numbers 8147 9058 1770 9134 6324 and we want integers lt 10 then x floor 11 x gives 8 9 1 10 6 and x ceil 10 x gives 9 10 2 10 7 e The only thing that may be disconcerting is that if we run our program which is based on generating random numbers two times in a row we will get different numbers It is sometimes useful such as in debugging to be able to reproduce your results exactly If you continually generate random numbers during the same Matlab session you will get a different sequence each time as you would expect e However each time you start a session the random number sequence begins at the same place 0 9501 and continues in the same way This is not true to life as every gambler knows e Clearly we don t want to terminate our session every time we want to repro duce results so we need a way to fix this To do this we provide a seed for our random number generator Just like Matlab has a seed at the beginning of a session we will give it a seed to start with at the beginning of our code Seeding rand e The random number generator rand can be seeded with the statement rand state n where n is any integer By default n is set to 0 when a Matlab session starts e Note that this statement does not generate any random
50. tead we have a continuous curve called our continuous PDF which allows us to calculate the probability of a value within any interval e How will we determine the probability for a range of values from the contin uous PDF We will see that the probability is calculated as the area under the curve between the values of interest e We first look at discrete PDFs Estimating the Probabilities amp the Discrete PDF e f we have an experiment like rolling dice or flipping a coin we may be able to compute the exact probabilities for each outcome as we did in the Fermat Pascal problem and then plot the PDF e But what if we don t have the exact probabilities Can we estimate them and then use the estimates to plot the PDF The answer is yes We essentially take many measurements of the random variable and accumulate the frequency that each outcome occurred The relative frequency of each outcome gives us an estimate of its probability For example if a particular outcomes occurs 50 of the time then its prob ability is 0 5 As a concrete example suppose we roll two dice and sum the numbers that appear The individual outcomes on each die are still equally likely but now we are going to look at the sum of the two dice which is not equally likely The possible outcomes are 2 through 12 Suppose we roll the dice 1000 times and count the frequency that each sum appears When we look at the frequency plot for the sums we notice that i
51. tegrate but for one we got an error 20 times smaller with the same number of points Why e The error depends in part on the variance of the function e To get a feeling for the variance of these two functions remember that the variance is given by the formula Var f ala mean dz The first function has an integral of about 0 2 over the interval 0 1 so its average value is 0 2 because the length of the interval is 1 The function ranges between 0 and 1 so the pointwise variance is never more than 1 2 0 64 The second function has an integral of about 4 5 over an interval of length 5 for an average value of about 1 Its values vary from 14 to 16 or so so the pointwise variance can be as much as 200 Integrals in two dimensions Since we know how to sample a square let s try to use the Monte Carlo method to compute the integral of f x y where f x y z y 0 75 over the square 1 1 x 1 1 Because of the absolute value function f x y is not differentiable It would be tricky to try to work out the integral even though this is a pretty simple function Because we are in two dimensions we need to use pairs of random numbers when we generate each it will be between 0 1 so we map to 1 1 x 1 1 by x 1 0 2 0 rand n 1 y 1 0 2 0 rand n 1 Here is a surface plot of our function Z X2 Y 314 We apply MC in an analogous way n
52. ticular mean and standard deviation for example if we want to generate 100 points with mean 1 and standard deviation 2 we use r 1 2 x randn 100 1 Example Generate 1000 random numbers using rand and randn and make a histogram of the results Compute the mean and standard deviation of each using the Matlab commands mean and std The command to make a histogram of y values is hist y The Cumulative Density Function If we have a discrete system with a known PDF the value of the PDF at 2 say pdf x is the probability that the outcome z will occur But suppose we want to know the chances of rolling a 7 or less using two dice This is the probability that the outcome is less than or equal to 7 it is so important that it has its own name the cumulative density function or CDF cdf x probability outcome is less than or equal to x y pdf y YKT Example If we return to the example when we rolled one die we know that each number was equally probable so each had probability of 1 6 What is the CDF for each x 1 2 3 4 5 6 If we ask what is cdf 6 then it should be 1 because we know it is 100 sure that we will roll a number lt 6 For the other values we simply sum up the PDFs 1 ih ee etc 060 3 dD edf 2 Cumulative Probability for 1 die T T OTF 0 6 F T g 6 0 5 ie a O 0 4 b 0 3 F 0 2 m 0 1 2 3 4 5 Score We see that for our discrete PDF e cdf x is a
53. w the probability that the random variable is lt a is the integral under the curve from oo to a J f a dx is the probability that the random variable is lt a Mean Variance and Expected Values e When we have a large collection of numbers such as scores from a large lecture exam we are not usually interested in the individual numbers but rather in certain descriptive quantities such as average and how far from the average a certain percentage of students are The same is true for a probability density function e Three important characteristics of a PDF are its mean or expected value or average its variance and the standard deviation e For a discrete PDF assume that we have a finite set Q of all possible n outcomes and the tth outcome for the random variable X is denoted by 7 We can calculate the expected value i e the mean of a random variable X denoted either u or ELX from the formula p plas i e For a continuous PDF where the random variables range between a and b we can calculate the mean from the formula b u p x a dx a e If you recall the definition of the center of mass of a laminar sheet fp xp x dA Jpp dA where p is the density you can see an analog of this formula because in our case op dz 1 e o understand the discrete case let s look at an example Consider the situation where we have the heights in inches of the 12 members of a women s basketball team given by 69

Part II - Random Processes Goals for this unit: • Give overview of

Contents

Download Pdf Manuals

Related Search

Related Contents

Part II - Random Processes Goals for this unit: &bull; Give overview of

Contents

Download Pdf Manuals

Related Search

Related Contents

Part II - Random Processes Goals for this unit: • Give overview of