MONTE CARLO INTEGRATION
İrem Tanrıverdi, iremt@metu.edu.tr
Monte Carlo methods are numerical techniques which rely on random sampling to approximate their results. Monte Carlo integration applies this process to the numerical estimation of integrals. Monte Carlo methods are a particularly attractive choice for the multidimensional integration problems common in computer graphics.
Basics:
Basic idea becomes clear from the “dartboard method” of integrating the area of an irregular domain:
Choose points randomly within the rectangular box
p(hit inside area) = number of hits inside area / number of hits inside the box
A(square) =4
points inside the circle = 40
total number of points = 50
A(circle) = 4*(40/50) = 3.2 ~𝜋
Please visit this page to understand basics of Monte Carlo Integration
The Monte Carlo Estimator
It is the Basic Estimator. Monte Carlo integration uses random sampling of a function to numerically compute an estimate of its integral. Suppose that we want to integrate the one-dimensional function f (x) from a to b:
𝜃 =∫f(x) dx
We can approximate this integral by averaging samples of the function f at uniform random points within the interval. Given a set of N uniform random variables Xi ∈ [a,b) with a corresponding PDF of 1/m, the Monte Carlo estimator for computing is 𝜃
converges to E[g(X)] = 𝜃 with probability 1, by Strong Law of Large Numbers (SLLN).
More simply:
Monte Carlo integration relies on large clusters of numbers, which implies that a sample mean for a large random sample will tend to be close to the predicted value of the distance being sampled.
Suppose f(x) is any function that has an integrate on the closed interval [a,b], then the integral 𝜃 =∫f(x) dx gives the area of the region is bounded by (a,b).
Then,
Let u₁,u₂, u₃,…,u𝑛 be independent uniform random variables on the interval [a,b] then from the uniform distribution:
This defines the original integral ∫f(u) du which can be approximated by multiplying the a sample mean 𝑓(𝑢) by (𝑏−𝑎).
Strong law of large numbers:
Let X be a real-valued random variable, and let X₁,X₂, X₃,…,X𝑛 be an infinite sequence of independent and identically distributed copies of X. Let
be the empirical averages of this sequence. Suppose that the first moment E|X| of X is finite. Then sample mean of Xn converges almost surely to E(X), thus
The simple Monte Carlo estimator of ∫g(x)dx is g(x.bar).
To summarize, the simple Monte Carlo estimator of the integral 𝜃 =∫g(x) dx is computed as follows;
- Generate X₁,X₂, X₃,…,Xm i.i.d. from Uniform(a,b).
The generic problem: Evaluate
The Convergence;
is valid by Strong Law of Large Numbers.
Advantages of Monte Carlo integration
- It converges at any dimension, regardless of the smoothness of the integrand. This makes it particularly useful in graphics, where we often need to calculate multi-dimensional integrals of discontinuous functions.
- Monte Carlo integration is simple. Only two basic operations are required, namely sampling and point evaluation.
- It is general. It is based on random sampling.
In R to solve integral, we use “integrate()” function.
Example 1)
Based on the function given below
f(x)=3x⁵+4x²+8x, 0<x<1
Find an approximation to the integral I. Take n = 10000.
# Step 1: Define f(X)
f <- function(x) 3*x^5+4*x^2+8*x
# Step 2: Generate x1,...,xn iid Uniform(min=a, max=b)
n=10^4
a <- 0; b <- 1
x <- runif(n= n, min= a, max= b)
# Step 3: Compute (b-a)*g(Xbar)
g.Xbar = mean(sapply(x ,f))
estimate <- (b-a)*g.Xbar
estimate
5.81195541562804
# calculate real integral value
integrate(f, a, b)5.833333 with absolute error < 6.5e-14y1 <- seq(a, b,0.001)
y.low1 <- rep(0,times = length(y1))
plot(y1, f(y1), type = "n", main = "f(x) = 3x^5+4x^2+8x", lwd = 2.5)
lines(y1,f(y.low1), col = 'grey')
lines(y1,f(y1), col = 'grey')
polygon(c(y1, rev(y1)), c(f(y1), rev(f(y.low1))), col = "darkred", border = NA)
Example 2)
Based on the function given below;
# Step 1: Define f(X)
f <- function(x) exp(-2*x^3)/4 + abs(x-6)
# Step 2: Generate x1,...,xn iid Uniform(min=a, max=b)
n=10^4
a <- 0; b <- 12
x <- runif(n= n, min= a, max= b)
# Step 3: Compute (b-a)*g(Xbar)
g.Xbar = mean(sapply(x ,f))
estimate <- (b-a)*g.Xbar
estimate
35.8452894973456
# calculate real integral value
integrate(f, a, b)36.17719 with absolute error < 9.6e-05y1 <- seq(a, b,0.001)
y.low1 <- rep(0,times = length(y1))
plot(y1, f(y1), type = "n", main = "f(x) = exp(-2x^3)/4 + abs(x-6)", lwd = 2.5)
lines(y1,f(y.low1), col = 'grey')
lines(y1,f(y1), col = 'grey')
polygon(c(y1, rev(y1)), c(f(y1), rev(f(y.low1))), col = "grey30", border = NA)
Example 3)
Based on the function given below;
Compute the integral of;
# Step 1: Define f(X)
f <- function(x) sqrt(x + sqrt(x + sqrt(x)))
# Step 2: Generate x1,...,xn iid Uniform(min=a, max=b)
n=10^4
a <- 0; b <- 3
x <- runif(n= n, min= a, max= b)
# Step 3: Compute (b-a)*g(Xbar)
g.Xbar = mean(sapply(x ,f))
estimate <- (b-a)*g.Xbar
estimate
5.10777795038029
# calculate real integral value
integrate(f, a, b)5.111806 with absolute error < 2.7e-07y1 <- seq(a, b,0.001)
y.low1 <- rep(0,times = length(y1))
plot(y1, f(y1), type = "n", main = "f(x) = sqrt(x + sqrt(x + sqrt(x)))", lwd = 2.5)
lines(y1,f(y.low1), col = 'grey')
lines(y1,f(y1), col = 'grey')
polygon(c(y1, rev(y1)), c(f(y1), rev(f(y.low1))), col = "darkgreen", border = NA)
Example 4)
Find the area of the unit circle (with radius 1) by using Monte Carlo integration.
We know that we could find the area of unit circle by using following integral:
f <- function(x) 4*sqrt(1-x^2)
n <- 10^5
a <- 0; b <- 1
x <- runif(n,a,b)
estimate <- mean(sapply(x,f)) * (b-a)
estimate
3.13888960701166
# calculate real integral value
integrate(f, a, b)3.141593 with absolute error < 0.00016y1 <- seq(a, b,0.001)
plot(y1, f(y1), type = "l",
main = "Unit Circle",
lwd = 2.5, col = "Red")
Example 5)
Based on the functions given below;
a)
Compute the integral of;
# Step 1: Define f(X)
f <- function(x) (cos(50*x)+sin(20*x))^2
# Step 2: Generate x1,...,xn iid Uniform(min=a, max=b)
n=10^4
a <- 0; b <- 1
x <- runif(n= n, min= a, max= b)
# Step 3: Compute (b-a)*g(Xbar)
g.Xbar = mean(sapply(x ,f))
estimate <- (b-a)*g.Xbar
estimate
0.959628737075019
# calculate real integral value
integrate(f, a, b)0.9652009 with absolute error < 1.9e-10y1 <- seq(a,b,0.001)
plot(y1, f(y1), type = "l",
main = "f(x) = (cos(50x) + sin(20x))^2",
lwd = 2.5, col = "Red")
b) How do we simulate integral with infinit range??
where X ~N(0, 1)
Compute the integral of;
# Step 1: Define f(X)
f <- function(x) (sin(x)-cos(x))^2
# Step 2: Generate x1,...,xn iid Uniform(n)
n=10^4
a <- -Inf; b <- Inf
x <- rnorm(n = n)
# Step 3: Compute g(Xbar)
g.Xbar = mean(sapply(x ,f))
estimate <- g.Xbar
estimate
0.998723658182567
integrand = function(x) (sin(x)-cos(x))^2*dnorm(x)
integrate(integrand,a,b)1 with absolute error < 9.4e-05y1 <- seq(-50,50,0.001)
plot(y1, f(y1), type = "l",
main = "f(x) = (sin(x) - cos(x))^2",
lwd = 2.5, col = "Red")
Example 6)
Based on the function given below;
Compute the integral of;
Now, write an R function which calculates the Monte Carlo Integration for a given funtion and given upper and lower bounds of x.
monte.carlo <- function(upper, lower, f, n){
set.seed(2022)
# Step 1: Generate x1,...,xn iid Uniform(min=a, max=b)
x <- runif(n= n, min= lower, max= upper)
# Step 2: Compute (b-a)*g(Xbar)
g.Xbar = mean(sapply(x ,f))
estimate <- (upper-lower)*g.Xbar
cat("estimate:", estimate, "\n")
cat("integrate:", integrate(f,lower,upper)$value)
y <- seq(lower, upper,0.001)
plot(y, f(y), type = "l", main = "f(x) = exp(sin(x)) + x^5 + log(x^2)", lwd = 2.5, col = "Red")
}
f <- function(x) exp(sin(x)) + x^5 + log(x^2)
upper <- pi/2
lower <- 0
n <- 10^6
result <- monte.carlo(upper, lower, f, n)estimate: 3.885552
integrate: 3.885093
How can we use Monte Carlo Integration in Statistics?
Example 7)
Let X∼Exp(0.5) be an exponential random variable with rate =0.5. Approximate P(X>2) using Monte Carlo approximation and check your answer using “pexp”.
We know that;
How can we find P(x>2) ?
In all question above we have g function which distributed as uniform(a, b). In hear, we do not have g function.
In such cases we can define an indicator function for g.
Let;
This gives us;
Thus our Monte Carlo approximation is given by;
# Step 1: Define f(X)
f <- function(x) (1/lambda)*exp(-x/lambda)
# Step 2: Define g(X)
g <- function(x, p){
if (x > p) return(1)
else return(0)
}
# Step 3: Generate x1,...,xn iid random variables from desired distribution
n=10^4
lambda <- 0.5
p <- 2
x <- rexp(n = n, rate = 0.5)
# Step 4: Compute the estimate
estimate = mean(sapply(x ,g, p))
estimate
0.3599
P(X>2) = 1- P(X ≤ 2)
1 - pexp(2, rate=0.5) #defining cdf
0.367879441171442
Example 8) FINDING NORMAL DISTRIBUTION CDF
Let X ~N(𝜇, 𝜎²)$ and let p be a real number. Approximate the probability P(X<p) using Monte Carlo integration. Take, 𝜇=5, 𝜎=3 and p=2.4.
We know that;
How can we find P(x<p) ?
In all question above we have g function which distributed as uniform(a, b). In hear, we do not have g function.
In such cases we can define an indicator function for g.
Let;
This gives us;
Thus our Monte Carlo approximation is given by;
# Step 1: Define f(X)
f <- function(x) 1/(sqrt(2*pi*sigma^2)) * exp((-(x-mu)^2) / 2*sigma^2)
# Step 2: Define g(X)
g <- function(x, t){
if (x < t) return(1)
else return(0)
}
# Step 3: Generate x1,...,xn iid random variables
n=10^4
mu <- 5
sigma <- 3
p <- 2.4
x <- rnorm(n = n, mean = mu, sd = sigma)
# Step 4: Compute the estimate
estimate = mean(sapply(x ,g, p))
estimate
0.1951
pnorm(2.4, mean = 5, sd = 3) #defining cdf
0.193062337141907
Example 9) FINDING GAMMA DISTRIBUTION CDF
Let X ~Gamma(𝛼,𝛽) and let p be a real number. Approximate the probability P(X < p1)+P(x > p2) using Monte Carlo integration. Take, 𝛼=9, 𝛽=8 and p1= 0.4 and p2=2. Besides, check your answer using “pgamma”.
We know that;
How can we find P(x<p) ?
In all question above we have g function which distributed as uniform(a, b). In hear, we do not have g function.
In such cases we can define an indicator function for g.
Let;
This gives us;
Thus our Monte Carlo approximation is given by;
# Step 1: Define f(X)
f <- function(x) 1/(factorial(alpha-1)*beta^alpha)*x^(alpha-1)*exp(-x/beta)
# Step 2: Define g(X)
g1 <- function(x, p1){
if (x < p1) return(1)
else return(0)
}
g2 <- function(x, p2){
if (x > p2) return(1)
else return(0)
}
# Step 3: Generate x1,...,xn iid random variables
n=10^5
alpha <- 9
beta <- 8
p1 <- 0.4
p2 <- 2
x <- rgamma(n = n, alpha, beta)
# Step 4: Compute the estimate
probability1 = mean(sapply(x ,g1, p1))
probability2 = mean(sapply(x ,g2, p2))
estimate = probability1 + probability2
cat("P(x < 0.4) + P(x > 2) =", estimate)P(x < 0.4) + P(x > 2) = 0.02754pgamma(p1, alpha, beta) + (1-pgamma(p2, alpha, beta))
0.0277013916154789
EXERCISE FROM THE BOOK
Exercise 5.4 (page 150)
Write a function to compute a Monte Carlo estimate of the Beta(3, 3) cdf, and use the function to estimate F(x) for x = 0.1, 0.2, . . . , 0.9. Compare the estimates with the values returned by the pbeta function in R.
CDF of Beta distribution;
We should define g function which should distribute uniform. Range of g function is equal to range of integral. Range of our integral is (0, x), so g~Uniform(0, x)
set.seed(361)
n<- 10^4
beta_cdf <- function(x,alpha,beta) {
u = runif(n, min = 0, max = x)
g = factorial(alpha + beta - 1)/(factorial(alpha - 1)*factorial(beta - 1)) * u^(alpha - 1)*(1-u)^(beta - 1)
estimated_cdf = mean(x*g)
}
x=seq(0.1,0.9,0.1)
for (i in x) {
estimates = beta_cdf(i,3,3)
real_values = pbeta(i,3,3)
cat("x=",i , "\t","estimates=", estimates, "\t", "real beta values=", real_values, "\n")
}x= 0.1 estimates= 0.008630343 real beta values= 0.00856
x= 0.2 estimates= 0.05783376 real beta values= 0.05792
x= 0.3 estimates= 0.1639318 real beta values= 0.16308
x= 0.4 estimates= 0.3190771 real beta values= 0.31744
x= 0.5 estimates= 0.4977538 real beta values= 0.5
x= 0.6 estimates= 0.6863473 real beta values= 0.68256
x= 0.7 estimates= 0.8389897 real beta values= 0.83692
x= 0.8 estimates= 0.9435028 real beta values= 0.94208
x= 0.9 estimates= 0.9852201 real beta values= 0.99144