MONTE CARLO INTEGRATION

İrem Tanrıverdi, iremt@metu.edu.tr

İrem Tanrıverdi
11 min readJun 27, 2022

Monte Carlo methods are numerical techniques which rely on random sampling to approximate their results. Monte Carlo integration applies this process to the numerical estimation of integrals. Monte Carlo methods are a particularly attractive choice for the multidimensional integration problems common in computer graphics.

Basics:

Basic idea becomes clear from the “dartboard method” of integrating the area of an irregular domain:

Choose points randomly within the rectangular box

Screen%20Shot%202022-04-01%20at%205.31.52%20PM.png

p(hit inside area) = number of hits inside area / number of hits inside the box

A(square) =4

points inside the circle = 40

total number of points = 50

A(circle) = 4*(40/50) = 3.2 ~𝜋

Please visit this page to understand basics of Monte Carlo Integration

The Monte Carlo Estimator

It is the Basic Estimator. Monte Carlo integration uses random sampling of a function to numerically compute an estimate of its integral. Suppose that we want to integrate the one-dimensional function f (x) from a to b:

𝜃 =∫f(x) dx

We can approximate this integral by averaging samples of the function f at uniform random points within the interval. Given a set of N uniform random variables Xi ∈ [a,b) with a corresponding PDF of 1/m, the Monte Carlo estimator for computing is 𝜃

converges to E[g(X)] = 𝜃 with probability 1, by Strong Law of Large Numbers (SLLN).

More simply:

Monte Carlo integration relies on large clusters of numbers, which implies that a sample mean for a large random sample will tend to be close to the predicted value of the distance being sampled.

Suppose f(x) is any function that has an integrate on the closed interval [a,b], then the integral 𝜃 =∫f(x) dx gives the area of the region is bounded by (a,b).

Then,

Let u₁,u₂, u₃,…,u𝑛 be independent uniform random variables on the interval [a,b] then from the uniform distribution:

This defines the original integral ∫f(u) du which can be approximated by multiplying the a sample mean 𝑓(𝑢) by (𝑏−𝑎).

Strong law of large numbers:

Let X be a real-valued random variable, and let X₁,X₂, X₃,…,X𝑛 be an infinite sequence of independent and identically distributed copies of X. Let

be the empirical averages of this sequence. Suppose that the first moment E|X| of X is finite. Then sample mean of Xn converges almost surely to E(X), thus

The simple Monte Carlo estimator of ∫g(x)dx is g(x.bar).

To summarize, the simple Monte Carlo estimator of the integral 𝜃 =∫g(x) dx is computed as follows;

  1. Generate X₁,X₂, X₃,…,Xm i.i.d. from Uniform(a,b).

The generic problem: Evaluate

The Convergence;

is valid by Strong Law of Large Numbers.

Advantages of Monte Carlo integration

  1. It converges at any dimension, regardless of the smoothness of the integrand. This makes it particularly useful in graphics, where we often need to calculate multi-dimensional integrals of discontinuous functions.
  2. Monte Carlo integration is simple. Only two basic operations are required, namely sampling and point evaluation.
  3. It is general. It is based on random sampling.

In R to solve integral, we use “integrate()” function.

Screen%20Shot%202022-04-04%20at%2012.14.18%20PM.png

Example 1)

Based on the function given below

f(x)=3x⁵+4x²+8x, 0<x<1

Find an approximation to the integral I. Take n = 10000.

# Step 1: Define f(X)

f <- function(x) 3*x^5+4*x^2+8*x


# Step 2: Generate x1,...,xn iid Uniform(min=a, max=b)

n=10^4
a <- 0; b <- 1

x <- runif(n= n, min= a, max= b)


# Step 3: Compute (b-a)*g(Xbar)

g.Xbar = mean(sapply(x ,f))

estimate <- (b-a)*g.Xbar

estimate

5.81195541562804

# calculate real integral value
integrate(f, a, b)
5.833333 with absolute error < 6.5e-14y1 <- seq(a, b,0.001)
y.low1 <- rep(0,times = length(y1))

plot(y1, f(y1), type = "n", main = "f(x) = 3x^5+4x^2+8x", lwd = 2.5)

lines(y1,f(y.low1), col = 'grey')
lines(y1,f(y1), col = 'grey')

polygon(c(y1, rev(y1)), c(f(y1), rev(f(y.low1))), col = "darkred", border = NA)
png

Example 2)

Based on the function given below;

# Step 1: Define f(X)

f <- function(x) exp(-2*x^3)/4 + abs(x-6)


# Step 2: Generate x1,...,xn iid Uniform(min=a, max=b)

n=10^4
a <- 0; b <- 12

x <- runif(n= n, min= a, max= b)


# Step 3: Compute (b-a)*g(Xbar)

g.Xbar = mean(sapply(x ,f))

estimate <- (b-a)*g.Xbar

estimate

35.8452894973456

# calculate real integral value
integrate(f, a, b)
36.17719 with absolute error < 9.6e-05y1 <- seq(a, b,0.001)
y.low1 <- rep(0,times = length(y1))

plot(y1, f(y1), type = "n", main = "f(x) = exp(-2x^3)/4 + abs(x-6)", lwd = 2.5)

lines(y1,f(y.low1), col = 'grey')
lines(y1,f(y1), col = 'grey')

polygon(c(y1, rev(y1)), c(f(y1), rev(f(y.low1))), col = "grey30", border = NA)
png

Example 3)

Based on the function given below;

Compute the integral of;

# Step 1: Define f(X)

f <- function(x) sqrt(x + sqrt(x + sqrt(x)))


# Step 2: Generate x1,...,xn iid Uniform(min=a, max=b)

n=10^4
a <- 0; b <- 3

x <- runif(n= n, min= a, max= b)


# Step 3: Compute (b-a)*g(Xbar)

g.Xbar = mean(sapply(x ,f))

estimate <- (b-a)*g.Xbar

estimate

5.10777795038029

# calculate real integral value
integrate(f, a, b)
5.111806 with absolute error < 2.7e-07y1 <- seq(a, b,0.001)
y.low1 <- rep(0,times = length(y1))

plot(y1, f(y1), type = "n", main = "f(x) = sqrt(x + sqrt(x + sqrt(x)))", lwd = 2.5)

lines(y1,f(y.low1), col = 'grey')
lines(y1,f(y1), col = 'grey')

polygon(c(y1, rev(y1)), c(f(y1), rev(f(y.low1))), col = "darkgreen", border = NA)
png

Example 4)

Find the area of the unit circle (with radius 1) by using Monte Carlo integration.

We know that we could find the area of unit circle by using following integral:

f <- function(x) 4*sqrt(1-x^2)
n <- 10^5
a <- 0; b <- 1
x <- runif(n,a,b)


estimate <- mean(sapply(x,f)) * (b-a)
estimate

3.13888960701166

# calculate real integral value
integrate(f, a, b)
3.141593 with absolute error < 0.00016y1 <- seq(a, b,0.001)
plot(y1, f(y1), type = "l",
main = "Unit Circle",
lwd = 2.5, col = "Red")
png

Example 5)

Based on the functions given below;

a)

Compute the integral of;

# Step 1: Define f(X)

f <- function(x) (cos(50*x)+sin(20*x))^2


# Step 2: Generate x1,...,xn iid Uniform(min=a, max=b)

n=10^4
a <- 0; b <- 1

x <- runif(n= n, min= a, max= b)


# Step 3: Compute (b-a)*g(Xbar)

g.Xbar = mean(sapply(x ,f))

estimate <- (b-a)*g.Xbar

estimate

0.959628737075019

# calculate real integral value
integrate(f, a, b)
0.9652009 with absolute error < 1.9e-10y1 <- seq(a,b,0.001)
plot(y1, f(y1), type = "l",
main = "f(x) = (cos(50x) + sin(20x))^2",
lwd = 2.5, col = "Red")
png

b) How do we simulate integral with infinit range??

where X ~N(0, 1)

Compute the integral of;

# Step 1: Define f(X)

f <- function(x) (sin(x)-cos(x))^2


# Step 2: Generate x1,...,xn iid Uniform(n)

n=10^4
a <- -Inf; b <- Inf

x <- rnorm(n = n)


# Step 3: Compute g(Xbar)

g.Xbar = mean(sapply(x ,f))

estimate <- g.Xbar

estimate

0.998723658182567

integrand = function(x) (sin(x)-cos(x))^2*dnorm(x)

integrate(integrand,a,b)
1 with absolute error < 9.4e-05y1 <- seq(-50,50,0.001)
plot(y1, f(y1), type = "l",
main = "f(x) = (sin(x) - cos(x))^2",
lwd = 2.5, col = "Red")
png

Example 6)

Based on the function given below;

Compute the integral of;

Now, write an R function which calculates the Monte Carlo Integration for a given funtion and given upper and lower bounds of x.

monte.carlo <- function(upper, lower, f, n){

set.seed(2022)

# Step 1: Generate x1,...,xn iid Uniform(min=a, max=b)

x <- runif(n= n, min= lower, max= upper)


# Step 2: Compute (b-a)*g(Xbar)

g.Xbar = mean(sapply(x ,f))
estimate <- (upper-lower)*g.Xbar

cat("estimate:", estimate, "\n")
cat("integrate:", integrate(f,lower,upper)$value)


y <- seq(lower, upper,0.001)
plot(y, f(y), type = "l", main = "f(x) = exp(sin(x)) + x^5 + log(x^2)", lwd = 2.5, col = "Red")

}


f <- function(x) exp(sin(x)) + x^5 + log(x^2)
upper <- pi/2
lower <- 0
n <- 10^6

result <- monte.carlo(upper, lower, f, n)
estimate: 3.885552
integrate: 3.885093
png

How can we use Monte Carlo Integration in Statistics?

Example 7)

Let X∼Exp(0.5) be an exponential random variable with rate =0.5. Approximate P(X>2) using Monte Carlo approximation and check your answer using “pexp”.

We know that;

How can we find P(x>2) ?

In all question above we have g function which distributed as uniform(a, b). In hear, we do not have g function.

In such cases we can define an indicator function for g.

Let;

This gives us;

Thus our Monte Carlo approximation is given by;

# Step 1: Define f(X)

f <- function(x) (1/lambda)*exp(-x/lambda)


# Step 2: Define g(X)

g <- function(x, p){
if (x > p) return(1)
else return(0)
}

# Step 3: Generate x1,...,xn iid random variables from desired distribution

n=10^4
lambda <- 0.5
p <- 2


x <- rexp(n = n, rate = 0.5)


# Step 4: Compute the estimate

estimate = mean(sapply(x ,g, p))


estimate

0.3599

P(X>2) = 1- P(X ≤ 2)

1 - pexp(2, rate=0.5) #defining cdf

0.367879441171442

Example 8) FINDING NORMAL DISTRIBUTION CDF

Let X ~N(𝜇, 𝜎²)$ and let p be a real number. Approximate the probability P(X<p) using Monte Carlo integration. Take, 𝜇=5, 𝜎=3 and p=2.4.

We know that;

How can we find P(x<p) ?

In all question above we have g function which distributed as uniform(a, b). In hear, we do not have g function.

In such cases we can define an indicator function for g.

Let;

This gives us;

Thus our Monte Carlo approximation is given by;

# Step 1: Define f(X)

f <- function(x) 1/(sqrt(2*pi*sigma^2)) * exp((-(x-mu)^2) / 2*sigma^2)


# Step 2: Define g(X)

g <- function(x, t){
if (x < t) return(1)
else return(0)
}

# Step 3: Generate x1,...,xn iid random variables

n=10^4
mu <- 5
sigma <- 3
p <- 2.4


x <- rnorm(n = n, mean = mu, sd = sigma)


# Step 4: Compute the estimate

estimate = mean(sapply(x ,g, p))


estimate

0.1951

pnorm(2.4, mean = 5, sd = 3) #defining cdf

0.193062337141907

Example 9) FINDING GAMMA DISTRIBUTION CDF

Let X ~Gamma(𝛼,𝛽) and let p be a real number. Approximate the probability P(X < p1)+P(x > p2) using Monte Carlo integration. Take, 𝛼=9, 𝛽=8 and p1= 0.4 and p2=2. Besides, check your answer using “pgamma”.

We know that;

How can we find P(x<p) ?

In all question above we have g function which distributed as uniform(a, b). In hear, we do not have g function.

In such cases we can define an indicator function for g.

Let;

This gives us;

Thus our Monte Carlo approximation is given by;

# Step 1: Define f(X)

f <- function(x) 1/(factorial(alpha-1)*beta^alpha)*x^(alpha-1)*exp(-x/beta)


# Step 2: Define g(X)

g1 <- function(x, p1){
if (x < p1) return(1)
else return(0)
}


g2 <- function(x, p2){
if (x > p2) return(1)
else return(0)
}

# Step 3: Generate x1,...,xn iid random variables

n=10^5
alpha <- 9
beta <- 8
p1 <- 0.4
p2 <- 2

x <- rgamma(n = n, alpha, beta)


# Step 4: Compute the estimate

probability1 = mean(sapply(x ,g1, p1))
probability2 = mean(sapply(x ,g2, p2))

estimate = probability1 + probability2
cat("P(x < 0.4) + P(x > 2) =", estimate)
P(x < 0.4) + P(x > 2) = 0.02754pgamma(p1, alpha, beta) + (1-pgamma(p2, alpha, beta))

0.0277013916154789

EXERCISE FROM THE BOOK

Exercise 5.4 (page 150)

Write a function to compute a Monte Carlo estimate of the Beta(3, 3) cdf, and use the function to estimate F(x) for x = 0.1, 0.2, . . . , 0.9. Compare the estimates with the values returned by the pbeta function in R.

CDF of Beta distribution;

We should define g function which should distribute uniform. Range of g function is equal to range of integral. Range of our integral is (0, x), so g~Uniform(0, x)

set.seed(361)
n<- 10^4

beta_cdf <- function(x,alpha,beta) {
u = runif(n, min = 0, max = x)
g = factorial(alpha + beta - 1)/(factorial(alpha - 1)*factorial(beta - 1)) * u^(alpha - 1)*(1-u)^(beta - 1)
estimated_cdf = mean(x*g)
}

x=seq(0.1,0.9,0.1)

for (i in x) {
estimates = beta_cdf(i,3,3)

real_values = pbeta(i,3,3)
cat("x=",i , "\t","estimates=", estimates, "\t", "real beta values=", real_values, "\n")

}
x= 0.1 estimates= 0.008630343 real beta values= 0.00856
x= 0.2 estimates= 0.05783376 real beta values= 0.05792
x= 0.3 estimates= 0.1639318 real beta values= 0.16308
x= 0.4 estimates= 0.3190771 real beta values= 0.31744
x= 0.5 estimates= 0.4977538 real beta values= 0.5
x= 0.6 estimates= 0.6863473 real beta values= 0.68256
x= 0.7 estimates= 0.8389897 real beta values= 0.83692
x= 0.8 estimates= 0.9435028 real beta values= 0.94208
x= 0.9 estimates= 0.9852201 real beta values= 0.99144

--

--

İrem Tanrıverdi

Research and Teaching Assistant. MSc in Statistics. Interested in programming, machine learning and artificial inteligence.