Differently exposed images (e.g. individual frames of video)
of the same subject matter are denoted as
vectors:
,
,
,
,
,
,
.
Each video frame is some unknown function, , of the actual
quantity of light,
falling on the image sensor:
f_i = f(k_i q ( A_ix+b_i c_ix+d_i )),
where
denotes the spatial coordinates of the image,
is a single unknown scalar exposure constant, and parameters
,
,
, and
denote the projective
coordinate transformation between successive pairs of images:
is the linear coordinate transformation
(e.g. accounts for magnification in each of the
and
directions and
shear in each of the
and
directions),
is the translation in each
of these two coordinate directions, and
is the projective chirp rate in
each of these two coordinate directions[3].
The additional constant
makes the coordinate transformation into a group.
For simplicity, this coordinate transformation is assumed to be able to be independently recovered (e.g. using the methods of [3]). Therefore, without loss of generality, in this paper, it will be taken to be the identity coordinate transformation, which corresponds to the special case of images differing only in exposure.
Without loss of generality, will be called the reference
exposure, and will be set to unity, and frame zero will be called the reference
frame, so that
.
Thus we have:
1k_i f^-1 (f_i) = f^-1 (f_0), i, 0<i<I.
Taking the logarithm of both sides,
F^-1 (f_i) - K_i = F^-1 (f_0), i, 0<i<I,
where , and
is the logarithmic inverse camera response
function (e.g. a LookUp Table converting pixel values into exposure values).
Re-arranging, we have:
F^-1 (f_i) - F^-1 (f_0) = K_i, i, 0<i<I.
This relation suggests a way to estimate the camera response function, ,
from a pair of differently exposed images of the same subject matter.
Before estimating the camera response function, we consider how the noise
will affect the estimation.