Image segmentation algorithm
In this section image segmentation algorithm is introduced based on the
clustering technique proposed in Section 4. The algorithm
is based on the assumption that salient regions in an image are presented by
dense clouds of points in the range domain, the feature space spanned by color
coordinates. Revealed clusters of pixels are mapped to image segments,
spatially continuous regions in the image.
Before describing the details of the algorithm, we comment the selection of
the range domain.
Since the use of the radially symmetric kernel
relies on the Euclidean metric, the implemented color space should satisfy the
assumption that the difference between two colors is proportional to the
length of the straight line joining them. Although the proposed
algorithm is not designed for any particular color space, the performance
could degrade if the assumed Euclidean metric of the color space is not
valid. Further discussion hereafter refers to the algorithm implementation in
approximately uniform color space.
The Fast two-step histogram-based image segmentation algorithm (FHS)
can be described in following steps:
- Color space transformation
Minimal bounding rectangle (MBR) of the input data
in the designated color space is determined, where
are image pixels represented by their color coordinates and is the number of pixels. The computational complexity of this step is .
- Domain discretization
The range domain is partitioned into cells with side length
, where fixed bandwidth and proportionality
factor are algorithm parameters.
- Initial density estimation
The discrete density
is acquired by counting pixels which
populate each histogram cell. Proportionality constant
is computed as the geometric mean of
. The
complexity of this step is .
- Adaptive density estimation
The discrete density is estimated according to
(15) using the Epanechnikov
kernel (17):
|
(17) |
where is equal to the volume of the -dimensional hypersphere.
As the influence of the
Epanechnikov kernel centered at vanishes for
,
the contribution of all samples populating the -th cell with center
is accounted for in the set of cells with center satisfying
. The variable bandwidth is computed
(8) from the initial estimate
. The
computational complexity of this step is , where is the number
of populated cells and
.
- Range domain clustering
The step wise hill-climbing procedure, outlined in Section
4.2, is started from cells satisfying
. The hill-climbing procedure
partitions the input data into clusters of pixels pertaining to basins
of attraction determined by strong attractors satisfying
and noise, where is the geometric mean of
and
is the algorithm parameter. The
computational complexity of this step is .
- Mapping range domain clusters to the spatial image domain
Spatial constraints of image regions are recovered using the region growing
procedure. The growing process is continued as long as adjacent pixels
belong to the same range domain cluster. The computational complexity of
this step is .
The region growing mapping procedure is selected to evaluate the ability of
the proposed histogram-based approach to produce quality segmentations. When
implemented as a part of a complex vision system, the
output of the algorithm can be accommodated for a particular application and
the desired input for higher level processing modules. An alternative mapping
procedure (step 6) could be based on the fuzzy region growing [31],
with fuzzy segments which retain uncertainty for propagation to
higher level processing, where more intelligent decisions can be made.
Overall complexity of the FHS algorithm is
.
The algorithm is driven by three controlling parameters: fixed bandwidth
, discretization granularity and relative noise level
. Noise level, separating statistically important clusters from
the noise and outliers, is given by (16) relatively to the
geometric mean of the estimated density.
In Fig. 1 number of clusters (strong
attractors) in the range domain with respect to the relative noise level
is shown, normalized by the number of clusters disclosed for
.
The results, shown for two values of , present the average number of
clusters for all images in the publicly available Berkeley Segmentation
Dataset [32] collection. In both cases number of clusters for
is above of number of clusters disclosed for
, suggesting low susceptibility of the algorithm with respect
to this parameter. As shown in Fig. 1, the
resulting number of clusters retains temporal stability starting at
, thus we fix relative noise level to this value.
Factor determines discretization granularity of the color space. This
parameter prescribes the complexity of the density estimation procedure
, as constant
directly depends on the average number
of neighboring cells in the discretized domain in which kernel contribution is
accounted for. We fix the discretization granularity to . This
value represents a compromise between the efficiency and the quality of the
provided segmentations. For greater space and time complexity of the
algorithm rapidly grows, without prominent advances in the quality of
generated segmentations.
After fixing parameters
and , the algorithm is driven with
the single parameter which defines the scale of observation in the range
domain. The additional parameter, often used in the segmentation algorithms,
can be the minimum segment size.
Damir Krstinic
2011-11-04