Interactive Hierarchical Image Segmentation on Irregular Pyramids (bibtex)
by Michael Gerstmayer
Abstract:
Image segmentation, in general, is the process of dividing a digital image into segments having a strong correlation with objects in it. Various techniques exist to locate objects of interest formed by visual cues. However, general purpose segmentation methods cannot produce a perfect final segmentation by using low-level cues only. A way round the problem is rather to create a stack of segmentations with different resolution levels. Higher level knowledge shall then be used to confirm or select regions for further processing. In automatic region-based segmentation, usually such a stack of segmentations is built in a bottom-up manner, guided by low-level image feature data and the defined homogeneity criteria. We should take into account as well that the accuracy of an image segmentation is measurable, but its quality and usability are highly subjective and depend also on the scope of the application. This thesis deals with modifications of such an irregular image segmentation pyramid and embedding additional knowledge about the problem domain such that the results of the image segmentation best suit the user. Based on an existing automatic segmentation framework where the minimum spanning tree based method tries to capture perceptually important groupings we bring the user into the loop and define interactive operations guiding the segmentation process. Semi-automatic approaches show multiple benefits (like flexibility and acceptance), but may sometimes be required also from juridical point of view. The interactive operations of merging and inhibition from merging require a representation that encodes the edge and the parent-child relationship of the merging tree. In this work each level of the irregular pyramid is represented by a combinatorial map, encoding both region and boundary information in a single combinatorial map structure. Using the connecting paths between the different levels of the pyramid, it is possible to set focus on regions from different granularities. In contrast to related approaches, this work is not limited to a single working level and pure sequential processing. Moreover, regions having different resolutions down to pixel level may be selected in parallel. This requires dedicated (pre-)processing and conflict resolution methods which guarantee consistency of applying the operations throughout the hierarchy. The output is a stack of segmentations with a final result that best suits the users' applications, in the topmost level of the hierarchy. We try to find out answers related to usability questions of the interactive segmentation tool developed and empirical values on the operations defined. As it turned out, the candidates (beginners) were able to produce results satisfying their expectations. The data recorded during the segmentation-sessions reveals different strategies and gives evidence on the usage of the interactive operations. This work can be used for problems where accuracy in image segmentation, annotating images or creating ground truth among others is needed.
Reference:
Interactive Hierarchical Image Segmentation on Irregular Pyramids (Michael Gerstmayer), Technical report, PRIP, TU Wien, 2013.
Bibtex Entry:
@TechReport{PTR-Gerstmayer13a,
  author =	 "Michael Gerstmayer",
  title =	 "Interactive Hierarchical Image Segmentation on Irregular Pyramids",
  institution =	 "PRIP, TU Wien",
  number =	 "PRIP-TR-129",
  year =	 "2013",
  url =		 "ftp://ftp.prip.tuwien.ac.at/pub/publications/trs/tr129.pdf",
  abstract =	 "Image segmentation, in general, is the process of dividing a digital image into segments having a
strong correlation with objects in it. Various techniques exist to locate objects of interest formed
by visual cues. However, general purpose segmentation methods cannot produce a perfect final
segmentation by using low-level cues only. A way round the problem is rather to create a stack
of segmentations with different resolution levels. Higher level knowledge shall then be used to
confirm or select regions for further processing. In automatic region-based segmentation, usually
such a stack of segmentations is built in a bottom-up manner, guided by low-level image feature
data and the defined homogeneity criteria. We should take into account as well that the accuracy
of an image segmentation is measurable, but its quality and usability are highly subjective and
depend also on the scope of the application. This thesis deals with modifications of such an irregular image segmentation pyramid and
embedding additional knowledge about the problem domain such that the results of the image
segmentation best suit the user. Based on an existing automatic segmentation framework where
the minimum spanning tree based method tries to capture perceptually important groupings we
bring the user into the loop and define interactive operations guiding the segmentation process.
Semi-automatic approaches show multiple benefits (like flexibility and acceptance), but may
sometimes be required also from juridical point of view. The interactive operations of merging
and inhibition from merging require a representation that encodes the edge and the parent-child
relationship of the merging tree. In this work each level of the irregular pyramid is represented by
a combinatorial map, encoding both region and boundary information in a single combinatorial
map structure. Using the connecting paths between the different levels of the pyramid, it is
possible to set focus on regions from different granularities. In contrast to related approaches,
this work is not limited to a single working level and pure sequential processing. Moreover,
regions having different resolutions down to pixel level may be selected in parallel. This
requires dedicated (pre-)processing and conflict resolution methods which guarantee consistency
of applying the operations throughout the hierarchy. The output is a stack of segmentations with
a final result that best suits the users' applications, in the topmost level of the hierarchy.
We try to find out answers related to usability questions of the interactive segmentation
tool developed and empirical values on the operations defined. As it turned out, the candidates
(beginners) were able to produce results satisfying their expectations. The data recorded
during the segmentation-sessions reveals different strategies and gives evidence on the usage
of the interactive operations. This work can be used for problems where accuracy in image
segmentation, annotating images or creating ground truth among others is needed.",
}
Powered by bibtexbrowser