Pattern Recognition and
Image Processing Group
Institute of Computer Graphics and Algorithms
PRIP Fakultät für Informatik, TU Wien Technische Universität Wien Fakultät für Informatik

183.304 DiplDiss Seminar

For schedule & exams see TISS

Important Information:

  • Date: 12.11.2014 (Wednesday) at 13:00.
  • Location: Seminar Room 183/2, Favoritenstraße 9/4, 1040, Vienna.
  • Speakers: Students of Bachelor, Master, and PhD levels.

  • (182Kb) Program


    Title: Structurally Correct Image Segmentation using Local Binary Patterns and the Combinatorial Pyramid
    Speaker: Martin Cerman
    Time: 13:00-13:08
    Abstract: Local Binary Patterns (LBP) were first introduced in 1996 as a way to locally describe the texture of a 2-dimensional surface. The basic LBP operator captures the spatial structure information around a pixel by thresholding the 8 pixels in its 3*3 neighborhood by its gray scale value and encoding the result as a 8-bit binary number. The histogram of these binary numbers describes the texture within a specified region. Current image segmentation methods using the LBP texture descriptor utilize the LBP around a pixel as an additional feature and perform a segmentation using a clustering method, or they employ hierarchical windowed split-and-merge techniques in conjunction with the LBP histogram. Our method in contrast, uses the LBP to define a representative image which is structurally equal to the original image, and which is then used to partition the image into irregular regions. The minimal internal and external contrast between these regions defines the order of merging operations. Our algorithm additionally uses the combinatorial pyramid data structure to preserve the topological properties of the image, capture the merging history, and enable a full top-to-bottom reconstruction of the image.

    Title: Markerless tracking of facial features for facial palsy analysis.
    Speaker: Barbara Koneczny
    Time: 13:10-13:18
    Abstract: Facial nerve paralysis is a paralysis of the muscles which are innervated by the sev- enth cranial nerve, resulting in partial or total paralysis of the muscle tone. This causes restrictions of the nerval actuation of muscles responsible for facial expressions which causes asymmetric facial movement. A way to treat facial palsy is to apply neuro- muscular reconstruction methods to reestablish the muscle tone and symmetric facial movement.In order to measure the progress quantitatively, physicians require clinical measures extracted from those locations of the face which provide most information about the facial expression. Small artifical markers indicated these locations. These markers are placed manuallly on to the patient’s face before the evaluation session be- gins. During the evaluation session the patient is recorded while performing various facial expressions. In order to evaluate the common condition the distance between two marker positions ist measured. The software which performs the distance calculation needs a manual annotation of the marker points. This task is currently performed by the physician and can take several hours. Object tracking refers to the estimation of the position of objects from an image sequence. Natural objects, such as the human face, have a high potential for deformation and are characterized by an irregular texture. As not only one, but multiple markers have to be tracked simultaneously, additional diffi- culty is imposed by ensuring that markers that can be uniquely identified.The aim of this thesis is to automate the tracking process and to establish markerless tracking of facial features. The markers in the mouth regions could be replaced by methods of visual speech recognition. An alternative to the markers around the eye region is the applica- tion of methods used in drowsy driver detection In order to increase the accuracy of the tracking, a reference frame will be defined manually and then all other frames similar to the reference frame will be marked as anchor frames. Due to the similarity, the tracker can compute the flow between two anchor frames independently. This approach prevents error accumulation in lengthy performances.

    Title: Robust real-time face tracking with a moving camera
    Speaker: Bernd Artmüller
    Time: 13:20-13:28
    Abstract: I present a real-time face tracking system used with a Parrot AR Drone as part of the exhibition "Because human beeings can be expected to face reality". The performance was hosted by the Vienna-based artist collective Team[:]Niel in the embassy of Austria in Washington D.C. To convey a feeling of beeing permanently observed and the loss of individual privacy, my former colleague Bálint Kovács and I faced the challange to implement a robust real-time face detection and tracking system. Based on the open source library OpenCV I discuss the used technologies, major drawbacks and how we have minimized or rather avoided them.

    Title: Robust Segmentation of Human Teeth Contours in Dental Radiographs using Active Shape Models
    Speaker: Michael Sprinzl
    Time: 13:30-13:38
    Abstract: Within my thesis I present a framework for robust segmentation of human teeth contours in dental radiographs. I propose "Active Shape Models" (ASM) as segmentation approach. ASM - a term introduced by Cootes and Taylor in 1992, are flexible, statistically based models which iteratively move toward structures in images similar to those on which they were trained in advance. An ASM consists of a set of landmark points, each representing the position of a particular part of the tooth to be located. For human molars and premolars, a tooth model is built from the statistics of positions of landmark points placed on each of a set of training images. For image interpretation, the model of the tooth to be segmented, is placed in a previously unseen tooth image. The model parameters are then iteratively adjusted to move the landmark points toward better positions. Constraints are applied so that the overall tooth shape cannot deform more than the teeth seen in the corresponding training set. My proposed segmentation framework, consisting of noise reduction, building the ASM for the specific teeth using corresponding training images, and searching for teeth in previously unseen images. It can be used either with GNU Octave or MATLAB. The accuracy and speed of my proposed segmentation framework is evaluated using a set of dental radiographs containing intra-oral posterior periapical views of 60 molars and 70 premolars taken from 24 patients (22 female, 2 male) born between 1942 and 1993, taken over a period of 10 years.

    Title: Recognizing Structure in Spatio-Temporal Classes of Images
    Speaker: Ines Janusch
    Time: 13:40-13:48
    Abstract: By extending a single capturing of a 2D image (defined in the spatial domain) to a sequence of such images, temporal information is added and the image is defined in the spatio-temoral domain. The image content (either of single images or of an image sequences) may then be represented using a medial axis or Reeb graphs. However, these representations rely on a clear foreground/background segmentation as their input. One aim of the planned thesis is to computed critical points, that form the basis for graph representations directly on the unsegmented data. One option is to use local descriptors as for example local binary patterns (LBP) as a Morse function. For this purpose the LBP description of local structures needs to be extended in time to describe local features in the spatio-temporal domain. Another aim of the planned research is to find robust representations of spatio-temporal images. A first application is given by root images which are used in plant phenotyping. The roots due to their specific shape may be well represented using parabolas. The proposed representation transforms a curved root into a parabola, thereby straigthening the root. This new representation on the one hand allows to map the root pixels into the parabola shape and to reduce possible segmentation artefacts by doing so. On the other hand it allows for an analysis of root images: root images of the same root on different days of growth can be aligned using the branching points along the root, the growth process of the whole root system may be analysed. Side roots and the main root may be represented using this parabola approach, they can be overlayed to analyse the overall growth pattern of a root.
    Title: On the N-1 Property and Min-Weight Max-Entropy Problem
    Speaker: Samuel de Sousa
    Time: 13:50-13:58
    Abstract: Consider a point-set coming from an object which was sampled with a digital sensor (depth range, camera, etc). We are interested in finding a graph that would represent that point-set according to some properties. Such a representation would allow us to match two objects (graphs) by exploiting topological properties instead of solely relying on geometrical properties. The Delaunay triangulation is a common out-off-the-shelf strategy to triangulate a point-set and it is used by many researchers as the standard way to create the so called data-graph. We are interested in generating a graph with the following properties: the graph is (i) as unique as possible, (ii) and as discriminative as possible regarding the degree distribution. We pose a combinatorial optimization problem to build such a graph by minimizing the total weight cost of the edges and at the same time maximizing the entropy of the degree distribution. Our optimization approach is based on Dynamic Programming (DP) and yields a polynomial time algorithm. In this presentation, we derive the Min-Weight Max-Entropy Problem for the unconstrained and constrained scenario (when the graph has to be embedded on the plane) and demonstrate how our algorithm is able to solve both versions of the problem.
Contact: Mail: webmaster(at) | Tel: +43.1.58801.18661 | Fax : +43.1.58801.18697
2014 PRIP, Impressum
This page is maintained by Webmaster ( webmaster(at) ) and was last modified on 03. November 2014 17:10