aib.c File Reference

Agglomerative Information Bottleneck (AIB) - Definition. More...

#include "aib.h"
#include <stdio.h>
#include <stdlib.h>
#include <float.h>
#include <math.h>

Data Structures

struct  _VlAIB
 AIB algorithm data. More...

Defines

#define BETA_MAX   DBL_MAX
 The maximum value which beta may take.

Functions

void vl_aib_normalize_P (vl_aib_prob *P, vl_aib_node nelem)
 Normalizes an array of probabilities to sum to 1.
vl_aib_nodevl_aib_new_nodelist (vl_aib_node nentries)
 Allocates and creates a list of nodes.
vl_aib_probvl_aib_new_Px (vl_aib_prob *Pcx, vl_aib_node nvalues, vl_aib_node nlabels)
 Allocates and creates the marginal distribution Px.
vl_aib_probvl_aib_new_Pc (vl_aib_prob *Pcx, vl_aib_node nvalues, vl_aib_node nlabels)
 Allocates and creates the marginal distribution Pc.
void vl_aib_min_beta (VlAIB *aib, vl_aib_node *besti, vl_aib_node *bestj, vl_double *minbeta)
 Find the two nodes which have minimum beta.
void vl_aib_merge_nodes (VlAIB *aib, vl_aib_node i, vl_aib_node j, vl_aib_node new)
 Merges two nodes i,j in the internal datastructure.
void vl_aib_update_beta (VlAIB *aib)
 Updates aib->beta and aib->bidx according to aib->which.
void vl_aib_calculate_information (VlAIB *aib, vl_aib_prob *I, vl_aib_prob *H)
 Calculates the current information and entropy.
VlAIBvl_aib_new_aib (vl_aib_prob *Pcx, vl_aib_node nvalues, vl_aib_node nlabels)
 Allocates and initializes the internal data structure.
void vl_aib_delete_aib (VlAIB *aib)
 Deletes AIB data structure.
vl_aib_nodevl_aib (vl_aib_prob *Pcx, vl_uint nlabels, vl_uint nvalues, double **cost)
 Runs AIB on Pcx.


Detailed Description

For internal use only.

Author:
Brian Fulkerson

Andrea Vedaldi


Define Documentation

#define BETA_MAX   DBL_MAX

For internal use only.


Function Documentation

vl_aib_node* vl_aib ( vl_aib_prob Pcx,
vl_uint  nlabels,
vl_uint  nvalues,
double **  cost 
)

Parameters:
Pcx joint probability table (column major).
nlabels number of rows in Pcx (number of labels).
nvalues number of columns in Pcx (number of feature values).
cost cost of each node (out).
The function runs Agglomerative Information Bottleneck (AIB) on the joint probabilit talbe Pcx which has labels along the columns and feature valeus along the rows. AIB iteratively merges the two vlaues of the feature x that casuses the smallest decrease in mutual information between the random variables x and c.

Merge operations are arranged in a binary tree. The nodes of the tree correspond to the original feature values and any other value obtained as a result of a merge operation. The nodes are indexed in breadth-first order, starting from the leaves. The first index is zero. In this way, the leaves correspond directly to the original feature values. In total there are 2*nvalues-1 nodes.

The function returns an array whit one element per tree node. Each element is the index the parent node. The root parent is equal to zero.

Feature values with null probability are ignored by the algorithm and their nodes have parents indexing a non-existant tree node (a value bigger than 2*nvalues-1).

If cost is not equal to NULL, then the function will also return a vector with the information level after each merge. cost has nvalues entries: The first is the value of the cost funcitonal before any merge, and the other are the cost after the nvalues-1 merges.

Returns:
the array of parents representing the tree. The array has 2*nvalues-1 elements.

void vl_aib_calculate_information ( VlAIB aib,
vl_aib_prob I,
vl_aib_prob H 
)

For internal use only.

Parameters:
aib A pointer to the internal data structure
I The current mutual information (out).
H The current entropy (out).
Calculates the current mutual information and entropy of Pcx and sets I and H to these new values.

void vl_aib_delete_aib ( VlAIB aib  ) 

For internal use only.

Parameters:
aib data structure to delete.

void vl_aib_merge_nodes ( VlAIB aib,
vl_aib_node  i,
vl_aib_node  j,
vl_aib_node  new 
)

For internal use only.

Parameters:
aib A pointer to the internal data structure
i The index of one member of the pair to merge
j The index of the other member of the pair to merge
new The index of the new node which corresponds to the union of (i, j).
Nodes are merged by replacing the entry i with the union of ij, moving the node stored in last position (called lastnode) back to jth position and the entry at the end.

After the nodes have been merged, it updates which nodes should be considered on the next iteration based on which beta values could potentially change. The merged node will always be part of this list.

void vl_aib_min_beta ( VlAIB aib,
vl_aib_node besti,
vl_aib_node bestj,
vl_double minbeta 
)

For internal use only.

Parameters:
aib A pointer to the internal data structure
besti The index of one member of the pair which has mininum beta
bestj The index of the other member of the pair which minimizes beta
minbeta The minimum beta value corresponding to (i, j)
Searches aib->beta to find the minimum value and fills minbeta and besti and bestj with this information.

VlAIB* vl_aib_new_aib ( vl_aib_prob Pcx,
vl_aib_node  nvalues,
vl_aib_node  nlabels 
)

For internal use only.

Parameters:
Pcx A pointer to a 2D array of probabilities
nvalues The number of rows in the array
nlabels The number of columns in the array
Creates a new VlAIB struct containing pointers to all the data that will be used during the AIB process.

Allocates memory for the following:

Since it simply copies to pointer to Pcx, the total additional memory requirement is: (nvalues+nlabels)*sizeof(vl_aib_prob) + 3*nvalues*sizeof(vl_aib_prob) + nvalues*sizeof(vl_double)

Returns:
An allocated and initialized VlAIB pointer

vl_aib_node* vl_aib_new_nodelist ( vl_aib_node  nentries  ) 

For internal use only.

Parameters:
nentries The size of the list which will be created
Returns:
an array containing elements 0...nentries

vl_aib_prob* vl_aib_new_Pc ( vl_aib_prob Pcx,
vl_aib_node  nvalues,
vl_aib_node  nlabels 
)

For internal use only.

Parameters:
Pcx A two-dimensional array of probabilities
nvalues The number of rows in Pcx
nlabels The number of columns in Pcx
Returns:
an array of size nlabels which contains the marginal distribution over the columns

vl_aib_prob* vl_aib_new_Px ( vl_aib_prob Pcx,
vl_aib_node  nvalues,
vl_aib_node  nlabels 
)

For internal use only.

Parameters:
Pcx A two-dimensional array of probabilities
nvalues The number of rows in Pcx
nlabels The number of columns in Pcx
Returns:
an array of size nvalues which contains the marginal distribution over the rows.

void vl_aib_normalize_P ( vl_aib_prob P,
vl_aib_node  nelem 
)

For internal use only.

Parameters:
P The array of probabilities
nelem The number of elements in the array
Returns:
Modifies P to contain values which sum to 1

void vl_aib_update_beta ( VlAIB aib  ) 

For internal use only.

Parameters:
aib AIB data structure.
The function calculates beta[i] and bidx[i] for the nodes i listed in aib->which. beta[i] is the minimal variation of mutual information (or other score) caused by merging entry i with another entry and bidx[i] is the index of this best matching entry.

Notice that for each entry i that we need to update, a full scan of all the other entries must be performed.


Generated on Mon Jan 21 17:43:32 2008 for vlfeat by  doxygen 1.5.4