SourceForge Home | About me | Download | API

JExtractor

Signal Feature Extractor Toolkit

Author

John Talbot

Institution

Physics Dept.,
University of Ottawa

Latest version

0.3 pre-alpha, progress

30%

Description

Feature extraction toolkit based on a wavelet multiscale vision model. Includes 1D signal processing sample application for line identification and classification of astronomical spectra.

Purpose

Provide highly extensible toolkit for signal feature extraction
Emphasis is placed on usability rather than efficiency
JExtractor is not a wavelet toolkit, focus is on feature extraction, not wavelets
The à trous discrete wavelet transform (DWT) is the implementation of choice.
There are no plans to implement other kinds of wavelet transforms.
The toolkit will conform to the Java Data Mining API version 1.0, JSR-73 and the JDMAPI 2.0, JSR-247 when the final draft is published

License

GPL version 2

Source Code

Located on SourceForge

API Documentation

JavaDoc API and hyperlinked source code

Programming Language

Java 5.0 required

Persistence

Input : Text, FITS and GIF, XML for database (JAXB)
Output : XML for database, signal and features

Dependencies

Build Time
Run Time
- JFits 0.92
- JAXB 1.0.3

Reliability

Unit testing

Algorithm

Input
1. Signal containing features to extract
2. Non-uniform noise for each pixel in the signal
Output
1. A set of individual features contained in the signal
2. A graphical display of features
3. A set of metrics for peak-like features

Procedure

Detection : find wavelet coefficients which exceed threshold
Separation : artifacts are grouped and seperated into objects
Reconstruction : each object is iteratively reconstructed
Measurement : certain objects such as peaks are measured
Display : graphical display of the signal and features

(a) Interpretation: : measurements are physicaly interpreted and labelled
(b) Classification: : labeled features can be grouped, sorted, classified

Progress

Detection	80%
Separation	30%
Reconstruction	20%
Measurement	70%
Display	20%
Interpretion	25%
Classification	20%

Procedure Summary

Click on titles for more details.

1. Detection

The first step in the Multiscale Vision Model for feature extraction is to transform the signal using a discrete wavelet transform based on the à trous with a B³-spline smoothing function. (see Starck, 2002). This produces the wavelet coefficients in several scales and the 1-sigma threshold. Each above threshold artifact is associated with a structure. Each wavelet scale contains a structure which is a set of contiguous wavelet coefficients which exceed a noise threshold. A structure is defined in a scale with a noise threshold. A structure optionally contains a reference to its parent structure and a list children structures via interscale relationships. A structure may be a part of an object. A structure may be due noise if it posseses very few pixels (the upper limit of which is application specific) and it is not connected to any other structure via an InterscaleRelationship or it is located near the boundary of the signal, in which case it might be due to boundary effects.
More ... .

2. Separation

The next step in the Multiscale Vision Model involves an attempt to associate structures into an object tree which corresponds to a kind of feature in the original signal. An object contains a set of connected interscale relationships representing a tree of structures spanning multiple scales and usually restricted to a small subset of positions in the original signal. Each interscale relationship share one structure in common with another interscale relationship instance in the same object. An object is composed of zero or more sub-objects represented by a subset of the interscale relationship of the parent object. Separation into sub- objects is based on the local-maxima contained in each object.

3. Reconstruction

The final step in the Multiscale Vision Model is to accurately reconstruct each object using an iterative conjugate gradient matrix solver. All objects are extracted, irrespective of shape or extend.
More ... .

4. Measurement

The various features in a signal can be measured. For instance in 1D signals some features resemble peaks which can be measured by fitting them using a Levenberg Marquardt non-linear least-square-fit to obtain the width, height and center of a gaussian function. This type of measurement is but one example, many other kinds of features could be measured such as discontinuities.

5. Display

A Java GUI for viewing original signal and overlays of its extracted features.

(a) Interpretation : With the aid the GUI display, physical interpretations can be associated with individual features and measurements such as theoretical wavelength identifications, ionic species such as carbon++ and transition quantum levels 2P^o - 3P^o.
(b) Classification : With the aid the GUI display, labeled and/or intepreted features can be grouped, sorted and classified with the ultimate goal of reaching a scientific conclusion about the observed phenomena. It can also be used to justify a given physical theory based on labeled interpreted features found the experimental data.

Procedure Details

The following sections go into greater detail for each step :

Step 1 : Detection using à trous Discrete Wavelet Transform

Description

The DWT is a kind of mathematical transform of a signal
The DWT is unlike the discrete fourier transform in that it produces results with one dimension more than the signal
The à trous DWT produces a set of scales which roughly correspond to aspects of the signal at different spatial frequencies
The results of the à trous DWT have an element of redundancy
The term 'à trous' comes from the 'holes' inserted between the smoothing function coefficients

Variables

kernel : {1, 4, 6, 4, 1} / 16 compact B³ spline smoothing function
j : wavelet scale
c(j) : successively smoothed signal
w(j) : multiresolution discrete wavelet tranform scales of signal

Algorithm

c(0) = signal
j = 1
c(j) = convolution of c(j-1) with kernel
w(j) = c(j) - c(j-1)
insert 2^j - 1 zero(s) between convolution kernel filter coefficients (holes). (i.e. the filter expands by a factor of 2 at every iteration)
j = j + 1
goto step 3 until the desired number of wavelet scales are obtained

Features

Signal of any size; no power of two restriction
Can be extended to a signal of any number of dimensions
Compact scaling function (B³ spline)
Any boundary condition can be used : (a) mirror (b) periodic (c) continuous etc...
Translation invariant
Trivial reconstruction : c(0) = c(p) + Σ w(j), where c(p) is the smoothed array at the last scale p
Evolution of the transform from one scale to the next can be followed easily because :
- no decimation occurs
- The transform is carried out in direct space

Step 3: Reconstruction using Multiscale Vision Models

Description

The multiscale vision model that we use is based on Starck (2002), p.102-103. Basically it involves iterative reconstruction of each object using a conjugate gradient matrix method :

Operators and Variables

Operators
W	Wavelet transform operator
P_w	Projection onto O_i structures operator
A = P_w o W	Composition of the two previously defined operators
A^-1	Adjoint of A operator
W^-1	Inverse transform of W operator

Variables
O_i	Signal corresponding to object with index i
W_i	Wavelet space signal restricted to O_i structures, zero elsewhere
n	Reconstruction iteration
O_i(n)	Object with index i at iteration n
w_r(n)	Wavelet residual signal at iteration n
R(n)	Residual signal at iteration n
α(n)	Convergence parameter at iteration n
β(n)	Convergence parameter at iteration n
threshold	Threshold for wavelet residual (constant)

Algorithm

1	Intialization	O_i(0) = W^-1W_i w_r(0) = W_i − AO_i(0) R(0) = A^-1w_r(0)
2	Convergence parameter	α(n) = \| A^-1w_r(n) \| ² / \| AR(n) \| ²
3	Correction	O_i(n+1) = O_i(n) + α(n)R(n)
4	Positivity constraint	Negative values of O_i(n+1) are set to zero
5	Wavelet residual	w_r(n+1) = W_i − AO_i(n+1)
6	Convergence test	if \| w_r(n+1) \| < threshold then STOP
7	Convergence parameter	β(n+1) = \| A^-1w_r(n+1) \| ² / \| A^-1w_r(n) \| ²
8	Residual image	R(n+1) = A^-1w_r(n+1) + β(n+1)R(n)
9	Return to step 2

Features

Fully automatic feature extraction
Requires no a-priori information on feature dimensions
By limiting the maximum wavelet scale, a lower bound on the spatial frequency of extracted features can be set

Reference and Data Sources

Meinel et al. 1975, Catalog of emission lines in astrophysical objects. Frequently observed emission lines from the litterature (1930 and 1966)
Sloan Digital Sky Survey Data Release 3
Hewitt, A. and Burbidge, G.: 1993, Astrophys. J.S. 87, 451
Starck, J.-L., Siebenmorgen, R., Gredel, R. 1997, Astrophys J. 482, 1011. Spectral Analysis Using the Wavelet Transform
Starck,J.-L., Murtagh, F.: 2002, Astronomical Image Processing and Data Analysis, ISBN 3540428852 Springer-Verlag.