Table of Contents

introduction
installation
Methods and Main Functions
- FunFEM
- FunLBM
- FunCC
- FunPF
- FunSparse
- FunSAS
- FunLocal
- Bimax
- SSVD
- CVX
References
Contribution

This Page

FunFEM¶

This page references the official documentation of FunFEM.

Method Description¶

FunFEM is a model-based clustering method specifically designed for functional data, such as time series. It employs a discriminative functional mixture (DFM) model that projects the observed curves into a latent functional subspace, where clustering is performed. The key steps of the method are:

Functional Data Representation

Each observed curve is first smoothed using a basis expansion (e.g., Fourier or spline basis), converting discrete observations into continuous functional forms.

Discriminative Subspace Learning

A low-dimensional discriminative subspace is identified via a generalized eigenvalue problem, maximizing between-cluster variance while minimizing within-cluster variance (Fisher’s criterion).

Model Inference (FunFEM Algorithm)

An iterative Expectation-Maximization (EM)-like algorithm alternates between:

F-step: Update the discriminative subspace orientation.

M-step: Estimate cluster parameters (means, covariances, and noise variances).

E-step: Compute posterior cluster membership probabilities for each curve.

Model Selection

The optimal number of clusters (K) and intrinsic dimensionality (d) are selected using the slope heuristic, a data-driven penalty calibration method, which outperforms BIC/AIC in practice.

Sparse Basis Selection

Optionally, sparsity-inducing regularization (l₁ penalty) is applied to select the most discriminative basis functions (e.g., key time intervals or frequencies) for interpretability.

Function¶

This method provides three core functions: fem_sim_data, fem_bifunc, and FDPlot.fem_fdplot. In this section, we detail their respective usage, as well as parameters, output values and usage examples for each function.

fem_sim_data¶

fem_sim_data loads real-world data sourced from the French bike-sharing system.

fem_sim_data()

Parameter¶

The simulated data are loaded internally and have no adjustable parameters.

Value¶

The function fem_sim_data outputs a dict represents French bike-sharing system data.

data: the loading profiles (number of available bikes / number of bike docks) of the 345 stations at 181 times.
pos: the longitude and latitude of the 345 bike stations.
dates: the download dates.
bonus: indicates if the station is on a hill (bonus = 1).
names: the names of the stations.

Example¶

from BiFuncLib.simulation_data import fem_sim_data
fem_simdata = fem_sim_data()

fem_bifunc¶

fem_bifunc performs model fitting.

fem_bifunc(fd, K = np.arange(2, 7), model = ['AkjBk'], crit = 'bic', init = 'kmeans', Tinit = (), maxit = 50, eps = 1e-6, disp = False, lambda_ = 0, graph = False)

Parameter¶

Parameter	Description
fd	dict, a functional data dict produced by the GENetLib package.
K	integer or list, a sequence specifying the numbers of mixture components (clusters) among which the model selection criterion will choose the most appropriate number of groups. Default is 2:6.
model	list, a list defining discriminative latent mixture (DLM) models to fit. There are 12 different models: “DkBk”, “DkB”, “DBk”, “DB”, “AkjBk”, “AkjB”, “AkBk”, “AkBk”, “AjBk”, “AjB”, “ABk”, “AB”. Users may supply any subset of models as a list; the optimal result will be selected according to the specified criteria.
crit	character, the criterion to be used for model selection (‘bic’, ‘aic’ or ‘icl’). ‘bic’ is the default.
init	character, the initialization type (‘random’, ‘kmeans’ of ‘hclust’). ‘kmeans’ is the default.
Tinit	array, a n x K matrix which contains posterior probabilities for initializing the algorithm (each line corresponds to an individual). Default is ().
maxit	character, the maximum number of iterations before the stop of the Fisher-EM algorithm. Default is 50.
eps	numeric, the threshold value for the likelihood differences to stop the Fisher-EM algorithm. Default is 1e-6.
disp	bool, if True, some messages are printed during the clustering. Default is False.
lambda_	numeric, the (l₁ penalty) (between 0 and 1) for the sparse version. Default is 0.
graph	bool, if True, plot the evolution of the log-likelhood. Default is False.

Value¶

The function fem_bifunc outputs a dict including clustering results and information of the model.

model: the model name.
K: the number of groups.
cls: the group membership of each individual estimated by the Fisher-EM algorithm.
P: the posterior probabilities of each individual for each group.
prms: the model parameters.
U: the orientation of the functional subspace according to the basis functions.
aic: the value of the Akaike information criterion.
bic: the value of the Bayesian information criterion.
icl: the value of the integrated completed likelihood criterion.
loglik: the log-likelihood values computed at each iteration of the FEM algorithm.
ll: the log-likelihood value obtained at the last iteration of the FEM algorithm.
nbprm: the number of free parameters in the model.
crit: the model selection criterion used.
allCriterions: stores the criterion values for all models under every combination of K and init.

If disp=True, the following information will be returned.

If graph=True, a plot of the log-likelihood versus iteration number will be returned.

Example¶

import numpy as np
from BiFuncLib.fem_bifunc import fem_bifunc
from BiFuncLib.simulation_data import fem_sim_data
from BiFuncLib.BsplineFunc import BsplineFunc
from GENetLib.fda_func import create_fourier_basis
fem_simdata = fem_sim_data()
# Create fd object
basis = create_fourier_basis((0, 181), nbasis=25)
time_grid = np.arange(1, 182).tolist()
fdobj = BsplineFunc(basis).smooth_basis(time_grid, np.array(fem_simdata['data'].T))['fd']
# Biclustering
res = fem_bifunc(fdobj, K=[5,6], model=['AkjBk', 'DkBk', 'DB'], crit = 'icl',
                init='hclust', lambda_=0.01, disp=True)
# Another setting
res2 = fem_bifunc(fdobj, K=[res['K']], model=['AkjBk', 'DkBk'], init='user', Tinit=res['P'],
                lambda_=0.01, disp=True, graph = True)

FDPlot.fem_fdplot¶

FDPlot.fem_fdplot produces visualizations.

FDPlot(result).fem_fdplot(data, fdobj)

Parameter¶

Parameter	Description
result	dict, a clustering result generated by fem_bifunc function.
data	dict, a data set loaded by fem_sim_data function.
fdobj	dict, a fd object serving as the first input to fem_bifunc function.

Value¶

The function FDPlot.fem_fdplot reconstructs the functional profiles for each cluster category, and displays a scatter plot which visualizes the distribution of data samples across different classes.

For each cluster category:

And a scatter plot:

Example¶

import numpy as np
from BiFuncLib.fem_bifunc import fem_bifunc
from BiFuncLib.simulation_data import fem_sim_data
from BiFuncLib.BsplineFunc import BsplineFunc
from GENetLib.fda_func import create_fourier_basis
from BiFuncLib.FDPlot import FDPlot
fem_simdata = fem_sim_data()
# Create fd object
basis = create_fourier_basis((0, 181), nbasis=25)
time_grid = np.arange(1, 182).tolist()
fdobj = BsplineFunc(basis).smooth_basis(time_grid, np.array(fem_simdata['data'].T))['fd']
# Biclustering
res = fem_bifunc(fdobj, K=[5,6], model=['AkjBk', 'DkBk', 'DB'], crit = 'icl',
                init='hclust', lambda_=0.01, disp=True)
# Another setting
res2 = fem_bifunc(fdobj, K=[res['K']], model=['AkjBk', 'DkBk'], init='user', Tinit=res['P'],
                lambda_=0.01, disp=True, graph = True)
# plot
FDPlot(res).fem_fdplot(fem_simdata, fdobj)

Previous: Methods and Main Functions | Next: FunLBM

On this page

FunFEM
- Method Description
- Function
  - fem_sim_data
    - Parameter
    - Value
    - Example
  - fem_bifunc
    - Parameter
    - Value
    - Example
  - FDPlot.fem_fdplot
    - Parameter
    - Value
    - Example

BiFuncLib's documentation

This Page

Quick search

FunFEM¶

Method Description¶

Function¶

fem_sim_data¶

Parameter¶

Value¶

Example¶

fem_bifunc¶

Parameter¶

Value¶

Example¶

FDPlot.fem_fdplot¶

Parameter¶

Value¶

Example¶