Package: RcppML 1.0.0

RcppML: Fast Non-Negative Matrix Factorization and Divisive Clustering

High-performance non-negative matrix factorization (NMF), singular value decomposition (SVD), and divisive clustering for large sparse and dense matrices. Implements alternating least squares with coordinate descent and Cholesky NNLS solvers, diagonal scaling for interpretable factors, cross-validation for automatic rank selection, multiple distribution-based losses (Gaussian, Poisson, Generalized Poisson, Negative Binomial, Gamma, Inverse Gaussian, Tweedie) via iteratively reweighted least squares, regularization (L1, L2, L21, angular, graph Laplacian), and optional GPU acceleration via CUDA. Includes divisive clustering via recursive rank-2 factorization, consensus clustering, and the StreamPress compressed sparse matrix format. Methods are described in DeBruine, Melcher, and Triche (2021) <doi:10.1101/2021.09.01.458620>.

Authors:Zachary DeBruine [aut, cre]

RcppML_1.0.0.tar.gz
RcppML_1.0.0.zip(r-4.7)RcppML_1.0.0.zip(r-4.6)RcppML_1.0.0.zip(r-4.5)
RcppML_1.0.0.tgz(r-4.6-x86_64)RcppML_1.0.0.tgz(r-4.6-arm64)RcppML_1.0.0.tgz(r-4.5-x86_64)RcppML_1.0.0.tgz(r-4.5-arm64)
RcppML_1.0.0.tar.gz(r-4.6-arm64)RcppML_1.0.0.tar.gz(r-4.6-x86_64)RcppML_1.0.0.tar.gz(r-4.5-arm64)RcppML_1.0.0.tar.gz(r-4.5-x86_64)
RcppML_1.0.0.tgz(r-4.5-emscripten)
RcppML.pdf |RcppML.html
RcppML/json (API)

# Install 'RcppML' in R:
install.packages('RcppML', repos = c('https://zdebruine.r-universe.dev', 'https://cloud.r-project.org'))

Bug tracker:https://github.com/zdebruine/rcppml/issues

Uses libs:
  • c++– GNU Standard C++ Library v3
  • openmp– GCC OpenMP (GOMP) support library
Datasets:
  • aml - Acute Myelogenous Leukemia (AML) Dataset
  • digits - MNIST Digits Dataset
  • golub - Golub ALL-AML Dataset
  • hawaiibirds - Hawaii Bird Species Frequency Dataset
  • movielens - MovieLens Dataset
  • olivetti - Olivetti Faces Dataset
  • pbmc3k - PBMC 3k Single-Cell RNA-seq Dataset

On CRAN:

Conda:

clusteringmatrix-factorizationnmfrcpprcppeigensparse-matrixcppopenmp

10.76 score 114 stars 61 packages 402 scripts 23k downloads 72 exports 4 dependencies

Last updated from:df69dddbe3. Checks:11 WARNING, 1 ERROR, 1 OK. Indexed: yes.

TargetResultTimeFilesSyslog
linux-devel-arm64WARNING432
linux-devel-x86_64WARNING434
source / vignettesERROR410
linux-release-arm64WARNING400
linux-release-x86_64WARNING404
macos-release-arm64WARNING464
macos-release-x86_64WARNING707
macos-oldrel-arm64WARNING405
macos-oldrel-x86_64WARNING510
windows-develWARNING465
windows-releaseWARNING424
windows-oldrelWARNING483
wasm-releaseOK277

Exports:alignassessauto_nmf_distributionbipartiteMatchbipartitionbiplotclassify_embeddingclassify_logisticclassify_rfcoercecompare_nmfcompute_targetconsensus_nmfcosinecross_validate_graphdclustdiagnose_dispersiondiagnose_zero_inflationevaluateexport_logfactor_addfactor_concatfactor_conditionfactor_configfactor_inputfactor_netfactor_sharedfitgpu_availablegpu_infoHheadnmfnmf_layernnlspcapredictreconstructrefinescore_test_distributionshowsimulateNMFsimulateSwimmersortsparsityst_add_transposest_chunk_rangesst_filter_colsst_filter_rowsst_free_gpust_infost_map_chunksst_obs_indicesst_readst_read_densest_read_gpust_read_obsst_read_varst_slicest_slice_colsst_slice_rowsst_writest_write_densest_write_listsubsetsummarysvdsvd_layerttraining_loggervariance_explainedW

Dependencies:latticeMatrixRcppRcppEigen

Readme and manuals

Help Manual

Help pageTopics
RcppML: Fast Non-Negative Matrix Factorization and Divisive ClusteringRcppML-package RcppML
Access layer results by name$.factor_net_result
Align two NMF modelsalign align,nmf-method
Acute Myelogenous Leukemia (AML) Datasetaml
Convert assessment results to a one-row data frameas.data.frame.nmf_assessment
Convert training log to data.frameas.data.frame.training_logger
Assess Embedding Qualityassess
Auto-select NMF distributionauto_nmf_distribution
Bipartite graph matchingbipartiteMatch
Bipartition a sample setbipartition
Biplot for NMF factorsbiplot,nmf-method
Evaluate classification performance of factor embeddingsclassify_embedding
Logistic regression classifier for factor embeddingsclassify_logistic
Random forest classifier for factor embeddingsclassify_rf
Compare Multiple NMF Modelscompare_nmf
Compute a Target Matrix for Guided NMFcompute_target
Consensus Clustering for NMFconsensus_nmf
Cosine similaritycosine
Cross-validate a factorization networkcross_validate_graph
Divisive clusteringdclust
Diagnose dispersion modediagnose_dispersion
Diagnose zero inflationdiagnose_zero_inflation
MNIST Digits Datasetdigits
Evaluate an NMF modelevaluate evaluate,nmf-method
Export training log to CSVexport_log
Element-wise H addition (skip/residual connection)factor_add
Concatenate H factors from branches (row-bind)factor_concat
Concatenate conditioning metadata to a layer's Hfactor_condition
Global configuration for a factorization networkfactor_config
Create an input node for a factorization networkfactor_input
Compile a factorization networkfactor_net
Shared factorization across multiple inputs (multi-modal)factor_shared
Fit a factorization networkfit fit.factor_net
Golub ALL-AML Dataset (Brunet et al. 2004)golub
Check if GPU acceleration is availablegpu_available
Get GPU device informationgpu_info
Methods for gpu_sparse_matrix objectsdim.gpu_sparse_matrix gpu_sparse_matrix-methods ncol.gpu_sparse_matrix nrow.gpu_sparse_matrix print.gpu_sparse_matrix
Hawaii Bird Species Frequency Datasethawaiibirds
MovieLens Datasetmovielens
Non-negative matrix factorizationnmf
Create an NMF factorization layernmf_layer
nmf S4 Classnmf-class
Non-negative Least Squares Projectionnnls
Olivetti Faces Datasetolivetti
PBMC 3k Single-Cell RNA-seq Dataset (StreamPress Compressed)pbmc3k
PCA (centered SVD)pca
Plot Consensus Matrix Heatmapplot.consensus_nmf
Plot divisive clustering hierarchyplot.dclust
Plot NMF Training History and Diagnosticsplot.nmf
Plot Cross-Validation Resultsplot.nmfCrossValidate
Plot training logplot.training_logger
Project new data through a trained factor networkpredict.factor_net_result
Print a factor_netprint.factor_net
Print a factor_net_cv resultprint.factor_net_cv
Print a factor_net_resultprint.factor_net_result
Print a classifier evaluation resultprint.fn_classifier_eval
Print an fn_factor_configprint.fn_factor_config
Print an fn_global_configprint.fn_global_config
Print an fn_nodeprint.fn_node
Print method for nmf_assessment objectsprint.nmf_assessment
Print a training logprint.training_logger
Refine an NMF Model Using Label-Guided Correctionrefine
Score-test distribution diagnosticscore_test_distribution
Simulate an NMF datasetsimulateNMF
Simulate Swimmer DatasetsimulateSwimmer
Compute the sparsity of each NMF factorsparsity sparsity,nmf-method
Add Transpose Section to an Existing StreamPress Filest_add_transpose
Get Column Ranges for Each Chunk in a StreamPress Filest_chunk_ranges
Slice Columns Matching Variable Metadata Filterst_filter_cols
Slice Rows Matching Observation Metadata Filterst_filter_rows
Free GPU-Resident Sparse Matrixst_free_gpu
Get metadata from a StreamPress filest_info
Apply a Function to Every Chunk in a StreamPress Filest_map_chunks
Get Row Indices Matching Observation Metadata Filterst_obs_indices
Read a StreamPress file into a dgCMatrixst_read
Read a Dense Matrix from StreamPress v3 Formatst_read_dense
Read StreamPress File Directly to GPU Memoryst_read_gpu
Read Observation (Row) Metadata from a StreamPress Filest_read_obs
Read Variable (Column) Metadata from a StreamPress Filest_read_var
Slice Rows and/or Columns from a StreamPress Filest_slice
Slice Columns from a StreamPress Filest_slice_cols
Slice Rows from a StreamPress Filest_slice_rows
Write a sparse matrix to a StreamPress filest_write
Write a Dense Matrix to StreamPress v3 Formatst_write_dense
Write a List of Matrices as a Single StreamPress Filest_write_list
StreamPress I/O: Read, Write, and Inspect Compressed Matricesstreampress
nmf class methods$,nmf-method coerce,nmf,list-method dim,nmf-method dimnames,nmf-method head,nmf-method predict,nmf-method prod,nmf-method show,nmf-method sort,nmf-method subset,nmf-method t,nmf-method [,nmf,ANY,ANY,ANY-method [[,nmf-method
Summarize NMF factorsplot.nmfSummary summary,nmf-method
Summary for Consensus NMFsummary.consensus_nmf
Summarize a factor_net_resultsummary.factor_net_result
Summarize a classifier evaluation resultsummary.fn_classifier_eval
Truncated SVD / PCA with constraints and regularizationsvd
Create an SVD/PCA factorization layersvd_layer
svd S4 Classdim,svd-method head,svd-method predict,svd-method reconstruct reconstruct,svd-method show,svd-method svd-class variance_explained variance_explained,svd-method [,svd,ANY,ANY,ANY-method
Create a training logger for factor network fittingtraining_logger
Per-factor configuration for factorization layersH W