diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index 7aae1445..564aaa47 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -36,13 +36,13 @@ By contributing, you understand and agree that your work becomes the part of the * Use package [styler](http://styler.r-lib.org/) with RStudio add-in to easily re-style your code to comply with the guidelines. * When appropriate, prefer this naming scheme for the internal functions – `._<...>` with ``, ``, etc. the first, second and later parameter/operand: - - _ ... missing - - a ... array - - d ... data frame - - f ... formula - - h ... hyperSpec object - - m ... matrix - - n ... numeric (scalar, vector, or matrix) + - _ ... missing + - a ... array + - d ... data frame + - f ... formula + - h ... hyperSpec object + - m ... matrix + - n ... numeric (scalar, vector, or matrix) * Each new function should be accompanied with appropriate unit tests. * If a unit test needs to be disabled temporarily, please use `skip("reason for switching off")`. @@ -130,22 +130,24 @@ Every commit should be related to one feature only, but the commit should group The project adheres to the semantic versioning guidelines, as outlined at https://semver.org/ (Work in progress, see [#123](https://github.com/cbeleites/hyperSpec/issues/123)). -Briefly, the version string has the form `x.y.z` (or `major.minor.patch`), where the major number gets incremeted if a release introduces breaking changes, the minor one after any changes in functionality (new features of bugfixes), and the patch number is increased after any trivial change. If a major or minor number is incremented, all subsequent ones are set to zero. +Briefly, the version string has the form `x.y.z` (or `major.minor.patch`), where the major number gets incremented if a release introduces breaking changes, the minor one after any changes in functionality (new features of bugfixes), and the patch number is increased after any trivial change. If a major or minor number is incremented, all subsequent ones are set to zero. The version numbers refer only to commits in the `master` branch, and get incremented in one of two cases: + * during the release preparation, when a `release/x.y.z` branch buds off `develop` and merges into `master`. * after a hotfix, which also results in a new commit on `master`. * development branches have version `x.x.x.9000` (or `.9001` and so on - but that is rarely needed). This is important since **pkgdown** uses the `.9000` to distinguish between documentation for the released version vs. the development version. ### Release Process + The process starts when the package is in a stable state that can be released to CRAN (release candidate). First, decide on a new version number `x.y.z` based on the severity of changes. Then: * Create a `release/x.y.z` branch using `git flow release start ` and push it with `git flow publish` * Open a pull request that merges into `master` * Update the version number in the `DESCRIPTION` file * Verify that the changes are listed in `NEWS.md` -* Confirm that the package can be built for each plaftorm +* Confirm that the package can be built for each platform * Ensure that all check are passed on the tarballs you build (either on your machine or using CI) with `R CMD check --as-cran `. The checks must pass for `R` versions `R-oldrel`, `R-release`, `R-patched`, and `R-devel`. * If any bugs are found, they must be fixed in the very same branch (see [here](https://stackoverflow.com/a/57507373/6029703) for details) * Once everything works use `git flow release finish `. It will merge the release branch into both `master` and `develop`, and will assign a tag to the newly created commit in the `master` branch. diff --git a/DESCRIPTION b/DESCRIPTION index 5e7cec28..5876adba 100644 --- a/DESCRIPTION +++ b/DESCRIPTION @@ -26,7 +26,7 @@ Description: Comfortable ways to work with hyperspectral data sets, of information associated with each of the spectra. The spectra can be data as obtained in XRF, UV/VIS, Fluorescence, AES, NIR, IR, Raman, NMR, MS, etc. More generally, any data that is recorded over a discretized variable, - e.g. absorbance = f (wavelength), stored as a vector of absorbance values + e.g. absorbance = f(wavelength), stored as a vector of absorbance values for discrete wavelengths is suitable. License: GPL (>= 3) + file LICENSE LazyLoad: yes diff --git a/R/initialize.R b/R/initialize.R index ae970082..13e3ffbe 100644 --- a/R/initialize.R +++ b/R/initialize.R @@ -1,15 +1,14 @@ - -#' Create a `hyperSpec` object +#' Create a `hyperSpec` Object #' -#' To create a new `hyperSpec` object, the following functions can be used: +#' To create a new `hyperSpec` object, you can use one of the following functions: #' - [new()] (i.e., `new("hyperSpec", ...)`); #' - `new_hyperSpec()`. #' #' @note #' -#' A `hyperSpec` object is an S4 object, so its initialization is carried out -#' by calling `new("hyperSpec", ...)`. Function `new_hyperSpec()` is just -#' a convenience function. +#' A `hyperSpec` object is an S4 object, so its initialization is performed +#' by calling `new("hyperSpec", ...)`. The function `new_hyperSpec()` is provided +#' for convenience. #' #' @docType methods #' @@ -29,30 +28,30 @@ #' A spectra matrix with spectra in rows and wavelength intensities in #' columns. #' -#' The `spc` does not need to be an R `matrix`, but must be an object +#' The `spc` does not need to be an R `matrix`, but it must be an object #' convertible to a matrix via `I(as.matrix(spc))`. #' #' @param data (`data.frame`) \cr #' A `data.frame` with extra (non-spectroscopic) data in columns. #' The data frame may also contain a special column `spc` with a `matrix` #' of spectroscopic data. -#' (Such single column that contains matrix can be created with +#' (Such a single column that contains a matrix can be created with #' `data.frame(spc = I(as.matrix(spc)))`. -#' However, it will usually be more convenient if the spectra are given -#' via argument `spc`.) +#' However, it is usually more convenient to provide the spectra +#' via the `spc` argument.) #' #' @param wavelength (numeric vector) \cr #' The wavelengths corresponding to the columns of `spc`. #' #' If no wavelengths are given, an appropriate vector is derived from the -#' column the column names of `data$spc`. If this is not possible, +#' column names of `data$spc`. If this is not possible, #' `1:ncol(data$spc)` is used instead. #' #' @param labels A named `list`: -#' - list's element names should containing one or more names of `data` -#' columns as well as special name `.wavelength` for `wavelength`s ). +#' - list's element names should contain one or more names of `data` +#' columns as well as the special name `.wavelength` for `wavelength`s. #' - list's element values should contain the labels for the indicated -#' names usually either in a for of character strings or +#' names, usually in the form of character strings or #' [plotmath][grDevices::plotmath()] expressions. #' (The labels should be given in a form ready #' for the text-drawing functions, see [grDevices::plotmath()]). @@ -61,24 +60,26 @@ #' columns of `data` and `wavelength` is used. #' #' @param gc (logical) \cr Use garbage collection. -#' If option `gc` is `TRUE`, the initialization will have frequent calls -#' to [base::gc()], which can help to avoid swapping or running out of -#' memory. The default value of `gc` can be set via [hy_set_options()]. +#' If the option `gc` is set to `TRUE`, the initialization will have +#' frequent calls to [base::gc()], which can help avoid swapping or +#' running out of memory. +#' The default value of `gc` can be set via [hy_set_options()]. #' #' @param log This parameter is currently **ignored**. It is present due to -#' backwards compatibility. +#' backward compatibility. #' #' @param .Object A new `hyperSpec` object. #' #' -#' @author C.Beleites, V. Gegzna +#' @author C. Beleites, V. Gegzna #' @seealso #' #' - [methods::new()] for more information on creating and initializing S4 objects. -#' - [grDevices::plotmath()] on expressions for math annotations as for slot `label`. -#' - [hy_set_options()] setting `hyperSpec` options. +#' - [grDevices::plotmath()] for expressions used for math annotations as in the `label` slot. +#' - [hy_set_options()] for setting `hyperSpec` options. #' #' @keywords methods datagen +#' #' @concept hyperSpec conversion #' #' @examples @@ -98,7 +99,7 @@ #' colnames(spc) <- 600:603 #' new("hyperSpec", spc = spc) # wavelength taken from colnames (spc) #' -#' # given wavelengths precede over colnames of spc +#' # given wavelengths take precedence over colnames of spc #' new("hyperSpec", spc = spc, wavelength = 700:703) #' #' # specifying labels @@ -123,7 +124,7 @@ NULL .initialize <- function(.Object, spc = NULL, data = NULL, wavelength = NULL, labels = NULL, gc = hy_get_option("gc"), log = "ignored") { - # Do the small stuff first, so we need not be too careful about copies + # Handle the small stuff first, so we don't need to be too careful about copies # The wavelength axis if (!is.null(wavelength) && !is.numeric(wavelength)) { @@ -197,7 +198,7 @@ NULL if (gc) base::gc() if (!is.null(data$spc) && !(is.null(spc))) { - warning("Spectra in data are overwritten by argument spc.") + warning("Spectra in data are overwritten by the argument spc.") } # Deal with spectra @@ -216,7 +217,7 @@ NULL dim <- dim(spc) spc <- suppressWarnings(as.numeric(spc)) if (all(is.na(spc))) { - stop("spectra matrix needs to be numeric or convertable to numeric") + stop("spectra matrix needs to be numeric or convertible to numeric") } else { warning("spectra matrix is converted from ", class(data$spc), " to numeric.") } @@ -227,7 +228,7 @@ NULL if (gc) base::gc() if (!is.null(spc)) { - attr(spc, "class") <- "AsIs" # I seems to make more than one copy + attr(spc, "class") <- "AsIs" # It seems to make more than one copy if (gc) base::gc() } @@ -254,7 +255,7 @@ NULL .Object <- .spc_fix_colnames(.Object) # For consistency with .wl<- - # Finally: check whether we got a valid hyperSpec object + # Finally, check whether we got a valid hyperSpec object validObject(.Object) .Object diff --git a/R/write_txt_long.R b/R/write_txt_long.R index b9340e5c..0786e3c6 100644 --- a/R/write_txt_long.R +++ b/R/write_txt_long.R @@ -8,25 +8,25 @@ #' @rdname write_txt #' @aliases write_txt_long #' -#' @param file Filename or connection. -#' @param object `hyperSpec` object. -#' @param cols Column names specifying the column order. -#' @param order Which columns should be [base::order()]ed? Parameter `order` is -#' used as index vector into a `data.frame` with columns given by `cols`. -#' @param na.last Handed to [base::order()] by `write_txt_long`. -#' @param quote,sep,col.names,row.names Have their usual meaning -#' (see [utils::write.table()]), but different default values. -#' -#' For file import, `row.names` should usually be `NULL` so that the first -#' column becomes a extra data column (as opposed to row names of the -#' extra data). -#' @param col.labels Should the column labels be used rather than the -#' colnames? +#' @param file Filename or connection to write the data. +#' @param object A `hyperSpec` object to export. +#' @param cols Column names specifying the order of columns in the output file. +#' @param order Which columns should be sorted using [base::order()]? The `order` +#' parameter is used as an index vector into a `data.frame` with columns +#' specified by `cols`. +#' @param na.last Passed to [base::order()] by `write_txt_long`. +#' @param quote,sep,col.names,row.names These parameters have their usual meaning +#' as used in [utils::write.table()], but with different default values. +#' +#' For file import, `row.names` should usually be set to `NULL` so that the first +#' column becomes an extra data column (instead of row names of the extra data). +#' +#' @param col.labels Should the column labels be used rather than the colnames? #' @param append Should the output be appended to an existing file? -#' @param decreasing logical vector giving the sort order. -#' @param header.lines Toggle one or two line header (wavelengths in the +#' @param decreasing A logical vector specifying the sort order for columns. +#' @param header.lines Toggle one or two-line headers (wavelengths in the #' second header line) for `write_txt_wide`. -#' @param ... arguments handed to [utils::write.table()]. +#' @param ... Additional arguments passed to [utils::write.table()]. #' #' #' @concept io @@ -39,13 +39,13 @@ #' #' ## Export & import Matlab files #' if (require(R.matlab)) { -#' # export to matlab file +#' # Export to a Matlab file #' writeMat(paste0(tempdir(), "/test.mat"), #' x = flu[[]], wavelength = flu@wavelength, #' label = lapply(flu@label, as.character) #' ) #' -#' # reading a matlab file +#' # Read a Matlab file #' data <- readMat(paste0(tempdir(), "/test.mat")) #' print(data) #' mat <- new("hyperSpec", @@ -86,12 +86,12 @@ #' #' read.txt.wide( #' file = paste0(tempdir(), "/flu.txt"), -#' # give columns in same order as they are in the file +#' # Give columns in the same order as they are in the file #' cols = list( #' spc = "I / a.u", #' c = expression("/"("c", "mg/l")), #' filename = "filename", -#' # plus wavelength label last +#' # Plus wavelength label last #' .wavelength = "lambda / nm" #' ), #' header = TRUE diff --git a/README.md b/README.md index 7429daf3..fd53cf8a 100644 --- a/README.md +++ b/README.md @@ -5,9 +5,9 @@ [![CRAN status](https://www.r-pkg.org/badges/version-last-release/hyperSpec)](https://cran.r-project.org/package=hyperSpec) [![metacran downloads](https://cranlogs.r-pkg.org/badges/grand-total/hyperSpec)](https://cran.r-project.org/package=hyperSpec) [![metacran downloads](https://cranlogs.r-pkg.org/badges/hyperSpec)](https://cran.r-project.org/package=hyperSpec) -[![R-CMD-check](https://github.com/r-hyperspec/hyperSpec/workflows/R-CMD-check/badge.svg?branch=develop)](https://github.com/r-hyperspec/hyperSpec/actions) +[![R-CMD-check](https://github.com/r-hyperspec/hyperSpec/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/r-hyperspec/hyperSpec/actions/workflows/R-CMD-check.yaml) [![Codecov](https://codecov.io/gh/r-hyperspec/hyperSpec/branch/develop/graph/badge.svg)](https://codecov.io/gh/r-hyperspec/hyperSpec?branch=develop) -![Website (pkgdown)](https://github.com/r-hyperspec/hyperSpec/workflows/Website%20(pkgdown)/badge.svg) +[![Website (pkgdown)](https://github.com/r-hyperspec/hyperSpec/actions/workflows/pkgdown.yaml/badge.svg)](https://github.com/r-hyperspec/hyperSpec/actions/workflows/pkgdown.yaml) @@ -21,7 +21,7 @@ Package `hyperSpec` is under overhaul now. So this website is still under construction and the contents as well as resources are not fully updated yet. -The documentation of version `0.100.0` is not present here too. +The documentation of version 0.100.2 is not present here either.
@@ -30,7 +30,7 @@ The documentation of version `0.100.0` is not present here too. [**R**](https://www.r-project.org/) package **hyperSpec** is the main package in the [**`r-hyperspec`**](https://r-hyperspec.github.io/) family of packages. The goal of **hyperSpec** (and whole **`r-hyperspec`**) is to make the work with hyperspectral data sets, (i.e. spatially or time-resolved spectra, or spectra with any other kind of information associated with each of the spectra) more comfortable. -The spectra can be data as obtained during +The spectra can be data obtained during [XRF](https://en.wikipedia.org/wiki/X-ray_fluorescence), [UV/VIS](https://en.wikipedia.org/wiki/Ultraviolet%E2%80%93visible_spectroscopy), [Fluorescence](https://en.wikipedia.org/wiki/Fluorescence_spectroscopy), @@ -59,7 +59,7 @@ The documentation of the other **`r-hyperspec`** family packages can be found at ## Issues, Bug Reports and Feature Requests -Issues, bug reports and feature requests should go [here](https://github.com/r-hyperspec/hyperSpec/issues)! +Issues, bug reports, and feature requests should go [here](https://github.com/r-hyperspec/hyperSpec/issues)! @@ -96,7 +96,7 @@ remotes::install_github("r-hyperspec/hyperSpec") ``` **NOTE 1:** -Usually, "Windows" users need to download, install and properly configure **Rtools** (see [these instructions](https://cran.r-project.org/bin/windows/Rtools/)) to make the code above work. +Usually, "Windows" users need to download, install, and properly configure **Rtools** (see [these instructions](https://cran.r-project.org/bin/windows/Rtools/)) to make the code above work. **NOTE 2:** This method will **not** install package's documentation (help pages and vignettes) into your computer. @@ -115,14 +115,14 @@ So you can either use the [online documentation](https://r-hyperspec.github.io/) 1. From the **hyperSpec**'s GitHub [repository](https://github.com/r-hyperspec/hyperSpec): - If you use Git, `git clone` the branch of interest. You may need to fork it before cloning. - - Or just chose the branch of interest (1 in Figure below), download a ZIP archive with the code (2, 3) and unzip it on your computer. + - Or just choose the branch of interest (1 in Figure below), download a ZIP archive with the code (2, 3), and unzip it on your computer. ![image](https://user-images.githubusercontent.com/12725868/89338263-ffa1dd00-d6a4-11ea-94c2-fa36ee026691.png) 2. Open the downloaded directory in RStudio (preferably, as an RStudio project). - The code below works correctly only if your current working directory coincides with the root of the repository, i.e., if it is in the directory that contains file `README.md`. - If you open RStudio project correctly (e.g., by clicking `project.Rproj` icon ![image](https://user-images.githubusercontent.com/12725868/89340903-26621280-d6a9-11ea-8299-0ec5e9cf7e3e.png) in the directory), then the working directory is set correctly by default. -3. In RStudio 'Console' window, run the code (provided below) to: +3. In the RStudio 'Console' window, run the code (provided below) to: a. Install packages **remotes** and **devtools**. b. Install **hyperSpec**'s dependencies. c. Create **hyperSpec**'s documentation. @@ -146,7 +146,7 @@ devtools::install(build_vignettes = TRUE) ``` **NOTE 1:** -Usually, "Windows" users need to download, install and properly configure **Rtools** (see [these instructions](https://cran.r-project.org/bin/windows/Rtools/)) to make the code above work. +Usually, "Windows" users need to download, install, and properly configure **Rtools** (see [these instructions](https://cran.r-project.org/bin/windows/Rtools/)) to make the code above work. diff --git a/_pkgdown.yml b/_pkgdown.yml index 236292e7..ea9cd15a 100644 --- a/_pkgdown.yml +++ b/_pkgdown.yml @@ -215,6 +215,7 @@ reference: - contents: - has_concept("labels") + - title: '8. Other functions' # ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ desc: "Functions not mentioned above." - contents: diff --git a/vignettes/flu.Rmd b/vignettes/flu.Rmd index ffc0a498..3db7ba50 100644 --- a/vignettes/flu.Rmd +++ b/vignettes/flu.Rmd @@ -1,8 +1,9 @@ --- -# For vignette --------------------------------------------------------------- +# For Vignette --------------------------------------------------------------- title: Calibration of Quinine Fluorescence Emission subtitle: "Example Workflow for Fluorescence Emission Dataset `flu`" description: "flu: Example workflow for fluorescence emission dataset `flu`." + # Authors -------------------------------------------------------------------- author: - name: Claudia Beleites^1,2,3,4,5^, Vilmantas Gegzna @@ -14,6 +15,7 @@ author: 3. ÖPV, JKI, Berlin/Germany (2017--2019) 4. Arbeitskreis Lebensmittelmikrobiologie und Biotechnologie, Hamburg University, Hamburg/Germany (2019 -- 2020) 5. Chemometric Consulting and Chemometrix GmbH, Wölfersheim/Germany (since 2016) + # Document ------------------------------------------------------------------- date: "`r Sys.Date()`" output: @@ -25,6 +27,7 @@ output: css: - vignette.css - style.css + vignette: > % \VignetteIndexEntry{Calibration of Quinine Fluorescence Emission} % \VignetteKeyword{calibration} @@ -40,14 +43,14 @@ link-citations: yes bibliography: resources/flu-pkg.bib biblio-style: plain csl: elsevier-with-titles.csl + # Pkgdown -------------------------------------------------------------------- pkgdown: as_is: true --- - ```{r cleanup-flu, include = FALSE} -# Clean up to ensure reproducible workspace ---------------------------------- +# Clean up to ensure a reproducible workspace ---------------------------------- rm(list = ls(all.names = TRUE)) ``` @@ -61,7 +64,7 @@ source("vignette-functions.R", encoding = "UTF-8") # Settings ------------------------------------------------------------------- source("vignette-default-settings.R", encoding = "UTF-8") -# Temporaty options ---------------------------------------------------------- +# Temporary options ---------------------------------------------------------- # Change the value of this option in "vignette-default-settings.R" show_reviewers_notes <- getOption("show_reviewers_notes", TRUE) ``` @@ -78,15 +81,13 @@ knitr::write_bib( ``` - - ```{block, type="note-t", echo=show_reviewers_notes} **V. Gegzna's notes** `flu-1` 1. `# FIXME:`{.r} After the translation is completed, the contents in the box below must be fixed. -2. `# FIXME:`{.r} do not mention `vignettes.def` +2. `# FIXME:`{.r} Do not mention `vignettes.def`. ``` @@ -98,35 +99,32 @@ knitr::write_bib( **Reproducing this vignette.** -The spectra files are shipped with package **hyperSpec**. -This allows reproduction of the whole vignette (source code and spectra files are in the package's documentation directory and its `rawdata` subdirectory). +The files with spectra are included with the **hyperSpec** package. +This allows the reproduction of the entire vignette (source code and spectra files are in the package's documentation directory and its `rawdata` subdirectory). -For reproducing the examples in a live session, the full file names of the spectra can be found with the command: +To reproduce the examples in a live session, you can obtain the full file names of the spectra with the following command: -`list.files(system.file("doc/rawdata", package = "hyperSpec"), pattern = "flu[1-6][.]txt")`{.r}. +```{r, eval=FALSE} +list.files(system.file("doc/rawdata", package = "hyperSpec"), pattern = "flu[1-6][.]txt") ``` +This tutorial provides an example of how to: +- Write an import function for proprietary ASCII files from a spectrometer manufacturer. +- Add additional data columns to the spectra. +- Set up a linear calibration (inverse least squares). -This tutorial gives an example how to: - -- write an import function for a spectrometer manufacturer's proprietary ASCII files, -- add further data columns to the spectra, and -- set up a linear calibration (inverse least squares). - -The data set `flu`{.r} in package **hyperSpec** consists of 6 fluorescence emission spectra of quinine solutions. -They were acquired during an student practicum and were kindly provided by M. Kammer. - -The concentrations of the solutions range from 0.05 to 0.30 mg/l. -Spectra were acquired with a Perkin Elmer LS50-B fluorescence spectrometer at 350 nm excitation. +The `flu` dataset in the **hyperSpec** package consists of six fluorescence emission spectra of quinine solutions. +These spectra were acquired during a student practicum and were kindly provided by M. Kammer. +The concentrations of the solutions range from 0.05 to 0.30 mg/l, and the spectra were acquired using a Perkin Elmer LS50-B fluorescence spectrometer at 350 nm excitation. # Writing an Import Function -The raw spectra are in Perkin Elmer's ASCII file format, one spectrum per file. -The files are completely ASCII text, with the actual spectra starting at line 55. +The raw spectra are stored in Perkin Elmer's ASCII file format, with one spectrum per file. +These files are completely ASCII text, and the actual spectra start at line 55. ```{block, type="note-t", echo=show_reviewers_notes} @@ -148,10 +146,10 @@ The function to import these files, `read.txt.PerkinElmer()`{.r}, is discussed i 1. `# FIXME:`{.r} Why do we need `source("read.txt.PerkinElmer.R")`? -Why is `read.txt.PerkinElmer()` not a function of package **hyperSpec**. +Why is `read.txt.PerkinElmer()` not a function of the package **hyperSpec**? This part (`source("read.txt.PerkinElmer.R")`) is not user-friendly and must be fixed. -**Selfnote**: review after `fileio` vignette is translated. +**Selfnote**: Review after the `fileio` vignette is translated. ``` @@ -220,7 +218,7 @@ folder <- system.file("extdata/flu", package = "hyperSpec") files <- Sys.glob(paste0(folder, "/flu?.txt")) flu <- read.txt.PerkinElmer(files, skip = 54) ``` -Now the spectra are in a `hyperSpec`{.r} object and can be examined, e.g., by: +Now, the spectra are contained within a `hyperSpec`{.r} object and can be examined, e.g., by: ```{r rawspc} flu @@ -236,8 +234,8 @@ plot(flu) # Adding Further Data Columns -The calibration model needs the quinine concentrations for the spectra. -This information can be stored together with the spectra, and also gets an appropriate label: +The calibration model requires the quinine concentrations for the spectra. +This information can be stored alongside the spectra and also gets an appropriate label: ```{r newdata} flu$c <- seq(from = 0.05, to = 0.30, by = 0.05) @@ -250,7 +248,7 @@ flu ```{block, type="note-t", echo=show_reviewers_notes} **V. Gegzna's notes** `flu-4` -1. `# FIXME:`{.r} why file saving is needed? +1. `# FIXME:`{.r} why is file saving needed? `save(flu, file = "flu.rda")` ``` @@ -260,7 +258,7 @@ save(flu, file = "flu.rda") ``` -Now the `hyperSpec`{.r} object `flu`{.r} contains two data columns, holding the actual spectra and the respective concentrations. The dollar operator returns such a data column: +Now, the `hyperSpec`{.r} object `flu`{.r} contains two data columns, holding the actual spectra and the respective concentrations. The dollar operator returns such a data column: ```{r newc} flu$c @@ -285,12 +283,11 @@ flu$filename <- NULL # Linear Calibration -As R is developed for the purpose of statistical analysis, tools for a least squares calibration -model are readily available. +As R is developed for statistical analysis, tools for a least squares calibration model are readily available. The original spectra range from `r min(wl(flu))` to `r max(wl(flu))` nm. However, the intensities at 450 nm are perfect for a univariate calibration. -Plotting them over the concentration is done by: +Plotting them against the concentration can be done as follows: diff --git a/vignettes/hyperSpec.Rmd b/vignettes/hyperSpec.Rmd index 38f9d8e2..27664691 100644 --- a/vignettes/hyperSpec.Rmd +++ b/vignettes/hyperSpec.Rmd @@ -93,9 +93,9 @@ cat(res, sep = "\n") # Things to Know About **_hyperSpec_** -Package **hyperSpec** is a `R` package that allows convenient handling of \index{hyperspectral data sets} hyperspectral data sets, i.e., data sets combining spectra with further data on a per-spectrum basis. +Package **hyperSpec** is an `R` package that allows convenient handling of \index{hyperspectral data sets} hyperspectral data sets, i.e., data sets combining spectra with further data on a per-spectrum basis. The spectra can be anything that is recorded over a common discretized axis. -This vignette gives an introduction on basic working techniques using the `R` package package **hyperSpec**. +This vignette gives an introduction to basic working techniques using the `R` package package **hyperSpec**. This is done mostly from a spectroscopic point of view: rather than going through the functions provided by package **hyperSpec**, it is organized by spectroscopic tasks. ## Terms & Notations Used Here @@ -113,7 +113,7 @@ transmission, absorbance, $\frac{e^{-}}{s}$, intensity, etc. **extra data** \index{extra data} : further information/data belonging to each spectrum such as -spatial information (spectral images, maps, or profiles), temporal information (kinetics, time series), concentrations (calibration series), class membership information, etc. Class `hyperSpec` object may contain arbitrary numbers of extra data columns. +spatial information (spectral images, maps, or profiles), temporal information (kinetics, time series), concentrations (calibration series), class membership information, etc. Class `hyperSpec` object may contain arbitrary numbers of extra data columns. In R, slots of an S4 class are accessed by the \index{`"@`{.r} operator}`@`{.r} operator. In this vignette, the notation `@xxx`{.r} will thus mean *"slot xxx of an object"*. @@ -134,7 +134,7 @@ It is then a matrix with zero columns. ```{r structure, echo = FALSE, fig.cap = CAPTION, out.width = "600"} knitr::include_graphics("intro--structure--hyperSpec--objects.png") -CAPTION <- "The structure of the data in a `hyperSpec` object. In this example the 'extra data' are the `x`, `y` and `c` columns in `@data`." +CAPTION <- "The structure of the data in a `hyperSpec` object. In this example, the 'extra data' are the `x`, `y` and `c` columns in `@data`." ``` Slot `@label`{.r} contains an element for each of the columns in `@data`{.r} plus one holding the label for the wavelength axis, `.wavelength`. @@ -169,7 +169,7 @@ Package **hyperSpec** comes with several data sets: ----------------- ---------------------------------------------------------------------- In this vignette, the data sets are used to illustrate appropriate procedures for different tasks and different spectra. -In addition, [`laser`](#list-of-vignettes) and [`flu`](#list-of-vignettes) are accompanied by their own vignettes showing sample work flows for the respective data type. +In addition, [`laser`](#list-of-vignettes) and [`flu`](#list-of-vignettes) are accompanied by their own vignettes showing sample workflows for the respective data type. This document describes how to accomplish typical tasks in the analysis of spectra. It does not give a complete reference on particular functions. @@ -181,9 +181,10 @@ It is therefore recommended to look up the methods in the `R` help system using \index{options|textbf} \index{options!debuglevel} \index{options!gc} -The global behaviour of package **hyperSpec** can be configured via options. +The global behavior of package **hyperSpec** can be configured via options. The values of the options are retrieved with `hy_get_options()`{.r} and `hy_get_option()`{.r}, and changed with `hy_set_options()`{.r}. -Table \@ref(tab:options) gives an overview of the options. You should not worry about these at the start of your exploration of **hyperSpec**. +Table \@ref(tab:options) gives an overview of the options. +You should not worry about these at the start of your exploration of **hyperSpec**. # Obtaining Basic Information about **_hyperSpec_** Objects {#info-hyperspec-objs} @@ -228,11 +229,11 @@ The column names of the spectra matrix contain the wavelengths as a character ve # Accessing & Manipulating **_hyperSpec_** Objects {#sec:access-parts} - + While the parts of the `hyperSpec` object can be accessed directly, it is good practice to use the functions provided by the package to handle the objects rather than accessing the slots directly. This also ensures that proper (i.e. *valid*) objects are returned. -In some cases, however, direct access to the slots can considerably speed up calculations (see the appendicies). +In some cases, however, direct access to the slots can considerably speed up calculations (see the appendices). The main functions to retrieve the data of a `hyperSpec` object are `[]` and `[[]]`. \mFun{`[]`,`[[]]`} The difference between these functions is that `[]` returns a `hyperSpec` object, whereas `[[]]` returns a `data.frame` containing `x$spc`{.r}, the spectral data. @@ -245,7 +246,7 @@ To modify a `hyperSpec` object, the corresponding functions are `[<-`{.r} and `[ * `i` refers to rows of the `@data`{.r} slot. `i` can be integer indices or a logical vector. * `j` refers to columns of the `@data`{.r} slot. `j` can be integer indices, a logical vector or the name of a column. _However, there is no guaranteed order to_ `colnames(x)`{.r} _so using integer indices and logical vectors is unwise._ * `l` refers to wavelengths. Note the argument `wl.index` which determines how `l` is interpreted. -* If there is only one index given, e.g. `x[1:3]`{.r}, it refers to the row index `i`. Likewise if there are only two indices given they refer to `i` and `j`. +* If there is only one index given, e.g. `x[1:3]`{.r}, it refers to the row index `i`. Likewise, if there are only two indices given they refer to `i` and `j`. * See \@ref(sec:square-brack-replace) and \@ref(sec:accessing-extra-data) for even more ways to specify the indices. @@ -256,21 +257,22 @@ Table \@ref(tab:getters) shows the main functions that can be used with class `h : **(\#tab:getters)** Getter functions for the slots of `hyperSpec` objects -------------------------------------------- ----------------------------------------------------------------------------------------------------------------------- -**`x[]`{.r}** Returns the entire `hyperSpec` object unchanged. -**`x[i, , ]`{.r}** Returns the `hyperSpec` object with selected rows; equivalent to `x[i]`{.r}. -**`x[, j, ]`{.r}** Returns the `hyperSpec` object with empty `x$spc`{.r} slot. If you want the column `j`, `x[["name"]]` returns a `data.frame` containing `j` or `x$name`{.r} returns it as a vector. -**`x[, , l, wl.index = TRUE/FALSE]`{.r}** Returns the `hyperSpec` object with selected wavelengths. -**`x[[]]`{.r}** Returns the spectra matrix (`x$spc`{.r}). -**`x[[i, , ]]`{.r}** Returns the spectra matrix (`x$spc`{.r}) with selected rows. -**`x[[, j, ]]`{.r}** Returns a `data.frame`{.r} with the selected columns. Safest to give `j` as a character string. -**`x[[, , l, wl.index = TRUE/FALSE]]`{.r}** Returns the spectra matrix (`x$spc`{.r}) with selected wavelengths. -**`x$name`{.r}** Returns the column `name` as a vector. -**`x$.`{.r}** Returns the complete `data.frame`{.r} `x@data`{.r}, with the spectra in column `$spc`{.r}. -**`x$..`{.r}** Returns all the extra data (`x@data`{.r} without `x$spc`{.r}). -**`wl()`{.r}** Returns the wavelengths. -**`labels()`{.r}** Returns the labels. -------------------------------------------- ----------------------------------------------------------------------------------------------------------------------- +Syntax | Description +--------------------------------------------|----------------------------------------------------------------------------------------------------------------------- +**`x[]`{.r}** | Returns the entire `hyperSpec` object unchanged. +**`x[i, , ]`{.r}** | Returns the `hyperSpec` object with selected rows; equivalent to `x[i]`{.r}. +**`x[, j, ]`{.r}** | Returns the `hyperSpec` object with empty `x$spc`{.r} slot. If you want the column `j`, `x[["name"]]` returns a `data.frame` containing `j` or `x$name`{.r} returns it as a vector. +**`x[, , l, wl.index = TRUE/FALSE]`{.r}** | Returns the `hyperSpec` object with selected wavelengths. +**`x[[]]`{.r}** | Returns the spectra matrix (`x$spc`{.r}). +**`x[[i, , ]]`{.r}** | Returns the spectra matrix (`x$spc`{.r}) with selected rows. +**`x[[, j, ]]`{.r}** | Returns a `data.frame`{.r} with the selected columns. Safest to give `j` as a character string. +**`x[[, , l, wl.index = TRUE/FALSE]]`{.r}** | Returns the spectra matrix (`x$spc`{.r}) with selected wavelengths. +**`x$name`{.r}** | Returns the column `name` as a vector. +**`x$.`{.r}** | Returns the complete `data.frame`{.r} `x@data`{.r}, with the spectra in column `$spc`{.r}. +**`x$..`{.r}** | Returns all the extra data (`x@data`{.r} without `x$spc`{.r}). +**`wl()`{.r}** | Returns the wavelengths. +**`labels()`{.r}** | Returns the labels. + @@ -285,18 +287,22 @@ Table \@ref(tab:setters) shows the main functions that can be used with class `h : **(\#tab:setters)** Setter functions for the slots of `hyperSpec` objects ------------------------------------------------ ----------------------------------------------------------------------------------------------------------------------- -**`x[i, ,] <-`{.r}** Replaces the specified rows of the `@data`{.r} slot, including `x$spc`{.r} and any extra data columns. Other approaches are probably easier. -**`x[, j,] <-`{.r}** Replaces the specified columns. Safest to give `j` as a character string. -**`x[i, j] <-`{.r}** Replaces the specified column limited to the specified rows. Safest to give `j` as a character string.**`x[, , l, wl.index = TRUE/FALSE] <-`{.r}** Replaces the specified wavelengths. -**`x[[i, ,]] <-`{.r}** Replaces the specified row of `x$spc`{.r} -**`x[[, j,]] <-`{.r}** As `[[]]`{.r} refers to just the spectral data in `x$spc`{.r}, this operation is not valid. See below. -**`x[[, , l, wl.index = TRUE/FALSE]] <-`{.r}** Replaces the intensity values in `x$spc`{.r} for the specified wavelengths. -**`x[[i, , l, wl.index = TRUE/FALSE]] <-`{.r}** Replaces the intensity values in `x$spc`{.r} for the specified wavelengths limited to the specified rows. -**`x$.. <-`{.r}** Sets the extra data (`x@data`{.r} without touching `x$spc`{.r}). The column names must match exactly in this case. -**`wl<-`{.r}** Sets the wavelength vector. -**`labels<-`{.r}** Sets the labels. ------------------------------------------------ ----------------------------------------------------------------------------------------------------------------------- + + +Syntax | Description +-------------------------------------------- |----------------------------------------------------------------------------------------------------------------------- +**`x[i, ,] <-`{.r}** | Replaces the specified rows of the `@data`{.r} slot, including `x$spc`{.r} and any extra data columns. Other approaches are probably easier. +**`x[, j,] <-`{.r}** | Replaces the specified columns. Safest to give `j` as a character string. +**`x[i, j] <-`{.r}** | Replaces the specified column limited to the specified rows. Safest to give `j` as a character string. +**`x[, , l, wl.index = TRUE/FALSE] <-`{.r}** | Replaces the specified wavelengths. +**`x[[i, ,]] <-`{.r}** | Replaces the specified row of `x$spc`{.r} +**`x[[, j,]] <-`{.r}** | As `[[]]`{.r} refers to just the spectral data in `x$spc`{.r}, this operation is not valid. See below. +**`x[[, , l, wl.index = TRUE/FALSE]] <-`{.r}** | Replaces the intensity values in `x$spc`{.r} for the specified wavelengths. +**`x[[i, , l, wl.index = TRUE/FALSE]] <-`{.r}** | Replaces the intensity values in `x$spc`{.r} for the specified wavelengths limited to the specified rows. +**`x$.. <-`{.r}** | Sets the extra data (`x@data`{.r} without touching `x$spc`{.r}). The column names must match exactly in this case. +**`wl<-`{.r}** | Sets the wavelength vector. +**`labels<-`{.r}** | Sets the labels. + \mFun{`[]<-`, `[[]]<-`, `$<-`} @@ -372,7 +378,7 @@ Here, indices may be requested using `index = TRUE`{.r}. ## Selecting Extra Data Columns {#sec:accessing-extra-data} The second argument of the extraction functions `[]` and `[[]]` specifies the extra data columns. -They can be given like any column specification for a `data.frame`{.r}, i.e., numeric, logical, or by a vector of the column names. However, since there is intrinsic order the column names of a `hyperSpec` object, using the column names is safest: +They can be given like any column specification for a `data.frame`{.r}, i.e., numeric, logical, or by a vector of the column names. However, since there is intrinsic order in the column names of a `hyperSpec` object, using the column names is safest: ```{r data} colnames(faux_cell) @@ -444,7 +450,7 @@ Operator `[[]]<-` also accepts index matrices of size $n × 2$. ### Converting Wavelengths to Indices and Vice Versa {#sec:wavelength-indices} \mFun{`wl2i()`{.r} `i2wl()`{.r}} -Spectra in package **hyperSpec** always have discrete wavelength axes, and are stored in a matrix with each column corresponding to one wavelength. +Spectra in package **hyperSpec** always have discrete wavelength axes and are stored in a matrix with each column corresponding to one wavelength. Two functions are provided to convert the respective column indices into wavelengths and vice versa: `i2wl()`{.r} and `wl2i()`{.r}. For `i2wl()`{.r} you should provide a vector of integers to serve as the indices. For `wl2i()`{.r} you can provide a vector of integers giving the wavelength range, or you can use a *formula* interface. The basic syntax for the formula is **`start ~ end`**. @@ -472,7 +478,7 @@ wl2i(faux_cell, 1000 ~ 1010) ``` - If the object's wavelength axis is not ordered, the formula approach will give weird results. - In that (probably rare) case, use `wl_sort()`{.r} first to obtain an object with ordered wavelength axis. + In that (probably rare) case, use `wl_sort()`{.r} first to obtain an object with an ordered wavelength axis. Values *start* and *end* may contain the special variables `min`{.r} and `max`{.r} that correspond to the lowest and highest wavelengths of the object: @@ -514,7 +520,7 @@ plot(paracetamol[, , 2800 ~ 3200]) ``` By default, the values given are treated as wavelengths. -If they are indices into the columns of the spectra matrix, use `wl.index = TRUE`{.r}: +If they are indices of the spectra matrix columns, use `wl.index = TRUE`{.r}: ```{r include = FALSE} @@ -569,7 +575,7 @@ For details see the [plotting](#list-of-vignettes) vignette. \mFun{`wl()`{.r}, `wl<-`{.r}} Sometimes wavelength axes need to be transformed, e.g., converting from wavelengths to frequencies. In this case, retrieve the wavelength axis vector with `wl()`{.r}, convert each value of the resulting vector and assign the result with `wl<-`{.r}. -Also the label of the wavelength axis may need to be adjusted. +Also, the label of the wavelength axis may need to be adjusted. As an example, convert the wavelength axis of `laser`{.r} to frequencies. As the wavelengths are in nanometers, and the frequencies are easiest expressed in terahertz, an additional conversion factor of 1000 is needed: @@ -618,7 +624,7 @@ wl(barb) ## Conversion to Long- and Wide-Format `data.frame`{.r}s {#sec:conv-long-form} \mFun{`as.data.frame()`{.r}} -Function `as.data.frame()`{.r} extracts the `@data`{.r} slot as a `data.frame`{.r}: +The function `as.data.frame()`{.r} extracts the `@data`{.r} slot as a `data.frame`{.r}: ```{r} flu <- flu[, , 400 ~ 407] # make a small and handy version of the flu data set @@ -665,7 +671,7 @@ as.t.df(apply(flu, 2, mean_pm_sd)) \mFun{`as.long.df()`{.r}} Some functions need the data in an *unstacked* or *long-format* `data.frame`{.r}. -Function `as.long.df()`{.r} is the appropriate conversion function. +The function `as.long.df()`{.r} is the appropriate conversion function. ```{r} head(as.long.df(flu), 20) @@ -790,9 +796,9 @@ plot(faux_cell[135]) plot(tmp[135, , ], add = TRUE, col = palette_colorblind[4]) ``` -But the method cannot be used if each spectrum (or groups of spectra) are shifted individually. +However, the method cannot be used if each spectrum (or groups of spectra) are shifted individually. In that case, interpolation is needed. -`R` offers many possibilities to interpolate (e.g., `approx()`{.r} for constant / linear approximation, `spline()`{.r} for spline interpolation, `loess()`{.r} can be used to obtain smoothed approximations, etc.). +`R` offers many possibilities to interpolate (e.g., `approx()`{.r} for constant/linear approximation, `spline()`{.r} for spline interpolation, `loess()`{.r} can be used to obtain smoothed approximations, etc.). The appropriate interpolation strategy will depend on the spectra, and package **hyperSpec** therefore leaves it up to the user to select a sensible interpolation function. As an example, we will use natural splines to do the interpolation. @@ -843,7 +849,7 @@ wl = wl(tmp), shift = -0.5 plot(tmp, lines.args = list(type = "b", pch = 19, cex = 0.5), add = TRUE, col = palette_colorblind[3]) ``` -If different spectra need to be offset by different shift, use a loop^[Function `sweep()`{.r} cannot be used here, and while there is the possibility to use `sapply()`{.r} or `mapply()`{.r}, they are not faster than the for loop in this case. +If different spectra need to be offset by a different shift, use a loop^[Function `sweep()`{.r} cannot be used here, and while there is the possibility to use `sapply()`{.r} or `mapply()`{.r}, they are not faster than the for loop in this case. Make sure to work on a copy of the spectra matrix, as that is much faster than row-wise extracting and changing the spectra by `[[`{.r} and `[[<-`{.r}.]. ```{r} @@ -869,7 +875,7 @@ As just the very maximum is too coarse, we'll use the maximum of a square polyno **V. Gegzna's notes** `hyperspec-9` 1. `# TODO: `{.r} use vanderMonde -2. `# FIXME: `{.r} remove `tmp_false` when the bug is solved (issue in `qr.solve()`{.r}) +2. `# FIXME: `{.r} remove `tmp_false` when the bug is solved (the issue in `qr.solve()`{.r}) ``` @@ -1047,8 +1053,8 @@ Correction of cosmic spikes is not a part of `hyperSpec` package, but can be add ## Smoothing Interpolation \mFun{`spc_bin()`{.r} `spc_loess()`{.r}} -Spectra acquired by grating instruments are frequently interpolated onto a new wavelength axis, e.g., because the unequal data point spacing should be removed. -Also, the spectra can be smoothed: reducing the spectral resolution allows to increase the signal to noise ratio. +Spectra acquired by grating instruments are frequently interpolated onto a new wavelength axis, e.g. because the unequal data point spacing should be removed. +Also, the spectra can be smoothed: reducing the spectral resolution allows for an increase in the signal-to-noise ratio. For chemometric data analysis reducing the number of data points per spectrum may be crucial as it reduces the dimensionality of the data. Package **hyperSpec** provides two functions to do so: `spc_bin()`{.r} and `spc_loess()`{.r}. @@ -1129,10 +1135,10 @@ A least-squares fit is done so that the function may be used on rather noisy spe However, the user must supply an object that is cut appropriately. Particularly, the supplied wavelength ranges are not weighted. -Function `spc_fit_poly_below()`{.r} tries to find appropriate support points for the baseline iteratively. +The function `spc_fit_poly_below()`{.r} tries to find appropriate support points for the baseline iteratively. Both functions return a `hyperSpec` object containing the fitted baselines. -They need to be subtracted afterwards: +They need to be subtracted afterward: ```{r bl} bl <- spc_fit_poly_below(faux_cell) @@ -1194,9 +1200,9 @@ Such a constant can be immediately subtracted: `spectra - constant`{.r}. ### Correcting Wavelength Dependence \mFun{`sweep()`{.r}} -For each of the wavelengths the same correction needs to be applied to all spectra. +For each of the wavelengths, the same correction needs to be applied to all spectra. -1. There might be wavelength dependent offsets (background or dark spectra). +1. There might be wavelength-dependent offsets (background or dark spectra). They are subtracted: ```{r eval = FALSE} sweep(spectra, 2, offset.spectrum, "-") @@ -1212,7 +1218,7 @@ sweep(spectra, 2, photon.efficiency, "/") \mFun{`sweep()`{.r}} If the correction depends on the spectra (e. g. due to inhomogeneous illumination while collecting imaging data, differing optical path length, etc.), the `MARGIN`{.r} of the `sweep()`{.r} function needs to be 1 or `SPC`{.r}: -1. Pixel dependent offsets are subtracted: +1. Pixel-dependent offsets are subtracted: ```{r eval = FALSE} sweep(spectra, SPC, pixel.offsets, "-") ``` @@ -1236,7 +1242,7 @@ faux_cell_tmp <- sweep(faux_cell, 1, mean, "/") If the calculation of the normalization factors is more elaborate, use a two step procedure: 1. Calculate appropriate normalization factors - You may calculate the factors using only a certain wavelength range, thereby normalizing on a particular band or peak. + You may calculate the factors using only a certain wavelength range, thereby normalizing on a particular band or peak. 2. Again, sweep the factor off the spectra: ```{r eval = FALSE} normalized <- sweep (spectra, 1, factors, "*") @@ -1274,11 +1280,11 @@ For minimum-maximum-normalization, first do an offset- or baseline correction, t \mFun{`scale()`{.r}} Centering means that the mean spectrum is subtracted from each of the spectra. Many data analysis techniques, like principal component analysis, partial least squares, etc., work much better on centered data. -From a spectroscopic point of view it depends on the particular data set whether centering does make sense or not. +From a spectroscopic point of view, it depends on the particular data set whether centering does make sense or not. Variance scaling is often used in multivariate analysis to adjust the influence and scaling of the variates (that are typically different physical values). -However, spectra already do have the same scale of the same physical value. -Thus one has to trade off the the expected numeric benefit with the fact that for wavelengths with low signal the noise level will greatly increase when using variance scaling. +However, spectra already have the same scale of the same physical value. +Thus one has to trade off the the expected numeric benefit with the fact that for wavelengths with low signal, the noise level will greatly increase when using variance scaling. Scaling usually makes sense only for centered data. Both tasks are carried out by the same method in `R`, `scale()`{.r}, which will by default both mean center and variance scale the spectra matrix. @@ -1296,12 +1302,12 @@ flu.centered <- scale(flu, scale = FALSE) plot(flu.centered) ``` -On the other hand, the `faux_cell`{.r} data set consists of Raman spectra, so the spectroscopic interpretation of centering is getting rid of the the average chemical composition of the sample. +On the other hand, the `faux_cell`{.r} data set consists of Raman spectra, so the spectroscopic interpretation of centering is getting rid of the average chemical composition of the sample. But what is the meaning of the "Average spectrum" of an inhomogeneous sample? -In this case it may be better to subtract the minimum spectrum (which will hopefully have almost the same benefit on the data analysis) as it is the spectrum of that chemical composition that is underlying the whole sample. +In this case, it may be better to subtract the minimum spectrum (which will hopefully have almost the same benefit on the data analysis) as it is the spectrum of that chemical composition that is underlying the whole sample. One more point to consider is that the actual minimum spectrum will pick up (negative) noise. -In order to avoid that, using, e.g., the 5^th^ percentile spectrum is more suitable: +To avoid that, using, e.g., the 5^th^ percentile spectrum is more suitable: @@ -1320,7 +1326,7 @@ See section the appendices for some tips to speed up these calculations. ## Multiplicative Scatter Correction \mFun{`pls::msc()`{.r}} -Multiplicative scatter correction (MSC) can be done using `msc()`{.r} from package package **pls** [`r cite_pkg("pls")`]. +Multiplicative scatter correction (MSC) can be done using `msc()`{.r} from package **pls** [`r cite_pkg("pls")`]. It operates on the spectra matrix: ```{r msc, eval = FALSE} @@ -1351,9 +1357,9 @@ labels(absorbance.spectra)$spc <- "A" Be careful: `R`'s `log()`{.r} function calculates the *natural* logarithm if no base is given. The basic arithmetic operators work element-wise in `R`. -Thus they all need either a scalar, or a matrix (or `hyperSpec` object) of the correct size. +Thus they all need either a scalar or a matrix (or `hyperSpec` object) of the correct size. -Matrix multiplication is done by `%*%`\mFun{`\%*\%`}, again each of the operands may be a matrix or a `hyperSpec` object, and must have the correct dimensions. +Matrix multiplication is done by `%*%`\mFun{`\%*\%`}, again each of the operands may be a matrix or a `hyperSpec` object and must have the correct dimensions. There are many more mathematical functions that understand a `hyperSpec` object. See `?Arith` for more details. @@ -1460,11 +1466,11 @@ plot_map(scores[, , 3], col.regions = diverging_hcl(20, palette = "Blue-Red2")) ### PCA as Noise Filter {#sec:pca-as-noise} -Principal component analysis is sometimes used as a noise filtering technique. +Principal component analysis is sometimes used as a noise-filtering technique. The idea is that the relevant differences are captured in the first few components while the higher components contain noise only. Thus the spectra are reconstructed using only the first $p$ components. -This reconstruction is in fact a matrix multiplication: +This reconstruction is a matrix multiplication: \begin{equation} @@ -1487,10 +1493,10 @@ Note that this corresponds to a model based on the Beer-Lambert law: The matrix formulation puts the $n$ spectra into the rows of $A$ and $c$, while the $i$ pure components appear in the columns of $c$ and rows of the absorbance coefficients $\epsilon$. -For an ideal data set (constituents varying independently, sufficient signal to noise ratio) one would expect the principal component analysis to extract something like the concentrations and pure component spectra. +For an ideal data set (constituents varying independently, sufficient signal-to-noise ratio) one would expect the principal component analysis to extract something like the concentrations and pure component spectra. \mFun{`\%*\%`} -If we decide that only the first 10 components actually carry spectroscopic information, we can reconstruct spectra with better signal to noise ratio: +If we decide that only the first 10 components actually carry spectroscopic information, we can reconstruct spectra with a better signal-to-noise ratio: ```{r pca-smooth} smoothed <- scores[, , 1:10] %*% loadings[1:10] @@ -1532,7 +1538,7 @@ First, cut the dendrogram so that three clusters result: ```{r dendcut} faux_cell$region <- as.factor(cutree(dendrogram, k = 3)) ``` -As the cluster membership was stored as factor, the levels can be meaningful names, which are displayed in the color legend. +As the cluster membership was stored as a factor, the levels can be meaningful names, which are displayed in the color legend. ```{r clustname} levels(faux_cell$region) <- c("matrix", "lacuna", "cell") @@ -1559,7 +1565,7 @@ So we may plot the cluster mean spectra: ```{r include = FALSE} -CAPTION <- "The results of the cluster analysis: the the mean spectra. " +CAPTION <- "The results of the cluster analysis: the mean spectra. " ``` ```{r clustmean, fig.cap = CAPTION} @@ -1586,7 +1592,7 @@ dim(cbind(flu, flu)) dim(rbind(flu, flu)) ``` -There is also a more general function, `bind()`{.r}, taking the direction (`"r"`{.r} or `"c"`{.r}) as first argument followed by the objects to bind either in separate arguments or in a list. +There is also a more general function, `bind()`{.r}, taking the direction (`"r"`{.r} or `"c"`{.r}) as the first argument followed by the objects to bind either in separate arguments or in a list. As usual for `rbind()`{.r} and `cbind()`{.r}, the objects that should be bound together must have the same number of columns (for `rbind()`{.r}) and the same number of rows (for `cbind()`{.r}), @@ -1600,7 +1606,7 @@ For binding row-wise (`rbind()`{.r}), `collapse()`{.r} is more flexible and fast \mFun{`collapse()`{.r}} Function `collapse()`{.r} combines objects that should be bound together by row, but they do not share the columns and/or spectral range. -The resulting object has all columns from all input objects, and all wavelengths from the input objects. +The resulting object has all columns from all input objects and all wavelengths from the input objects. If an input object does not have a particular column or wavelength, its value in the resulting object is `NA`{.r}. The `barbiturates`{.r} data is a list of `r length(barbiturates)` `hyperSpec` objects, each containing one mass spectrum. @@ -1622,7 +1628,7 @@ barb[[1:3, , min ~ min + 10i]] ## Binding Objects that Do not Share the Same Spectra {#sec:merge} \mFun{`merge()`{.r}} -Function `merge()`{.r} adds a new spectral range (like `cbind()`{.r}), but works also if spectra are missing in one of the objects. +The function `merge()`{.r} adds a new spectral range (like `cbind()`{.r}) but works also if spectra are missing in one of the objects. The arguments `by`{.r}, `by.x`{.r}, and `by.y`{.r} specify which columns should be used to decide which spectra are the same. The arguments `all`{.r}, `all.x`{.r}, and `all.y`{.r} determine whether spectra should be kept for the result set if they appear in only one of the objects. For details, see also the help on the base function `merge()`{.r}. @@ -1737,7 +1743,7 @@ flu.merged$.. ``` The usual rules for `merge()`{.r} apply. -E. g., if to preserver all spectra of flu, use `all.x = TRUE`{.r}: +E.g. to preserve all spectra of flu, use `all.x = TRUE`{.r}: ```{r} flu.merged <- merge(flu, flu.ref, all.x = TRUE) @@ -1848,7 +1854,7 @@ stopifnot(all(names(hy_get_options(TRUE)) %in% c( ## Speed and Memory Considerations {- #sec:speed-considerations} -While most of package **hyperSpec**'s functions work at a decent speed for interactive sessions (of course depending on the size of the object), iterated (repeated) calculations as for bootstrapping or iterated cross validation may ask for special speed considerations. +While most of the package **hyperSpec**'s functions work at a decent speed for interactive sessions (of course depending on the size of the object), iterated (repeated) calculations for bootstrapping or iterated cross-validation may ask for special speed considerations. As an example, consider the code for shifting the spectra: @@ -1879,7 +1885,7 @@ system.time({ ## Additional Packages -Package **matrixStats**[`r cite_pkg("matrixStats")`] implements fast functions to calculate summary statistics for each row or each column of a matrix. +Package **matrixStats**[`r cite_pkg("matrixStats")`] implements fast functions to calculate summary statistics for each row or each column of a matrix. ## Memory Usage @@ -1889,7 +1895,7 @@ At certain points, package **hyperSpec** provides switches that allow working wi \mFun{`new ("hyperSpec")`{.r}, `read.ENVI*()`{.r},`read.txt.Renishaw()`{.r}} \index{options!gc} -The initialization method `new("hyperSpec", ...)`{.r} takes particular care to avoid unneccessary copies of the spectra matrix. +The initialization method `new("hyperSpec", ...)`{.r} takes particular care to avoid unnecessary copies of the spectra matrix. In addition, frequent calls to `gc()`{.r} can be requested by `hy_set_option(gc = TRUE)`{.r}. The same behaviour is triggered in `read.ENVI()`{.r} and its derivatives (`read.ENVI.`_`Manufacturer`_`()`{.r}). The memory consumption of `read.txt.Renishaw()`{.r} can be lowered by importing the data in chunks (argument `nlines`{.r}). diff --git a/vignettes/list-of-vignettes.md b/vignettes/list-of-vignettes.md index b60b15a6..03f3ac6c 100644 --- a/vignettes/list-of-vignettes.md +++ b/vignettes/list-of-vignettes.md @@ -1,23 +1,22 @@ ```{block, type="note-t", echo = TRUE} -**Note**: +**Notice**: -Package **`hyperSpec`** and it's friends provide a number of vignettes to help you get started. +Package **`hyperSpec`** and its associated packages offer a range of vignettes intended to assist users in familiarizing themselves with its functionality. -* You can access the vignettes via these links: +* The vignettes are accessible through the following links: + [Introduction to **`hyperSpec`**](http://r-hyperspec.github.io/hyperSpec/articles/hyperSpec.html) + [Plotting Functions in **`hyperSpec`**](http://r-hyperspec.github.io/hyperSpec/articles/plotting.html) + [Fitting Baselines to Spectra](http://r-hyperspec.github.io/hyperSpec/articles/baseline.html) + [Importing Files into **`hyperSpec`**](http://r-hyperspec.github.io/hyperSpec/articles/fileio.html) - + [flu: Example Workflow for Fluorescene Emission](http://r-hyperspec.github.io/hyperSpec/articles/flu.html) + + [flu: Example Workflow for Fluorescence Emission](http://r-hyperspec.github.io/hyperSpec/articles/flu.html) + [laser: Example Workflow for Spectral Time Series](http://r-hyperspec.github.io/hyperSpec/articles/laser.html) -* Alternatively, if you are offline or prefer accessing the vignettes with *R*, simply type `browseVignettes("hyperSpec")`{.r} to get a clickable list in a browser window. +* Alternatively, if you are offline or prefer accessing the vignettes within *R*, simply type `browseVignettes("hyperSpec")`{.r} to obtain a clickable list in a browser window. -* Vignettes in other packages: - + [Example Workflow for 2D Raman Spectra ](https://r-hyperspec.github.io/hySpc.chondro/articles/hySpc-chondro.html) (`chondro` dataset) +* Vignettes from other packages: + + [Example Workflow for 2D Raman Spectra](https://r-hyperspec.github.io/hySpc.chondro/articles/hySpc-chondro.html) (`chondro` dataset) + [Plotting `hyperSpec` objects with **`ggplot2`**](https://r-hyperspec.github.io/hySpc.ggplot2/articles/hySpc-ggplot2.html) - + [Using **`dplyr`** functions with `hyperSpec` objects](https://r-hyperspec.github.io/hySpc.dplyr/articles/hySpc-dplyr.html) ```