---
title: "1. Overview of how to use fundiversity"
output: rmarkdown::html_vignette
link-citations: yes
bibliography: fundiversity.bib
vignette: >
  %\VignetteIndexEntry{1. Overview of how to use fundiversity}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r vignette-setup, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
```

`fundiversity` lets you compute functional diversity indices. Currently it can compute five indices:

- functional richness (FRic),
- functional volume intersections (FRic_intersect),
- functional divergence (FDiv),
- functional evenness (FEve),
- functional dispersion (FDis)
- Rao's quadratic entropy (Q).

This vignette will introduce you to the data needed as well as how to compute and interpret each index. We made sure the computations of these indices are correct based on a test dataset as specified in the [correctness vignette](./fundiversity_3-correctness.html).

```{r setup}
library("fundiversity")
```

# Required data 

To compute functional diversity indices, you will need at least a dataset describing species traits, i.e. species characteristics. Note that we here talk about species but the reasoning could apply on whatever unit you're interested in whether it's individual organisms, ecological plots, or even entire ecosystems. The traits are the features that describe these units.

`fundiversity` comes with one example trait dataset. The dataset comes from @Nowak_Projecting_2019 and describe the traits of birds and plants along a tropical gradient [@Nowak_Data_2019].
You can see the datasets available in `fundiversity` using the `data()` function:

```{r see-data}
data(package = "fundiversity")
```

To load them use their names into the `data()` function:

```{r load-data-traits}
data("traits_birds", package = "fundiversity")

head(traits_birds)

data("traits_plants", package = "fundiversity")

head(traits_plants)
```

Note that in these datasets the **species** are shown in **rows**, with species names as row names, and **traits** are in **columns**.

Functional diversity indices are generally computed at different locations that we hereafter call sites. We thus need a description of which species is in which site in the form of a site-species matrix. Again, we're calling it a site-species matrix but the granularity of both your "species" and "site" units can vary depending on what you want to compute functional diversity on.

`fundiversity` contains the corresponding site-species matrices to the above-mentioned trait dataset [@Nowak_Data_2019]: 

```{r load-data-site-sp}
# Site-species matrix for birds
data("site_sp_birds", package = "fundiversity")

head(site_sp_birds)[, 1:5]

# Site-species matrix for plants
data("site_sp_plants", package = "fundiversity")

head(site_sp_plants)[, 1:5]
```

The site species matrix represent the presence of a given **species** (in **column**) in a given **site** (in **row**), similar to the format used in the [`vegan`](https://cran.r-project.org/package=vegan) package. Here the site-species matrix contains only 0 (absence) and 1 (presence), but `fundiversity` can also use matrices that contain abundances for some functional diversity indices (FDiv and Q).

To ensure the good computation of functional diversity indices, at least some of species names (row names) in the trait data need to be in to the column names of the site species matrix:

```{r not-enough-species, error=TRUE}
# Fewer species in trait dataset than species in the site-species matrix
fd_fric(traits_birds[2:217,], site_sp_birds)

# Fewer species in the site-species matrix than in the traits
fd_fric(traits_birds, site_sp_birds[, 1:60])

# No species in common between both dataset
fd_fric(traits_birds[1:5,], site_sp_birds[, 6:10])
```


# Functional Richness (FRic) - `fd_fric()`

Functional Richness (FRic) represents the total amount of functional space filed by a community in a dataset [@Villeger_New_2008]. You can compute FRic in `fundiversity` using the `fd_fric()` function.

For a single trait range FRic is the range of trait observed in the dataset:

```{r fric-1d}
# Range of bill width in the birds dataset
diff(range(traits_birds[, "Bill.width..mm."]))

# Using fundiversity::fd_fric()
fd_fric(traits_birds)
```

The first column `site` describes the site on which FRic has been computed while the `FRic` column contains the computed FRic values. If no site-species matrix has been provided the site is named by default `s1`.

For multiple traits, FRic can be thought as a multi-dimensional range which is computed as the convex hull volume of the considered species [@Villeger_New_2008]:

```{r fric-nd}
fd_fric(traits_birds)
```

If you provide only a trait dataset without specifying site-species matrix `fd_fric()` computes FRic on the full trait dataset.
You can compute FRic values for different sites by providing both a trait dataset and a site-species matrix to `fd_fric()`:

```{r fric-nd-sites}
fd_fric(traits_birds, site_sp_birds)
```

Because the convex hull volume depends on the number and the units of the traits used, it is difficult to compare across datasets, that is why it has been suggested to standardize its value by the total volume comprising all species in the dataset [@Villeger_New_2008]:

```{r fric-stand}
fd_fric(traits_birds, stand = TRUE)
```

The newly computed FRic values will then be comprised between 0 and 1. It is especially useful when comparing different sites:

```{r fric-stand-sites}
fd_fric(traits_birds, site_sp_birds, stand = TRUE)
```

Each row gives the standardized FRic values of each site.

**Parallelization**. The computation of this function can be parallelized thanks to the `future` package. Refer to the [parallelization vignette](./fundiversity_1-parallel.html) to get more information about how to do so.

**Memoization**. By default, when loading `fundiversity`, the functions to compute convex hulls are [memoised](https://en.wikipedia.org/wiki/Memoization) through the `memoise`  package if it is installed (their results are cached to avoid recomputing the  same functional volume twice). It means repeated call to `fd_fric()` with the same arguments won't be recomputed but recovered from cache. To deactivate this behavior you can set the  option `fundiversity.memoise` to `FALSE` by running the following line: `options(fundiversity.memoise = FALSE)`. By changing the option, the behavior will automatically change the next time you run the function. Add it to your script(s) or `.Rprofile`  file to avoid toggling it each time. By changing the option, the behavior will automatically change the next time you run the function. **Important note**:  memoisation is only available when the `memoise` package has been installed  **and without parallelization**, otherwise `fundiversity` will use unmemoised versions of the functions.


# Functional volume intersect (FRic_intersect) - `fd_fric_intersect()`

Sometimes you're interested in the shared functional volumes between pairs of sites more than in the functional volumes of each site separately. fundiversity provides the 
`fd_fric_intersect()` function for this exact use case.

It follows the same interface as `fd_fric()` with similar named arguments:

```{r fric-intersect-intro}
fd_fric_intersect(traits_birds)
```

`fd_fric_intersect()` computes the shared functional volumes between each pair of sites, including self-intersection which correspond to the functional volume of each site. Similarly to `fd_fric()` if no site-species data is provided, `fd_fric_intersect()` considers a site that contains all species from the trait dataset.

```{r fric-intersect-all}
fd_fric_intersect(traits_birds, site_sp_birds[1:2,])
```

The output is a data.frame where the two first columns (`first_site` and `second_site`) define the sites on which the intersection is computed, the third column (`FRic_intersect`) contains the volume of the intersection.

Similarly to `fd_fric()` the intersections volumes can be standardized:

```{r fric-intersect-stand}
fd_fric_intersect(traits_birds, site_sp_birds[1:2,], stand = TRUE)
```

Note that when standardizing the volumes, the behavior is similar to that of `fd_fric()` which means the function considers the total volume occupied by provided trait values, even if they are absent from all sites, this can lead to standardized self-intersection volumes lower than one.

**Parallelization**. The computation of this function can be parallelized thanks to the `future` package. Refer to the [parallelization vignette](./fundiversity_1-parallel.html) to get more information about how to do so.

**Memoization**. By default, when loading `fundiversity`, the functions to compute convex hulls are [memoised](https://en.wikipedia.org/wiki/Memoization) through the `memoise` package if it is installed. It means that repeated calls to `fd_fric_intersect()` with similar arguments won't be recomputed each time but recovered from memory. To deactivate this behavior you can set the option `fundiversity.memoise` to `FALSE` by running the following line: `options(fundiversity.memoise = FALSE)`. By changing the option, the behavior will automatically change the next time you run the function. Add it to your script(s) or `.Rprofile` file to avoid toggling it each time. **Important note**:  memoisation is only available when the `memoise` package has been installed  **and without parallelization**, otherwise `fundiversity` will use unmemoised versions of the functions.


# Functional Divergence (FDiv) - `fd_fdiv()`

Functional Divergence (FDiv) represents how abundance is spread along the different traits [@Villeger_New_2008]. When a species with extreme trait values has the highest abundance, then functional divergence is high.

Use the `fd_fdiv()` function to compute functional divergence:

```{r fdiv-intro}
# One-dimension FDiv
fd_fdiv(traits_birds[, 1, drop = FALSE])

# Multiple dimension FDiv
fd_fdiv(traits_birds)
```

When no site-species matrix is provided, FDiv is computed by default considering all the species together. If you provide a site-species matrix, then FDiv is computed across all sites:

```{r fdiv-sites}
fd_fdiv(traits_birds, site_sp_birds)
```

Similarly to FRic, if the included species differ between the site-species matrix and the trait dataset, `fd_fdiv()` will take the common subset of species.

The computation of this function can be parallelized thanks to the `future` package. Refer to the [parallelization vignette](./fundiversity_1-parallel.html) to get more information about how to do so.

# Functional Evenness (FEve) - `fd_feve()`

Functional Evenness (FEve) describes the regularity of the distribution of species (and their abundances) in trait space [@Villeger_New_2008]. FEve is bounded between 0 and 1. FEve is close to 0 when most species (and abundances) are tightly packed in a portion of the trait space while it is close to 1 if species are regularly spread (with even abundances) along the trait space.

Use the `fd_fdiv()` function to compute functional divergence:

```{r feve-intro}
# One-dimension FEve
fd_feve(traits_birds[, 1, drop = FALSE])

# Multiple dimension FEve
fd_feve(traits_birds)
```

When no site-species matrix is provided, FEve is computed by default considering all the species together. If you provide a site-species matrix, then FEve is computed across all sites:

```{r feve-sites}
fd_feve(traits_birds, site_sp_birds)
```

Similarly to FRic, if the included species differ between the site-species matrix and the trait dataset, `fd_feve()` will take the common subset of species.

The computation of this function can be parallelized thanks to the `future` package. Refer to the [parallelization vignette](./fundiversity_1-parallel.html) to get more information about how to do so.

**Memoization**. By default, when loading `fundiversity`, the functions to compute convex hulls are [memoised](https://en.wikipedia.org/wiki/Memoization) through the `memoise`  package if it is installed (their results are cached to avoid recomputing the  same functional volume twice). It means repeated call to `fd_fdiv()` with the same arguments won't be recomputed but recovered from cache. To deactivate this behavior you can set the  option `fundiversity.memoise` to `FALSE` by running the following line: `options(fundiversity.memoise = FALSE)`. By changing the option, the behavior will automatically change the next time you run the function. Add it to your script(s) or `.Rprofile`  file to avoid toggling it each time. By changing the option, the behavior will automatically change the next time you run the function. **Important note**:  memoisation is only available when the `memoise` package has been installed  **and without parallelization**, otherwise `fundiversity` will use unmemoised versions of the functions.


# Functional Dispersion (FDis) - `fd_fdis()`

Functional Dispersion reflects changes in the abundance-weighted deviation of species trait values from the center of the functional space.

You can compute Functional Dispersion (FDis) using the `fd_fdis()` function by providing a trait dataset:

```{r fdis-intro}
fd_fdis(traits_birds)
```

If you don't provide a site-species matrix, `fd_fdis()` considers all species provided in the trait dataset present at equal abundances in the same site.
You can also provide a site-species matrix to compute FDis at different sites:

```{r fdis-sites}
fd_fdis(traits_birds, site_sp_birds)
```

The computation of this function can be parallelized thanks to the `future` package. Refer to the [parallelization vignette](./fundiversity_1-parallel.html) to get more information about how to do so.


# Rao's Quadratic Entropy (Q) - `fd_raoq()`

Rao's Quadratic entropy assesses the multi-dimensional divergence in trait space [@Rao_Diversity_1982]. It is the abundance-weighted variance of the trait dissimilarities between all species pairs.

You can compute Rao's Quadratic entropy (Q) using the `fd_raoq()` function by providing a trait dataset:

```{r raoq-intro}
fd_raoq(traits_birds)
```

If you don't provide a site-species matrix, `fd_raoq()` considers all species provided in the trait dataset present at equal abundances in the same site.
You can also provide a site-species matrix to compute Q at different sites:

```{r raoq-sites}
fd_raoq(traits_birds, site_sp_birds)
```

Because the computation of Rao's quadratic entropy requires dissimilarities between all pair of species in the dataset, if you provide a trait dataset `fd_raoq()`, the function will compute the Euclidean distance between all pairs of species.
If you wish to directly provide species dissimilarities, you can do so through the `dist_matrix` argument:

```{r raoq-dissim}
# Compute dissimilarity between species with the Manhattan distance
trait_dissim <- dist(traits_birds, method = "manhattan")

fd_raoq(dist_matrix = trait_dissim)

fd_raoq(sp_com = site_sp_birds, dist_matrix = as.matrix(trait_dissim))
```

NB: if you want to provide both a site-species matrix and a trait dissimilarity matrix please specify explicitly the arguments names.


# Large site-species data / sparse matrices

[Sparse matrices](https://en.wikipedia.org/wiki/Sparse_matrix) are memory efficient ways of storing matrix object that contains many zeros. `fundiversity` is fully compatible with sparse matrices through the [`Matrix` package](https://cran.r-project.org/package=Matrix). They can be used to encode site-species information or distance matrices.

Provide `Matrix` objects as inputs of the indices function `fundiversity`, they will transparently use them for efficient computation.

```{r sparse-matrix-example}
# Convert site-species matrix to sparse matrix
sparse_site_sp <- Matrix::Matrix(site_sp_birds, sparse = TRUE)

fd_raoq(traits_birds, site_sp_birds)
```


# Standardizing trait data

`fundiversity` does not perform any transformation on the input trait or dissimilarity data. In `fd_raoq()` if you provide only continuous trait data then the function will attempt computing Euclidean distance between the species.

In order to get comparable functional diversity indices you can standardize the trait data. One option would be to consider the `scale()` function to scale each continuous trait with a mean of zero and a standard deviation of one (z-score). Each trait will then have the same importance when computing functional diversity indices:

```{r scale_traits}
traits_birds_sc <- scale(traits_birds)
summary(traits_birds_sc)

# Unscaled
fd_fric(traits_birds)

# Scaled
fd_fric(traits_birds_sc)
```
Another solution to make trait comparable is to scale them between 0 and 1 by scaling each trait by its maximum and minimum values:

```{r minmax_traits}
min_values <- as.numeric(lapply(as.data.frame(traits_birds), min))
max_values <- as.numeric(lapply(as.data.frame(traits_birds), max))

traits_birds_minmax <- apply(traits_birds, 1, function(x) {
  (x - min_values)/(max_values - min_values)
})
traits_birds_minmax <- t(traits_birds_minmax)
summary(traits_birds_minmax)
```

There are several other options available to standardize trait values, reviewed in @Leps_Quantifying_2006.

If not all the traits you use are continuous, refer to the next section, which suggests ways of computing functional diversity indices with non-continuous traits.


# Non-continuous traits?

Do not panic. You can still compute the above-mentioned functional diversity indices. However, as all indices need continuous descriptors for all considered species, you need to transform the non-continuous trait data into a continuous form. The general idea is to obtain from the trait table a table of quantitative descriptions by defining specific dissimilarity and projecting species dissimilarities onto quantitative space using Principal Coordinates Analysis (PCoA). The framework is fully described in @Maire_How_2015.

To compute dissimilarity with non-continuous traits you can user Gower's distance [@Gower_general_1971] or its following adaptations [@Pavoine_challenge_2009; @Podani_Extending_1999]. You can use the following functions: `cluster::daisy()`, `FD::gowdis()`, `ade4::dist.ktab()`, or `vegan::vegdist()`.

Then you can project these dissimilarities with Principal Coordinates using `ape::pcoa()` for example. You can then select the first dimensions that explains the most variance and use theses as the input "traits" to compute functional diversity indices.


# Missing values in traits?

Sometimes, some of the trait values can be missing for some species in your dataset. Because `fundiversity` does not want to make assumptions without telling you, by default it **drops** the species data for which the trait is missing.

If you want to use data with missing values you can use dissimilarity metrics that accept missing trait values such as some of the methods specified in `vegan::vegdist()`.

Another solution, would be to impute the missing trait value to fill it. Many imputation methods exists and trait imputation is out of the scope of `fundiversity` but you can find some details on how to proceed in the review by @Penone_Imputation_2014.


# Functions summary table

```{r child="../man/rmdchunks/_fundiversity_functions.Rmd"}
```

# References