From 9914fde5dc6cccec010bac8c1f813ba7c2e681ac Mon Sep 17 00:00:00 2001 From: nfrerebeau Date: Thu, 8 Aug 2024 12:43:47 +0200 Subject: [PATCH 1/5] Update diversity module (#4) --- tfqar.csv | 94 +++++++++++++++++++++++++++---------------------------- 1 file changed, 47 insertions(+), 47 deletions(-) diff --git a/tfqar.csv b/tfqar.csv index a20430b..adf271c 100644 --- a/tfqar.csv +++ b/tfqar.csv @@ -1,47 +1,47 @@ -tfqa_group,tfqa_program,tfqa_description,r,r_package,r_function,notes -Spatial Analysis,CONTIG,Monte Carlo evaluation of the statistical significance of the observed degree of contiguity of grid units assigned to the same cluster.,,,, -Spatial Analysis,FISHER,Calculates Fisher's Exact test,Yes,stats,stats::fisher_test(), -Spatial Analysis,GRID,Aggregates point-provenience data into counts by type for each grid unit.,Yes,[sf](https://r-spatial.github.io/sf/),sf::st_join(),[Tutorial](https://mattherman.info/blog/point-in-poly) -Spatial Analysis,HOA,Computes Hodder and Okell's A and dispersion ratios,Yes,[GmAMisc](https://cran.r-project.org/web/packages/GmAMisc/index.html),GmAMisc::Aindex(), -Spatial Analysis,KMEANS,"Performs k-means cluster analysis with extensive output designed to facilitate interpretation. The program can be used to cluster analyze any data set, but has special features developed for use in archaeological spatial analysis. In particular, Kintigh and Ammerman's (1982) k-means pure locational clustering method can be performed. The program also executes the clustering for Whallon's (1984) unconstrained clustering method on data smoothed using the GRID or LDEN programs.",,,,Unpackaged script: [mpeeples2008/Kmeans](https://github.com/mpeeples2008/Kmeans) -Spatial Analysis,KMPLT,Plots the SSE and (2 dimensional) cluster configuration results of KMEANS on screen and creates hard-copy publishable quality plots,,,,Unpackaged script: [mpeeples2008/Kmeans](https://github.com/mpeeples2008/Kmeans) -Spatial Analysis,KOETJE,Performs the Monte Carlo analysis of homogeneity of cluster configurations as suggested by Koetje (1987).,,,, -Spatial Analysis,LDEN,Performs Johnson's (1984) Local Density Analysis on point-provenienced or grid data. The program also outputs counts or percentages of points of different types that occur within a circular neighborhood around each data point. ,,,, -Spatial Analysis,LDPLT,"Plots selected local density coefficients computed by LDEN against radius, so behavior of coefficients for different pairs of classes can be easily observed over a range of radii",,,, -Spatial Analysis,NEIG,"An efficient, general-purpose nearest-neighbor (Whallon 1984) and gravity model program useful for intrasite spatial analysis or regional analysis. It allows categorization of items by class (e.g. site type or tool type) and permits the calculation of within or between class neighbors.",,,, -Spatial Analysis,RANDPT,"Generates random sets of coordinates, including for clumped distributions with different parameters. Also random walks any number of points in an existing distribution with arbitrary number of steps and step length.",Partially,[spatstat](https://spatstat.org/),"spatstat::rpoint(), spatstat::runifpoint(), spatstat::rpoispp()","Not sure about the ""random walk"" part." -Diversity,BOONE,"Calculates, for a set of proveniences with counts by artifact class, Boone's (1987) assemblage heterogeneity measure and related values.",,,, -Diversity,DIVERS,"Calculates richness and evenness (H/Hmax) dimensions of diversity for a given data set and uses Monte-Carlo methods to derive expected diversity for a model distribution over a range of sample sizes (Kintigh 1984, 1989).",,,, -Diversity,DIVMEAS,"Calculates several diversity measures including Richness, Simpson's, Shannon's, Brillouin's, and the Renyi and Delta families of generalized diversity measures for any given distribution of counts.",Yes,"[tabula](https://tabula.archaeo.science/), [vegan](https://CRAN.R-project.org/package=vegan)","tabula::index_richness(), tabula::index_heterogeneity(), vegan::renyi()",tabula is not currently available on CRAN -Diversity,DIVPLT,Plots the results of DIVERS on screen and creates publishable quality plots,,,, -Diversity,EVALC,Performs a Monte Carlo evaluation of the significance of an observed value of Simpson's C measure of diversity relative to a given assumption about the population.,,,, -Diversity,RAREFY,"Performs rarefaction analysis for sets of sample counts in a CSV file as described by Baxter (2001). Provides expected richness, standard deviation of the expected, Z score, and probability for each larger sample to every smaller sample size. Also outputs expected richness for each sample up to its sample size for graphing.",,,, -Distance,BAYES,This program implements Bayesian methods for proportions as described by Iversen (1984). Intervals are calculated and graphed for Bayesian estimates of proportions based on both flat and informative priors.,,,, -Distance,BINOMIAL,Computes binomial probabilities and population proportion intervals for a sample.,,,, -Distance,BRSAMPLE,Provides a Monte Carlo estimate of the sampling error of differences of the Brainerd Robinson coefficient calculated between a sample and a known population or between two samples drawn from the same population,,,, -Distance,CLCA,"Performs a Complete Linkage Cluster Analysis on up to 180 cases. It takes as input an upper triangular distance matrix, as is created by the DIST program. As output, it lists the sequence of item/cluster joins and fusion values but does not create a dendrogram.",,,, -Distance,DIST,"Computes a triangular matrix of distance or similarity measures: Euclidean Distance, Pearson's r, Brainerd-Robinson Coefficient, Jaccard's Coefficient, Simple Matching Coefficient, and Gower Coefficient.",Partially,[vegan](https://CRAN.R-project.org/package=vegan),vegan::vegdist(),"vegan implements Euclidean, Jaccard, and Gower distances." -Distance,FORD,Plots a publishable quality battleship curve (Ford) diagram,Yes,[tabula](https://tabula.archaeo.science/),tabula::plot_ford(),tabula is not currently available on CRAN -Distance,POISSON,"Computes Poisson and negative binomial probabilities, given expected counts.",,,, -Distance,resampleBRED,"Provide Monte Carlo estimates of the sampling error of differences of the Brainerd-Robinson and Euclidean Distance coefficients calculated between a sample and a known population or between two samples drawn from the same population, as described and applied in Deboer et al. (1996).",,,, -Distance,TWOWAY,"Provides tests of independence and measures of association and prints tables that have been standardized with a number of techniques. Standard Chi² and G tests of independence are provided. Using Monte Carlo methods, Chi² and G tests can be performed on tables with very small expected counts. A Chi² goodness of fit test (with externally determined expected values) can also be calculated. Measures of association include Yule's Q, Phi, Cramer's V and proportional reduction of error measures Tau and Lambda. Table standardization methods include median polish (Lewis 1986) and Mosteller (multiplicative) standardization as well as Haberman's z-score standardization for independent variables used by Grayson (1984) and Allison's binomial probability-based z-score standardization. It will also print row, column, and cell percents, Chi² cell contributions, and Chi² expected values. ",,,, -Dating and Demography,ARRANGE,Creates a probabilistic estimate of the range of site dates based on the proportions of dated ceramic types in the assemblage. Output includes a density plot against time. The program also calculates mean ceramic dates. This method is described in Steponaitis and Kintigh (1993).,,,,Unpackaged script: [mpeeples2008/Mean-Ceramic-Date-and-Error-Estimation](https://github.com/mpeeples2008/Mean-Ceramic-Date-and-Error-Estimation) -Dating and Demography,C14,"provides a graphical way to analyze sets of radiocarbon dates. Each radiocarbon date is treated not as a single point in time but as a normally distributed probability with a mean and standard deviation given by the lab. In evaluating several dates, for each interval the probability distributions associated with the dates are summed. For each temporal interval, an expected number of dates is calculated and plotted in a histogram.",Yes,[rcarbon](https://github.com/ahb108/rcarbon/),"rcarbon::plot(), rcarbon::spd()",Also [stratigraphr](http://stratigraphr.joeroe.io/) for tidy alternatives. -Dating and Demography,CALCULATE_K,Calculates K for for use in Cowgill's formula that estimates the span of true interval producing an observed set of measured dates with Gaussian errors. It calculates the value of K for any standard deviation of a Normal Distribution. See Cowgill and Kintigh (2020).,No,,,Pascal source available: [kintigh/phaselen](https://github.com/kintigh/phaselen) -Dating and Demography,DSPLIT,Compares and combines radiocarbon samples using the procedure published in Archaeometry by Wilson and Ward (1981).,,,, -Dating and Demography,MATCHINTERVAL,Performs a MonteCarlo evaluation of the correspondence between temporal intervals with extreme climate events and the occurrence dates of major cultural changes as described and applied by Kintigh & Ingram (2018).,,,, -Dating and Demography,PHASELEN,Provides a Monte Carlo analysis to estimate the span of true span producing an observed set of measured dates with Gaussian errors such as radiocarbon and obsidian hydration dates. The program has an option for calibration. ,No,,,Pascal source available: [kintigh/phaselen](https://github.com/kintigh/phaselen) -Dating and Demography,ROOMACCUM,"Estimates within-period rates of population growth (or decline) given structure counts dated to a sequence of chronological periods as described and applied by Kintigh and Peeples (2020). It assumes a knowledge of the number of structures that date to each specific period, the period lengths, and an estimated structure use life. The population growth rate estimates are derived by simulating the construction (due to replacement and population growth) and abandonment (due to the completion of the use life or population decline) of individual structures such that the observed number of rooms dating to a period matches the simulated number of rooms.",No,,,Pascal source available: [kintigh/RoomAccum](https://github.com/kintigh/RoomAccum). -Subsurface Testing,PLACESTP,"Calculates the optimal placement of test units in a rectangular or linear survey area. For a user-specified number of survey transects (or user-specified lengthwise and width-wise spacing of test units), in any one of three basic configurations, the program will print out the coordinates of the optimal test unit placement, along with some statistics about the largest circular site that can go unsampled in the survey area. This program implements the formulae provided by Krakker, Shott, and Welch (1983) and revised in Kintigh (1988).",No,,,Could be implemented in [fieldwalkr](https://github.com/joeroe/fieldwalkr) -Subsurface Testing,STP,Probabilistic evaluation of subsurface testing designs as described in Kintigh 1988. STP uses Monte-Carlo methods to evaluate the effectiveness of a test unit layout within a survey area to locate sites with a given size and artifact density.,,,, -Utility,ADFUTIL,"Generates random data sets and manipulates files in the data format used by the analysis programs. It allows the creation of random data set of any size. Variables may be uniform or normally distributed variables with user specified ranges or means standard deviations. ADFUTIL allows the deletion of columns (variables), selective deletion of rows (observations) based on values in a column, replacement of values in a column, randomization of columns for Monte Carlo analysis, the addition of new columns from another data set, and selection of a random sample of cases.",,,, -Utility,CNTCNV,"Program to speed data input and increase entry accuracy for count data, where the number of categories is large relative to the number of items counted for an observation (e.g. surface collection counts of 40 ceramic type divided into 8 vessel forms). It permits a highly abbreviated input format but it writes out a standard matrix (of the sort read by most analysis programs) with one count per category of each observation. The program provides labeled printouts of the data and can perform elaborate aggregation of count categories and simple aggregation of observations.",,,, -Utility,CntEdit,CntEdit is a companion program to CNTCNV and can be used to do global or selective substititions of row or column field values in a data file formatted for CNTCNV.,,,, -Utility,CntRefmt,"CntRefmt is a companion program to CNTCNV that reformats row-column-count segments of records formatted for CntCnv, e.g, to make differently formatted files consistent or to change the spacing to make reading easier.",,,, -Utility,CONVSYS,"Converts a SYSTAT internal format data file into a raw data file, a variable label file, and a case label file that can be used these and other programs that read free-format ASCII data. Works with versions 2.0 and above of SYSTAT, on files of any size.",,,, -Utility,HPPLOT,Provides a flexible user interface to a Hewlett Packard compatible plotters. Its can create a customized analysis graphics from a raw data file edited to include the plot commands.,,,, -Utility,MVC,Permits arbitrarily complex copying of sets of columns in an input record into sets of columns in an output record. It can extract data from fixed-format data records for use with analytical programs that require free format input. Files of any size can be processed.,,,, -Utility,SCAT,"Produces screen and publishable quality scatter plots of variables. All points may be plotted with the same symbol, or different symbols can be plotted based on the value of a variable.",Yes,[ggplot2](https://ggplot2.tidyverse.org/),ggplot2::geom_point(), -Utility,SORTLINE,"A general purpose sort utility, SORTLINE sorts fixed-format data files of up to 32,767 lines into an order defined by any number of user-specified sort fields.",Yes,[dplyr](https://dplyr.tidyverse.org/),dplyr::arrange(), -Utility,SPLIT,"Divides a large file into sections that can be recombined with the DOS COPY command. Thus, large hard disk file can be split and copied onto several floppies.",,,, -Utility,UNTAB,Replaces tabs and control characters in a file with blanks so they can be used with analysis programs that require pure ASCII files (e.g. SYSTAT).,,,, \ No newline at end of file +tfqa_group,tfqa_program,tfqa_description,r,r_package,r_function,notes +Spatial Analysis,CONTIG,Monte Carlo evaluation of the statistical significance of the observed degree of contiguity of grid units assigned to the same cluster.,,,, +Spatial Analysis,FISHER,Calculates Fisher's Exact test,Yes,stats,stats::fisher_test(), +Spatial Analysis,GRID,Aggregates point-provenience data into counts by type for each grid unit.,Yes,[sf](https://r-spatial.github.io/sf/),sf::st_join(),[Tutorial](https://mattherman.info/blog/point-in-poly) +Spatial Analysis,HOA,Computes Hodder and Okell's A and dispersion ratios,Yes,[GmAMisc](https://cran.r-project.org/web/packages/GmAMisc/index.html),GmAMisc::Aindex(), +Spatial Analysis,KMEANS,"Performs k-means cluster analysis with extensive output designed to facilitate interpretation. The program can be used to cluster analyze any data set, but has special features developed for use in archaeological spatial analysis. In particular, Kintigh and Ammerman's (1982) k-means pure locational clustering method can be performed. The program also executes the clustering for Whallon's (1984) unconstrained clustering method on data smoothed using the GRID or LDEN programs.",,,,Unpackaged script: [mpeeples2008/Kmeans](https://github.com/mpeeples2008/Kmeans) +Spatial Analysis,KMPLT,Plots the SSE and (2 dimensional) cluster configuration results of KMEANS on screen and creates hard-copy publishable quality plots,,,,Unpackaged script: [mpeeples2008/Kmeans](https://github.com/mpeeples2008/Kmeans) +Spatial Analysis,KOETJE,Performs the Monte Carlo analysis of homogeneity of cluster configurations as suggested by Koetje (1987).,,,, +Spatial Analysis,LDEN,Performs Johnson's (1984) Local Density Analysis on point-provenienced or grid data. The program also outputs counts or percentages of points of different types that occur within a circular neighborhood around each data point. ,,,, +Spatial Analysis,LDPLT,"Plots selected local density coefficients computed by LDEN against radius, so behavior of coefficients for different pairs of classes can be easily observed over a range of radii",,,, +Spatial Analysis,NEIG,"An efficient, general-purpose nearest-neighbor (Whallon 1984) and gravity model program useful for intrasite spatial analysis or regional analysis. It allows categorization of items by class (e.g. site type or tool type) and permits the calculation of within or between class neighbors.",,,, +Spatial Analysis,RANDPT,"Generates random sets of coordinates, including for clumped distributions with different parameters. Also random walks any number of points in an existing distribution with arbitrary number of steps and step length.",Partially,[spatstat](https://spatstat.org/),"spatstat::rpoint(), spatstat::runifpoint(), spatstat::rpoispp()","Not sure about the ""random walk"" part." +Diversity,BOONE,"Calculates, for a set of proveniences with counts by artifact class, Boone's (1987) assemblage heterogeneity measure and related values.",Yes,[tabula](https://packages.tesselle.org/tabula/),tabula::index_boone(),"Not sure about the ""related values"" part." +Diversity,DIVERS,"Calculates richness and evenness (H/Hmax) dimensions of diversity for a given data set and uses Monte-Carlo methods to derive expected diversity for a model distribution over a range of sample sizes (Kintigh 1984, 1989).",Yes,[tabula](https://packages.tesselle.org/tabula/),tabula::simulate(), +Diversity,DIVMEAS,"Calculates several diversity measures including Richness, Simpson's, Shannon's, Brillouin's, and the Renyi and Delta families of generalized diversity measures for any given distribution of counts.",Yes,"[tabula](https://packages.tesselle.org/tabula/), [vegan](https://CRAN.R-project.org/package=vegan)","tabula::heterogeneity(), tabula::evenness(), tabula::richness(), tabula::composition(), vegan::renyi()", +Diversity,DIVPLT,Plots the results of DIVERS on screen and creates publishable quality plots,Yes,[tabula](https://packages.tesselle.org/tabula/),tabula::plot(), +Diversity,EVALC,Performs a Monte Carlo evaluation of the significance of an observed value of Simpson's C measure of diversity relative to a given assumption about the population.,,,, +Diversity,RAREFY,"Performs rarefaction analysis for sets of sample counts in a CSV file as described by Baxter (2001). Provides expected richness, standard deviation of the expected, Z score, and probability for each larger sample to every smaller sample size. Also outputs expected richness for each sample up to its sample size for graphing.",YES,[tabula](https://packages.tesselle.org/tabula/),tabula::rarefaction(), +Distance,BAYES,This program implements Bayesian methods for proportions as described by Iversen (1984). Intervals are calculated and graphed for Bayesian estimates of proportions based on both flat and informative priors.,,,, +Distance,BINOMIAL,Computes binomial probabilities and population proportion intervals for a sample.,,,, +Distance,BRSAMPLE,Provides a Monte Carlo estimate of the sampling error of differences of the Brainerd Robinson coefficient calculated between a sample and a known population or between two samples drawn from the same population,,,, +Distance,CLCA,"Performs a Complete Linkage Cluster Analysis on up to 180 cases. It takes as input an upper triangular distance matrix, as is created by the DIST program. As output, it lists the sequence of item/cluster joins and fusion values but does not create a dendrogram.",,,, +Distance,DIST,"Computes a triangular matrix of distance or similarity measures: Euclidean Distance, Pearson's r, Brainerd-Robinson Coefficient, Jaccard's Coefficient, Simple Matching Coefficient, and Gower Coefficient.",Partially,[vegan](https://CRAN.R-project.org/package=vegan),vegan::vegdist(),"vegan implements Euclidean, Jaccard, and Gower distances." +Distance,FORD,Plots a publishable quality battleship curve (Ford) diagram,Yes,[tabula](https://tabula.archaeo.science/),tabula::plot_ford(),tabula is not currently available on CRAN +Distance,POISSON,"Computes Poisson and negative binomial probabilities, given expected counts.",,,, +Distance,resampleBRED,"Provide Monte Carlo estimates of the sampling error of differences of the Brainerd-Robinson and Euclidean Distance coefficients calculated between a sample and a known population or between two samples drawn from the same population, as described and applied in Deboer et al. (1996).",,,, +Distance,TWOWAY,"Provides tests of independence and measures of association and prints tables that have been standardized with a number of techniques. Standard Chi² and G tests of independence are provided. Using Monte Carlo methods, Chi² and G tests can be performed on tables with very small expected counts. A Chi² goodness of fit test (with externally determined expected values) can also be calculated. Measures of association include Yule's Q, Phi, Cramer's V and proportional reduction of error measures Tau and Lambda. Table standardization methods include median polish (Lewis 1986) and Mosteller (multiplicative) standardization as well as Haberman's z-score standardization for independent variables used by Grayson (1984) and Allison's binomial probability-based z-score standardization. It will also print row, column, and cell percents, Chi² cell contributions, and Chi² expected values. ",,,, +Dating and Demography,ARRANGE,Creates a probabilistic estimate of the range of site dates based on the proportions of dated ceramic types in the assemblage. Output includes a density plot against time. The program also calculates mean ceramic dates. This method is described in Steponaitis and Kintigh (1993).,,,,Unpackaged script: [mpeeples2008/Mean-Ceramic-Date-and-Error-Estimation](https://github.com/mpeeples2008/Mean-Ceramic-Date-and-Error-Estimation) +Dating and Demography,C14,"provides a graphical way to analyze sets of radiocarbon dates. Each radiocarbon date is treated not as a single point in time but as a normally distributed probability with a mean and standard deviation given by the lab. In evaluating several dates, for each interval the probability distributions associated with the dates are summed. For each temporal interval, an expected number of dates is calculated and plotted in a histogram.",Yes,[rcarbon](https://github.com/ahb108/rcarbon/),"rcarbon::plot(), rcarbon::spd()",Also [stratigraphr](http://stratigraphr.joeroe.io/) for tidy alternatives. +Dating and Demography,CALCULATE_K,Calculates K for for use in Cowgill's formula that estimates the span of true interval producing an observed set of measured dates with Gaussian errors. It calculates the value of K for any standard deviation of a Normal Distribution. See Cowgill and Kintigh (2020).,No,,,Pascal source available: [kintigh/phaselen](https://github.com/kintigh/phaselen) +Dating and Demography,DSPLIT,Compares and combines radiocarbon samples using the procedure published in Archaeometry by Wilson and Ward (1981).,,,, +Dating and Demography,MATCHINTERVAL,Performs a MonteCarlo evaluation of the correspondence between temporal intervals with extreme climate events and the occurrence dates of major cultural changes as described and applied by Kintigh & Ingram (2018).,,,, +Dating and Demography,PHASELEN,Provides a Monte Carlo analysis to estimate the span of true span producing an observed set of measured dates with Gaussian errors such as radiocarbon and obsidian hydration dates. The program has an option for calibration. ,No,,,Pascal source available: [kintigh/phaselen](https://github.com/kintigh/phaselen) +Dating and Demography,ROOMACCUM,"Estimates within-period rates of population growth (or decline) given structure counts dated to a sequence of chronological periods as described and applied by Kintigh and Peeples (2020). It assumes a knowledge of the number of structures that date to each specific period, the period lengths, and an estimated structure use life. The population growth rate estimates are derived by simulating the construction (due to replacement and population growth) and abandonment (due to the completion of the use life or population decline) of individual structures such that the observed number of rooms dating to a period matches the simulated number of rooms.",No,,,Pascal source available: [kintigh/RoomAccum](https://github.com/kintigh/RoomAccum). +Subsurface Testing,PLACESTP,"Calculates the optimal placement of test units in a rectangular or linear survey area. For a user-specified number of survey transects (or user-specified lengthwise and width-wise spacing of test units), in any one of three basic configurations, the program will print out the coordinates of the optimal test unit placement, along with some statistics about the largest circular site that can go unsampled in the survey area. This program implements the formulae provided by Krakker, Shott, and Welch (1983) and revised in Kintigh (1988).",No,,,Could be implemented in [fieldwalkr](https://github.com/joeroe/fieldwalkr) +Subsurface Testing,STP,Probabilistic evaluation of subsurface testing designs as described in Kintigh 1988. STP uses Monte-Carlo methods to evaluate the effectiveness of a test unit layout within a survey area to locate sites with a given size and artifact density.,,,, +Utility,ADFUTIL,"Generates random data sets and manipulates files in the data format used by the analysis programs. It allows the creation of random data set of any size. Variables may be uniform or normally distributed variables with user specified ranges or means standard deviations. ADFUTIL allows the deletion of columns (variables), selective deletion of rows (observations) based on values in a column, replacement of values in a column, randomization of columns for Monte Carlo analysis, the addition of new columns from another data set, and selection of a random sample of cases.",,,, +Utility,CNTCNV,"Program to speed data input and increase entry accuracy for count data, where the number of categories is large relative to the number of items counted for an observation (e.g. surface collection counts of 40 ceramic type divided into 8 vessel forms). It permits a highly abbreviated input format but it writes out a standard matrix (of the sort read by most analysis programs) with one count per category of each observation. The program provides labeled printouts of the data and can perform elaborate aggregation of count categories and simple aggregation of observations.",,,, +Utility,CntEdit,CntEdit is a companion program to CNTCNV and can be used to do global or selective substititions of row or column field values in a data file formatted for CNTCNV.,,,, +Utility,CntRefmt,"CntRefmt is a companion program to CNTCNV that reformats row-column-count segments of records formatted for CntCnv, e.g, to make differently formatted files consistent or to change the spacing to make reading easier.",,,, +Utility,CONVSYS,"Converts a SYSTAT internal format data file into a raw data file, a variable label file, and a case label file that can be used these and other programs that read free-format ASCII data. Works with versions 2.0 and above of SYSTAT, on files of any size.",,,, +Utility,HPPLOT,Provides a flexible user interface to a Hewlett Packard compatible plotters. Its can create a customized analysis graphics from a raw data file edited to include the plot commands.,,,, +Utility,MVC,Permits arbitrarily complex copying of sets of columns in an input record into sets of columns in an output record. It can extract data from fixed-format data records for use with analytical programs that require free format input. Files of any size can be processed.,,,, +Utility,SCAT,"Produces screen and publishable quality scatter plots of variables. All points may be plotted with the same symbol, or different symbols can be plotted based on the value of a variable.",Yes,[ggplot2](https://ggplot2.tidyverse.org/),ggplot2::geom_point(), +Utility,SORTLINE,"A general purpose sort utility, SORTLINE sorts fixed-format data files of up to 32,767 lines into an order defined by any number of user-specified sort fields.",Yes,[dplyr](https://dplyr.tidyverse.org/),dplyr::arrange(), +Utility,SPLIT,"Divides a large file into sections that can be recombined with the DOS COPY command. Thus, large hard disk file can be split and copied onto several floppies.",,,, +Utility,UNTAB,Replaces tabs and control characters in a file with blanks so they can be used with analysis programs that require pure ASCII files (e.g. SYSTAT).,,,, From 174d98b880e3c1b9eb787484608afa520a7f3312 Mon Sep 17 00:00:00 2001 From: nfrerebeau Date: Thu, 8 Aug 2024 12:52:33 +0200 Subject: [PATCH 2/5] Update distance module (#5) --- tfqar.csv | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/tfqar.csv b/tfqar.csv index adf271c..4135921 100644 --- a/tfqar.csv +++ b/tfqar.csv @@ -20,8 +20,8 @@ Distance,BAYES,This program implements Bayesian methods for proportions as descr Distance,BINOMIAL,Computes binomial probabilities and population proportion intervals for a sample.,,,, Distance,BRSAMPLE,Provides a Monte Carlo estimate of the sampling error of differences of the Brainerd Robinson coefficient calculated between a sample and a known population or between two samples drawn from the same population,,,, Distance,CLCA,"Performs a Complete Linkage Cluster Analysis on up to 180 cases. It takes as input an upper triangular distance matrix, as is created by the DIST program. As output, it lists the sequence of item/cluster joins and fusion values but does not create a dendrogram.",,,, -Distance,DIST,"Computes a triangular matrix of distance or similarity measures: Euclidean Distance, Pearson's r, Brainerd-Robinson Coefficient, Jaccard's Coefficient, Simple Matching Coefficient, and Gower Coefficient.",Partially,[vegan](https://CRAN.R-project.org/package=vegan),vegan::vegdist(),"vegan implements Euclidean, Jaccard, and Gower distances." -Distance,FORD,Plots a publishable quality battleship curve (Ford) diagram,Yes,[tabula](https://tabula.archaeo.science/),tabula::plot_ford(),tabula is not currently available on CRAN +Distance,DIST,"Computes a triangular matrix of distance or similarity measures: Euclidean Distance, Pearson's r, Brainerd-Robinson Coefficient, Jaccard's Coefficient, Simple Matching Coefficient, and Gower Coefficient.",Partially,"[tabula](https://packages.tesselle.org/tabula/), [vegan](https://CRAN.R-project.org/package=vegan)","tabula::similarity(), vegan::vegdist()","tabula implements Brainerd-Robinson, Jaccard; vegan implements Euclidean, Jaccard, and Gower distances." +Distance,FORD,Plots a publishable quality battleship curve (Ford) diagram,Yes,[tabula](https://packages.tesselle.org/tabula/),tabula::plot_ford(), Distance,POISSON,"Computes Poisson and negative binomial probabilities, given expected counts.",,,, Distance,resampleBRED,"Provide Monte Carlo estimates of the sampling error of differences of the Brainerd-Robinson and Euclidean Distance coefficients calculated between a sample and a known population or between two samples drawn from the same population, as described and applied in Deboer et al. (1996).",,,, Distance,TWOWAY,"Provides tests of independence and measures of association and prints tables that have been standardized with a number of techniques. Standard Chi² and G tests of independence are provided. Using Monte Carlo methods, Chi² and G tests can be performed on tables with very small expected counts. A Chi² goodness of fit test (with externally determined expected values) can also be calculated. Measures of association include Yule's Q, Phi, Cramer's V and proportional reduction of error measures Tau and Lambda. Table standardization methods include median polish (Lewis 1986) and Mosteller (multiplicative) standardization as well as Haberman's z-score standardization for independent variables used by Grayson (1984) and Allison's binomial probability-based z-score standardization. It will also print row, column, and cell percents, Chi² cell contributions, and Chi² expected values. ",,,, From c95de1d939cb542091931a0ad1d0a7c3ea3919c5 Mon Sep 17 00:00:00 2001 From: nfrerebeau Date: Thu, 8 Aug 2024 12:56:01 +0200 Subject: [PATCH 3/5] Update dating and demography module (#6) --- tfqar.csv | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tfqar.csv b/tfqar.csv index 4135921..f3930ee 100644 --- a/tfqar.csv +++ b/tfqar.csv @@ -25,7 +25,7 @@ Distance,FORD,Plots a publishable quality battleship curve (Ford) diagram,Yes,[t Distance,POISSON,"Computes Poisson and negative binomial probabilities, given expected counts.",,,, Distance,resampleBRED,"Provide Monte Carlo estimates of the sampling error of differences of the Brainerd-Robinson and Euclidean Distance coefficients calculated between a sample and a known population or between two samples drawn from the same population, as described and applied in Deboer et al. (1996).",,,, Distance,TWOWAY,"Provides tests of independence and measures of association and prints tables that have been standardized with a number of techniques. Standard Chi² and G tests of independence are provided. Using Monte Carlo methods, Chi² and G tests can be performed on tables with very small expected counts. A Chi² goodness of fit test (with externally determined expected values) can also be calculated. Measures of association include Yule's Q, Phi, Cramer's V and proportional reduction of error measures Tau and Lambda. Table standardization methods include median polish (Lewis 1986) and Mosteller (multiplicative) standardization as well as Haberman's z-score standardization for independent variables used by Grayson (1984) and Allison's binomial probability-based z-score standardization. It will also print row, column, and cell percents, Chi² cell contributions, and Chi² expected values. ",,,, -Dating and Demography,ARRANGE,Creates a probabilistic estimate of the range of site dates based on the proportions of dated ceramic types in the assemblage. Output includes a density plot against time. The program also calculates mean ceramic dates. This method is described in Steponaitis and Kintigh (1993).,,,,Unpackaged script: [mpeeples2008/Mean-Ceramic-Date-and-Error-Estimation](https://github.com/mpeeples2008/Mean-Ceramic-Date-and-Error-Estimation) +Dating and Demography,ARRANGE,Creates a probabilistic estimate of the range of site dates based on the proportions of dated ceramic types in the assemblage. Output includes a density plot against time. The program also calculates mean ceramic dates. This method is described in Steponaitis and Kintigh (1993).,Partially,[kairos](https://packages.tesselle.org/kairos/),kairos::mcd(),Unpackaged script: [mpeeples2008/Mean-Ceramic-Date-and-Error-Estimation](https://github.com/mpeeples2008/Mean-Ceramic-Date-and-Error-Estimation) Dating and Demography,C14,"provides a graphical way to analyze sets of radiocarbon dates. Each radiocarbon date is treated not as a single point in time but as a normally distributed probability with a mean and standard deviation given by the lab. In evaluating several dates, for each interval the probability distributions associated with the dates are summed. For each temporal interval, an expected number of dates is calculated and plotted in a histogram.",Yes,[rcarbon](https://github.com/ahb108/rcarbon/),"rcarbon::plot(), rcarbon::spd()",Also [stratigraphr](http://stratigraphr.joeroe.io/) for tidy alternatives. Dating and Demography,CALCULATE_K,Calculates K for for use in Cowgill's formula that estimates the span of true interval producing an observed set of measured dates with Gaussian errors. It calculates the value of K for any standard deviation of a Normal Distribution. See Cowgill and Kintigh (2020).,No,,,Pascal source available: [kintigh/phaselen](https://github.com/kintigh/phaselen) Dating and Demography,DSPLIT,Compares and combines radiocarbon samples using the procedure published in Archaeometry by Wilson and Ward (1981).,,,, From 47985494f01da8a5ec4b4b21d6eba4396961682d Mon Sep 17 00:00:00 2001 From: nfrerebeau Date: Thu, 8 Aug 2024 13:08:49 +0200 Subject: [PATCH 4/5] Fix table creation --- README.Rmd | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/README.Rmd b/README.Rmd index 46c80af..41da834 100644 --- a/README.Rmd +++ b/README.Rmd @@ -76,12 +76,13 @@ tfqar %>% r_package = "R package(s)", r_function = "R function(s)", notes = "Notes") %>% - fmt_missing(everything(), missing_text = "") %>% - text_transform(cells_body(vars(tfqa_description)), md_collapse) %>% - text_transform(cells_body(vars(r_function)), plain_list) %>% - fmt_markdown(vars(r_package, notes)) %>% + sub_missing(everything(), missing_text = "") %>% + text_transform(md_collapse, cells_body(tfqa_description)) %>% + text_transform(plain_list, cells_body(r_function)) %>% + fmt_markdown(c(r_package, notes)) %>% tab_source_note(md("TFQA program descriptions copied from ")) %>% tab_style(cell_text(weight = "bold"), list(cells_column_labels(everything()), cells_row_groups())) %>% - tab_style(cell_text(v_align = "top"), cells_body()) + tab_style(cell_text(v_align = "top"), cells_body()) %>% + as_raw_html() ``` \ No newline at end of file From 6950690630c68ed298a1686513b4d5bc92ff65ee Mon Sep 17 00:00:00 2001 From: nfrerebeau Date: Thu, 8 Aug 2024 13:30:32 +0200 Subject: [PATCH 5/5] Knit README --- README.md | 1759 +++++++++++------------------------------------------ 1 file changed, 356 insertions(+), 1403 deletions(-) diff --git a/README.md b/README.md index 3d4c8c3..9261b45 100644 --- a/README.md +++ b/README.md @@ -1,6 +1,6 @@ Tools for Quantitative Archaeology – in R ================ -2020-11-06 +2024-08-08 @@ -42,1406 +42,359 @@ request, or [opening an issue](/sslarch/tfqar/issues) with suggestions. Generated from [tfqar.csv](/sslarch/tfqar/blob/main/tfqar.csv). - -
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
-TFQA Program - -Description - -Available in R? - -R package(s) - -R function(s) - -Notes -
-Spatial Analysis -
-CONTIG - -
- -Monte Carlo evaluation of the statistical significance of the observed… - -Monte Carlo evaluation of the statistical significance of the observed -degree of contiguity of grid units assigned to the same cluster. -
-
- - - -
-FISHER - -Calculates Fisher’s Exact test - -Yes - - -
- -

-stats -

- +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
TFQA ProgramDescriptionAvailable in R?R package(s)R function(s)Notes
Spatial Analysis
CONTIG
Monte Carlo evaluation of the statistical significance of the observed...Monte Carlo evaluation of the statistical significance of the observed degree of contiguity of grid units assigned to the same cluster.
+




FISHERCalculates Fisher's Exact testYesstatsstats::fisher_test()
GRID
Aggregates point-provenience data into counts by type for each grid...Aggregates point-provenience data into counts by type for each grid unit.
+
Yessfsf::st_join()Tutorial
HOAComputes Hodder and Okell's A and dispersion ratiosYesGmAMiscGmAMisc::Aindex()
KMEANS
Performs k-means cluster analysis with extensive output designed to facilitate...Performs k-means cluster analysis with extensive output designed to facilitate interpretation. The program can be used to cluster analyze any data set, but has special features developed for use in archaeological spatial analysis. In particular, Kintigh and Ammerman's (1982) k-means pure locational clustering method can be performed. The program also executes the clustering for Whallon's (1984) unconstrained clustering method on data smoothed using the GRID or LDEN programs.
+



Unpackaged script: mpeeples2008/Kmeans
KMPLT
Plots the SSE and (2 dimensional) cluster configuration results of...Plots the SSE and (2 dimensional) cluster configuration results of KMEANS on screen and creates hard-copy publishable quality plots
+



Unpackaged script: mpeeples2008/Kmeans
KOETJE
Performs the Monte Carlo analysis of homogeneity of cluster configurations...Performs the Monte Carlo analysis of homogeneity of cluster configurations as suggested by Koetje (1987).
+




LDEN
Performs Johnson's (1984) Local Density Analysis on point-provenienced or grid...Performs Johnson's (1984) Local Density Analysis on point-provenienced or grid data. The program also outputs counts or percentages of points of different types that occur within a circular neighborhood around each data point.
+




LDPLT
Plots selected local density coefficients computed by LDEN against radius,...Plots selected local density coefficients computed by LDEN against radius, so behavior of coefficients for different pairs of classes can be easily observed over a range of radii
+




NEIG
An efficient, general-purpose nearest-neighbor (Whallon 1984) and gravity model program...An efficient, general-purpose nearest-neighbor (Whallon 1984) and gravity model program useful for intrasite spatial analysis or regional analysis. It allows categorization of items by class (e.g. site type or tool type) and permits the calculation of within or between class neighbors.
+




RANDPT
Generates random sets of coordinates, including for clumped distributions with...Generates random sets of coordinates, including for clumped distributions with different parameters. Also random walks any number of points in an existing distribution with arbitrary number of steps and step length.
+
Partiallyspatstatspatstat::rpoint()
spatstat::runifpoint()
spatstat::rpoispp()
Not sure about the “random walk” part.
Diversity
BOONE
Calculates, for a set of proveniences with counts by artifact...Calculates, for a set of proveniences with counts by artifact class, Boone's (1987) assemblage heterogeneity measure and related values.
+
Yestabulatabula::index_boone()Not sure about the “related values” part.
DIVERS
Calculates richness and evenness (H/Hmax) dimensions of diversity for a...Calculates richness and evenness (H/Hmax) dimensions of diversity for a given data set and uses Monte-Carlo methods to derive expected diversity for a model distribution over a range of sample sizes (Kintigh 1984, 1989).
+
Yestabulatabula::simulate()
DIVMEAS
Calculates several diversity measures including Richness, Simpson's, Shannon's, Brillouin's, and...Calculates several diversity measures including Richness, Simpson's, Shannon's, Brillouin's, and the Renyi and Delta families of generalized diversity measures for any given distribution of counts.
+
Yestabula, vegantabula::heterogeneity()
tabula::evenness()
tabula::richness()
tabula::composition()
vegan::renyi()

DIVPLT
Plots the results of DIVERS on screen and creates publishable...Plots the results of DIVERS on screen and creates publishable quality plots
+
Yestabulatabula::plot()
EVALC
Performs a Monte Carlo evaluation of the significance of an...Performs a Monte Carlo evaluation of the significance of an observed value of Simpson's C measure of diversity relative to a given assumption about the population.
+




RAREFY
Performs rarefaction analysis for sets of sample counts in a...Performs rarefaction analysis for sets of sample counts in a CSV file as described by Baxter (2001). Provides expected richness, standard deviation of the expected, Z score, and probability for each larger sample to every smaller sample size. Also outputs expected richness for each sample up to its sample size for graphing.
+
YEStabulatabula::rarefaction()
Distance
BAYES
This program implements Bayesian methods for proportions as described by...This program implements Bayesian methods for proportions as described by Iversen (1984). Intervals are calculated and graphed for Bayesian estimates of proportions based on both flat and informative priors.
+




BINOMIALComputes binomial probabilities and population proportion intervals for a sample.



BRSAMPLE
Provides a Monte Carlo estimate of the sampling error of...Provides a Monte Carlo estimate of the sampling error of differences of the Brainerd Robinson coefficient calculated between a sample and a known population or between two samples drawn from the same population
+




CLCA
Performs a Complete Linkage Cluster Analysis on up to 180...Performs a Complete Linkage Cluster Analysis on up to 180 cases. It takes as input an upper triangular distance matrix, as is created by the DIST program. As output, it lists the sequence of item/cluster joins and fusion values but does not create a dendrogram.
+




DIST
Computes a triangular matrix of distance or similarity measures: Euclidean...Computes a triangular matrix of distance or similarity measures: Euclidean Distance, Pearson's r, Brainerd-Robinson Coefficient, Jaccard's Coefficient, Simple Matching Coefficient, and Gower Coefficient.
+
Partiallytabula, vegantabula::similarity()
vegan::vegdist()
tabula implements Brainerd-Robinson, Jaccard; vegan implements Euclidean, Jaccard, and Gower distances.
FORDPlots a publishable quality battleship curve (Ford) diagramYestabulatabula::plot_ford()
POISSONComputes Poisson and negative binomial probabilities, given expected counts.



resampleBRED
Provide Monte Carlo estimates of the sampling error of differences...Provide Monte Carlo estimates of the sampling error of differences of the Brainerd-Robinson and Euclidean Distance coefficients calculated between a sample and a known population or between two samples drawn from the same population, as described and applied in Deboer et al. (1996).
+




TWOWAY
Provides tests of independence and measures of association and prints...Provides tests of independence and measures of association and prints tables that have been standardized with a number of techniques. Standard Chi² and G tests of independence are provided. Using Monte Carlo methods, Chi² and G tests can be performed on tables with very small expected counts. A Chi² goodness of fit test (with externally determined expected values) can also be calculated. Measures of association include Yule's Q, Phi, Cramer's V and proportional reduction of error measures Tau and Lambda. Table standardization methods include median polish (Lewis 1986) and Mosteller (multiplicative) standardization as well as Haberman's z-score standardization for independent variables used by Grayson (1984) and Allison's binomial probability-based z-score standardization. It will also print row, column, and cell percents, Chi² cell contributions, and Chi² expected values.
+




Dating and Demography
ARRANGE
Creates a probabilistic estimate of the range of site dates...Creates a probabilistic estimate of the range of site dates based on the proportions of dated ceramic types in the assemblage. Output includes a density plot against time. The program also calculates mean ceramic dates. This method is described in Steponaitis and Kintigh (1993).
+
Partiallykairoskairos::mcd()Unpackaged script: mpeeples2008/Mean-Ceramic-Date-and-Error-Estimation
C14
provides a graphical way to analyze sets of radiocarbon dates....provides a graphical way to analyze sets of radiocarbon dates. Each radiocarbon date is treated not as a single point in time but as a normally distributed probability with a mean and standard deviation given by the lab. In evaluating several dates, for each interval the probability distributions associated with the dates are summed. For each temporal interval, an expected number of dates is calculated and plotted in a histogram.
+
Yesrcarbonrcarbon::plot()
rcarbon::spd()
Also stratigraphr for tidy alternatives.
CALCULATE_K
Calculates K for for use in Cowgill's formula that estimates...Calculates K for for use in Cowgill's formula that estimates the span of true interval producing an observed set of measured dates with Gaussian errors. It calculates the value of K for any standard deviation of a Normal Distribution. See Cowgill and Kintigh (2020).
+
No

Pascal source available: kintigh/phaselen
DSPLIT
Compares and combines radiocarbon samples using the procedure published in...Compares and combines radiocarbon samples using the procedure published in Archaeometry by Wilson and Ward (1981).
+




MATCHINTERVAL
Performs a MonteCarlo evaluation of the correspondence between temporal intervals...Performs a MonteCarlo evaluation of the correspondence between temporal intervals with extreme climate events and the occurrence dates of major cultural changes as described and applied by Kintigh & Ingram (2018).
+




PHASELEN
Provides a Monte Carlo analysis to estimate the span of...Provides a Monte Carlo analysis to estimate the span of true span producing an observed set of measured dates with Gaussian errors such as radiocarbon and obsidian hydration dates. The program has an option for calibration.
+
No

Pascal source available: kintigh/phaselen
ROOMACCUM
Estimates within-period rates of population growth (or decline) given structure...Estimates within-period rates of population growth (or decline) given structure counts dated to a sequence of chronological periods as described and applied by Kintigh and Peeples (2020). It assumes a knowledge of the number of structures that date to each specific period, the period lengths, and an estimated structure use life. The population growth rate estimates are derived by simulating the construction (due to replacement and population growth) and abandonment (due to the completion of the use life or population decline) of individual structures such that the observed number of rooms dating to a period matches the simulated number of rooms.
+
No

Pascal source available: kintigh/RoomAccum.
Subsurface Testing
PLACESTP
Calculates the optimal placement of test units in a rectangular...Calculates the optimal placement of test units in a rectangular or linear survey area. For a user-specified number of survey transects (or user-specified lengthwise and width-wise spacing of test units), in any one of three basic configurations, the program will print out the coordinates of the optimal test unit placement, along with some statistics about the largest circular site that can go unsampled in the survey area. This program implements the formulae provided by Krakker, Shott, and Welch (1983) and revised in Kintigh (1988).
+
No

Could be implemented in fieldwalkr
STP
Probabilistic evaluation of subsurface testing designs as described in Kintigh...Probabilistic evaluation of subsurface testing designs as described in Kintigh 1988. STP uses Monte-Carlo methods to evaluate the effectiveness of a test unit layout within a survey area to locate sites with a given size and artifact density.
+




Utility
ADFUTIL
Generates random data sets and manipulates files in the data...Generates random data sets and manipulates files in the data format used by the analysis programs. It allows the creation of random data set of any size. Variables may be uniform or normally distributed variables with user specified ranges or means standard deviations. ADFUTIL allows the deletion of columns (variables), selective deletion of rows (observations) based on values in a column, replacement of values in a column, randomization of columns for Monte Carlo analysis, the addition of new columns from another data set, and selection of a random sample of cases.
+




CNTCNV
Program to speed data input and increase entry accuracy for...Program to speed data input and increase entry accuracy for count data, where the number of categories is large relative to the number of items counted for an observation (e.g. surface collection counts of 40 ceramic type divided into 8 vessel forms). It permits a highly abbreviated input format but it writes out a standard matrix (of the sort read by most analysis programs) with one count per category of each observation. The program provides labeled printouts of the data and can perform elaborate aggregation of count categories and simple aggregation of observations.
+




CntEdit
CntEdit is a companion program to CNTCNV and can be...CntEdit is a companion program to CNTCNV and can be used to do global or selective substititions of row or column field values in a data file formatted for CNTCNV.
+




CntRefmt
CntRefmt is a companion program to CNTCNV that reformats row-column-count...CntRefmt is a companion program to CNTCNV that reformats row-column-count segments of records formatted for CntCnv, e.g, to make differently formatted files consistent or to change the spacing to make reading easier.
+




CONVSYS
Converts a SYSTAT internal format data file into a raw...Converts a SYSTAT internal format data file into a raw data file, a variable label file, and a case label file that can be used these and other programs that read free-format ASCII data. Works with versions 2.0 and above of SYSTAT, on files of any size.
+




HPPLOT
Provides a flexible user interface to a Hewlett Packard compatible...Provides a flexible user interface to a Hewlett Packard compatible plotters. Its can create a customized analysis graphics from a raw data file edited to include the plot commands.
+




MVC
Permits arbitrarily complex copying of sets of columns in an...Permits arbitrarily complex copying of sets of columns in an input record into sets of columns in an output record. It can extract data from fixed-format data records for use with analytical programs that require free format input. Files of any size can be processed.
+




SCAT
Produces screen and publishable quality scatter plots of variables. All...Produces screen and publishable quality scatter plots of variables. All points may be plotted with the same symbol, or different symbols can be plotted based on the value of a variable.
+
Yesggplot2ggplot2::geom_point()
SORTLINE
A general purpose sort utility, SORTLINE sorts fixed-format data files...A general purpose sort utility, SORTLINE sorts fixed-format data files of up to 32,767 lines into an order defined by any number of user-specified sort fields.
+
Yesdplyrdplyr::arrange()
SPLIT
Divides a large file into sections that can be recombined...Divides a large file into sections that can be recombined with the DOS COPY command. Thus, large hard disk file can be split and copied onto several floppies.
+




UNTAB
Replaces tabs and control characters in a file with blanks...Replaces tabs and control characters in a file with blanks so they can be used with analysis programs that require pure ASCII files (e.g. SYSTAT).
+




TFQA program descriptions copied from http://tfqa.com/programs.htm
- -
-stats::fisher\_test() - -
-GRID - -
- -Aggregates point-provenience data into counts by type for each grid… - -Aggregates point-provenience data into counts by type for each grid -unit. -
-
-Yes - - -
- -

-sf -

- -
- -
-sf::st\_join() - - -
- -

-Tutorial -

- -
- -
-HOA - -Computes Hodder and Okell’s A and dispersion ratios - -Yes - - -
- -

-GmAMisc -

- -
- -
-GmAMisc::Aindex() - -
-KMEANS - -
- -Performs k-means cluster analysis with extensive output designed to -facilitate… - -Performs k-means cluster analysis with extensive output designed to -facilitate interpretation. The program can be used to cluster analyze -any data set, but has special features developed for use in -archaeological spatial analysis. In particular, Kintigh and Ammerman’s -(1982) k-means pure locational clustering method can be performed. The -program also executes the clustering for Whallon’s (1984) unconstrained -clustering method on data smoothed using the GRID or LDEN programs. -
-
- - - - -
- -

-Unpackaged script: -mpeeples2008/Kmeans -

- -
- -
-KMPLT - -
- -Plots the SSE and (2 dimensional) cluster configuration results of… - -Plots the SSE and (2 dimensional) cluster configuration results of -KMEANS on screen and creates hard-copy publishable quality plots -
-
- - - - -
- -

-Unpackaged script: -mpeeples2008/Kmeans -

- -
- -
-KOETJE - -
- -Performs the Monte Carlo analysis of homogeneity of cluster -configurations… - -Performs the Monte Carlo analysis of homogeneity of cluster -configurations as suggested by Koetje (1987). -
-
- - - -
-LDEN - -
- -Performs Johnson’s (1984) Local Density Analysis on point-provenienced -or grid… - -Performs Johnson’s (1984) Local Density Analysis on point-provenienced -or grid data. The program also outputs counts or percentages of points -of different types that occur within a circular neighborhood around each -data point. -
-
- - - -
-LDPLT - -
- -Plots selected local density coefficients computed by LDEN against -radius,… - -Plots selected local density coefficients computed by LDEN against -radius, so behavior of coefficients for different pairs of classes can -be easily observed over a range of radii -
-
- - - -
-NEIG - -
- -An efficient, general-purpose nearest-neighbor (Whallon 1984) and -gravity model program… - -An efficient, general-purpose nearest-neighbor (Whallon 1984) and -gravity model program useful for intrasite spatial analysis or regional -analysis. It allows categorization of items by class (e.g. site type or -tool type) and permits the calculation of within or between class -neighbors. -
-
- - - -
-RANDPT - -
- -Generates random sets of coordinates, including for clumped -distributions with… - -Generates random sets of coordinates, including for clumped -distributions with different parameters. Also random walks any number of -points in an existing distribution with arbitrary number of steps and -step length. -
-
-Partially - - -
- -

-spatstat -

- -
- -
-spatstat::rpoint()
spatstat::runifpoint()
spatstat::rpoispp() -
- -
- -

-Not sure about the “random walk” part. -

- -
- -
-Diversity -
-BOONE - -
- -Calculates, for a set of proveniences with counts by artifact… - -Calculates, for a set of proveniences with counts by artifact class, -Boone’s (1987) assemblage heterogeneity measure and related values. -
-
- - - -
-DIVERS - -
- -Calculates richness and evenness (H/Hmax) dimensions of diversity for a… - -Calculates richness and evenness (H/Hmax) dimensions of diversity for a -given data set and uses Monte-Carlo methods to derive expected diversity -for a model distribution over a range of sample sizes (Kintigh 1984, -1989). -
-
- - - -
-DIVMEAS - -
- -Calculates several diversity measures including Richness, Simpson’s, -Shannon’s, Brillouin’s, and… - -Calculates several diversity measures including Richness, Simpson’s, -Shannon’s, Brillouin’s, and the Renyi and Delta families of generalized -diversity measures for any given distribution of counts. -
-
-Yes - - -
- -

-tabula, -vegan -

- -
- -
-tabula::index\_richness()
tabula::index\_heterogeneity()
vegan::renyi() -
- -
- -

-tabula is not currently available on CRAN -

- -
- -
-DIVPLT - -
- -Plots the results of DIVERS on screen and creates publishable… - -Plots the results of DIVERS on screen and creates publishable quality -plots -
-
- - - -
-EVALC - -
- -Performs a Monte Carlo evaluation of the significance of an… - -Performs a Monte Carlo evaluation of the significance of an observed -value of Simpson’s C measure of diversity relative to a given assumption -about the population. -
-
- - - -
-RAREFY - -
- -Performs rarefaction analysis for sets of sample counts in a… - -Performs rarefaction analysis for sets of sample counts in a CSV file as -described by Baxter (2001). Provides expected richness, standard -deviation of the expected, Z score, and probability for each larger -sample to every smaller sample size. Also outputs expected richness for -each sample up to its sample size for graphing. -
-
- - - -
-Distance -
-BAYES - -
- -This program implements Bayesian methods for proportions as described -by… - -This program implements Bayesian methods for proportions as described by -Iversen (1984). Intervals are calculated and graphed for Bayesian -estimates of proportions based on both flat and informative priors. -
-
- - - -
-BINOMIAL - -Computes binomial probabilities and population proportion intervals for -a sample. - - - - -
-BRSAMPLE - -
- -Provides a Monte Carlo estimate of the sampling error of… - -Provides a Monte Carlo estimate of the sampling error of differences of -the Brainerd Robinson coefficient calculated between a sample and a -known population or between two samples drawn from the same population -
-
- - - -
-CLCA - -
- -Performs a Complete Linkage Cluster Analysis on up to 180… - -Performs a Complete Linkage Cluster Analysis on up to 180 cases. It -takes as input an upper triangular distance matrix, as is created by the -DIST program. As output, it lists the sequence of item/cluster joins and -fusion values but does not create a dendrogram. -
-
- - - -
-DIST - -
- -Computes a triangular matrix of distance or similarity measures: -Euclidean… - -Computes a triangular matrix of distance or similarity measures: -Euclidean Distance, Pearson’s r, Brainerd-Robinson Coefficient, -Jaccard’s Coefficient, Simple Matching Coefficient, and Gower -Coefficient. -
-
-Partially - - -
- -

-vegan -

- -
- -
-vegan::vegdist() - - -
- -

-vegan implements Euclidean, Jaccard, and Gower distances. -

- -
- -
-FORD - -Plots a publishable quality battleship curve (Ford) diagram - -Yes - - -
- -

-tabula -

- -
- -
-tabula::plot\_ford() - - -
- -

-tabula is not currently available on CRAN -

- -
- -
-POISSON - -Computes Poisson and negative binomial probabilities, given expected -counts. - - - - -
-resampleBRED - -
- -Provide Monte Carlo estimates of the sampling error of differences… - -Provide Monte Carlo estimates of the sampling error of differences of -the Brainerd-Robinson and Euclidean Distance coefficients calculated -between a sample and a known population or between two samples drawn -from the same population, as described and applied in Deboer et -al. (1996). -
-
- - - -
-TWOWAY - -
- -Provides tests of independence and measures of association and prints… - -Provides tests of independence and measures of association and prints -tables that have been standardized with a number of techniques. Standard -Chi² and G tests of independence are provided. Using Monte Carlo -methods, Chi² and G tests can be performed on tables with very small -expected counts. A Chi² goodness of fit test (with externally determined -expected values) can also be calculated. Measures of association include -Yule’s Q, Phi, Cramer’s V and proportional reduction of error measures -Tau and Lambda. Table standardization methods include median polish -(Lewis 1986) and Mosteller (multiplicative) standardization as well as -Haberman’s z-score standardization for independent variables used by -Grayson (1984) and Allison’s binomial probability-based z-score -standardization. It will also print row, column, and cell percents, Chi² -cell contributions, and Chi² expected values. -
-
- - - -
-Dating and Demography -
-ARRANGE - -
- -Creates a probabilistic estimate of the range of site dates… - -Creates a probabilistic estimate of the range of site dates based on the -proportions of dated ceramic types in the assemblage. Output includes a -density plot against time. The program also calculates mean ceramic -dates. This method is described in Steponaitis and Kintigh (1993). -
-
- - - - - - -
-C14 - -
- -provides a graphical way to analyze sets of radiocarbon dates…. - -provides a graphical way to analyze sets of radiocarbon dates. Each -radiocarbon date is treated not as a single point in time but as a -normally distributed probability with a mean and standard deviation -given by the lab. In evaluating several dates, for each interval the -probability distributions associated with the dates are summed. For each -temporal interval, an expected number of dates is calculated and plotted -in a histogram. -
-
-Yes - - -
- -

-rcarbon -

- -
- -
-rcarbon::plot()
rcarbon::spd() -
- -
- -

-Also stratigraphr for tidy -alternatives. -

- -
- -
-CALCULATE\_K - -
- -Calculates K for for use in Cowgill’s formula that estimates… - -Calculates K for for use in Cowgill’s formula that estimates the span of -true interval producing an observed set of measured dates with Gaussian -errors. It calculates the value of K for any standard deviation of a -Normal Distribution. See Cowgill and Kintigh (2020). -
-
-No - - - - -
- -

-Pascal source available: -kintigh/phaselen -

- -
- -
-DSPLIT - -
- -Compares and combines radiocarbon samples using the procedure published -in… - -Compares and combines radiocarbon samples using the procedure published -in Archaeometry by Wilson and Ward (1981). -
-
- - - -
-MATCHINTERVAL - -
- -Performs a MonteCarlo evaluation of the correspondence between temporal -intervals… - -Performs a MonteCarlo evaluation of the correspondence between temporal -intervals with extreme climate events and the occurrence dates of major -cultural changes as described and applied by Kintigh & Ingram (2018). -
-
- - - -
-PHASELEN - -
- -Provides a Monte Carlo analysis to estimate the span of… - -Provides a Monte Carlo analysis to estimate the span of true span -producing an observed set of measured dates with Gaussian errors such as -radiocarbon and obsidian hydration dates. The program has an option for -calibration. -
-
-No - - - - -
- -

-Pascal source available: -kintigh/phaselen -

- -
- -
-ROOMACCUM - -
- -Estimates within-period rates of population growth (or decline) given -structure… - -Estimates within-period rates of population growth (or decline) given -structure counts dated to a sequence of chronological periods as -described and applied by Kintigh and Peeples (2020). It assumes a -knowledge of the number of structures that date to each specific period, -the period lengths, and an estimated structure use life. The population -growth rate estimates are derived by simulating the construction (due to -replacement and population growth) and abandonment (due to the -completion of the use life or population decline) of individual -structures such that the observed number of rooms dating to a period -matches the simulated number of rooms. -
-
-No - - - - -
- -

-Pascal source available: -kintigh/RoomAccum. -

- -
- -
-Subsurface Testing -
-PLACESTP - -
- -Calculates the optimal placement of test units in a rectangular… - -Calculates the optimal placement of test units in a rectangular or -linear survey area. For a user-specified number of survey transects (or -user-specified lengthwise and width-wise spacing of test units), in any -one of three basic configurations, the program will print out the -coordinates of the optimal test unit placement, along with some -statistics about the largest circular site that can go unsampled in the -survey area. This program implements the formulae provided by Krakker, -Shott, and Welch (1983) and revised in Kintigh (1988). -
-
-No - - - - -
- -

-Could be implemented in -fieldwalkr -

- -
- -
-STP - -
- -Probabilistic evaluation of subsurface testing designs as described in -Kintigh… - -Probabilistic evaluation of subsurface testing designs as described in -Kintigh 1988. STP uses Monte-Carlo methods to evaluate the effectiveness -of a test unit layout within a survey area to locate sites with a given -size and artifact density. -
-
- - - -
-Utility -
-ADFUTIL - -
- -Generates random data sets and manipulates files in the data… - -Generates random data sets and manipulates files in the data format used -by the analysis programs. It allows the creation of random data set of -any size. Variables may be uniform or normally distributed variables -with user specified ranges or means standard deviations. ADFUTIL allows -the deletion of columns (variables), selective deletion of rows -(observations) based on values in a column, replacement of values in a -column, randomization of columns for Monte Carlo analysis, the addition -of new columns from another data set, and selection of a random sample -of cases. -
-
- - - -
-CNTCNV - -
- -Program to speed data input and increase entry accuracy for… - -Program to speed data input and increase entry accuracy for count data, -where the number of categories is large relative to the number of items -counted for an observation (e.g. surface collection counts of 40 ceramic -type divided into 8 vessel forms). It permits a highly abbreviated input -format but it writes out a standard matrix (of the sort read by most -analysis programs) with one count per category of each observation. The -program provides labeled printouts of the data and can perform elaborate -aggregation of count categories and simple aggregation of observations. -
-
- - - -
-CntEdit - -
- -CntEdit is a companion program to CNTCNV and can be… - -CntEdit is a companion program to CNTCNV and can be used to do global or -selective substititions of row or column field values in a data file -formatted for CNTCNV. -
-
- - - -
-CntRefmt - -
- -CntRefmt is a companion program to CNTCNV that reformats -row-column-count… - -CntRefmt is a companion program to CNTCNV that reformats -row-column-count segments of records formatted for CntCnv, e.g, to make -differently formatted files consistent or to change the spacing to make -reading easier. -
-
- - - -
-CONVSYS - -
- -Converts a SYSTAT internal format data file into a raw… - -Converts a SYSTAT internal format data file into a raw data file, a -variable label file, and a case label file that can be used these and -other programs that read free-format ASCII data. Works with versions 2.0 -and above of SYSTAT, on files of any size. -
-
- - - -
-HPPLOT - -
- -Provides a flexible user interface to a Hewlett Packard compatible… - -Provides a flexible user interface to a Hewlett Packard compatible -plotters. Its can create a customized analysis graphics from a raw data -file edited to include the plot commands. -
-
- - - -
-MVC - -
- -Permits arbitrarily complex copying of sets of columns in an… - -Permits arbitrarily complex copying of sets of columns in an input -record into sets of columns in an output record. It can extract data -from fixed-format data records for use with analytical programs that -require free format input. Files of any size can be processed. -
-
- - - -
-SCAT - -
- -Produces screen and publishable quality scatter plots of variables. All… - -Produces screen and publishable quality scatter plots of variables. All -points may be plotted with the same symbol, or different symbols can be -plotted based on the value of a variable. -
-
-Yes - - -
- -

-ggplot2 -

- -
- -
-ggplot2::geom\_point() - -
-SORTLINE - -
- -A general purpose sort utility, SORTLINE sorts fixed-format data files… - -A general purpose sort utility, SORTLINE sorts fixed-format data files -of up to 32,767 lines into an order defined by any number of -user-specified sort fields. -
-
-Yes - - -
- -

-dplyr -

- -
- -
-dplyr::arrange() - -
-SPLIT - -
- -Divides a large file into sections that can be recombined… - -Divides a large file into sections that can be recombined with the DOS -COPY command. Thus, large hard disk file can be split and copied onto -several floppies. -
-
- - - -
-UNTAB - -
- -Replaces tabs and control characters in a file with blanks… - -Replaces tabs and control characters in a file with blanks so they can -be used with analysis programs that require pure ASCII files -(e.g. SYSTAT). -
-
- - - -
-TFQA program descriptions copied from -http://tfqa.com/programs.htm -
- -
- -