## CRAN Task View: Archaeological Science

Maintainer:
Ben Marwick |
Contact:
benmarwick at gmail.com |
Version:
2016-09-16 |

This CRAN Task View contains a list of packages useful for scientific work in Archaeology, grouped by topic. Note that this is not an official CRAN Task View, just one I have prepared for my own convenience, so it includes some packages only on GitHub and other non-CRAN resources I find useful. Many of the most highly recommended packages listed here can be installed in a single step by installing the tidyverse package.

Besides these packages, a very wide variety of functions suitable for scientific work in Archaeology is provided by both the basic R system (and its set of recommended core packages), and a number of other packages on the Comprehensive R Archive Network (CRAN) and GitHub. Consequently, several of the other CRAN Task Views may contain suitable packages, in particular the Social Sciences, Spatial, Spatio-temporal, Cluster analysis, Multivariate Statistics, Bayesian inference, Visualization, and Reproducible research Task Views.

Contributions to this Task View are always welcome, and encouraged. The source file for this particular task view file resides in a GitHub repository (see below), and pull requests are the preferred method for contributions.

**Data acquisition**

- The ideal method is to export your spreadsheets from Excel (or whatever program you made them) as CSVs (comma-separated-values, a simple, non-proprietary plain-text-based format that is very transparent, being human-readable, easily machine-processable and suitable for archival storage) and read them into R using the base function
`read.csv()`

. Data in other types of plain text files can be read in with`read.table()`

. - To read Microsoft Excel files into R there are a number of packages: readxl (requires Rcpp, which in turn requires Rtools for Windows or XCode for OSX), gdata (requires Perl), openxlsx (requires Rcpp, which in turn requires Rtools for Windows or XCode for OSX), XLConnect (requires rJava and Java), xlsx (also requires rJava and Java). OpenDocument Spreadsheet files can be read into R using readODS.
- For working with untidy and/or complex Excel spreadsheets (i.e. many tables in one sheet, coloured cells, cells with formulas, etc.), use jailbreakr, xlsxtractr, and tidyxl
- Text data (as in sentences and paragraphs) can be read in and analysed with the tm package. If the text is in a PDF file, use pdftools.
- For quickly reading in a very large number of CSV files, or very large CSV files, use
`fread()`

from the data.table package or functions in the readr package. - RODBC, RMySQL, RPostgreSQL, RSQLite for connecting R to SQL databases. RODBC can connect to Microsoft Access databases.
- The haven and foreign packages can be used for reading and writing files from certain versions of Minitab, S, SAS, SPSS, Stata, Systat and Weka.
- ESRI shapefiles can be read using rgdal or maptools
- R can receive data directly from the web using httr, XML, jsonlite, rvest, RSelenium (requires Selenium 2.0 Remote WebDriver). R can be programmed to be a web-scraper using rvest and/or rselenium. The Web Technologies task view gives more details.
- Google spreadsheets can be read into R using the googlesheet package
- Tables can be read directly from Microsoft Word documents with docxtractr, and from PDF documents with tabulizer
- Datasets from the Open Context repository can be browsed and read into R using the opencontext package

**Data manipulation**

- dplyr and data.table for splitting the data up by groups, applying some common or custom functions, and combining the output back into a convenient form (ie. typical aggregation, splitting and summarising operations). Both packages are fast on very large datasets.
- tidyr for rearranging the data from long to wide forms, and more complex reshaping.
- purrr simplifies working with lists, applying functions to list elements and collecting the results
- broom takes the output of many built-in functions in R, such as
`lm`

or`t.test`

, and turns them into tidy data frames. - measurements, convertr, and units converts between metric and imperial units, or calculate a dimension’s unknown value from other dimensions’ measurements.

**Visualising data**

- ggplot2 produces a very wide variety of attractive plots with a highly flexible and logical syntax.
- Extensions include ggbiplot (PCA biplots with ellipses), GGally (plot matrices), ggtern (ternary plots), gridExtra (arranging multiple plots, see also wq), ggfortify (many methods for plotting PCA, clustering, linear model output, etc., using ggplot2),ggalt (more geoms, coords, stats, scales and fonts, including splines, 1d and 2d densities), waffle (for square pie charts), ggraph for treemaps, and ggrepel for moving overlapping text labels away from each other.
- For showing distributions across several categories: ggbeeswarm, vipor, sinaplot
- plotly and ggiraph make ggplots interactive with mouse-over pop-ups, zooming, click-actions, etc. scatterD3 makes highly interactive scatter plots
- circlize implements Circos in R for circular and chord plots. Rose plots can be made with ggplot2, Schmidt diagrams can be made with the
`net`

function in RFOC or the`Stereo*`

functions in RockFab - plotrix has the function
`battleship.plot()`

to make Ford’s battleship diagrams. - ggmaps combines the spatial information of static maps from Google Maps, OpenStreetMap, Stamen Maps or CloudMade Maps with the layered grammar of graphics implementation of ggplot2
- Stratigraphic data plots can be drawn using
`Stratiplot()`

function in analogue and functions`strat.plot()`

and strat.plot.simple in the rioja package. The rioja package also includes`chclust()`

for stratigraphically constrained clustering, and related dendrogram plotting methods. - rgl and plotly for interactive 3D plots,and scatterplot3d also draws 3D point clouds.
- tabplot for exploratory data visualisation of tables
- For schematic diagrams, such as Harris matrices, DiagrammeR is useful.
- For colour schemes in plots: viridis for perfectly perceptually-uniform colours, RColorBrewer, wesanderson, and munsell for exploring and using the Munsell colour system, and for some extra themes for ggplot2, including some Tufte-inspired themes, see ggthemes.

**Analysis in general**

- Base R, especially the stats package, has a lot of functionality useful for analysing archaeological data. For example,
`chisq.test()`

,`prop.test()`

,`binom.test()`

,`t.test()`

,`wilcox.test()`

,`kruskal.test()`

,`mcnemar.test()`

,`cor.test()`

,`power.t.test()`

,`power.prop.test()`

,`power.anova.test()`

among many others. Hmisc includes bootstrapping, setting confidence intervals, and power analysis functions, see also psych for useful descriptive statistics and visualizations. - corrr contains many convenient functions for exploring correlations.
- Bayesian and resampling variants of these also exist, for example in the MCMCpack, BEST (requires JAGS), Bayesian First Aid (also requires JAGS) packages (see the Bayesian task view for more) and the coin, boot, and bootstrap packages (see also msm).
- For analysing change over time, bcp, changepoint, ecp, AnomalyDetection and BreakoutDetection provide functions for detecting distributional changes within time-ordered observations.
- abc provides functions for parameter estimation, model selection, and goodness-of-fit.

**Analysis of categorical and count data**

- The
`table()`

function in the base package and the`xtabs()`

and`ftable()`

functions in the stats package construct contingency tables. - The
`chisq.test()`

and`fisher.test()`

functions in the stats package may be used to test for independence in two-way contingency tables - The
`assocstats()`

function in the vcd package computes the Pearson chi-Squared test, the Likelihood Ratio chi-Squared test, the phi coefficient, the contingency coefficient and Cramer’s V for plain or stratified contingency tables.

**Linear, generalized linear models, and non-linear models**

- Linear models can be fitted (via OLS) with
`lm()`

(from stats). The modelr package has helper functions for pipeable modelling (e.g. cross-validation, bootstrapping). For data from a non-normal population, or when there are apparent outliers, lmPerm computes linear models using permutation tests. - Bayesian fitting of linear and non-linear models is possible with rstanarm, brms, and rethinking.
- The
`nls()`

function (from stats) as well as the package minpack.lm allow the solution of nonlinear least squares problems. - Correlated and/or unequal variances can be modeled using the
`gnls()`

function of the nlme package and by nlreg. The nlme package is supported by Pinheiro & Bates (2000) Mixed-effects Models in S and S-PLUS, Springer, New York. - The generic
`anova()`

function in the stats package constructs sequential analysis of variance and analysis of deviance tables, and can compute F and likelihood-ratio tests for nested models. (It is typical for other classes of statistical models in R to have anova methods as well.) The generic anova function in the car package (associated with Fox, An R and S-PLUS Companion to Applied Regression, Sage, 2002) constructs so-called “Type-II” and “Type-III” tests for linear and generalized linear models.

**Multivariate statistics**

- The Cluster task view provides a more detailed discussion of available cluster analysis methods and appropriate R functions and packages.
- caret and FactoMiner are popular packages with a suite of multivariate methods
- aplpack provides
`bagplots()`

`lda()`

and`qda()`

within MASS provide linear and quadratic discrimination respectively.

Model Testing and Validation

- caret provides many functions for model training, testing, and validation. There is also Max Kuhn’s excellent companion book called “Applied Predictive Modeling” (Springer, also available as an ebook). The mlr package also provides classification, regression, and machine learning methods.
- Other packages that enable tuning and evaluation of models include bootstrap and Hmisc
- The CVtools and DAAG packages include cross-validation functions for evaluating the optimality of tuning parameters such as sample sizes or number of predictors etc., in statistical models

Hierarchical cluster analysis

- The package cluster provides functions for cluster analysis following the methods described in Kaufman and Rousseeuw (1990) Finding Groups in data: an introduction to cluster analysis, Wiley, New York
- There are also
`hclust()`

in the stats package and`hcluster()`

in amap - pvclust is a package for assessing the uncertainty in hierarchical cluster analysis. It provides approximately unbiased p-values as well as bootstrap p-values. Enhanced plotting is also available through the dendextend package.
- dendextend Offers a set of functions for extending dendrogram objects in R. It allows to both adjust a tree’s graphical parameters - the color, size, type, etc of its branches, nodes and labels - as well as visually (and statistically) compare different dendrograms to one another.

Other partitioning methods

`kmeans()`

in stats provides k-means clustering and cmeans() in e1071 implements a fuzzy version of the k-means algorithm. The recommended package cluster also provides functions for various partitioning methodologies.- To compute the optimum number of clusters there is the
`pamk()`

function in the fpc package,`cascadeKM()`

in vegan,`Mclust()`

in mclust,`apcluster()`

in apcluster

Mixture models and model-based cluster analysis

Principle components and other projection, scaling, and ordination methods

- Principal Components (PCA) is available via the
`prcomp()`

function (based on svd),`rda()`

(in package vegan),`pca()`

(in package labdsv) and`dudi.pca()`

(in package ade4), provide more ecologically-orientated implementations. Plotting of PCA output is available in ggbiplot and ggfortify. - Redundancy Analysis (RDA) is available via
`rda()`

in vegan and`pcaiv()`

in ade4. - Canonical Correspondence Analysis (CCA) is implemented in
`cca()`

in both vegan and ade4. - Detrended Correspondence Analysis (DCA) is implemented in
`decorana()`

in vegan. - Principal coordinates analysis (PCO) is implemented in
`dudi.pco()`

in ade4,`pco()`

in labdsv,`pco()`

in ecodist, and`cmdscale()`

in package MASS. - Non-Metric multi-Dimensional Scaling (NMDS) is provided by
`isoMDS()`

in package MASS and`nmds()`

in ecodist.`nmds()`

, a wrapper function for`isoMDS()`

, is also provided by package labdsv. vegan provides helper function`metaMDS()`

for`isoMDS()`

, implementing random starts of the algorithm and standardised scaling of the NMDS results. The approach adopted by vegan with`metaMDS()`

is the recommended approach for ecological data. `corresp()`

and`mca()`

in MASS provide simple and multiple correspondence analysis respectively. ca also provides single, multiple and joint correspondence analysis.`ca()`

and`mca()`

in ade4 provide correspondence and multiple correspondence analysis respectively, as well as adding homogeneous table analysis with`hta()`

. Further functionality is also available within vegan co-correspondence is available from cocorresp. FactoMineR provides`CA()`

and`MCA()`

which also enable simple and multiple correspondence analysis as well as associated graphical routines. CAinterprTools has functions for correspondence analysis and diagnostics.- Seriation methods are available in seriation, which includes
`bertinplot()`

for producing battleship plots, and CAseriation which also has a battleship plotting function.

Dissimilarity coefficients

`dist()`

in standard package stats,`daisy()`

in recommended package cluster,`vegdist()`

in vegan,`dsvdis()`

in labdsv,`Dist()`

in amap,`distance()`

in ecodist, a suite of functions in ade4- simba provides functions for the calculation of similarity and multiple plot similarity measures with binary data (for instance presence/absence data)
`distance()`

in the analogue package can be used to calculate dissimilarity between samples of one matrix and those of a second matrix. The same function can be used to produce pair-wise dissimilarity matrices, though the other functions listed above are faster.`distance()`

can also be used to generate matrices based on Gower’s coefficient for mixed data (mixtures of binary, ordinal/nominal and continuous variables). Function`daisy()`

in package cluster provides a faster implementation of Gower’s coefficient for mixed-mode data than`distance()`

if a standard dissimilarity matrix is required. Function`gowdis()`

in package FD also computes Gower’s coefficient and implements extensions to ordinal variables.- DistatisR provides functions for three-way multidimensional scaling for the analysis of multiple distance/covariance matrices collected on the same set of observations.
- Simple and partial Mantel tests to compute the Mantel statistic as a matrix correlation between two dissimilarity matrices are available in vegan and ecodist

**Making maps and using R as a Geographical Information System**

- Making maps: maps, rworldmap, mapdata, maptools, mapproj, ggplot2, ggmap, RgoogleMaps, cartography RColorBrewer
- Scale bars and North arrows can be added to maps made with ggplot and ggmap using GISTools, ggsn or legendMap.
- Interactive mapping of spatial objects with zooming and panning is possible with leaflet and geomapview
- R has many packages that enable it to be used as a GIS for spatial analysis: sp, raster, rasterVis, shapefiles, spatial, spatstat, splancs, ipdw, geoR, argosfilter, ads, spdep, gstat, GISTools
- spgrass6 and rgrass7 provides facilities for using all GRASS geographical information system commands from the R command line. RQGIS establishes an interface between R and QGIS, i.e. it allows the user to access QGIS functionalities from within R.
- spdply provides methods for dplyr verbs for ‘sp’ and ‘Spatial’ class objects.
- rgdal uses the GDAL (Geospatial Data Abstraction Library) (raster) and OGR (vector) data I/O library, as well as PROJ.4 for CRS (coordinate reference systems) (re)projections
- rgeos uses the GEOS (Geometry Open Source) library, which powers PostGIS: does the ‘usual’ geometry operations for features
- The Spatial and Spatio Temporal task views have more details.

**Environmental & geological analysis**

- Transfer function models including weighted averaging (WA), modern analogue technique (MAT), Locally-weighted WA, & maximum likelihood (aka Gaussian logistic) regression (GLR) are provided by analogue, vegan, and rioja for stratigraphic analyses
- G2Sd gives full descriptive statistics and a physical description of sediments based on grain-size distributions, soiltexture and ggtern for ternary plots of soil texture
- Constrained clustering of stratigraphic data is provided by function
`chclust()`

in the form of constrained hierarchical clustering in rioja. - Stratigraphic columns can be plotted and analysed with the the SDAR package.
- Benn diagrams can be drawn with plotrix and Woodcock diagrams with RFOC.
- Function for circular statistics such as the Rayleigh test and many others, can be found in CircStats, RFOC, circular, Directional, and heR.Misc
- The siar package takes data on organism isotopes and fits a Bayesian model to their dietary habits based upon a Gaussian likelihood with a mixture dirichlet-distributed prior on the mean
- The zooaRch package has functions for survival analysis of zooarchaeological datasets
- Functions for tree ring analysis can be found in dplR
- See the Environmetrics task view for more.

**Dating**
- Radiocarbon dates can be calibrated using Bchron with various calibration curves (including user generated ones); also does Age-depth modelling, relative sea level rate estimation incorporating time uncertainty in polynomial regression models; and non-parametric phase modelling via Gaussian mixtures as a means to determine the activity of a site (and as an alternative to the Oxcal function SUM).
- Bayesian age-depth modelling of radiocarbon dates is also available in Bacon, and clam contains functions for “classical”, non-Bayesian age-depth modelling. These are not R packages, but clam has been packaged for easy use.
- The roxcal package allows you to use R to connect to a local installation of the OxCal software to calibrate radiocarbon dates and a variety of other OxCal operations.
- Various R functions for Luminescence Dating data analysis are in the Luminescence package (including radial plotting) and in the numOSL package, includingequivalent dose calculation, annual dose rate determination, growth curve fitting, decay curve decomposition, statistical age model optimization, and statistical plot visualization.
- The archSeries makes chronologies from information from multiple entities with varying chronological resolution and overlapping date ranges

**Phylogenetics, morphometrics, evolution and shape analysis**

- The Phylogenetics task view provides more detailed coverage of the subject area and related functions within R.
- Packages specifically tailored for the analysis of phylogenetic and evolutionary data include: ape, phytools, phangorn, Rphylip (requires PHYLIP), ouch, and pegas.
- For plotting trees most of these packages include their own modifications of the base
`plot()`

function, and there are also ggtree, ggdendro, dendextend, and ggphylo - Morphometric and shape analysis methods are provided by shapes, geomorph, paleomorph and Momocs. Related packages include shapeR Anthropometry and Morpho.
- StereoMorph allows users to collect 3D landmarks and curves from objects using two standard digital cameras.

**Image analysis**

- pixmap provides methods for creating, plotting and converting bitmapped images in three different formats: RGB, grey and indexed pixmaps. Similarly, jpeg provides an easy and simple way to read, write and display bitmap images stored in the JPEG format.
- EBImage (requires ImageMagick) provides general purpose functionality for the reading, writing, processing and analysis of images (and is very well documented). Various functions for image processing and analysis can also be found in ripa and imager
- magick provides bindings to the ImageMagick image-processing library, the most comprehensive open-source image processing package available.

**Simulations**

- RNetLogo links R and NetLogo
- simecol for simulating ecological (and other) dynamic systems. It can be used for differential equations, individual-based (or agent-based) and other models as well.
- One-dimensional cellular automata are also possible to model with the package CellularAutomaton.

**Network analysis**

- The two major packages are igraph, which is a generic network analysis and visualisation package, and sna, which performs social analysis of networks.
- Other packages include statnet, intergraph, network (manipulates and displays network objects), ergm (a set of tools to analyze and simulate networks based on exponential random graph models exponential random graph models), hergm (implements hierarchical exponential random graph models), and RSiena (allows the analyses of the evolution of social networks using dynamic actor-oriented models)

**Reproducible research**

- knitr enables R code and text with formatting instructions (eg. markdown or LaTeX) to be combined in a single document and executed to produce a document that contains rendered plots, analysed data and formatted text. The remake package has functions that enable declarative workflows so that each time an analysis is run, it updates only the parts of the workflow that have changed.
- The rrrpkg essay explains why the R package is a suitable file-and-folder structure for almost any research project, with real-world examples. manuscriptPackage, template, template (yes, that’s two slightly different packages with the same name), and prodigenr, makeProject, and ProjectTemplate are packages that give templates for organising an analysis as an R package (eg. where the manuscript is the package vignette, or similarly bundled with the package). rlp is a package that lets you write an analysis as a Rmd file and then converts it into a package.
- The rmarkdown package implements the simple markdown document formatting language with some minor customizations to recognize R code blocks and inline code. The bookdown package provides tools for single file and multi-chapter rmarkdown documents with all the usual scholarly accessories: citations, figures, tables, captions and cross-referencing. captioner and kfigr also provide functions for figure and table captions and cross-referencing in Rmd documents.
- The rticles package includes templates for converting markdown documents into PDF files formatted ready for submission for publication, such as in the PLOS journals, the Frontiers In journals, and Elsevier journals. Similarly, the papaja packge contains a template for converting a markdown file into an APA-formatted PDF. These packages depend on pandoc, a universal document format converter (not an R package). In this context it is used to convert rmarkdown or LaTeX to PDF, MS Word or HTML files. It is included with RStudio but can also be used stand-alone from the command line.
- packrat supports the development of isolated, stand-alone projects that include all the packages used and their dependencies. miniCRAN has functions to create a local repository to install packages (and their dependancies) from without internet access. Related packages include rbundler for package development which manages dependencies listed in a package’s DESCRIPTION file by storing them in a local project-specific library for installation, and pkgsnap for creating a snapshot of your installed CRAN packages with ‘snap’, and then using ‘restore’ on another system to recreate exactly the same environment.
- checkpoint allows you to install R packages from a specific snapshot date in the past, ensuring that you use the same package version that you started with, not a more recent one (related: gRAN can retrieve and build sources for any version of any non-base package that has ever been released on CRAN or BioConductor).
- rocker is a project that provides Docker containers to run R in a lightweight virtual environment, the hadleyverse container includes dplyr, ggplot2, etc., as well as RStudio server and LaTeX. The package harbor provides functions for controlling docker containers on local and remote hosts. The analogsea package has functions for deploying R and RStudio quickly & easily on DigitalOcean clusters using Docker images for cloud computing. The dockertest package contains functions for generating Dockerfiles from R packages and other R projects, and building Docker containers that contains all the package dependencies.

**Developing R code and packages**

- devtools (requires Rtools for Windows or Xcode for OSX) for easily creating R packages. Includes
`use_travis()`

and related functions for easily adding continuous integration for automated building and testing during package development. Mason helps you to quickly build R packages using an interactive Q&A to generate metadata files, READMEs with badges, git repositories, etc. - Goodpractice gives advice about good practices when building R packages. Advice includes functions and syntax to avoid, package structure, code complexity, code formatting, etc.
- badgecreatr generates badges for your readme file to signal the quality and current status of your package.
- roxygen2 for simplifying the creation of documentation for packages,
- testthat for developing tests of functions in packages
- Rcpp enables the use of C++ code in R packages for high performance computing, requires Rtools for Windows or Xcode for OSX
- editR is a basic Rmarkdown editor with instant previewing of your document. It allows you to create and edit Rmarkdown documents while instantly previewing the result of your writing and coding.
- RStudio is an integrated development environment that simplfies developing R code with numerous built-in conveniences, including vim keyboard shortcuts. There are also packages that make scholarly writing in RStudio easy: wordcountaddin, citr. And several for making nice tables: carpenter), htmlTable), pixiedust, pander, simpletable stargazer
- Emacs is a highly flexible text editor, which when used with the Emacs Speaks Statistics package, is a comprehensive R development environment. Org-mode provides a literate programming environment in Emacs similar to knitr.
- Style guide for writing R code by Hadley Wickham, and the packages formatR and rfmt which are designed to reformat R code to improve readability. The lintr package analyses code to check that it conforms to Hadley Wickham’s style guide (this package is built into RStudio)
- Idioms of R are discussed in the vignette of the rockchalk package, and Pat Burn’s essay the R Inferno.

**Datasets**

- archdata contains eleven archaeological datasets from around the world reported in published studies. These represent typical forms of archaeological data (and so are useful for teaching)
- binford contains more than 200 variables coding aspects of hunter-gatherer subsistence, mobility, and social organization for 339 ethnographically documented groups of hunter-gatherers, as used in Binford (2001)
*Constructing Frames of Reference: An Analytical Method for Archaeological Theory Building Using Ethnographic and Environmental Data Sets* - BSDA contains a dataset of 60 radiocarbon ages of observations taken from an archaeological site with four phases of occupation.
- cawd contains 15 datasets of ancient Greek, Roman and Persian maps and digital atlas data
- chemometrics contains a dataset of elemental concentrations for 180 archaeological glass vessels excavated from 15th - 17th century contexts in Antwerp.
- zooaRch contains two zooarchaeological datasets.

**Places to go for help**

`?`

function, eg.`?mean`

to get built-in help on the mean function`sos::findFn("rose diagram")`

searches all installed packages for the search term, using the sos package- Most major packages come with vignettes that narrate typical uses of the package’s core functions. Vignettes can be accessed with the command
`vignette(packagename)`

. - Google searches in the form: r help [search terms]
- A custom search engine of R resources: http://www.rseek.org/
- Graphical output from the examples in the documentation for all CRAN packages
- All R package documentation (including CRAN, GitHub and Bioconductor packages) is online in an easy-to-ready format at http://www.rdocumentation.org/
- Cheatsheets to print for handy reference: short one on base, longer one on base, ggplot2, dplyr, tidyr, rmarkdown, making packages, data.table, using colours, colours, numerous others
- stackoverflow is an online Q&A point-scoring website where questions and answers can be voted on to indicate their quality. Many highly skilled R programmers are active participants. Cross Validated is a similar Q&A site for questions about statistics
- You could send a message to the official r-help email list, but do be sure to read, follow and cite the posting guide. The list is also searchable.

### Related links:

- CRAN Task View: SocialSciences
- CRAN Task View: Spatial
- CRAN Task View: Spatio-temporal
- CRAN Task View: Cluster analysis
- CRAN Task View: Multivariate Statistics
- CRAN Task View: Bayesian inference
- CRAN Task View: Phylogenetics
- CRAN Task View: Robust
- CRAN Task View: Visualization
- CRAN Task View: Reproducible research
- David L. Carlson’s guides on using R for ‘Quantifying Archaeology’ by Stephen Shennan ‘Statistics for Archaeologists’ by Robert Drennan
- Matt Peeples’ scripts for archaeological statistics
- Gianmarco Alberti’s pages on Correspondence Analysis in Archaeology
- Quantitative Archaeology Wiki, including some code for battleship plots
- Michael Baxter’s ‘Notes on Quantitative Archaeology using R’
- GitHub repositoryRfor this Task View

#### Publications that include R code

Clarkson, C., Mike Smith, Ben Marwick, Richard Fullagar, Lynley A. Wallis, Patrick Faulkner, Tiina Manne, Elspeth Hayes, Richard G. Roberts, Zenobia Jacobs, Xavier Carah, Kelsey M. Lowe, Jacqueline Matthews, S. Anna Florin 2015 The archaeology, chronology and stratigraphy of Madjedbebe (Malakunanja II): A site in northern Australia with early occupation. Journal of Human Evolution 8, 46–64 http://dx.doi.org/10.1016/j.jhevol.2015.03.014

Conrad, C., Higham, C., Eda, M. and Marwick, B. (2016) Paleoecology and Forager Subsistence Strategies During the Pleistocene-Holocene Transition: A Reinvestigation of the Zooarchaeological Assemblage from Spirit Cave, Mae Hong Son Province, Thailand. Asian Perspectives 55(1). https://github.com/cylerc/AP_SC

Contreras, Daniel A. and John Meadows. (2014) “Summed radiocarbon calibrations as a population proxy: a critical evaluation using a realistic simulation approach.” Journal of Archaeological Science 52:591-608. doi:10.1016/j.jas.2014.05.030

Crema, E. R., J. Habu, K. Kobayashi and M. Madella (2016). “Summed Probability Distribution of ^{14}C Dates Suggests Regional Divergences in the Population Dynamics of the Jomon Period in Eastern Japan.” PLoS ONE 11(4): e0154809., GitHub repo, Zenodo repo.

Crema, E.R., K. Edinborough, T. Kerig, S.J. Shennan (2014) An Approximate Bayesian Computation approach for inferring patterns of cultural evolutionary change, Journal of Archaeological Science, Volume 50 Pages 160-170 http://dx.doi.org/10.1016/j.jas.2014.07.014

Drake BL, Wills WH, Hamilton MI, Dorshow W (2014) Strontium Isotopes and the Reconstruction of the Chaco Regional System: Evaluating Uncertainty with Bayesian Mixing Models. PLoS ONE 9(5): e95580. doi:10.1371/journal.pone.0095580

Drake, Brandon L., David T. Hanson, James L. Boone (2012) The use of radiocarbon-derived Δ13C as a paleoclimate indicator: applications in the Lower Alentejo of Portugal, Journal of Archaeological Science, Volume 39, Issue 9, September 2012, Pages 2888-2896, http://dx.doi.org/10.1016/j.jas.2012.04.027

Drake, Brandon L., (2012) The influence of climatic change on the Late Bronze Age Collapse and the Greek Dark Ages, Journal of Archaeological Science, Volume 39, Issue 6, June 2012, Pages 1862-1870 http://dx.doi.org/10.1016/j.jas.2012.01.029

Drake, Brandon L., WH Wills, and Erik B Erhardt (2012) The 5.1 ka aridization event, expansion of piñon-juniper woodlands, and the introduction of maize (Zea mays) in the American Southwest The Holocene December 2012 22: 1353-1360, first published on July 9, 2012 doi:10.^{1177}⁄_{0959683612449758}

Dye, Thomas S. (2011). “A Model-based Age Estimate for Polynesian Colonization of Hawai‘i”. Archaeology in Oceania 46, pp. 130–138 https://github.com/tsdye/hawaii-colonization

Dye, T. S. (2016). “Long-term rhythms in the development of Hawaiian social stratification.” Journal of Archaeological Science 71: 1-9.

Lightfoot E and O’Connell TC (2016).“On The Use of Biomineral Oxygen Isotope Data to Identify Human Migrants in the Archaeological Record: Intra-Sample Variation, Statistical Methods and Geographical Considerations.” PLoS ONE 11(4). http://doi:10.1371/journal.pone.0153850, code and data: https://www.repository.cam.ac.uk/handle/1810/252773

Lowe, K., Wallis, L., Pardoe, C., Marwick, B., Clarkson, C., Manne, T., Smith, M. and R. Fullagar 2014 Ground-penetrating radar and burial practices in western Arnhem Land, Australia. Archaeology in Oceania 49(3): 148–157 http://onlinelibrary.wiley.com/doi/10.1002/arco.5039/abstract

Mackay A, Sumner A, Jacobs Z, Marwick B, Bluff K and Shaw M 2014. Putslaagte 1 (PL1), the Doring River, and the later Middle Stone Age in southern Africa’s Winter Rainfall Zone. Quaternary International. http://dx.doi.org/10.1016/j.quaint.2014.05.007

Marwick, B. 2016. Computational reproducibility in archaeological research: Basic principles and a case study of their implementation. Journal of Archaeological Method and Theory, 1-27. doi: 10.1007/s10816-015-9272-9, text source repo

Marwick, B., 2013. Multiple Optima in Hoabinhian flaked stone artefact palaeoeconomics and palaeoecology at two archaeological sites in Northwest Thailand. Journal of Anthropological Archaeology 32, 553-564. http://dx.doi.org/10.1016/j.jaa.2013.08.004

Marwick, B. 2013. Discovery of Emergent Issues and Controversies in Anthropology Using Text Mining, Topic Modeling, and Social Network Analysis of Microblog Content. In Yanchang Zhao, Yonghua Cen (eds) Data Mining Applications with R. Elsevier. p. 63-93 https://github.com/benmarwick/AAA2011-Tweets

Shennan, SJ, Enrico R. Crema, Tim Kerig, (2014) Isolation-by-distance, homophily, and ‘core’ vs. ‘package’ cultural evolution models in Neolithic Europe, Evolution and Human Behavior, Available online 2 October 2014, http://dx.doi.org/10.1016/j.evolhumbehav.2014.09.006

#### Contributors

Ben Marwick, Agustin Diez Castillo, Allar Haav, Sebastian Heath, Phil Riris, Tom Brughmans, Lee Drake, Stefano Costa, Enrico Crema, Domenico Giusti, Matt Peeples, Mark Madsen, Daniel Contreras, Tal Galili