Abstract
The easier access to increasingly powerful computational approaches and tools in the field of distribution modelling, has contributed to a proliferation of data, applications, practitioners, guidelines, and novel theoretical understandings. Recognising the dynamic link in how these elements influence one another is critical as the discipline and practices develop. The challenge of how to implement the statistically and computationally complex theory behind the MaxEnt modelling method has been overcome by the practical simplicity of the powerful, platform independent and free Java™ tool, maxent.jar. Lowering this computational, and accessibility threshold, has meant the increased use and further development of relevant digital ecological data, such as biodiversity/occurrence records held in natural history collections worldwide (GBIF -Global Biodiversity Information Facility) and GIS layers of spatio-temporal environmental background layers being developed across a diverse range of fields.
However, the computational advantages of the fixed options offered by the software have come at the expense of a full exploration of the potentials of this statistical method. Over time, the popularity of the practical shortcuts have resulted in an uncritical acceptance of the defaults, a conflation of the statistical method with the software’s black box approach, and a disconnection between theoretical and practical implications of the modelling process. A more flexible and explicit integration of these two, facilitates a much needed comparison between, and testing of, these theoretical and practical defaults, options and settings.
The aim of this thesis is to reduce the gap between the how practitioners can work with these practical tools, their understanding the body of DM theory, and MaxEnt in particular. PAPER 1 lays out the theoretical description of a novel interpretation of MaxEnt, with new settings and options, such as a new model selection and model assessment criteria, and improved user control of the variable selection process. To test this new theory in a practical way, new informatics driven approaches and tools were developed. PAPER 2 provides their detailed description and presents them as a modular toolbox in the form of a set of flexible Rscripts and functions. This new MaxEnt modelling approach and toolbox are used in PAPER 3, which looks specifically at how to identify and tackle the potential effects of sampling bias in presence only (PO) data obtained from museum collections. The application value of this alternative MaxEnt modelling procedure (aMp) is further explored and tested in PAPERS 4 and 5, where conservation management issues are addressed, as well as model purpose, model fitting and properties of the data. PAPER 4 explores how distribution modelling can be combined with phylogeographic analysis to address spatial temporal conservation issues. PAPER 5 makes use of fine grained remotely sensed LiDAR data, to explore issues related both to data properties (accuracy, spatial autocorrelation) and model complexity (variable and model selection, and model improvement). All MaxEnt models are evaluated against an independently collected field dataset, and theoretical and practical implications are discussed. PAPER 6 makes full use of this new theoretical approach and practical toolbox, and addresses MaxEnt model selection strategy by testing eight different combinations of model complexity and data properties. Finally, the paper discusses additional benefits these tool enhancements of the MaxEnt model performance and also the ecological interpretability are discussed.
In modelling, there is no single or best approach that works for everyone. There are always alternative approaches owing to our individual differences as practitioners, not solely based on the modelling tools or purposes alone. This thesis makes explicit use of both Ecological and Informatics approaches to perform a broad-scoped assessment of the relative performance of different combinations of MaxEnt options and their settings for DM with different modelling purposes, including of the specific properties of the data. By adding a flexible and traceable way to tackle this both theoretically and practically, I’ve attempted the reduce gap between the how the practitioners can work with the tools and the body of theory.