Author: M.B. Dale 1
View More View Less
  • 1 Australian School of Environmental Studies, Griffith University Nathan, Qld. 4111, Australia
Restricted access

This paper examines how we might test the continuum theory against the community unit theory. Adherence to one or other of these models without testing is simply an assignment of an extreme prior probability to the preferred option. The question can be rephrased to ask whether, for a set of observations, a single model is adequate or whether a mixture of models would be preferable. To judge between them involves first defining the nature of the model(s) to be fitted in each case and then comparing the complexity and quality of fit. Occam's razor suggests that we should seek the simplest model with adequate fit, with parameters estimated with optimal precision. The simplest comparison of the two theories thus requires only the estimation of the number of clusters for the chosen model(s) of within-cluster variation. If a single cluster is of adequate quality then the continuum model is appropriate, while if several are needed then the community model is preferable for that particular dataset. To establish universal applicability of either model involves investigation of many datasets. There are several ways in which model quality can be assessed, and here I concentrate on the minimal message length principle which is a function of the prior probability of the model and its fit to the observed data, assuming the model to be correct. This principle has been shown to perform well when compared with other possibilities. I first illustrate the procedure for making a choice between models, using a simple model, then examine two alternative formulations of within-cluster models which seem more appropriate, one static, the other dynamic.

  • Bruun, H. H. and Erjnaes, R. 2000. Classification of dry grassland vegetation in Denmark. J. Veg. Sci. 11: 585-596.

    'Classification of dry grassland vegetation in Denmark ' () 11 J. Veg. Sci. : 585 -596.

  • Dale, M. B. 2000. On plexus representation of dissimilarities. Community Ecology 1:43-56.

    'On plexus representation of dissimilarities ' () 1 Community Ecology : 43 -56.

  • Dale, M. B. 1994. Straightening the horseshoe: a Riemannian resolution? Coenoses 9: 43-53.

    'Straightening the horseshoe: a Riemannian resolution ' () 9 Coenoses : 43 -53.

  • Gamberger, D. and Lavra, N. 1997. Conditions for Occam's razor applicability and noise elimination. In: Proc. 9th European Conf. Machine Learning. Springer Verlag. pp. 108-123.

    Conditions for Occam's razor applicability and noise elimination , () 108 -123.

  • Gilbert, N. and Wells. T. C. E. 1966. Analysis of quadrat data. J. Ecol. 54: 675-686.

    'Analysis of quadrat data ' () 54 J. Ecol. : 675 -686.

  • Wallace, C. S., Korb, K. B. and Dai, H. 1996. Causal discovery via MML. Tech. Rep. 96/254 Dept. Computer Science, Monash University, Clayton, Victoria 3168, Australia.

    Causal discovery via MML. Tech. Rep. 96/254 , ().

  • Webb, G. I. 1996. Further experimental evidence against the utility of Occam's Razor. J. Artif. Intell. Res. 4: 387-417.

    'Further experimental evidence against the utility of Occam's Razor ' () 4 J. Artif. Intell. Res. : 387 -417.

    • Search Google Scholar
  • Wisheu, I. and Keddy, P. A. 1992. Competition and centrifugal organisation of plant communities: theory and tests. J. Veg.Sci. 3: 147-156.

    'Competition and centrifugal organisation of plant communities: theory and tests ' () 3 J. Veg. Sci. : 147 -156.

    • Search Google Scholar
  • Barron, A. R. and Conover, T. M. 1991. Minimum complexity density estimation.I. E. E. E. Trans. Inform. Theory 31: 1034-1054.

    'Minimum complexity density estimation ' () 31 I. E. E. E. Trans. Inform. Theory : 1034 -1054.

    • Search Google Scholar
  • Kreinovich, V. and Kunin, I. A. 2003. Kolmogorov complexity and chaotic phenomena. Internatl. J. Engineering Science 41: 483-493.

    'Kolmogorov complexity and chaotic phenomena. Internatl ' () 41 J. Engineering Science : 483 -493.

    • Search Google Scholar
  • Boerlijst, M. C. 2000 Spirals and spots: novel evolutionary phenomena through spatial self-structuring. In: U. Dieckmann, R. Law and H. Metz (eds.), The Geometry of Ecological Interactions: Simplifying Spatial Complexity, Cambridge University Press, Cambridge, pp. 171-182.

    Spirals and spots: novel evolutionary phenomena through spatial self-structuring , () 171 -182.

    • Search Google Scholar
  • Boulton, D. M. and Wallace, C. S. 1970. A program for numerical classification. Comput. J. 13: 63-69.

    'A program for numerical classification ' () 13 Comput. J. : 63 -69.

  • Boulton, D. M. and Wallace, C. S. 1973. An information measure for hierarchic classification. Comput. J. 16: 254-261.

    'An information measure for hierarchic classification ' () 16 Comput. J. : 254 -261.

  • Brokaw, N. and Busing, R. T. 2000. Niche versus chance in tree diversity in forest gaps. TREE 15: 183-188.

    'Niche versus chance in tree diversity in forest gaps ' () 15 TREE : 183 -188.

  • Dale, M. B., Dale, P. E. R. and Edgoose, T. 2002a. Markov models for incorporating temporal dependence. Acta Oecologica 23: 261-269.

    'Markov models for incorporating temporal dependence ' () 23 Acta Oecologica : 261 -269.

  • Dale, M. B., Dale, P. E. R., Li, C. and Biswas, G. 2002b. Assessing impacts of small perturbations using a model-based approach.Ecol. Modell. 156: 185-199.

    'Assessing impacts of small perturbations using a model-based approach ' () 156 Ecol. Modell. : 185 -199.

    • Search Google Scholar
  • Dale, M. B., Salmina, L. and Mucina, L. 2001. Minimum message length clustering: an explication and some applications to vegetation data Community Ecology 2: 231-247.

    'Minimum message length clustering: an explication and some applications to vegetation data ' () 2 Community Ecology : 231 -247.

    • Search Google Scholar
  • Dale, P. E. R. and Dale, M. B. 2002. Optimal classification to describe environmental change: pictures from the exposition. Community Ecology 3: 19-30.

    'Optimal classification to describe environmental change: pictures from the exposition ' () 3 Community Ecology : 19 -30.

    • Search Google Scholar
  • Davis, R. I. A., Lovell, B. C. and Caelli, T. 2002. Improved estimation of hidden Markov model parameters from multiple observation sequences. In: R. Kasturi, D. Laurendeau and C. Suen (eds.), Proc. Internatl. Conf. Pattern Recognition, August 11-14 II, Quebec City, Canada, pp. 168-171.

    Improved estimation of hidden Markov model parameters from multiple observation sequences , () 168 -171.

    • Search Google Scholar
  • Desrochers, R. E. and Anand, M. 2003. The use of taxonomic diversity indices in the assessment of perturbed community recovery. In: Proc. 4th Internatl. Conf. Ecosystems and Sustainable Development, June 4-6,2003, Siena, Italy. WIT Press, Southampton.

    The use of taxonomic diversity indices in the assessment of perturbed community recovery , ().

    • Search Google Scholar
  • Domingos P. 1999. The role of Occam's Razor in knowledge discovery. Data Mining and Knowledge Discovery 3: 409-425.

    'The role of Occam's Razor in knowledge discovery ' () 3 Data Mining and Knowledge Discovery : 409 -425.

    • Search Google Scholar
  • Edwards, R. T. and Dowe, D. 1998. Single factor analysis in MML mixture modelling. Lecture Notes in Artificial Intelligence 1394, Springer Verlag, Berlin, pp. 96-109.

    Single factor analysis in MML mixture modelling , () 96 -109.

  • Erjnćs, R. and Bruun, H. H. 2000. Gradient analysis of dry grassland vegetation in Denmark. J. Veg. Sci. 11: 573-584.

    'Gradient analysis of dry grassland vegetation in Denmark ' () 11 J. Veg. Sci. : 573 -584.

    • Search Google Scholar
  • Fisher, D. H. 1992. Pessimistic and optimistic induction. TR CS-92-12 Dept. Comput. Sci., Vanderbilt Univ.

    Pessimistic and optimistic induction. TR CS-92-12 , ().

  • Gillison, A. N. and Brewer, K. R. W. 1985. The use of gradient directed transects or gradsects in natural resource surveys. J. Environ. Manage. 20: 103-127.

    'The use of gradient directed transects or gradsects in natural resource surveys ' () 20 J. Environ. Manage. : 103 -127.

    • Search Google Scholar
  • Goodall, D. W 1953. Objective methods for the classification of vegetation: the use of positive interspecific correlation. Austral. J. Bot. 1:39-63.

    'Objective methods for the classification of vegetation: the use of positive interspecific correlation ' () 1 Austral. J. Bot. : 39 -63.

    • Search Google Scholar
  • Hájek, P. and Havránek, T. 1977. On generation of inductive hypotheses. International. J. Man-Mach. Stud. 9: 415-438.

    'On generation of inductive hypotheses ' () 9 International. J. Man-Mach. Stud. : 415 -438.

    • Search Google Scholar
  • Anand, M. and Orlóci, L. 1997. Chaotic dynamics in a multispecies community. Ecological and Environmental Statistics 4: 337-344.

    'Chaotic dynamics in a multispecies community ' () 4 Ecological and Environmental Statistics : 337 -344.

    • Search Google Scholar
  • Attias, H. 1999. Independent factor analysis. Neural Computation 11:803-851.

    'Independent factor analysis ' () 11 Neural Computation : 803 -851.

  • Bar-Yam, Y. 2002. Sum rule for multiscale representations of kinematically described systems. Advances in Complex Systems 5: 409-431.

    'Sum rule for multiscale representations of kinematically described systems ' () 5 Advances in Complex Systems : 409 -431.

    • Search Google Scholar
  • Tucker, B. C. and Anand, M. 2003. The use of matrix models to detect natural and pollution-induced forest gradients. Community Ecology 4: 89-100.

    'The use of matrix models to detect natural and pollution-induced forest gradients ' () 4 Community Ecology : 89 -100.

    • Search Google Scholar
  • Dale, M. B. 2001. Minimal message length clustering, environmental heterogeneity and the variable Poisson model. Community Ecology 2: 171-180.

    'Minimal message length clustering, environmental heterogeneity and the variable Poisson model ' () 2 Community Ecology : 171 -180.

    • Search Google Scholar
  • Dale, M. B. 2002. Models, measures and messages: an essay on the role of induction. Community Ecology 3: 191-204.

    'Models, measures and messages: an essay on the role of induction ' () 3 Community Ecology : 191 -204.

    • Search Google Scholar
  • Hanson, R. Stutz, J. and Cheeseman, P. 1991. Bayesian Classification with Correlation and Inheritance. In: Proc. 12th International Joint Conference on Artificial Intelligence. Sydney, Australia. August 24-30. Morgan Kaufmann, San Francisco, pp. 692-698.

    Bayesian Classification with Correlation and Inheritance , () 692 -698.

  • Hogeweg, P. 2002. Computing an organism: on the interface between informatic and dynamic processes. BioSystems 64: 97-109.

    'Computing an organism: on the interface between informatic and dynamic processes ' () 64 BioSystems : 97 -109.

    • Search Google Scholar
  • Ihm, P and van Groenewoud, H. 1975. A multivariate ordering of vegetation data based on Gaussian type gradient response curves. J. Ecol. 63:161-111.

    'A multivariate ordering of vegetation data based on Gaussian type gradient response curves ' () 63 J. Ecol. : 161 -111.

    • Search Google Scholar
  • Jelinski, D. E. and Wu, J-G. 1996. The modifiable areal unit problem and implications for landscape ecology. Landscape Ecology 11: 129-140.

    'The modifiable areal unit problem and implications for landscape ecology ' () 11 Landscape Ecology : 129 -140.

    • Search Google Scholar
  • Kiers, H. A. L. 1994. SIMPLIMAX: oblique rotation to an optimal target with simple structure. Psychometrika 59: 567-579.

    'SIMPLIMAX: oblique rotation to an optimal target with simple structure ' () 59 Psychometrika : 567 -579.

    • Search Google Scholar
  • Li, C, Biswas, G., Dale, M. B. and Dale, P. E. R. 2002. Matryoshka: A HMM based temporal data clustering methodology for modelling system dynamics. Intelligent Data Analysis Journal (in press)

    'Matryoshka: A HMM based temporal data clustering methodology for modelling system dynamics ' () Intelligent Data Analysis Journal .

    • Search Google Scholar
  • Lippe, E., de Smidt, J. and Glenn-Lewin, D. 1985. Markov models and succession: a test from a heathland in the Netherlands. J. Ecol. 73: 775-791.

    'Markov models and succession: a test from a heathland in the Netherlands ' () 73 J. Ecol. : 775 -791.

    • Search Google Scholar
  • Neal, R. M. 1998. Markov chain sampling methods for Dirichlet process mixture models. Tech. Rep. 9815, Department of Statistics, Univ. Toronto.

    Markov chain sampling methods for Dirichlet process mixture models. Tech. Rep. 9815 , ().

    • Search Google Scholar
  • Rietkerk, M., Boerlijst, M. C, van Langevelde, F., HilleRisLambers, D., van der Koppel, J., Kumar, L. Prins, H. H. T. and de Roos, A. M. 2002. Self-organization of vegetation in arid ecosystems. Amer. Natur. 160: 524-530.

    'Self-organization of vegetation in arid ecosystems ' () 160 Amer. Natur. : 524 -530.

  • Rissanen, J. J. 1978. Modelling by shortest data description. Automatika 14: 465-471.

    'Modelling by shortest data description ' () 14 Automatika : 465 -471.

  • Rissanen, J. J. 1987. Stochastic complexity. J. Royal Statist. Soc. B 49: 223-239

    'Stochastic complexity ' () 49 J. Royal Statist. Soc. B : 223 -239.

  • Rissanen, J. J. 1996. Fisher information and stochastic complexity. I E. E. E. Trans. Information Theory 42: 40-47.

    'Fisher information and stochastic complexity ' () 42 I E. E. E. Trans. Information Theory : 40 -47.

    • Search Google Scholar
  • Shalizi, C. R., and Crutchfield, J. P. 1999. Computational mechanics: Pattern and prediction, structure and simplicity. Sante Fe Institute Working Paper 99-07-044.

  • Uebersax, J. S. and Grove, W. M. 1993. A latent trait finite mixture model for the analysis of rating agreement. Biometrics 49: 823-835.

    'A latent trait finite mixture model for the analysis of rating agreement ' () 49 Biometrics : 823 -835.

    • Search Google Scholar
  • Wallace, C. S. 1995. Multiple factor analysis by MML estimation. Tech. Rep. 95/218, Dept Computer Science, Monash University, Clayton, Victoria 3168, Australia. 21 pp.

    Multiple factor analysis by MML estimation. Tech. Rep. 95/218 , () 21.

  • Wallace, C. S. 1996. MML Inference of predictive trees, graphs and nets. In: Gammerman, A. (ed.), Computational Learning and Probabilistic Reasoning, John Wiley, London, pp. 43-66.

    MML Inference of predictive trees, graphs and nets , () 43 -66.

  • Wallace, C. S. 1998. Intrinsic classification of spatially-correlated data. Comput. J. 41: 602-611.

    'Intrinsic classification of spatially-correlated data ' () 41 Comput. J. : 602 -611.

  • Neil, J. R., Wallace, C. S. and Korb, K. B. 1999. Bayesian networks with non-interacting causes. Tech. Rep. 1999/28, Dept. Computer Science, Monash University, Melbourne.

    Bayesian networks with non-interacting causes. Tech. Rep. 1999/28 , ().

  • Openshaw, S. 1984. The modifiable areal unit problem. CATMOG 38. GeoBooks, Norwich, England.

    The modifiable areal unit problem. CATMOG 38 , ().

  • Orlóci, L., Anand, M. and He, X. S. 1993. Markov chain: a realistic model for temporal coenosere? Biométrie-Praximétrie 33: 7-26.

    'Markov chain: a realistic model for temporal coenosere ' () 33 Biométrie-Praximétrie : 7 -26.

    • Search Google Scholar
  • Pagie, L. and Hogeweg, P. 1999. Colicin diversity: a result of ecoevolutionary dynamics. J. Theoret. Biol. 196: 251-261.

    'Colicin diversity: a result of ecoevolutionary dynamics ' () 196 J. Theoret. Biol. : 251 -261.

    • Search Google Scholar
  • Posse, C. 1995 Projection pursuit exploratory data analysis. Computat. Statist. Data Anal. 20: 669-687.

    'Projection pursuit exploratory data analysis ' () 20 Computat. Statist. Data Anal. : 669 -687.

    • Search Google Scholar
  • Wallace, C. S. and Dowe, D. L. 2000. MML clustering of multi-state, Poisson, von Mises circular and Gaussian distributions. Statistics and Computing 10: 73-83.

    'MML clustering of multi-state, Poisson, von Mises circular and Gaussian distributions ' () 10 Statistics and Computing : 73 -83.

    • Search Google Scholar
  • Wallace, C. S. and Freeman, P. R. 1992. Single-factor analysis by minimal message length estimation. J. Roy. Statist. Soc. B 54:195-209.

    'Single-factor analysis by minimal message length estimation ' () 54 J. Roy. Statist. Soc. B : 195 -209.

    • Search Google Scholar
  • Wallace, C. S. and Georgerf, M. P. 1983. A general objective for inductive inference. Tech. Rep. 32, Dept. Computer Science, Monash University, 3168 Australia.

    A general objective for inductive inference. Tech. Rep. 32 , ().

  • Young, P., Parkinson, S. and Lees, M. 1996. Simplicity out of complexity in environmental modelling: Occam's razor revisited. J. Appl. Statist. 234: 165-210.

    'Simplicity out of complexity in environmental modelling: Occam's razor revisited ' () 234 J. Appl. Statist. : 165 -210.

    • Search Google Scholar
  • Dale, M. B. and Anderson, D. J. 1973. Inosculate analysis of vegetation data Austral. J. Bot. 21:253-276.

    'Inosculate analysis of vegetation data Austral ' () 21 J. Bot. : 253 -276.

  • Stone, J. V. and Porrill, J. 1998. Independent component analysis and Projection Pursuit: a tutorial introduction. Available as file ica_ tutorial2.tex from www.shef.ac.uk/psychology/stone

  • Trunk, G. V. 1976. Statistical estimation of the intrinsic dimensionality of data collections. Inform. Control. 12: 508-525.

    'Statistical estimation of the intrinsic dimensionality of data collections ' () 12 Inform. Control. : 508 -525.

    • Search Google Scholar
  • Ramsey, J. B. and Yuan, H-J. 1990 The statistical properties of dimension calculations using small data sets. Nonlinearity 3: 155-176.

    'The statistical properties of dimension calculations using small data sets ' () 3 Nonlinearity : 155 -176.

    • Search Google Scholar
  • Shipley, B. and Keddy, P. A. 1987. The individualistic and community-unit concepts as falsifiable hypotheses. Vegetatio 69: 47-55.

    'The individualistic and community-unit concepts as falsifiable hypotheses ' () 69 Vegetatio : 47 -55.

    • Search Google Scholar
  • Legendre, P. and Gallagher, E. 2001. Ecologically meaningful transformations for ordination of species data. Oecologia 270: 271-280.

    'Ecologically meaningful transformations for ordination of species data ' () 270 Oecologia : 271 -280.

    • Search Google Scholar