Correlation

From NECSIWiki

Jump to: navigation, search

This page is about correlation in the physical sense. The correlation between strings of numbers or characters is a special case. The information one has about the state of a thermodynamic system (simple or complex) is another special case. Some measures focus only on second moment correlations (like the Pearson correlation coefficient, parameterized auto and cross covariances, and pair correlation functions) while others consider all moments (like mutual information and Kullback-Leibler divergence).


Contents

[edit] Multi-moment correlation measures

Multi-moment expansion
Multi-moment expansion

Surprisals add where probabilities multiply. The surprisal for an event of probability p is defined as s ≡ kln[1/p]. If k is {1,1/ln2,1.38×10-23} then surprisal is in {nats, bits, or Joule/Kelvin} so that, for instance, there are N bits of surprisal for landing all "heads" on a toss of N coins.

Entropy, uncertainty, or average surprisal: S ≡ Σipis[pi] ≥ 0 where Σipi = 1 and i=1,N.

Kullback-Leibler divergence or net surprisal of {po} from {p}, a general measure of useful information, is defined via the quantities above as ΔI ≡ Σipi(s[poi]-s[pi]) = kΣipiln[pi/poi] ≥ 0. This is multi-moment in that the Taylor-expansion of ln[p/po] about po can be used to rewrite ΔI as an infinite series of dimensionless moments (specifically, averaged powers of p/po-1) starting with mean, variance, etc. Mutual information is simply KL-divergence in the multiple-subsystem case, when the reference (o) state assumes that the subsystems are uncorrelated.

[edit] Applications

[edit] Image offset analysis

Cross-covariance functions show maxima at offset vectors (Δx values) associated with similar features in two separate data fields. These maxima allow one to estimate the relative offset of the same object when seen from two separate vantage points (in effect to do fuzzy pattern recognition).

Such analyses are likely being done very rapidly by the human visual system in our evolved implementation of depth-perception from stereo-images (one from each of two eyes). These calculations are also useful when done by computers, for example in alignment algorithms for facial recognition analysis, in lateral-displacement maps obtained from atomic force microscope images of the same field obtained by a tip scanning in two different directions, and in Microsoft's upcoming photosynth software for reconstructing three-dimensional structures given a collection of photographs from a wide range of vantage points.

[edit] Harmonic analysis

The Fourier transform of a real n-dimensional dataset's auto-covariance (or pair correlation) function is a real n-dimensional power spectrum. For the n=1 case, digital fast Fourier transforms (FFTs) are often used to examine the strength of temporal frequencies in a complex sound or an NMR output signal.

Diffraction (in the kinematical approximation) is an optical analog of digital power spectrum analysis for n=2,3. It provides information on scattering power as a function of spatial frequency in objects that scatter an incoming plane wave. The pair correlation or Patterson function in X-ray diffraction is used to help recover the Fourier phase information not available from diffraction. Fluctuation or variable coherence-width microscopy also utilizes some of that Fourier phase information to go one step beyond 2nd-moment analysis, and explore pair-pair correlations from point to point in disordered solids.

[edit] Ecology and related fields

Unbiased estimates for the KL divergence "of model from reality" are useful in ranking models against experimental data with help from Akaike Information Criterion applied to the residuals that the models fail to explain. The fundamental underpinnings of this approach suggest that it will be adapted to assist with paradigm optimization in other application areas as well.

[edit] Mutual information

The KL divergence "of uncorrelated from correlated" measures the mutual information associated with fidelity in communications theory, inheritance in clade analysis, and entanglement in quantum computing. The applications are wide ranging. For instance, Lempel-Ziv-Welch (e.g. ZIP) compression methods developed with help from communications theory can as a result also be used to track plagarism in student reports, the phylogeny of mitochondrial DNA strings, and the path through the mails of a chain letter copied with transcription errors before the days of electronic replication.

[edit] Thermal physics

Pressure versus volume plot of available work from a mole of Argon gas relative to ambient, calculated as To times KL divergence.
Pressure versus volume plot of available work from a mole of Argon gas relative to ambient, calculated as To times KL divergence.

Best-guess states (e.g. for atoms in a gas) are inferred by maximizing the average-surprisal S (entropy) for a given set of control parameters (like pressure P or volume V). This constrained entropy maximization, both classically and quantum mechanically, minimizes Gibbs availability in entropy units A≡-klnZ where Z is a constrained multiplicity or partition function.

When temperature T is fixed, free-energy (T times A) is also minimized. Thus if T, V and number of molecules N are constant, the Helmholtz free energy F≡U-TS (where U is energy) is minimized as a system "equilibrates". If T and P are held constant (say during processes in your body), the Gibbs free energy G≡U+PV-TS is minimized instead. The change in free energy under these conditions is a measure of available work that might be done in the process. Thus available work for an ideal gas at constant temperature To and pressure Po is W = ΔG = NkToΘ[V/Vo] where Vo = NkTo/Po and Θ[x]≡x-1-lnx≥0.

More generally the work available relative to some ambient is obtained by multiplying ambient temperature To by KL-divergence or net-surprisal ΔI≥0, defined as the average value of kln[p/po] where po is the probability of a given state under ambient conditions. For instance, the work available in equilibrating a monatomic ideal gas to ambient values of Vo and To is thus W=ToΔI, where KL-divergence ΔI=Nk(Θ[V/Vo]+3/2Θ[T/To]). The resulting contours of constant KL-divergence, e.g. for a mole of Argon at standard temperature and pressure, put limits on the conversion of hot to cold as in flame-powered air-conditioning or in an unpowered device to convert boiling-water to ice-water. Thus the KL-divergence "of ambient from actual" can be used to measure thermodynamic availability in bits.

[edit] Evolving complexity

Looking at the world around, you can see that complexity has evolved in a hierarchical way. New gradients and boundaries typically emerge by the (sometimes unexpected) breaking of old symmetries. In context of those new boundaries, new subsystems emerge whose activity in turn determines how their population and structure changes over time.

This evolution is often driven by the conversion of energy replete with thermodynamic availability into heat at ambient temperature. The resulting loss of availability is partly offset by the creation of new subsystem correlations. For instance, martial arts enthusiasts gain net surprisal in their world by working out, in the process thermalizing available work from solar photons that was stored as chemical energy by plants.

Thus stars form by gravitational collapse of a cloud of gas and dust. Planetary surfaces emerge as solids condensed in a cooling solar nebula accrete in orbit around a new star. Cell membranes form in the face of sunlight-driven and volcano-driven bio-geo-chemical cycles on planet surfaces thanks to the fact that those bilayer membranes help preserve molecule patterns able to selectively replicate and hence adapt.

Given sufficient environmental stability on some planets, multicelled lifeforms might even emerge for a short while and begin to buffer niche-layer correlations with respect to metazoan skin, molecule codepool (e.g. family) and idea codepool (e.g. culture). Their rarity is but one reason that such processes may deserve careful management and respect.

[edit] References

  • S. Kullback and R. A. Leibler (1951) On information and sufficiency, Annals of Mathematical Statistics 22:79-86.
  • S. Kullback (1959) Information theory and statistics (John Wiley and Sons, NY).
  • S. Kullback (1987) The Kullback-Leibler distance, The American Statistician 41:340-341.
  • Kenneth P. Burnham and David R. Anderson (2001) Kullback-Leibler information as a basis for strong inference in ecological studies, Wildlife Research 28:111-119.
  • Burnham, K. P. and Anderson D. R. (2002) Model Selection and Multimodel Inference: A Practical Information-Theoretic Approach, Second Edition (Springer Science, New York) ISBN 978-0-387-95364-9.
  • Myron Tribus (1961) Thermodynamics and thermostatics (D. Van Nostrand, New York)
  • E. T. Jaynes (1957) Information theory and statistical mechanics, Physical Review 106:620
  • E. T. Jaynes (1957) Information theory and statistical mechanics II, Physical Review 108:171
  • J.W. Gibbs (1873) A method of geometrical representation of thermodynamic properties of substances by means of surfaces, reprinted in The Collected Works of J. W. Gibbs, Volume I Thermodynamics, ed. W. R. Longley and R. G. Van Name (New York: Longmans, Green, 1931) footnote page 52.
  • M. Tribus and E. C. McIrvine (1971) Energy and information, Scientific American 224:179-186.
  • Eric J. Chaisson (2001) Cosmic evolution: The rise of complexity in nature (Harvard University Press, Cambridge MA).
  • Peter D. Ward and Donald Brownlee (2000) Rare earth: Why complex life is uncommon in the universe (Copernicus, New York).
Personal tools