Harry Crane

Harry Crane

ResearchersOne


Articles

The most common bets in 19th-century casinos were even-money bets on red or black in Roulette or Trente et Quarante. Many casino gamblers allowed themselves to be persuaded that they could make money for sure in these games by following betting systems such as the d'Alembert. What made these systems so seductive? Part of the answer is that some of the systems, including the d'Alembert, can give bettors a very high probability of winning a small or moderate amount. But there is also a more subtle aspect of the seduction. When the systems do win, their return on investment --- the gain relative to the amount of money the bettor has to take out of their pocket and put on the table to cover their bets --- can be astonishingly high. Systems such as le tiers et le tout, which offer a large gain when they do win rather than a high probability of winning, also typically have a high upside return on investment. In order to understand these high returns on investment, we need to recognize that the denominator --- the amount invested --- is random, as it depends on how successive bets come out.

In this article, we compare some systems on their return on investment and their success in hiding their pitfalls. Systems that provide a moderate gain with a very high probability seem to accomplish this by stopping when they are ahead and more generally by betting less when they are ahead or at least have just won, while betting more when they are behind or have just lost. For historical reasons, we call this martingaling. Among martingales, the d'Alembert seems especially good at making an impressive return on investment quickly, encouraging gamblers' hope that they can use it so gingerly as to avoid the possible large losses, and this may explain why its popularity was so durable.

We also discuss the lessons that this aspect of gambling can have for evaluating success in business and finance and for evaluating the results of statistical testing.

Comment on the proposal to rename the R.A. Fisher Lecture.

The spread of infectious disease in a human community or the proliferation of fake news on social media can be modeled as a randomly growing tree-shaped graph. The history of the random growth process is often unobserved but contains important information such as thesource of the infection. We consider the problem of statistical inference on aspects of the latent history using only a single snapshot of the final tree. Our approach is to apply random labels to the observed unlabeled tree and analyze the resulting distribution of the growth process, conditional on the final outcome. We show that this conditional distribution is tractable under a shape-exchangeability condition, which we introduce here, and that this condition is satisfied for many popular models for randomly growing trees such as uniform attachment, linear preferential attachment and uniform attachment on a D-regular tree. For inference of the rootunder shape-exchangeability, we propose computationally scalable algorithms for constructing confidence sets with valid frequentist coverage as well as bounds on the expected size of the confidence sets. We also provide efficient sampling algorithms which extend our methods to a wide class of inference problems.

A response to John Ioannidis's article A fiasco in the making? As the coronavirus pandemic takes hold, we are making decisions without reliable data (March 17, 2020) in which he downplays the severe risks posed by coronavirus pandemic.

When gambling, think probability.
When hedging, think plausibility.
When preparing, think possibility.
When this fails, stop thinking. Just survive.

Naive probabilism is the (naive) view, held by many technocrats and academics, that all rational thought boils down to probability calculations. This viewpoint is behind the obsession with `data-driven methods' that has overtaken the hard sciences, soft sciences, pseudosciences and non-sciences. It has infiltrated politics, society and business. It's the workhorse of formal epistemology, decision theory and behavioral economics. Because it is mostly applied in low or no-stakes academic investigations and philosophical meandering, few have noticed its many flaws. Real world applications of naive probabilism, however, pose disproportionate risks which scale exponentially with the stakes, ranging from harmless (and also helpless) in many academic contexts to destructive in the most extreme events (war, pandemic). The 2019--2020 coronavirus outbreak (COVID-19) is a living example of the dire consequences of such probabilistic naivet\'e. As I write this on March 13, 2020, we are in the midst of a 6 continent pandemic, the world economy is collapsing and our future is bound to look very different from the recent past. The major damage caused by the spread of COVID-19 is attributable to a failure to act and a refusal to acknowledge what was in plain sight. This shared negligence stems from a blind reliance on naive probabilism and the denial of basic common sense by global and local leaders, and many in the general public.

 

This introductory chapter of Probabilistic Foundations of Statistical Network Analysis explains the major shortcomings of prevailing efforts in statistical analysis of networks and other kinds of complex data, and why there is a need for a new way to conceive of and understand data arising from complex systems.

We discuss how sampling design, units, the observation mechanism, and other basic statistical notions figure into modern network data analysis. These con- siderations pose several new challenges that cannot be adequately addressed by merely extending or generalizing classical methods. Such challenges stem from fundamental differences between the domains in which network data emerge and those for which classical tools were developed. By revisiting these basic statistical considerations, we suggest a framework in which to develop theory and methods for network analysis in a way that accounts for both conceptual and practical chal- lenges of network science. We then discuss how some well known model classes fit within this framework.

Whether the predictions put forth prior to the 2016 U.S. presidential election were right or wrong is a question that led to much debate. But rather than focusing on right or wrong, we analyze the 2016 predictions with respect to a core set of {\em effectiveness principles}, and conclude that they were ineffective in conveying the uncertainty behind their assessments. Along the way, we extract key insights that will help to avoid, in future elections, the systematic errors that lead to overly precise and overconfident predictions in 2016. Specifically, we highlight shortcomings of the classical interpretations of probability and its communication in the form of predictions, and present an alternative approach with two important features.  First, our recommended predictions are safer in that they come with certain guarantees on the probability of an erroneous prediction; second, our approach easily and naturally reflects the (possibly substantial) uncertainty about the model by outputting plausibilities instead of probabilities.

I compare forecasts of the 2018 U.S. midterm elections based on (i) probabilistic predictions posted on the FiveThirtyEight blog and (ii) prediction market prices on PredictIt.com. Based on empirical forecast and price data collected prior to the election, the analysis assesses the calibration and accuracy according to Brier and logarithmic scoring rules.  I also analyze the performance of a strategy that invests in PredictIt based on the FiveThirtyEight forecasts. 

This article describes our motivation behind the development of RESEARCHERS.ONE, our mission, and how the new platform will fulfull this mission.  We also compare our approach with other recent reform initiatives such as post-publication peer review and open access publications.  

I introduce a formalization of probability in intensional Martin-Löf type theory (MLTT) and homotopy type theory (HoTT) which takes the concept of ‘evidence’ as primitive in judgments about probability. In parallel to the intuition- istic conception of truth, in which ‘proof’ is primitive and an assertion A is judged to be true just in case there is a proof witnessing it, here ‘evidence’ is primitive and A is judged to be probable just in case there is evidence supporting it. To formalize this approach, we regard propositions as types in MLTT and define for any proposi- tion A a corresponding probability type Prob(A) whose inhabitants represent pieces of evidence in favor of A. Among several practical motivations for this approach, I focus here on its potential for extending meta-mathematics to include conjecture, in addition to rigorous proof, by regarding a ‘conjecture in A’ as a judgment that ‘A is probable’ on the basis of evidence. I show that the Giry monad provides a formal semantics for this system.

This article describes how the filtering role played by peer review may actually be harmful rather than helpful to the quality of the scientific literature. We argue that, instead of trying to filter out the low-quality research, as is done by traditional journals, a better strategy is to let everything through but with an acknowledgment of the uncertain quality of what is published, as is done on the RESEARCHERS.ONE platform.  We refer to this as "scholarly mithridatism."  When researchers approach what they read with doubt rather than blind trust, they are more likely to identify errors, which protects the scientific community from the dangerous effects of error propagation, making the literature stronger rather than more fragile.  

I make the distinction between academic probabilities, which are not rooted in reality and thus have no tangible real-world meaning, and real probabilities, which attain a real-world meaning as the odds that the subject asserting the probabilities is forced to accept for a bet against the stated outcome.  With this I discuss how the replication crisis can be resolved easily by requiring that probabilities published in the scientific literature are real, instead of academic.  At present, all probabilities and derivatives that appear in published work, such as P-values, Bayes factors, confidence intervals, etc., are the result of academic probabilities, which are not useful for making meaningful assertions about the real world.

The notion of typicality appears in scientific theories, philosophical arguments, math- ematical inquiry, and everyday reasoning. Typicality is invoked in statistical mechanics to explain the behavior of gases. It is also invoked in quantum mechanics to explain the appearance of quantum probabilities. Typicality plays an implicit role in non-rigorous mathematical inquiry, as when a mathematician forms a conjecture based on personal experience of what seems typical in a given situation. Less formally, the language of typicality is a staple of the common parlance: we often claim that certain things are, or are not, typical. But despite the prominence of typicality in science, philosophy, mathematics, and everyday discourse, no formal logics for typicality have been proposed. In this paper, we propose two formal systems for reasoning about typicality. One system is based on propositional logic: it can be understood as formalizing objective facts about what is and is not typical. The other system is based on the logic of intuitionistic type theory: it can be understood as formalizing subjective judgments about typicality.

Publication of scientific research all but requires a supporting statistical analysis, anointing statisticians the de facto gatekeepers of modern scientific discovery. While the potential of statistics for providing scientific insights is undeniable, there is a crisis in the scientific community due to poor statistical practice. Unfortunately, widespread calls to action have not been effective, in part because of statisticians’ tendency to make statistics appear simple. We argue that statistics can meet the needs of science only by empowering scientists to make sound judgments that account for both the nuances of the application and the inherent complexity of funda- mental effective statistical practice. In particular, we emphasize a set of statistical principles that scientists can adapt to their ever-expanding scope of problems.

I prove a connection between the logical framework for intuitive probabilistic reasoning (IPR) introduced by Crane (2017) and sets of imprecise probabilities. More specifically, this connection provides a straightforward interpretation to sets of imprecise probabilities as subjective credal states, giving a formal semantics for Crane's formal system of IPR. The main theorem establishes the IPR framework as a potential logical foundation for imprecise probability that is independent of the traditional probability calculus.

© 2018-2020 Researchers.One