logo

Quotes About Data

No single observation can meaningfully affect the aggregate.
~ Nassim Nicholas Taleb
our preference for the anecdotal over the empirical.
~ Nassim Nicholas Taleb
getting a lot of "timely" statistics are capable of overreacting and mistaking noise for information—
~ Nassim Nicholas Taleb
The more frequently you look at data, the more noise you are disproportionally likely to get (rather than the valuable part, called the signal);
~ Nassim Nicholas Taleb
the concept of median used in medical research does not characterize a probability distribution.
~ Nassim Nicholas Taleb
absence of evidence is not evidence of absence, a simple point that has the following implications: for the antifragile, good news tends to be absent from past data, and for
~ Nassim Nicholas Taleb
Experience is devoid of the cherry-picking that we find in studies, particularly those called "observational," ones in which the researcher finds past patterns, and, thanks to the sheer amount of data, can therefore fall into the trap of an invented narrative.
~ Nassim Nicholas Taleb
Modernity provides too many variables (but too little data per variable), and the spurious relationships grow much, much faster than real information, as noise is convex and information is concave. Increasingly, data can only truly deliver via negativa–style knowledge—it can be effectively used to debunk, not confirm.
~ Nassim Nicholas Taleb
When your sample is large, no single instance will significantly change the aggregate or the total.
~ Nassim Nicholas Taleb
Extremistan, you will have trouble figuring out the average from any sample since it can depend so much on one single observation. The idea is not more difficult than that. In Extremistan, one unit can easily affect the total in a disproportionate way. In this world, you should always be suspicious of the knowledge you derive from data. This is a very simple test of uncertainty that allows you to distinguish between the two kinds of randomness. Capish?
~ Nassim Nicholas Taleb
A very rarely discussed property of data: it is toxic in large quantities
~ Nassim Nicholas Taleb
o acesso a dados aumenta a intervenção, fazendo com que nos comportemos como o sujeito neurótico.
~ Nassim Nicholas Taleb
The more remote the event, the less we can get empirical data (assuming generously that the future will resemble the past) and the more we need to rely on theory.
~ Nassim Nicholas Taleb
Since most investment managers will not beat the market, investors should at least consider investing in "index funds" that replicate the market and so never get beaten by the market. Indexing may not be fun or exciting, but it works. The data from the performance measurement firms show that index funds have outperformed most investment managers over long periods of time. For
~ Charles D. Ellis
It would take only a few thousand terabytes of hard-drive space to archive a human's entire audiovisual experience from cradle to grave.
~ Charles Seife
if you want to work on data covering more than about one month you're supposed to phone Mr. Jobsworth at BT and whine for help.
~ Charles Stross
Twelve percent of all the photographs ever taken in human history have been taken in the last twelve months. And 40 percent of them are on Facebook.
~ Charles Stross
pivoted the results on our Criminal Records Bureau and National Insurance database mirrors to get the place of work for everyone who's on the books, and the pre-processor is turning that into grid reference data so we can plot them on a map or query for areas where the rate of that's funny . . .
~ Charles Stross
As a rule of thumb, the sample size must be at least 30 for the central limit theorem to hold true.) This
~ Charles Wheelan
Data are to statistics what a good offensive is to a star quarterback.
~ Charles Wheelan
Statistical inference is really just the marriage of two concepts that we've already discussed: data and probability (with a little help from the central limit theorem).
~ Charles Wheelan
Here is one of the most important things to remember when doing research that involves regression analysis: Try not to kill anyone. You can even put a little Post-it note on your computer monitor: "Do not kill people with your research.
~ Charles Wheelan
The credit card companies are at the forefront of this kind of analysis, both because they are privy to so much data on our spending habits and because their business model depends so heavily on finding customers who are just barely a good credit risk.
~ Charles Wheelan
So we simplify. We perform calculations that reduce a complex array of data into a handful of numbers that describe those data
~ Charles Wheelan