• Blog
  • About
  • Archives

elemental links

brenda michelson's technology advisory practice

Hans Rosling Joy of Stats Addendum: Making Data Dance

January 21, 2011 By brenda michelson

In December, I posted Hans Rosling’s Joy of Stats video in which Rosling “tells the story of the world in 200 countries over 200 years using 120,000 numbers – in just four minutes.”

The video is a terrific example of combining statistics, data visualization and story telling to simplify the delivery, and therefore absorption, of an important, data rich message.

This afternoon, as I was flipping through the latest Economist Technology Quarterly, I ran across the perfect addendum to the Joy of Stats video.  An article on Rosling’s work entitled Making Data Dance. 

The piece starts by describing Rosling’s primary work:

“The realities that Dr Rosling is trying to highlight have been gleaned from decades of studying statistics. They sound simple enough: that it no longer makes sense to consider the world as divided between developing and industrialised countries; and that people everywhere respond similarly to increasing levels of wealth and health, with higher material aspirations and smaller families. “There is no such thing as a ‘we’ and a ‘they’, with a gap in between,” Dr Rosling says. “The majority of people are living in the middle—although the distance from the very poorest to very richest is wider than ever.” The best measure of political stability of a country, he believes, is whether fertility rates are falling, because that indicates that women are being educated and basic health services are being provided. “The only way to reach sustainable population levels is to improve public health,” he says. “Child survival is the new green.””

Then moves to his embrace of infographics:

“Communicating these realities to students in his international-development classes at Uppsala University proved problematic, however. “I used to make huge photocopied sheets of Unicef statistics for the students on income, life expectancy and fertility rates around the planet. But it didn’t change their world view, it didn’t create another mindset. They still insisted that we were different, that all the Chinese cannot all have a car,” says Dr Rosling. He needed a new way to present his conclusions—a way to turn dusty figures into convincing illustrations.

Innovation in infographics has always been driven by the need to explain difficult things, Dr Rosling points out. “Florence Nightingale is known as a nurse, but she also made a new kind of pie chart showing how many soldiers in the Crimean war died from military action and how many from disease.” Nightingale’s famous “coxcomb” chart from 1858 demonstrated that improving hygiene in British military hospitals slashed mortality rates. She said its design was intended “to affect thro’ the eyes what we fail to convey to the public through their word-proof ears.””

Next, the article uncovers a bit of the how:

“With the help of his son and daughter-in-law, Dr Rosling then developed Trendalyzer software (now called Gapminder) to animate the bubbles.

“It was a conscious intent to make the data look alive,” he explains. “My son invented the trails, like patterns in the snow, so you can see how countries have changed. And we could overlay countries historically so that it’s clear that, for example, China today is like Sweden in 1948 and people in Vietnam now have the same life expectancy as Americans did in 1985. Every country has a graphical path that describes its development.””

And (news to me), Gapminder software is available via Google, as Google Motion Chart:

“The software was a hit, first with his classes in Sweden, then worldwide after a video of his 2006 TED lecture was posted online. Dr Rosling was soon helping Al Gore polish up his climate-change presentations and talking about Gapminder with the founders of Google, Larry Page and Sergey Brin. “I could see in their eyes how excited they were, how my software fitted with their ideas about making organised information generally available,” he recalls. “We started collaborating and quickly reached the conclusion that it was more rational that Google acquire our technology and the team behind it.” Within a year Google had bought Gapminder, and a version of the bubble-graph software is now available free online under the name Google Motion Chart.”

Lastly, the article describes the very real problem of making data public, and Rosling’s work to “become the Robin Hood for free data”.

Check out the full article.

If you create any interesting visualizations with Google Motion Chart, please do share.

Filed Under: active information, data science, data visualization Tagged With: economist, Hans Rosling, infographics

The Beauty of Data: Hans Rosling’s The Joy of Stats

December 1, 2010 By brenda michelson

Floating on twitter today is this tremendous clip of Hans Rosling using an unique data visualization technique to tell a story.  The clip is an excerpt from an upcoming BBC special, The Joy of Stats.

“Hans Rosling’s famous lectures combine enormous quantities of public data with a sport’s commentator’s style to reveal the story of the world’s past, present and future development. Now he explores stats in a way he has never done before – using augmented reality animation.

In this spectacular section of ‘The Joy of Stats’ he tells the story of the world in 200 countries over 200 years using 120,000 numbers – in just four minutes.

Plotting life expectancy against income for every country since 1810, Hans shows how the world we live in is radically different from the world most of us imagine.”

Check it out. It’s amazing.

Filed Under: active information, data science, data visualization Tagged With: BBC, Hans Rosling

O’Reilly Radar: What is Data Science?

June 3, 2010 By brenda michelson

Mike Loukides has an excellent piece on O’Reilly Radar entitled “What is data science?” In the article, Loukides covers making data products, the data lifecyle, working with data at scale (Big Data), story telling and data scientists.

Throughout the article, Loukides introduces the reader to many data science concepts, tools, experts and skills.

Calling out several items, I love the “data exhaust” term:

“These recommendations are “data products” that help to drive Amazon’s more traditional retail business. They come about because Amazon understands that a book isn’t just a book, a camera isn’t just a camera, and a customer isn’t just a customer; customers generate a trail of “data exhaust” that can be mined and put to use, and a camera is a cloud of data that can be correlated with the customers’ behavior, the data they leave every time they visit the site.”

I think this “make lemonade” sentiment on data quality is crucial:

“Once you’ve parsed the data, you can start thinking about the quality of your data. Data is frequently missing or incongruous. If data is missing, do you simply ignore the missing points? That isn’t always possible. If data is incongruous, do you decide that something is wrong with badly behaved data (after all, equipment fails), or that the incongruous data is telling its own story, which may be more interesting? It’s reported that the discovery of global warming was delayed because automated data collection tools discarded readings that were too low 1. In data science, what you have is frequently all you’re going to get. It’s usually impossible to get “better” data, and you have no alternative but to work with the data at hand.”

The big data definition is excellent. It’s about the problem, not the (product) solutions:

“The most meaningful definition I’ve heard: “big data” is when the size of the data itself becomes part of the problem. We’re discussing data problems ranging from gigabytes to petabytes of data. At some point, traditional techniques for working with data run out of steam.”

And the information platforms / dataspaces concept ties to my active information tier:

“What are we trying to do with data that’s different? According to Jeff Hammerbacher 2 (@hackingdata), we’re trying to build information platforms or dataspaces. Information platforms are similar to traditional data warehouses, but different. They expose rich APIs, and are designed for exploring and understanding the data rather than for traditional analysis and reporting. They accept all data formats, including the most messy, and their schemas evolve as the understanding of the data changes.”

If you want to learn something today, read the article. Then bookmark it for future reference.

Filed Under: active information, data science, trends

Lessons from the Crisis: Behavior Matters

August 25, 2009 By brenda michelson

The July/August issue of the Harvard Business Review has a feature by McKinsey & Company on 10 Trends You Have to Watch.  The premise is after a year in turmoil, business executives are starting to look towards the future.  However, the world has changed, and with it, so have some key trends.

The trend that caught my attention – Management as Science — falls squarely in the datarati realm:

“Data, computing power, and mathematical models have been transforming many realms of management from art to science. But the crisis exposed the limitations of certain tools. In particular, the world saw the folly of the reliance by banks, insurance companies, and others on financial models that assumed economic rationality, linearity, equilibrium, and bell-curve distributions. As the recession unfolded, it became clear that the models had failed badly.

It would be wrong to conclude that managers should go back to making decisions only on the basis of gut instinct. The real lessons are that the tools need to incorporate more-realistic visions of human behavior—most likely by drawing on behavioral economics, becoming more dynamic, and integrating real-world feedback—and that business executives need to get better at using them. Companies will, rightly, continue to seek ways to exploit the increasing amounts of data and computing power. As they do so, decision makers in every industry must take responsibility for looking inside the black boxes that advanced quantitative tools often represent and understanding their functioning, assumptions, and limitations.”

In retrospect, this makes perfect sense.  Human behavior is far from universally predictable.  Recall how the U.S. Government expected citizens to re-invigorate the economy by engaging in non-essential shopping with that first stimulus check.  Instead, what did many do?  Paid bills, bought groceries or tucked it away for the tough times to come.  Survival instincts won out over an algorithm.

Once you recognize that behavior matters, a natural follow-on is, “where does behavioral data come”?  No surprise, Google has a veritable treasure trove:

“Wu calls Google "the barometer of the world." Indeed, studying the clicks is like looking through a window with a panoramic view of everything. You can see the change of seasons—clicks gravitating toward skiing and heavy clothes in winter, bikinis and sunscreen in summer—and you can track who’s up and down in pop culture. Most of us remember news events from television or newspapers; Googlers recall them as spikes in their graphs. "One of the big things a few years ago was the SARS epidemic," Tang says. Wu didn’t even have to read the papers to know about the financial meltdown—he saw the jump in people Googling for gold.”

As for the rest of us, we can mine internal and public datasets, setup prediction markets, employ sentiment tools and/or hire behavioral economics consultants.  First though, I’d recommend familiarizing yourself with the field of behavioral economics, and pay special attention to the datarati ties. I plan to ease myself in with Dan Ariely’s Predictably Irrational. 

If you have experience applying behavioral economics in your business, or reading/learning suggestions, please share what you can in the comments or via email.

Filed Under: active information, business, business intelligence, data science, trends

Lessons from Baseball Science: A picture is worth 1000 data points

August 5, 2009 By brenda michelson

It’d be easy to chalk up today’s choice to my being in pre-vacation mode, but in truth, I’ve had this New York Times Baseball Science article open in a tab for nearly a month.  When I first read it, I immediately thought of connections to my recent post Lessons from Googlenomics: Data Abundance, Insight Scarcity.

In the referenced Wired Googlenomics article, Hal Varian asks, “What’s ubiquitous and cheap?” His answer “Data.” He follows up with “And what is scarce? The analytic ability to utilize that data.”

The Baseball Science article highlights an innovative way Major League Baseball is collecting even more player data – defense and base running – via a new system of high-resolution cameras and supporting software:

“A new camera and software system in its final testing phases will record the exact speed and location of the ball and every player on the field, allowing the most digitized of sports to be overrun anew by hundreds of innovative statistics that will rate players more accurately, almost certainly affect their compensation and perhaps alter how the game itself is played.

…In San Francisco, four high-resolution cameras sit on light towers 162 feet up, capturing everything that happens on the field in three dimensions and wiring it to a control room below. Software tools determine which movements are the ball, which are fielders and runners, and which are passing seagulls. More than two million meaningful location points are recorded per game.”

However, the system output is “simple time-stamped x-y-z coordinates” which require sophisticated algorithms to turn the raw data into insights:

“Software and artificial-intelligence algorithms must still be developed to turn simple time-stamped x-y-z coordinates into batted-ball speeds, throwing distances and comparative tools to make the data come alive.”

Beyond turning the raw data into meaningful information regarding player actions and game outcomes, the teams, league, and legions of fans and broadcasters, still need to figure out how to act on, and manage, this data trove:

“Teams have begun scrambling to develop uses for the new data, which will be unveiled Saturday to a group of baseball executives, statisticians and academics, knowing it will probably become the largest single advance in baseball science since the development of the box score. Several major league executives would not publicly acknowledge their enthusiasm for the new system, to better protect their plans for leveraging it.

“It can be a big deal,” the Cleveland Indians’ general manager, Mark Shapiro, said. “We’ve gotten so much data for offense, but defensive objective analysis has been the most challenging area to get any meaningful handle on. This is information that’s not available anywhere. When you create that much data you almost have to change the structure of the front office to make sense of it.””

The above two challenges, making the data meaningful, and developing actionable business insights, are accomplished by individuals that Hal Varian refers to as the “datarati”:

“Varian believes that a new era is dawning for what you might call the datarati—and it’s all about harnessing supply and demand. “What’s ubiquitous and cheap?” Varian asks. “Data.” And what is scarce? The analytic ability to utilize that data. As a result, he believes that the kind of technical person who once would have wound up working for a hedge fund on Wall Street will now work at a firm whose business hinges on making smart, daring choices—decisions based on surprising results gleaned from algorithmic spelunking and executed with the confidence that comes from really doing the math.”

In the baseball world, Billy Beane and Theo Epstein are considered ‘datarati’ archetypes.

As a geek by trade and a lifelong baseball fan, I find myself intrigued by this new data collection technology and the resulting analytic and management possibilities.  Of course, it also got me thinking beyond baseball, and sports, to wonder what other fields (no pun intended) might benefit from digital camera based data collection and data point to scenario reconciliation.

From my own background, I can envision the technology being applied to analyze and improve efficiencies in retail stores, warehouses and factories.  How about you?

Some questions to consider:

Could this data collection technique benefit your organization?

How about as a data consumer?  Can you think of an external scenario that might provide meaningful “simple time-stamped x-y-z coordinates” to your organization?

Has your organization embraced the rise of the datarati?

Filed Under: active information, business, business intelligence, data science, innovation, trends Tagged With: archive_0

Next Page »

Brenda M. Michelson

Brenda Michelson

Technology Architect.

Trusted Advisor.

(BIO)

  • Email
  • LinkedIn
  • RSS
  • Twitter

Recent Posts

  • Experts Sketch
  • PEW Research: Tech Saturation, Well-Being and (my) Remedies
  • technology knowledge premise
  • The Curse of Knowledge
  • better problems and technology knowledge transfer

Recent Tweets

  • Harshest editorial feedback I ever received “stultified and like death”… (wildly popular paper, as it turned out):… https://t.co/qWNwBCOS5i February 28, 2023 2:16 pm
  • “…where the process of drawing itself can take us. We can follow a suggestion, a squiggle, shadow, or smudge, and s… https://t.co/oRg0x2LoXG November 30, 2022 5:05 pm
  • On the waiting list for Post, join me (on the waitlist) via https://t.co/U8wYK707f6 November 24, 2022 4:17 pm
© 2004-2022 Elemental Links, Inc.