It’d be easy to chalk up today’s choice to my being in pre-vacation mode, but in truth, I’ve had this New York Times Baseball Science article open in a tab for nearly a month. When I first read it, I immediately thought of connections to my recent post Lessons from Googlenomics: Data Abundance, Insight Scarcity.
In the referenced Wired Googlenomics article, Hal Varian asks, “What’s ubiquitous and cheap?” His answer “Data.” He follows up with “And what is scarce? The analytic ability to utilize that data.”
The Baseball Science article highlights an innovative way Major League Baseball is collecting even more player data – defense and base running – via a new system of high-resolution cameras and supporting software:
“A new camera and software system in its final testing phases will record the exact speed and location of the ball and every player on the field, allowing the most digitized of sports to be overrun anew by hundreds of innovative statistics that will rate players more accurately, almost certainly affect their compensation and perhaps alter how the game itself is played.
…In San Francisco, four high-resolution cameras sit on light towers 162 feet up, capturing everything that happens on the field in three dimensions and wiring it to a control room below. Software tools determine which movements are the ball, which are fielders and runners, and which are passing seagulls. More than two million meaningful location points are recorded per game.”
However, the system output is “simple time-stamped x-y-z coordinates” which require sophisticated algorithms to turn the raw data into insights:
“Software and artificial-intelligence algorithms must still be developed to turn simple time-stamped x-y-z coordinates into batted-ball speeds, throwing distances and comparative tools to make the data come alive.”
Beyond turning the raw data into meaningful information regarding player actions and game outcomes, the teams, league, and legions of fans and broadcasters, still need to figure out how to act on, and manage, this data trove:
“Teams have begun scrambling to develop uses for the new data, which will be unveiled Saturday to a group of baseball executives, statisticians and academics, knowing it will probably become the largest single advance in baseball science since the development of the box score. Several major league executives would not publicly acknowledge their enthusiasm for the new system, to better protect their plans for leveraging it.
“It can be a big deal,” the Cleveland Indians’ general manager, Mark Shapiro, said. “We’ve gotten so much data for offense, but defensive objective analysis has been the most challenging area to get any meaningful handle on. This is information that’s not available anywhere. When you create that much data you almost have to change the structure of the front office to make sense of it.””
The above two challenges, making the data meaningful, and developing actionable business insights, are accomplished by individuals that Hal Varian refers to as the “datarati”:
“Varian believes that a new era is dawning for what you might call the datarati—and it’s all about harnessing supply and demand. “What’s ubiquitous and cheap?” Varian asks. “Data.” And what is scarce? The analytic ability to utilize that data. As a result, he believes that the kind of technical person who once would have wound up working for a hedge fund on Wall Street will now work at a firm whose business hinges on making smart, daring choices—decisions based on surprising results gleaned from algorithmic spelunking and executed with the confidence that comes from really doing the math.”
In the baseball world, Billy Beane and Theo Epstein are considered ‘datarati’ archetypes.
As a geek by trade and a lifelong baseball fan, I find myself intrigued by this new data collection technology and the resulting analytic and management possibilities. Of course, it also got me thinking beyond baseball, and sports, to wonder what other fields (no pun intended) might benefit from digital camera based data collection and data point to scenario reconciliation.
From my own background, I can envision the technology being applied to analyze and improve efficiencies in retail stores, warehouses and factories. How about you?
Some questions to consider:
Could this data collection technique benefit your organization?
How about as a data consumer? Can you think of an external scenario that might provide meaningful “simple time-stamped x-y-z coordinates” to your organization?
Has your organization embraced the rise of the datarati?