• Blog
  • About
  • Archives

elemental links

brenda michelson's technology advisory practice

Archives for June 2009

Lessons from Googlenomics: Data abundance, Insight Scarcity

June 29, 2009 By brenda michelson

“"What's ubiquitous and cheap?" [Google’s Hal] Varian asks. "Data." And what is scarce? The analytic ability to utilize that data.”

The June issue of Wired has an excellent article by Steven Levy, entitled Secret of Googlenomics: Data-Fueled Recipe Brews Profitability.  The article delves into the history and algorithms behind Google’s auction based ad system, highlighting the significance of engineering, mathematics, economics, and data mining in Google’s success.

On the economics front, the article explains Hal Varian’s role as Chief Economist at Google, including why Google needs a chief economist:

“The simplest reason is that the company is an economy unto itself. The ad auction, marinated in that special sauce, is a seething laboratory of fiduciary forensics, with customers ranging from giant multinationals to dorm-room entrepreneurs, all billed by the world's largest micropayment system.

Google depends on economic principles to hone what has become the search engine of choice for more than 60 percent of all Internet surfers, and the company uses auction theory to grease the skids of its own operations. All these calculations require an army of math geeks, algorithms of Ramanujanian complexity, and a sales force more comfortable with whiteboard markers than fairway irons.”

After reading the article, Varian’s economic view of data ubiquity and analytic scarcity really stuck with me.  The quote I opened the post with isn’t directed at software availability or processing power.  It refers to the scarcity of people qualified to churn abundant data into economic value.  

What follows are some excerpts “about harnessing supply and demand”.  The sub-headers and emphasis are mine.

Enter Econometricians

"The people working for me are generally econometricians—sort of a cross between statisticians and economists," says Varian, who moved to Google full-time in 2007 (he's on leave from Berkeley) and leads two teams, one of them focused on analysis.

"Google needs mathematical types that have a rich tool set for looking for signals in noise," says statistician Daryl Pregibon, who joined Google in 2003 after 23 years as a top scientist at Bell Labs and AT&T Labs. "The rough rule of thumb is one statistician for every 100 computer scientists."

Ubiquitous Data

“As the amount of data at the company's disposal grows, the opportunities to exploit it multiply, which ends up further extending the range and scope of the Google economy…

Keywords and click rates are their bread and butter. "We are trying to understand the mechanisms behind the metrics," says Qing Wu, one of Varian's minions. His specialty is forecasting, so now he predicts patterns of queries based on the season, the climate, international holidays, even the time of day. "We have temperature data, weather data, and queries data, so we can do correlation and statistical modeling," Wu says. The results all feed into Google's backend system, helping advertisers devise more-efficient campaigns.”

Continuous Analysis

“To track and test their predictions, Wu and his colleagues use dozens of onscreen dashboards that continuously stream information, a sort of Bloomberg terminal for the Googlesphere. Wu checks obsessively to see whether reality is matching the forecasts: "With a dashboard, you can monitor the queries, the amount of money you make, how many advertisers you have, how many keywords they're bidding on, what the rate of return is for each advertiser."”

Behavioral Based Insights

“Wu calls Google "the barometer of the world." Indeed, studying the clicks is like looking through a window with a panoramic view of everything. You can see the change of seasons—clicks gravitating toward skiing and heavy clothes in winter, bikinis and sunscreen in summer—and you can track who's up and down in pop culture. Most of us remember news events from television or newspapers; Googlers recall them as spikes in their graphs. "One of the big things a few years ago was the SARS epidemic," Tang says. Wu didn't even have to read the papers to know about the financial meltdown—he saw the jump in people Googling for gold. And since prediction and analysis are so crucial to AdWords, every bit of data, no matter how seemingly trivial, has potential value.”

Rise of the Datarati

“Varian believes that a new era is dawning for what you might call the datarati—and it's all about harnessing supply and demand. "What's ubiquitous and cheap?" Varian asks. "Data." And what is scarce? The analytic ability to utilize that data. As a result, he believes that the kind of technical person who once would have wound up working for a hedge fund on Wall Street will now work at a firm whose business hinges on making smart, daring choices—decisions based on surprising results gleaned from algorithmic spelunking and executed with the confidence that comes from really doing the math.”

Now, a few questions I think folks should consider:

  1. Who does that math in your organization? 
  2. Does your analytics / active information strategy suffer from information processing richness and insight scarcity?
  3. Who are, or should be, your datarati? 

Filed Under: active information, business, business intelligence, data science, information strategies, innovation, trends Tagged With: archive_0

Conversation with Steve Goldman of the CME Group on CEP as Enterprise Platform & StreamBase

June 24, 2009 By brenda michelson

Late in May, Mark Palmer, CEO of StreamBase, piqued the event processing community’s curiosity with this tweet: “Today I signed what I think is the most exciting CEP deal of 2009 – corporate selection by a household name…”.

While many household names use Complex Event Processing products, the products are acquired solve a particular business problem, or perhaps, a handful of scenarios within a business unit. In his tweet, Mark signaled an adoption pattern shift, from CEP as application enabler, to CEP as enterprise technology platform.

For the event processing community — vendors, researchers, early adopters and advocates — this shift has been long overdue. Of course, as a fact based community, we require a little more information than a 140-character tweet.

That information became public this week, as StreamBase announced that the “household name” is the CME Group:

“StreamBase today announced that CME Group, the world’s largest and most diverse derivatives exchange, has selected StreamBase Complex Event processing solution for enterprise-wide deployment. After a comprehensive evaluation, CME Group chose StreamBase as its internal standard Complex Event Processing (CEP) platform, and will be initially deploying it for their options pricing applications.

“CME Group is one of the most demanding technology environments in the world, processing millions of orders a day in milliseconds, and disseminating market data in a reliable and low latency manner is critical to our customers,” said Steve Goldman, Director, Enterprise Architecture, CME Group. “Their high-performance multi-threaded server and easy to use modeling tools met our requirements and will enable the exchange to quickly react to the ever changing needs of our customers.”

…As an international marketplace, CME Group brings buyers and sellers together on the CME Globex electronic trading platform and on trading floors in Chicago and New York. By acting as the buyer to every seller and the seller to every buyer, CME Clearing virtually eliminates counterparty credit risk. CME Group offers the widest range of benchmark products available across all major asset classes, including futures and options based on interest rates, equity indexes, foreign exchange, energy, agricultural commodities, metals, and alternative investment products such as weather and real estate. More information can be found at www.cmegroup.com or via Twitter @cmegroup.”

Earlier this month, I had the opportunity to speak with Steve Goldman, Director of Enterprise Architecture at the CME Group, about event processing at the CME Group and their selection of StreamBase. We had a great conversation that substantiated Mark’s proclamation “of the most exciting CEP deal of 2009”.

One administrative note before I jump into the highlights from our conversation. What follows are an edited and summarized version of my notes from the call. In other words, these are not direct quotes.

Business Scenarios

The CME Group’s first StreamBase use case is generating options settlement prices. The daily settlement process involves complicated calculations based on a number of market data feeds. For an idea of the complexity and product line variations, here are some details from the CME’s Daily Settlement Procedures (pdf):

“Equity Options: Exchange staff identifies “seed strikes” that include the at-the-money straddle and several out-of-the-money calls/puts. The midpoints of the bid/ask quotes in the seed strikes on Globex are used to create an implied volatility skew. The skew is adjusted based upon the underlying settlement price to automatically generate the out-of-the money settlement prices, and the in-the-money options are settled automatically, using the method referenced on page 4 of this document. For longer dated options for which no Globex data exists, market participants provide bid/ask data for the seed strikes. Adjustments may be made to incorporate relevant pit data.

Non-Treasury Interest Rate Options: Similar to the procedure used in equity options, settlements in the front year of expirations are generated based on the skew derived from taking the midpoint of the bid/ask quotes in Exchange-designated seed strikes from the pit and from Globex. The skew is adjusted based upon the underlying settlement price. The additional guidelines referenced on page 3 of this document are also utilized. All other contract months are settled by Exchange officials based upon input from market participants.

Agricultural Options: Market participants provide quotes in Exchange-designated seed strikes which are used to generate the implied volatility skew and the skew is adjusted to the underlying futures settlement price. Dairy products are settled using a flat volatility determined by the at-the-money straddle.

Weather Options: Option trades are converted to “standard deviations” using a model based on Stephen Jewson’s model for pricing Weather. This standard deviation creates prices in the entire options series which is then applied to the open strikes.

Housing Futures and Options: The futures are settled to the last trade or better bid/offer on Globex. Absent a trade or better bid/offer, the prior day settlement is used. The options are settled using volatility skews derived from the midpoints of the bid/ask in a given strike, tied to a futures level.

Metal Options: Exchange officials, in consultation with market participants, establish the at-the-money volatility and create the volatility surface for the out-of-the money puts and calls for all option series based on traded/quoted outrights and spreads, which is entered into an options pricing model to determine the settlements for all strikes. Settlements may be adjusted in accommodate relevant orders.”

[For more on the CME Group’s business, see Mark Palmer’s Innovation by the Numbers post.]

Event-Driven Organization

Goldman shared that the exchange has been an event-driven organization for a long time, at least since they began electronic trading. Goldman described CEP as the epitome. CEP introduces an engine to process thousands and thousand of real-time events, with a simple way to instruct the engine on what to do with those events.

Goldman emphasized the productivity benefits for business users. Business users will be able to build, dynamically change and test models. Once the business scenario is resolved, the business hands off the models to technology personnel who focus on implementation aspects, such as scale, reliability and monitoring.

[Weather Options Settlement Example in StreamBase, Click on Picture to enlarge]

By adding StreamBase, they now have a powerful and fle
xible tool to work with market data. To maximize this flexibility, the solution is being architected to receive all market data within the exchange, as well as many external data sources.

Future use cases include real-time risk analysis and the margining aspect of the business.

Selection Process

In respect to the selection process, Goldman spoke of mature enterprise architecture practices and deep business participation. They started by developing an enterprise architecture framework that looked into the entire settlement process. This resulted in a design, which ultimately led to Complex Event Processing.

Goldman outlined an evaluation process that continually narrows the field via introductory briefings, RFI responses, follow-on meetings, proof-of-concepts, gap analysis, and business terms. During the CEP evaluation, the CME Group looked at four vendors, and ended with two finalists.

The team determined that both finalists could do the job, meeting functional, performance, scale and monitoring requirements. Ultimately, the usability of the StreamBase Studio won the day.

The product’s ease of use, Goldman believes, also contributed to the business team’s deep engagement in the proof-of-concept and involvement in the final decision-making.

Return on Investment

Goldman projects the CEP engine investment will pay-off in less than a year. The alternative to purchasing a CEP engine was a custom solution. A custom solution would have required more development time and delayed the introduction of business capability, which the CME Group needs now.

In addition, a custom solution would have included manual processing and “taped together” third party tools. Besides cost and time, this path introduces more opportunities for error.

Real-time World

Speaking to opportunities outside of capital markets, Goldman spoke of the importance of real-time business in an increasingly real-time world. The ability to see and process orders, data, risk and regulatory compliance in real-time ultimately results in more business. More business results in more profits, now.

[Disclosure: StreamBase is not a client of my company, Elemental Links. Nor do I have the skill to trade on the CME Group’s exchanges.]

Filed Under: active information, enterprise architecture, event driven architecture, event processing

Next Cloud Watching Stop: Enterprise 2.0 in Boston, June 22, 2009

June 19, 2009 By brenda michelson

Continuing my broad survey of cloud computing, I’m dropping by Enterprise 2.0 in Boston.  The cloud computing program starts with a full day of talks and panel discussions and concludes with an Evening in the Cloud:

“…leading purveyors of cloud computing will explain how best to leverage your existing IT investments while getting the benefits of the cloud. In addition to provoking discussion, this interactive program will allow you to "invest a virtual $1 million" in the cloud-based solution(s) you believe will give your business the most bang for its buck.”

As has become a habit, I’ll share the highlights via live-blogging and tweeting.  I’m looking forward to the evening “speed-geeking”, where the vendors have 6 minutes to demo their solutions in an effort to earn a portion of our (virtual) $1 million portfolios.  Given elasticity is a fundamental tenet of the cloud, I’m wondering if there is a way to “scale-up” my portfolio.  And of course, lose the “virtual” aspect…

If ‘un-virtualizing’ that $1 million portfolio doesn’t work out, I have a plan underway that removes the “un” from my unintentional cloud watching.  More on that another time.

Filed Under: circuit, cloud computing

Grumpy Architect week: There is more to services than re-use

June 3, 2009 By brenda michelson

Perhaps I’m just grumpy this week.  Or, concerned for the future.  Or, most likely, both.  Nevertheless, I find conventional SOA lore more bothersome than usual.  Specifically, the paired notions that the sole reason to implement services (or not) is re-use potential, and that the main architectural aspect of SOA is governing said services for re-use. Now, don’t misinterpret, there is true value in sharing services and governance is critical.  However, SOA, or better said, services-architecture doesn’t begin and end with re-use potential and enforcement.

For those with architectural backgrounds – software not marketing trend – what follows is nothing new.  You are well acquainted with foundational tenets such as separation of concerns, modularity, loose coupling, cohesion etc and the associated benefits.  Unfortunately, based on my interactions over the last several months, I must report (a) this knowledge is not universal (b) people can’t articulate the benefits of well-architected software and/or (c) the dots don’t connect all the way to SOA.

Since the presence of well-defined (and well-built services) is assumed in a bevy of existing and emerging technology strategies — mashups, event-processing, business process automation and cloud computing — we need to correct the record on the total value of services and make the connection to proper architectural discipline.

To aid in this ‘services-architecture’ education, I’d like to call out excerpts of three works.  The first source is Luke Hohmann’s excellent 2003 book, Beyond Software Architecture.  In Chapter 1, Hohmann describes (reminds us of) architectural design principles that have stood the test of time:

“Encapsulation
The architecture is organized around separate and relatively independent pieces that hide internal implementation details from each other.

Interfaces
The ways that subsystems within a larger design interact are clearly defined.  Ideally, these interactions are specified in such a way that they can remain relatively stable over the life of the system.  One way to accomplish this is through abstractions over the concrete implementation.  Programming to the abstraction allows greater variability as implementation needs change.

…Another area in which the principle of interfaces influences system design is the careful isolation of aspects of the system that are likely to experience the greatest amount of change behind stable interfaces.

Loose Coupling
Coupling refers to the degree of interconnectedness among different pieces in a system.  In general, loosely coupled pieces are easier to understand, test, reuse, and maintain, because they can be isolated from other pieces of the system.  Loose coupling also promotes parallelism in the implementation schedule.  Note the application of the first two principles aides loose coupling.

Appropriate Granularity
One of the key challenges associated with loose coupling concerns component granularity.  By granularity I mean the level of work performed by a component.  Loosely coupled components may be easy to understand, test, reuse, and maintain in isolation, but when they are created with too fine of a granularity, creating solutions using them can be harder because you have to stitch together so many to accomplish a meaningful piece of work.  Appropriate granularity is determined by the task(s) associated with the component.

High Cohesion
Cohesion describes how closely related the activities within a single piece (component) or among a group of pieces are.  A highly cohesive component means that its elements strongly relate to each other.

Parameterization
Components can be encapsulated, but this does not mean that they perform their work without some kind of parameterization or instrumentation.  The most effective components perform an appropriate amount of work with the right number and kind of parameters that enable their user to adjust their operation. 

Deferral
Many times the development team is faced with a tough decision that cannot be made with certainty. …By deferring these decisions as long as possible the overall development team gives themselves the best chance to make a good choice.  While you can’t defer a decision forever, you can quarantine its effects by using the principles of good architectural design.” 

To state the obvious, the above principles all apply to service design.  Pointing out the (apparently) less obvious, the value of applying these principles – protecting against and planning for change, breaking up work, smart-sizing assets, isolating risk – are benefits that can be derived from services, regardless of re-use potential.

Moving to a real world example, the March 2008 Harvard Business Review featured an article by David M. Upton and Bradley R. Staats, entitled Radically Simple IT.  The article is on Japan’s Shinsei Bank implementing a new enterprise system:

“In our research, we discovered a standout among the companies applying the path-based method: Japan’s Shinsei Bank. It succeeded in developing and deploying an entirely new enterprise system in one year at a cost of $55 million: That’s one-quarter of the time and about 10% of the cost of installing a traditional packaged system. The new system not only served as a low-cost, efficient platform for running the existing business but also was flexible enough to support the company’s growth into new areas, including retail banking, consumer finance, and a joint venture to sell Indian mutual funds in Japan.

The path-based principles that Shinsei applied in designing, building, and rolling out the system—forging together, not just aligning, business and IT strategies; employing the simplest possible technology; making the system truly modular; letting the system sell itself to users; and enabling users to influence future improvements—are a model for other companies. Some of these principles are variations on old themes while others turn the conventional wisdom on its head.”

Although the entire article is excellent, I wanted to call out the section on “Modularity, not just modules”.  The emphasis is mine.

“While the prevailing view that big IT programs and systems should consist of modules is hardly new, the concept of modularity is often misunderstood. Just because a software developer claims that the various parts of its applications are modules does not mean that they are actually modular. Modularity involves clearly specifying interfaces so that development work can take place within any one module without affecting the others. Companies often miss that point when developing enterprise systems. For example, we know of an automobile company that had teams working on multiple modules of a new enterprise system and claimed to have a modular design. However, one team was in charge of interfaces and was constantly changing them. Every alteration by this group forced all the other groups to spend huge amounts of time redoing the work they had already completed. Rather than limiting the
impact of changes by embracing modularity, this company had actually amplified problems!

A truly modular architecture allows designers to focus on building solutions to local problems without disturbing the global system. With small, modular pieces, the organization can purchase off-the-shelf solutions or turn to inside or outside developers for a certain piece, accelerating the speed of development. Modular architecture also makes it easier to upgrade the technology within modules once the system is up and running.

Breaking down and solving problems in this way offers a number of advantages beyond speed. It allows the IT team to concentrate on obtaining the lowest-cost solution for each part and (by partitioning work) reduces the impact of a single point of failure. Clearly specifying the functions of modules and the interfaces makes it easier to build a module that can be reused in other applications.

The modular approach was a critical part of achieving the bank’s strategy, as Dvivedi described it, “to scale up and expand into new activities with ease, to be able to service the needs of the organization as it grows from a baby into an adult…and avoid building capacity before we need it.” Take loan-processing capabilities. The project team rolled out the capabilities in small stages for three reasons: to prove to management that the computer system would perform as promised, to avoid overwhelming managers and users with too much automation all at once, and to be able to address any technical issues quickly as they arose. Accordingly, the team initially sought to show that the system could correctly approve credit for a small number of loans (20 to 30 a day). Then the team developed the capacity to fully process 200 to 300 loans a day. As the business grew, Shinsei eliminated manual work to reach a capacity for processing 6,000 loans a day.

Thanks to the modular structure of the automated system, Shinsei can simply replace one part (the loan-application or credit-checking functions, for example) without affecting the rest. What’s more, modularity has allowed Shinsei to change its IT when appropriate or necessary without having to risk upsetting customers. It can keep the customer interfaces (such as web pages or the format of the ATM screen) the same while changing the back-end systems.”

Besides the excellent real-world example in applying and benefiting from the architectural principles cited by Hohmann, this article also calls out the ability to source functionality at a modular level.  With the advent of cloud computing, and subsequent opening of service markets, there is even more motivation to design and implement a services architecture.  As Dave Linthicum advises, “leverage other people’s work”.

Lastly, I want to point out a sidebar Guide to Modularity, from a 1997 Harvard Business Review article on Managing in an Age of Modularity.  The premise of this article was to introduce managers outside of technology and manufacturing to embrace modularity practices in product development:

“By breaking up a product into subsystems, or modules, designers, producers, and users have gained enormous flexibility. Different companies can take responsibility for separate modules and be confident that a reliable product will arise from their collective efforts.”

A Guide to Modularity

“Modularity is a strategy for organizing complex products and processes efficiently. A modular system is composed of units (or modules) that are designed independently but still function as an integrated whole. Designers achieve modularity by partitioning information into visible design rules and hidden design parameters. Modularity is beneficial only if the partition is precise, unambiguous, and complete.

The visible design rules (also called visible information) are decisions that affect subsequent design decisions. Ideally, the visible design rules are established early in a design process and communicated broadly to those involved. Visible design rules fall into three categories:

•    An architecture, which specifies what modules will be part of the system and what their functions will be.

•    Interfaces that describe in detail how the modules will interact, including how they will fit together, connect, and communicate.

•    Standards for testing a module’s conformity to the design rules (can module X function in the system?) and for measuring one module’s performance relative to another (how good is module X versus module Y?).

Practitioners sometimes lump all three elements of the visible information together and call them all simply “the architecture,” “the interfaces,” or “the standards.”

The hidden design parameters (also called hidden information) are decisions that do not affect the design beyond the local module. Hidden elements can be chosen late and changed often and do not have to be communicated to anyone beyond the module design team.”

In respect to SOA, the design rule most often broken is identifying “what modules will be part of the system and what their functions will be.”  The most common mistakes:

  1. the service portfolio map’s scope is immediately reduced to “those services that will be re-used”
  2. the service portfolio map mirrors the current software asset repository
  3. the service portfolio map is a derived artifact from individual project plans and deliverables

In creating a service portfolio map, start with business capabilities, business processes and/or business information, and perform business analysis to identify key business concepts that will represented by, further partitioned into, services.  Depending on the granularity of your starting point, work two to four levels down.  For instance, if your starting point is Supply Chain, you’ll still be doing business analysis four levels down.  If your starting point is Warehouse Receiving, by the fourth level you are probably in implementation detail.  Fine for a Warehouse Receiving project, but too deep for your service portfolio map.

With a clear understanding of the architectural aspects and full benefits of services, and a high-level service portfolio map, you can better position your organization to succeed in this new environment where services, and a services mindset, are assumed.

Filed Under: business architecture, cloud computing, enterprise architecture, event driven architecture, services architecture, soa Tagged With: archive_0

Brenda M. Michelson

Brenda Michelson

Technology Architect.

Trusted Advisor.

(BIO)

  • Email
  • LinkedIn
  • RSS
  • Twitter

Recent Posts

  • Experts Sketch
  • PEW Research: Tech Saturation, Well-Being and (my) Remedies
  • technology knowledge premise
  • The Curse of Knowledge
  • better problems and technology knowledge transfer

Recent Tweets

  • “…where the process of drawing itself can take us. We can follow a suggestion, a squiggle, shadow, or smudge, and s… https://t.co/oRg0x2LoXG November 30, 2022 5:05 pm
  • On the waiting list for Post, join me (on the waitlist) via https://t.co/U8wYK707f6 November 24, 2022 4:17 pm
  • Meet the longtime librarian being honored at the National Book Awards : NPR https://t.co/S44VQeJg83 November 13, 2022 2:51 pm
© 2004-2022 Elemental Links, Inc.