September 06, 2017

Is Digital Marketing having its ‘Deep Blue’ moment?


Garry Kasparov will forever be remembered as perhaps the greatest chess player of all time, dominating the game for almost twenty years until his retirement in 2005. But ironically he may be best remembered for the match he failed to win twenty years ago in 1997 against IBM’s Deep Blue chess computer. That watershed moment – marking the point at which computers effectively surpassed humans in chess-playing ability – prompted much speculation and hand-wringing about the coming obsolescence of the human brain, now that a mere computer had been able to beat the best chess grandmaster in the world.

Since then, computers and chess software have only grown more powerful, to the point that a $50 commercial chess program (or even a mobile app) can beat most grandmasters easily. Faced with this, you might expect Kasparov and other top-flight players to have grown disillusioned with the game, or defensive about the encroachment of computers on their intellectual territory; but in fact the reverse is true.

Today’s chess grandmasters make extensive use of computers to practice, try out new strategies, and prepare for tournaments, in the process becoming a little more like the machines that outpaced them in 1997. Kasparov himself was instrumental in pioneering a  new type of chess game, Advanced Chess, in which humans are allowed to consult with chess software as they play. In his new book, “Deep Thinking: Where Machine Intelligence Ends and Human Intelligence Begins”, Kasparov writes about an Advanced Chess match he played in 1998 against Veselin Topalov:

“Having a computer partner also meant never having to worry about making a tactical blunder. The computer could project the consequences of each move we considered, pointing out possible outcomes and countermoves we might otherwise have missed. With that taken care of for us, we could concentrate on strategic planning instead of spending so much time on calculations. Human creativity was even more paramount under these conditions.”

What Kasparov and his successors in the competitive chess-playing world have discovered was that, when it comes to chess, the strongest player is not man or machine, but man and machine. In fact, a new kind of chess tournament has sprung up, Freestyle Chess, in which teams of humans and computers compete against one another, each bringing their respective strengths to the game: creativity, strategy and intuition from the humans, and tactical outcome prediction from the computers.

And your point is?

You may be asking what relevance this has to digital marketing. In fact, there are strong similarities between chess and marketing (particularly digital marketing):  they are both highly quantifiable pursuits with clear outcomes which have historically relied solely on human intuition and creativity for success.

As in chess, digital marketing relies upon a continuous reassessment of the ‘board’ (customer behaviors and history) in order to decide upon the next ‘move’ (a particular campaign communication aimed at a particular group of customers). Once the move has been made, the board needs to be reassessed before taking the next move.

Today’s digital marketer is much like the chess grandmaster of the early 1990s – they rely on their intuitive understanding of their audience’s makeup and preferences to decide what offers and messages they want to deliver, to which users, and in which channels. Of course, digital marketers understand that measuring campaign outcomes and audience response (using techniques like control groups and attribution analysis) is very important, but most still operate in a world where the humans make the decisions, and the computers merely provide the numbers to support the decision-making.

Luddites 2.0

When Kasparov was asked in 1990 if a computer could beat a grandmaster before the year 2000, he quipped:

“No way - and if any grandmaster has difficulties playing computers, I would be happy to provide my advice.”

Today’s digital marketers can be forgiven for exhibiting some of the same skepticism. Ask them how they came up with a new idea for an ad, or how they know that a particular product will be just right for a particular audience, and they may not be able to answer – they will just know that their intuition is sound. As a result it can seem incredible that a computer can pick the right audience for a campaign, and match the appropriate offer and creative to that audience. 

But the computers are coming. As I mentioned in my earlier post on bandit experimentation, companies like Amplero, Kahuna and Cerebri AI are pitching intelligent systems that claim to take a lot of this decision-making about creative choice, audience, channel and other campaign variables out of the hands of humans. But where does that leave the digital marketer?

We welcome our robot colleagues

The clue lies in the insights that Kasparov ultimately drew from his defeat. He realized that the strengths he brought were different and complementary to the strengths of the computer. The same holds true for digital marketing. Coming up with product value propositions, campaign messaging and creative are activities which computers are nowhere close to being good at, especially in the context of broader intangible brand attributes. On the other hand, audience selection and targeting, as well as creative optimization, are highly suited to automation, to the extent that computers can be expected to perform significantly better than their human counterparts, much as chess software outperforms human players.

Clearly humans and machines need to work together to create and execute the best performing campaigns, but exactly how this model will work is still being figured out.

Today, most digital marketers build campaign audiences by hand, identifying specific audience attributes (such as demographics or behavioral history) and applying filters to to those attributes to build segments. The more sophisticated the marketer attempts to be in selecting audience attributes for campaign segments, the more cost they incur in the setup of those campaigns, making the ROI equation harder to balance.

The emerging alternative approach is to provide an ML/AI system with a set of audience (and campaign) attributes, and let it figure out which combinations of audience and offer/creative deliver the best results by experimenting with different combinations of these attributes in outbound communications. But this raises some important questions:

  • How to choose the attributes in the first place
  • How to understand which attributes make a difference
  • How to fit ML/AI-driven campaigns into a broader communications cadence & strategy
  • How to use learnings from ML/AI-driven campaigns to develop new value propositions and creative executions

In other words, ML/AI-driven marketing systems cannot simply be ‘black boxes’ into which campaign objectives and creative are dumped, and then left to deliver clicks or conversions on the resulting campaign delivery. They need to inform and involve marketers as they do their work, so that the marketers can make their uniquely human contribution to the process of designing effective campaigns. The black box needs some knobs and dials, in other words.

The world of chess offers a further useful parallel here. Chess grandmasters make extensive use of specialized chess software like Fritz 15 or Shredder, which not only provide a comprehensive database of chess moves, but also training and analysis capabilities to help human players improve their chess and plan their games. These programs don’t simply play chess – they explain how they are making their recommendations, to enable their human counterparts to make their own decisions more effectively.

These are the kinds of systems that digital marketers need to transform their marketing with AI. In turn, marketers need to adjust the way they plan and define campaigns in the same way that chess grandmasters have dramatically changed the way they study, plan and play games of chess in the last twenty years, working alongside the computers before, during and after campaigns are run.

In 1997, it was far from clear how chess, and the people who played it , would react to the arrival of computers. Digital Marketing stands on a similar threshold today. Twenty years from now it will seem obvious how marketers’ roles would evolve, and how technology would adapt to support them. We’re in the fortunate position of getting to figure this out as it all unfolds, much as Kasparov did. diggDigg RedditReddit StumbleUponStumbleUpon

May 03, 2017

What’s next for the Digital Analytics Association?

hipsterstepsI’ve been a member of the Digital Analytics Association for, it turns out, about twelve years – over half my professional life. In that time I’ve seen the organization grow and blossom into a vibrant community of professionals who are passionate about the work they do and about helping others to develop their own skills and career in digital analytics.

When the DAA started (as the WAA), web analytics was a decidedly niche activity, not considered as rigorous or demanding as ‘proper’ data mining or database development. Many of its early practitioners, like me, did not come from formal data backgrounds; we were to a large extent making things up as we went along, arguing with one another (often in lobby bars) about things like the proper definition of a page view, or the relative merits of JavaScript tags vs log files.

We didn’t know it at the time, but the niche activity we were helping to define would grow to dominate the entire field of data analytics. Today, transactional (i.e. log-like) and unstructured data comprise the vast majority of data being captured and analyzed worldwide and the analytical principles and techniques that the DAA championed have become the norm, not the exception.

The DAA and its members can justly derive a certain amount of satisfaction from knowing we were part of something so early on, but now that the rest of the world has shown up to the party that we started, how do we continue to differentiate the organization and add value to its members and the industry?

It’s to help answer this and other interesting and challenging questions facing the DAA that I’ve put my name forward for a position on the organization’s board. You can read my nomination (and, hopefully, vote for me) here if you’re a DAA member. After twelve years of benefiting from my DAA membership, it’s time to give something back to the organization.

If I’m elected to the board, I’ll devote my energies to helping DAA members adapt to and embrace the next set of transformations that are taking place within the industry. In my role at Microsoft I’m participating in a very rapid shift from traditional descriptive analytics, based around a recognizable cycle of do/measure/analyze/adjust, to machine learning-based optimization of business processes, particularly digital marketing. Predictive analytics and data science skills are therefore becoming more and more important in digital analytics, while the range of data and scenarios is exploding. This raises tricky questions for the DAA: Which skillsets and data scenarios should the association focus its energies on, and how to stay relevant as the industry changes so rapidly?

A big part of the answer, I believe, lies with the DAA members ourselves. At a DAA member event in Seattle last week, I met the excellent Scott Fasser of HackerAgency and had a fascinating conversation with him about a current passion of mine, multi-armed bandit experimentation for digital marketing. There are many experienced members of the DAA like Scott, who have deep expertise in different areas of digital analytics, and who are keen to share their knowledge with others. We need to find ways to connect the Scotts of this world to people who can benefit from their expertise, and more broadly connect the DAA’s more experienced members with those newer to the discipline so that they can pass on their hard-won knowledge.

Finally, given that so many new people have moved into the analytics neighborhood, the DAA needs to get out and meet some of the new neighbors rather than peering out through the curtains muttering about hipsters and gentrification. Many new groups of analytics & data science professionals have sprung up over the years, both formal and informal, and there are likely profitable connections to be made with at least some of these organizations, many of which share some of the same members as the DAA.

So if you’d like to see me put my shoulder to the wheel to address these and other challenges, please vote for me by May 12. diggDigg RedditReddit StumbleUponStumbleUpon

January 25, 2017

Solving the attribution conundrum with optimization-based marketing

Accurate multichannel campaign attribution has stumped the online marketing industry for years. But what if the solution is to stop worrying about attribution, and move to an optimization-driven approach?

2016-06-13_thumb4You know those photo mosaic images, which suddenly became terribly popular a few years back? They cleverly use lots of individual tiny images to make up one large image. If you look closely you can make out the individual images, but you have to stand back to take in the full picture.

The same is true for measuring the impact of digital marketing. When you step back, techniques like Marketing Mix Modeling can show that, in aggregate, digital marketing works as a part of the overall marketing mix - it complements other elements of the mix such as television and retail to drive sales.

On the other hand, zooming in, it's fairly straightforward to understand the impact of individual digital marketing campaigns at a user level, using various forms of instrumentation and tagging to link user actions to the marketing that they've seen. These techniques have become so common that it’s a brave marketer today who spends money on a digital campaign without providing some kind of performance reporting.

The problem comes in the middle. If you zoom out of a mosaic picture, there is a point where you lose the detail of the individual photos but the bigger picture has not yet emerged. And so it is with digital marketing; understanding the way that multiple campaigns, across multiple digital channels, interact to influence behavior at the user level is a very challenging problem that has stumped the industry for years - the so-called "attribution problem".

To put it another way, we've moved on from deciding whether to do digital marketing; it's which digital marketing to do which is the conundrum today, and especially understanding which mix of digital marketing will drive the best results.

The attribution problem is a really tough one for a few reasons:

  • Digital marketing channels don’t drive user behavior independently, but in combination, and also interfere with each other (for example, an email campaign can drive search activity);
  • User "state" (the history of a user's exposure and response to marketing) is changing all the time, making taking a snapshot of users for analysis purposes very difficult;
  • Attribution models end up including so many assumptions (for example, "decay curves" or "adstock" for influence of certain channels) that they end up being a reflection of the assumptions rather than a reflection of reality.

The trouble is, most organizations understand that they can't just continue to invest in, execute and analyze their digital marketing in a siloed, channel by channel fashion; they want to create a consistent, coherent dialog with their audience that spans channels and devices. But how to do it?


Digital Marketing as an Optimization Problem

The answer to this dilemma lies in thinking differently about digital marketing, and treating it as a user-centric optimization problem instead of a descriptive analytics problem.

To understand how this is different from traditional digital marketing, let's first look at how most digital marketing campaigns are set up today:


In a traditional digital campaign, a specific audience is identified by a marketer (either from first or third-party data, or a combination of the two) and a set of creative (ads, emails, etc.) is then delivered to that audience. After some time (measured in days or weeks) the marketer looks at the results and makes decisions about how to improve the next campaign, or make adjustments to the current campaign, to improve effectiveness.

The performance of the campaign can be improved in-flight by using techniques like dynamic creative optimization to weed out low-performing creatives before the campaign has finished. But overall insights about the campaign are usually left to the end. Most campaigns are analyzed on a channel-by-channel basis, and even if they're using control groups to measure lift, can't take into account the impact of other channels in their analysis.

With an optimization-driven approach, instead of the marketer creating a series of discrete campaigns for individual products or offers, each with its own target audience and its own outcome measurement, the marketer creates a series of "offers" (essentially, product messages) which can be delivered to users. The offers - together with a set of creative assets - are made available to an optimization engine, which continually tries to predict the combination of offer, creative and channel (email, web etc.) which will deliver the best outcome (click, conversion, revenue, etc.) with users.


A good example of this in a single digital channel is Amazon's product recommendation feature on its website, which combines information that it has about you (your previous purchases, demographic information, what you're currently purchasing) and information about the products to present a series of "suggested products" (in other words, offers) to you.


Multi-armed banditry

shutterstock_16210534_thumb7There are a number of things you need in place to make the above model work, such as a single creative repository, and a consistent execution model across multiple channels. The magic at the center of the picture, however, is the optimization engine. This is a piece of software that is capable of running multiple concurrent combinatorial tests of your creative, offers, user segments and channels, to find the combinations that deliver the best results. This is a classic multi-armed bandit problem.

This statistical problem is so called because it is based on the idea of an imaginary gambler at a row of slot machines, trying to decide which ones to put money into to generate the best return. The gambler could use one of the following strategies:

  • Pick a slot machine at random and stick with it, which would mean he'd most likely miss the most generous machine, and could pick a terrible one
  • Spread his money equally between all the machines, minimizing his chances of putting all his money into a bad machine, but ensuring he doesn't strike it rich, either

The smarter thing for the gambler to do is to start by putting a little money into each machine, and then, based on the results he gets, divert the remainder of his money to the machine that delivered the best return. The first of these two phases is known as the "explore" phase; the second, the "exploit" phase.

Multi-armed bandit experimentation is good for situations where conditions can change over time. Our gambler can choose to continue to divert a little money to the other machines even once he's identified the "best" machine, since slot machines can vary their payout over time; this minimizes his chance of losing out if conditions change. As a result, multi-armed experimentation is well-suited to campaign optimization because the users’ state is changing all the time - a user who has already received three emails about a product is much less likely to click on a fourth than a user who has never received an email about the same product, for example. Multi-armed bandit experimentation methodologies can be slower to deliver statistically significant results than traditional A/B or multivariate testing, but they are more robust in dynamic environments.


Dimensions of optimization

When we apply multi-armed bandit experimentation to campaign optimization, it's helpful to think of an overall "optimization space" that is comprised of all the attributes that we can optimize over. Broadly, these attributes fall into three categories:

  • Audience attributes: Information that we have about the audience for the marketing, at the individual level, such as previous purchases, demographic data, product/website usage, or marketing engagement
  • Offer attributes: Information about the offers themselves, such as product category, price range, or purchase model
  • Tactic attributes: Information (reflecting choices made) about the tactics that we are using in our campaigns, such as channel, creative, format, or timing

The first task of the optimization engine is to carve up this multi-dimensional space into a (quite large) number of virtual "bandits" or treatments, run concurrent marketing tests in each of the treatments, and measure the results. To visualize this with a simple example, let's imagine we're just using two dimensions to carve up the space:

  • User product engagement level (low, medium or high)
  • Marketing channel (email, advertising, mobile)

Because each of these dimensions has just three members each, there are 9 treatments in total, as in the diagram below:


For each treatment, the engine calculates the value of a success metric (in this example, conversion rate) based on delivery of messaging in each treatment. So in the example above, emails sent to the "Low" product engagement group of users resulted in a 3.4% conversion rate, while mobile messages to the Medium group generated a 2.9% conversion rate.

Based on these results, the optimization engine then needs to decide which treatment(s) it should focus its delivery on going forward to generate the best outcome overall. In the table above, the winning treatment is Email to High engaged users, generating a conversion rate of 9.8%. But of course the engine just can't put all its eggs in this one basket, for a couple of reasons: Firstly, we want our marketing to cover all the addressable audience, not just one part of it; and secondly, it's likely that there is some interaction between the effects of the different treatments - for example, a user who has received an email and a mobile message may be more likely to convert than one who has just received an email.

So what the engine really needs to do is decide which combination of treatments it should go forward with. This is called Combinatorial Multi-Armed Bandit experimentation, or CMAB for short, and is the subject of much academic study at the moment. If you'd like to learn more about this, my colleague Wei Chen of Microsoft Research has published a paper on it, which you can read here.


No humans required?

industrial-design-rendering-cyborg-headAdvocates of optimization-based marketing are liable to get a bit over-excited and say that this means that humans will no longer be needed to build campaign plans or audiences, and that in the future we'll just be able to toss offers into a giant hopper and watch them all be delivered to the perfect audience with no human intervention (though others disagreedisagree).

Fortunately for digital marketers, and especially digital marketing analytics professionals, optimization-driven campaigns don't remove the need for human involvement, though they do change its nature. Instead of creating complex audience segments up front for a campaign, these people will need instead to identify the attributes that campaigns should use for optimization.

Attribute selection (known as feature selection in data science circles) is a crucial step in making optimization work. Select too many attributes, and the engine will slice the audience up into tiny slivers, each of which will take ages to deliver results that are statistically significant, meaning that the optimization will take a long time to converge and deliver lift. Select too few, on the other hand, and the engine will converge quickly (since it will have few choices and plenty of data), but the lift will likely be very modest because the resulting "optimization" will not actually be very targeted to the audience. Select the wrong attributes, and the system will not optimize at all.

What this does mean for marketers is that the bar is being raised on the level of data-savviness required to do the job; no longer is it sufficient to say “Well, my product is aimed at younger people, so I’m going to target the 18-25 demo and hope for the best”. Marketers will increasingly need to work with data scientists (or pick up some data science skills themselves) to set up effective optimization-driven campaigns.


Getting started

This new approach to digital marketing optimization is a big change from the way that marketers have worked up until now. Fortunately, you don't have to change everything at once in order to start gaining benefits from this approach.

The best way to get started is to identify which attributes of your offers, audience or tactics you are able to experiment over most easily. If you have a lot of rich data about your audience, for example, you can use that as your experimentation space, carving your users up into many small segments and experimenting with creative variations and other delivery aspects like timing to get the best results. On the other hand, if you have a large and diverse product catalog, you can experiment within that domain, attempting to find the product offers that work best in different circumstances or with different creative.

Most existing targeting/optimization systems are primarily focused on optimizing within these two areas. For example, there are lots of email marketing solutions that can use rich audience data to target and personalize email. On the other hand, Amazon's recommendation system uses a combination of audience attributes (your purchase and browsing history) and the huge library of offers (essentially, Amazon's entire product catalog) to make targeted recommendations on the website.

Once you have built up experience in experimentation in these areas, you can tackle multi-channel experimentation. In addition to rich data on your users and products, this requires you to be able to execute experiments across channels easily, which means that you need an integrated campaign execution system, and an integrated marketing operations function to go with it. Right now, this is the biggest impediment to true cross-channel optimization: Most companies run their digital marketing in separate, channel-focused silos. Building a campaign that can execute seamlessly across multiple channels thus requires lots of cross-organization cooperation, which can be tough to pull off.

Fortunately there are a few companies which are starting to offer solutions for optimization-driven marketing and can start to help you down this path:

Amplero Digital campaign intelligence & optimization platform based on predictive analytics & machine learning.
Optimove Multichannel campaign automation solution, combining predictive modeling, hypertargeting and optimization
Kahuna Mobile-focused marketing automation & optimization solution
IgnitionOne Digital marketing platform featuring score-based message optimization; ability to activate across multiple channels
BrightFunnel Marketing analytics platform focusing on attribution modeling
ConversionLogic Cross-channel marketing attribution analytics platform, using a proprietary ML-based approach

If you know of other players in this space, please let me know in the comments.



Multichannel campaign optimization using combinatorial multi-armed bandit experimentation is a powerful, though nascent, alternative to traditional campaign attribution approaches for maximizing marketing ROI. Although performing true multichannel optimization requires significant investment and maturity in data, marketing automation technology and organizational alignment, it’s possible to get started in a more limited fashion by taking an optimization-driven approach in existing channels, and growing from there. diggDigg RedditReddit StumbleUponStumbleUpon

October 22, 2015

6 steps to building your Marketing Data Strategy

powerpoint_sleeping_meetingYour company has a Marketing Strategy, right? It’s that set of 102 slides presented by the CMO at the offsite last quarter, immediately after lunch on the second day, the session you may have nodded off in (it’s ok, nobody noticed. Probably). It was the one that talked about customer personas and brand positioning and social buzz, and had that video towards the end that made everybody laugh (and made you wake up with a start).

Your company may also have a Data Strategy. At the offsite, it was relegated to the end of the third day, after the diversity session and that presentation about patent law. Unfortunately several people had to leave early to catch their flights, so quite a few people missed it. The guy talked about using Big Data to drive product innovation through continuous improvement, and he may (at the very end, when your bladder was distracting you) have mentioned using data for marketing. But that was something of an afterthought, and was delivered with almost a sneer of disdain, as if using your company’s precious data for the slightly grubby purpose of marketing somehow cheapened it.

Which is a shame, because Marketing is one of the most noble and enlightened ways to use data, delivering a direct kick to the company’s bottom line that is hard to achieve by other means. So when it comes to data, your marketing shouldn’t just grab whatever table scraps it can and be grateful; it should actually drive the data that you produce in the first place. This is why you don’t just need a Marketing Strategy, or a Data Strategy: You need a Marketing Data Strategy.

A Marketing Data What?

What even is a Marketing Data Strategy, anyway? Is it even a thing? It certainly doesn’t get many hits on Bing, and those hits it does get tend to be about building a data-driven Marketing Strategy (i.e. a marketing strategy that focuses on data-driven activities). But that’s not what a Marketing Data Strategy is, or at least, that’s not my definition, which is:

A Marketing Data Strategy is a strategy for acquiring, managing, enriching and using data for marketing.

The four boldface words are the key here. If you want to make the best use of data for your marketing, you need to be thinking about how you can get hold of the data you need, how you can make it as useful as possible, and how you can use your marketing efforts themselves to generate even more useful data – creating a positive feedback loop and even contributing to the pool of Big Data that your Big Data guy is so excited about turning into an asset for the company.

Building your Marketing Data Strategy

So know that you know why it’s important to have a Marketing Data Strategy, how do you put one together? Everyone loves a list, so here are six steps you can take to build and then start executing on your Marketing Data Strategy.

Step 1: Be clear on your marketing goals and approach

setting-goalsThis seems obvious, but it’s a frequently missed step. Having a clear understanding of what you’re trying to achieve with your digital marketing will help you to determine what data you need, and what you need to do with/to it to make it work for you. Ideally, you already have a marketing strategy that captures a lot of this, though the connection between the lofty goals of a marketing strategy (sorry, Marketing MBA people) and the practical data needs to execute the strategy are not always clear.

Here are a few questions you should be asking:

Get new customers, or nurture existing ones? If your primary goal is to attract new customers, you’ll need to think differently about data (for example relying on third-party sources) than if you are looking to deepen your relationship with your existing customers (about whom you presumably have some data already).

What are your goals & success criteria? If you are aiming to drive sales, are you more interested in revenue, or margin? If you’re looking to drive engagement or loyalty, are you interested in active users/customers, or engagement depth (such as frequency of usage)?

Which communications strategies & channels? The environments in which you want to engage your audience make a big difference to your data needs – for example, you may have more data at your disposal to target people using your website compared to social or mobile channels.

Who’s your target audience? What attributes identify the people you’d most like to reach with your marketing? Are they primarily demographic (e.g. gender, age, locale) or behavioral (e.g. frequent users, new users)?

What is your conversion funnel? Can you convert customers entirely online, or do you need to hand over to humans (e.g. in store) at some point? If the latter, you’ll need a way to integrate offline transaction data with your online data.

These questions will not only help you identify the data you’ll need, but also some of the data that you can expect to generate with your marketing.

Step 2: Identify the most important data for your marketing efforts

haystack1Once you’re clear on your goals and success criteria, you need to consider what data is going to be needed to help you achieve them, and to measure your success.

The best way to break this down is to consider which events (or activities) you need to capture and then which attributes (or dimensions) you need on those events. But how to pick the events and attributes you need?

Let’s start with the events. If your marketing goals include driving revenue, you will need revenue (sales) events in your data, such as actual purchase amounts. If you are looking to drive adoption, then you might need product activation events. If engagement is your goal, then you will need engagement events – this might be usage of your product, or engagement with your company website or via social channels.

Next up are the attributes. Which data points about your customers do you think would be most useful for targeted marketing? For example, does your product particularly appeal to men, or women, or people within a certain geography or demographic group?

For example, say you’re an online gambling business. You will have identified that geo/location information is very important (because online gambling is banned in some countries, such as the US). Therefore, good quality location information will be an important attribute of your data sources.

At this step in the process, try not to trip yourself up by second-guessing how easy or difficult it will be to capture a particular event or attribute. That’s what the next step (the data audit) is for.

Step 3: Audit your data sources

auditor_gift_i_love_auditing_mugNow to the exciting part – a data audit! I’m sure the very term sends shivers of anticipation down your spine. But if you skip this step, you’ll be flying blind, or worse, making costly investments in acquiring data that you already have.

The principle of the data audit is relatively simple – for every dataset you have which describes your audience/customers and their interaction with you, write down whether (and at what kind of quality) they contain the data you need, as identified in the previous step:

  • Events (e.g. purchases, engagement)
  • Attributes (aka dimensions, e.g. geography, demographics)
  • IDs (e.g. cookies, email addresses, customer IDs)

The key to keeping this process from consuming a ton of time and energy is to make sure you’re focusing on the events, attributes and IDs which are going to be useful for your marketing efforts. Documenting datasets in a structured way is notoriously challenging (some of the datasets we have here at Microsoft have hundreds or even thousands of attributes), so keep it simple, especially the first time around – you can always go back and add to your audit knowledge base later on.

The one type of data you probably do want to be fairly inclusive with is ID data. Unless you already have a good idea which ID (or IDs) you are going to use to stitch together your data, you should capture details of any ID data in your datasets. This will be important for the next step.

To get you started on this process, I’ve created a very simple data audit template which you can download here. You’re welcome.

Step 4: Decide on a common ID (or IDs)

name_badge_2This is a crucial step. In order for you to build a rich profile of your users/customers that will enable you to target them effectively with marketing, you need to be able to stitch the various sources of data about them together, and for this you need a common ID.

Unless you’re spectacularly lucky, you won’t be issuing (or logging) a single ID consistently across all touchpoints with your users, especially if you have things like retail stores, where IDing your customers reliably is pretty difficult (well, for the time being, at least). So you’ll need to pick an ID and use this as the basis for a strategy to stitch together data.

When deciding which ID or IDs to use, take into consideration the following attributes:

  • The persistence of the ID. You might have a cookie that you set when people come visit your website, but cookie churn ensures that that ID (if it isn’t linked to a login) will change fairly regularly for many of your users, and once it’s gone, it won’t come back.
  • The coverage of the ID. You might have a great ID that you capture when people make a purchase, or sign up for online support, but if it only covers a small fraction of your users, it will be of limited use as a foundation for targeted marketing unless you can extend its reach.
  • Where the ID shows up. If your ID is present in the channels that you want to use for marketing (such as your own website), you’re in good shape. More likely, you’ll have an ID which has good representation in some channels, but you want to find those users in another channel, where the ID is not present.
  • Privacy implications. User email address can be a good ID, but if you start transmitting large numbers of email addresses around your organization, you could end up in hot water from a privacy perspective. Likewise other sensitive data like Social Security Numbers or credit card numbers – do not use these as IDs.
  • Uniqueness to your organization. If you issue your own ID (e.g. a customer number) that can have benefits in terms of separating your users from lists or extended audiences coming from other providers; though on the other hand, if you use a common ID (like a Facebook login), that can make joining data externally easier later.

Whichever ID you pick, you will need to figure out how you can extend its reach into the datasets where you don’t currently see it. There are a couple of broad strategies for achieving this:

  • Look for technical strategies to extend the ID’s reach, such as cookie-matching with a third-party provider like a DMP. This can work well if you’re using multiple digital touchpoints like web and mobile (though mobile is still a challenge across multiple platforms).
  • Look for strategies to increase the number of signed-in or persistently identified users across your touchpoints. This requires you to have a good reason to get people to sign up (or sign in with a third-party service like Facebook) in the first place, which is more of a business challenge than a technical one.

As you work through this, make sure you focus on the touchpoints/channels where you most want to be able to deliver targeted messaging – for example, you might decide that you really want to be able to send targeted emails and complement this with messaging on your website. In that case, finding a way to join ID data between those two specific environments should be your first priority.

Step 5: Find out what gaps you really need to fill

mindthegapYour data audit and decisions around IDs will hopefully have given you some fairly good indications of where you’re weak in your data. For example, you may know that you want to target your marketing according to geography, but have very little geographic data for your users. But before you run off to put a bunch of effort into getting hold of this data, you should try to verify whether a particular event or attribute will actually help you deliver more effective marketing.

The best way to do this is to run some test marketing with a subset of your audience who has a particular attribute or behavior, and compare the results with similar messaging to a group who which does not have this attribute (but are as similar in other regards as you can make them). I could write another whole post on this topic of A/B testing, because there is a myriad of ways that you can mess up a test like this and invalidate your results, or I could just recommend you read the work of my illustrious Microsoft colleague, Ronny Kohavi.

If you are able to run a reasonably unbiased bit of test marketing, you will discover whether the datapoint(s) you were interested in actually make a difference to marketing outcomes, and are therefore worth pursuing more of. You can end up in a bit of a chicken-and-egg situation in this regard, because of course you need data in the first place to test its impact, and even if you do have some data, you need to test over a sufficiently large population to be able to draw reliable conclusions. To address this, you could try working with a third-party data provider over a limited portion of your user base, or over a population the provider provides.

Step 6: Fix what you can, patch what you can’t, keep feeding the beast

cookie-monster-1_2Once you’ve figured out which data you actually need and the gaps you need to fill, the last part of your Marketing Data Strategy is about tactics to actually get this data. Of course the tactics then represent an ongoing (and never-ending) process to get better and better data about your audience. Here are four approaches you can use to get the data you need:

Measure it. Adding instrumentation to your website, your product, your mobile apps, or other digital touchpoints is (in principal) a straightforward way of getting behavioral events and attributes about your users. In practice, of course, a host of challenges exist, such as actually getting the instrumentation done, getting the signals back to your datacenter, and striking a balance between well-intentioned monitoring of your users and appearing to snoop on them (we know a little bit about the challenges of striking this balance).

Gather it. If you are after explicit user attributes such as age or gender, the best way to get this data is to ask your users for it. But of course, people aren’t just going to give you this information for no reason, and an over-nosy registration or checkout form is a sure-fire way to increase drop-out from your site, which can cost you money (just ask Bryan Eisenberg). So you will need to find clever ways of gathering this data which are linked to concrete benefits for your audience.

Model it. A third way to fill in data gaps is to use data modeling to extrapolate attributes that you have on some of your audience to another part of your audience. You can use predictive or affinity modeling to model an existing attribute (e.g. gender) by using the behavioral attributes of existing users whose gender you know to predict the gender of users you don’t know; or you can use similar techniques to model more abstract attributes, such as affinity for a particular product (based on signals you already have for some of your users who have recently purchased that product). In both cases you need some data to base your models on and a large enough group to make your predictions reasonably accurate. I’ll explore these modeling techniques in another post.

Buy it. If you have money to spend, you can often (not always) buy the data you need. The simplest (and crudest) version of this is old-fashioned list-buying – you buy a standalone list of emails (possibly with some other attributes) and get spamming. The advantage of this method is that you don’t need any data of your own to go down this path; the disadvantages are that it’s a horrible way to do marketing, will deliver very poor response rates, and could even damage your brand if you’re seen as spamming people. The (much) better approach is to look for data brokers that can provide data that you can join to your existing user/customer data (e.g. they have a record for user and so do you, so you can join the data together using the email address as a key).

Once you’ve determined which data makes the most difference for your marketing, and have hit upon a strategy (or strategies) to get more of this data, you need to keep feeding the beast. You won’t get all the data you need – whether you’re measuring it, asking for it, or modeling it – right away, so you’ll need to keep going, adjusting your approach as you go and learn about the quality of the data you’re collecting. Hopefully you can reduce your dependency on bought data as you go.

Finally, don’t forget – all this marketing you’re doing (or plan to do) is itself a very valuable source of data about your users. You should make sure you have a means to capture data about the marketing you’re exposing your users to, and how they’re responding to it, because this data is useful not just for refining your marketing as you go along, but can actually be useful other areas of your business such as product development or support. Perhaps you’ll even get your company’s Big Data people to have a bit more begrudging respect for marketing… diggDigg RedditReddit StumbleUponStumbleUpon

August 26, 2015

Got a DMP coming in? Pick up your underwear

mr-messy-nr-8If you’re like me, and have succumbed to the unpardonably bourgeois luxury of hiring a cleaner, then you may also have found yourself running around your house before the cleaner comes, picking up stray items of laundry and frantically doing the dishes. Much of this is motivated by “cleaner guilt”, but there is a more practical purpose – if our house is a mess when the cleaner comes, all she spends her time doing is tidying up (often in ways that turn out to be infuriating, as she piles stuff up in unlikely places) rather than actually cleaning (exhibit one: my daughter’s bedroom floor).

This analogy occurred to me as I was thinking about the experience of working with a Data Management Platform (DMP) provider. DMPs spend a lot of time coming in and “cleaning house” for their customers, tying together messy datasets and connecting them to digital marketing platforms. But if your data systems and processes are covered with the metaphorical equivalent of three layers of discarded underwear, the DMP will have to spend a lot of time picking that up (or working around it) before they can add any serious value.

So what can you do ahead of time to get the best value out of bringing in a DMP? That’s what this post is about.

What is a DMP, anyway?

That is a excellent question. DMPs have evolved and matured considerably since they emerged onto the scene a few years ago. It’s also become harder to clearly identify the boundaries of a DMP’s services because many of the leading solutions have been integrated into broader “marketing cloud” offerings (such as those from Adobe, Oracle or Salesforce). But most DMPs worth their salt provide the following three core services:

Data ingestion & integration: The starting place for DMPs, this is about bringing a marketer’s disparate audience data together in a coherent data warehouse that can then be used for analytics and audience segment building. Central to this warehouse is a master user profile  – a joined set of ID-linked data which provides the backbone of a customer’s profile, together with attributes drawn from first-party sources (such as product telemetry, historical purchase data or website usage data) and third-party sources (such as aggregated behavioral data the DMP has collected or brokered).

Analytics & segment building: DMPs typically offer their own tools for analyzing audience data and building segments, often as part of a broader campaign management workflow. These capabilities can vary in sophistication, and sometimes include lookalike modeling, where the DMP uses the attributes of an existing segment (for example, existing customers) to identify other prospects in the audience pool who have similar attributes, and conversion attribution - identifying which components of a multi-channel campaign actually influenced the desired outcomes (e.g. a sale).

Delivery system integration: The whole point of hiring a DMP to integrate data and enable segment building is to support targeted digital marketing. So DMPs now provide integration points to marketing delivery systems across email, display (via DSP and Exchange integration), in-app and other channels. This integration is typically patchy and influenced by other components of the DMP provider’s portfolio, but is steadily improving.

Making the best of your DMP relationship

The whole reason that DMPs exist in the first place is because achieving the above three things is hard – unless your organization in a position to build out and manage its own data infrastructure and put some serious investment behind data integration and development, you are unlikely to be able to replicate the services of a DMP (especially when it comes to integration with third-party data and delivery systems). But there are a number of things you can do to make sure you get the best value out of your DMP relationship.


1. Clean up your data

dirty-dishesThis is the area where you can make the most difference ahead of time. Bringing signals about your audience/customers together will benefit your business across the board, not just in a marketing context. You should set your sights on integrating (or at least cataloging and understanding) all data that represents customer/prospect interaction with your organization, such as:

  • Website visits
  • Purchases
  • Product usage (if you have a product that you can track the usage of)
  • Mobile app usage
  • Social media interaction (e.g. tweets)
  • Marketing campaign response (e.g. email clicks)
  • Customer support interactions
  • Survey/feedback response

You should also integrate any datasets you have that describe what you already know about your customers or users, such as previous purchases or demographic data.

The goal here is, for a given user/customer, to be able to identify all of their interactions with your organization, so that you can cross-reference that data to build interesting and useful segments that you can use to communicate with your audience. So for user XYZ123, for example, you want to know that:

  • They visited your website 3 times in the past month, focusing mainly on information about your Widget3000 product
  • They have downloaded your free WidgetFinder app, and run it 7 times
  • They previously purchased a Widget2000, but haven’t used it for four months
  • They are male, and live in Sioux Falls, South Dakota
  • Last week they tweeted:

Unless you’re some kind of data saint (or delusional), reading the two preceding paragraphs probably filled you with exhaustion. Because all of the above kinds of data have different schemas (if they have schemas at all), and more importantly (or depressingly), they all use different (or at least independent) ways of identifying who the user/customer actually is. How are you supposed to join all this data if you don’t have a common key?

DSPs solve these problems in a couple of ways:

  • They provide a unified ID system (usually via a third-party tag/cookie) for all online interaction points (such as web, display ads, some social)
  • They will map/aggregate key behavioral signals onto a common schema to create a single user profile (or online user profile, at any rate), typically hosted in the DMP’s cloud

The upside of this approach is that you can achieve some degree of data integration via the (relatively) painless means of inserting another bit of JavaScript into all of your web pages and ad templates, and also that you can access other companies’ audiences who are tagged with the same cookie – so-called audience extension.

However, there are some downsides, also. Key amongst these are:

Yet another ID: If you already have multiple ways of IDing your users, adding another “master ID” to the mix may just increase complexity. And it may be difficult to link key behaviors (such as mobile app purchases) or offline data (such as purchase history) to this ID.

Your data in someone else’s cloud: Most marketing cloud/DMP solutions assume that the master audience profile dataset will be stored in the cloud. That necessarily limits the amount and detail of information you can include in the profile – for example, credit card information.

It doesn’t help your data: Just taking a post-facto approach with a DMP (i.e. fixing all your data issues downstream of the source, in the DMP’s profile store) doesn’t do anything to improve the core quality of the source data.

So what should you do? My recommendation is to catalog, clean up and join your most important datasets before you start working with a DMP, and (if possible) identify an ID that you already own that you can use as a master ID. The more you can achieve here, the less time your DMP will spend picking up your metaphorical underwear, and the more time they’ll spend providing value-added services such as audience extension and building integrations into your online marketing systems.


2. Think about your marketing goals and segments

cpc_01You should actually think about your marketing goals before you even think about bringing in a DMP or indeed make any other investments in your digital marketing capabilities. But if your DMP is already coming in, make sure you can answer questions about what you want to achieve with your audience (for example, conversions vs engagement) and how you segment them (or would like to segment them).

Once you have an idea of the segments you want to use to target your audience, then you can see whether you have the data already in-house to build these segments. Any work you can do here up-front will save your DMP a lot of digging around to find this data themselves. It will also equip you well for conversations with the DMP about how you can go about acquiring or generating that data, and may save you from accidentally paying the DMP for third-party data that you actually don’t need.


3. Do your own due diligence on delivery systems and DSPs

catapultYour DMP will come with their own set of opinions and partnerships around Demand-side Platforms (DSPs) and delivery systems (e.g. email or display ad platforms). Before you talk with the DMP on this, make sure you understand your own needs well, and ideally, do some due diligence with the solutions in the marketplace (not just the tools you’re already using) as a fit to your needs. Questions to ask here include:

  • Do you need realtime (or near-realtime) targeting capabilities, and under what conditions? For example, if someone activates your product, do you want to be able to send them an email with hints and tips within a few hours?
  • What kinds of customer journeys do you want to enable? If you have complex customer journeys (with several stages of consideration, multiple channels, etc) then you will need a more capable ‘journey builder’ function in your marketing workflow tools, and your DMP will need to integrate with this.
  • Do you have any unusual places you want to serve digital messaging, such as in-product/in-app, via partners, or offline? Places where you can’t serve (or read) a cookie will be harder to reach with your DMP and may require custom integration.

The answers to these questions are important: on the one hand there may be a great third-party system with functionality that you really like, but which will need custom integration with your DMP; on the other hand, the solutions that the DMP can integrate with easily may get you started quickly and painlessly, but may not meet your needs over time.


If you can successfully perform the above housekeeping activities before your DMP arrives and starts gasping at the mountain of dishes piled up in your kitchen sink, you’ll be in pretty good shape. diggDigg RedditReddit StumbleUponStumbleUpon

June 23, 2015

The seven people you need on your data team

Congratulations! You just got the call – you’ve been asked to start a data team to extract valuable customer insights from your product usage, improve your company’s marketing effectiveness, or make your boss look all “data-savvy” (hopefully not just the last one of these). And even better, you’ve been given carte blanche to go hire the best people! But now the panic sets in – who do you hire? Here’s a handy guide to the seven people you absolutely have to have on your data team. Once you have these seven in place, you can decide whether to style yourself more on John Sturges or Akira Kurosawa.

Before we start, what kind of data team are we talking about here? The one I have in mind is a team that takes raw data from various sources (product telemetry, website data, campaign data, external data) and turns it into valuable insights that can be shared broadly across the organization. This team needs to understand both the technologies used to manage data, and the meaning of the data – a pretty challenging remit, and one that needs a pretty well-balanced team to execute.

1. The Handyman
Weird-Al-Handy_thumb10The Handyman can take a couple of battered, three-year-old servers, a copy of MySQL, a bunch of Excel sheets and a roll of duct tape and whip up a basic BI system in a couple of weeks. His work isn’t always the prettiest, and you should expect to replace it as you build out more production-ready systems, but the Handyman is an invaluable help as you explore datasets and look to deliver value quickly (the key to successful data projects). Just make sure you don’t accidentally end up with a thousand people accessing the database he’s hosting under his desk every month for your month-end financial reporting (ahem).

Really good handymen are pretty hard to find, but you may find them lurking in the corporate IT department (look for the person everybody else mentions when you make random requests for stuff), or in unlikely-seeming places like Finance. He’ll be the person with the really messy cubicle with half a dozen servers stuffed under his desk.

The talents of the Handyman will only take you so far, however. If you want to run a quick and dirty analysis of the relationship between website usage, marketing campaign exposure, and product activations over the last couple of months, he’s your guy. But for the big stuff you’ll need the Open Source Guru.

2. The Open Source Guru
cameron-howe_thumbI was tempted to call this person “The Hadoop Guru”. Or “The Storm Guru”, or “The Cassandra Guru”, or “The Spark Guru”, or… well, you get the idea. As you build out infrastructure to manage the large-scale datasets you’re going to need to deliver your insights, you need someone to help you navigate the bewildering array of technologies that has sprung up in this space, and integrate them.

Open Source Gurus share many characteristics in common with that most beloved urban stereotype, the Hipster. They profess to be free of corrupting commercial influence and pride themselves on plowing their own furrow, but in fact they are subject to the whims of fashion just as much as anyone else. Exhibit A: The enormous fuss over the world-changing effects of Hadoop, followed by the enormous fuss over the world-changing effects of Spark. Exhibit B: Beards (on the men, anyway).

So be wary of Gurus who ascribe magical properties to a particular technology one day (“Impala’s, like, totally amazing”), only to drop it like ombre hair the next (“Impala? Don’t even talk to me about Impala. Sooooo embarrassing.”) Tell your Guru that she’ll need to live with her recommendations for at least two years. That’s the blink of an eye in traditional IT project timescales, but a lifetime in Internet/Open Source time, so it will focus her mind on whether she really thinks a technology has legs (vs. just wanting to play around with it to burnish her resumé).

3. The Data Modeler
ErnoCube_thumb9While your Open Source Guru can identify the right technologies for you to use to manage your data, and hopefully manage a group of developers to build out the systems you need, deciding what to put in those shiny distributed databases is another matter. This is where the Data Modeler comes in.

The Data Modeler can take an understanding of the dynamics of a particular business, product, or process (such as marketing execution) and turn that into a set of data structures that can be used effectively to reflect and understand those dynamics.

Data modeling is one of the core skills of a Data Architect, which is a more identifiable job description (searching for “Data Architect” on LinkedIn generates about 20,000 results; “Data Modeler” only generates around 10,000). And indeed your Data Modeler may have other Data Architecture skills, such as database design or systems development (they may even be a bit of an Open Source Guru). But if you do hire a Data Architect, make sure you don’t get one with just those more technical skills, because you need datasets which are genuinely useful and descriptive more than you need datasets which are beautifully designed and have subsecond query response times (ideally, of course, you’d have both). And in my experience, the data modeling skills are the rarer skills; so when you’re interviewing candidates, be sure to give them a couple of real-world tests to see how they would actually structure the data that you’re working with.

4. The Deep Diver
diver_thumb3Between the Handyman, the Open Source Guru, and the Data Modeler, you should have the skills on your team to build out some useful, scalable datasets and systems that you can start to interrogate for insights. But who to generate the insights? Enter the Deep Diver.

Deep Divers (often known as Data Scientists) love to spend time wallowing in data to uncover interesting patterns and relationships. A good one has the technical skills to be able to pull data from source systems, the analytical skills to use something like R to manipulate and transform the data, and the statistical skills to ensure that his conclusions are statistically valid (i.e. he doesn’t mix up correlation with causation, or make pronouncements on tiny sample sizes). As your team becomes more sophisticated, you may also look to your Deep Diver to provide Machine Learning (ML) capabilities, to help you build out predictive models and optimization algorithms.

If your Deep Diver is good at these aspects of his job, then he may not turn out to be terribly good at taking direction, or communicating his findings. For the first of these, you need to find someone that your Deep Diver respects (this could be you), and use them to nudge his work in the right direction without being overly directive (because one of the magical properties of a really good Deep Diver is that he may take his analysis in an unexpected but valuable direction that no one had thought of before).

For the second problem – getting the Deep Diver’s insights out of his head – pair him with a Storyteller (see below).

5. The Storyteller
woman_storytellerThe Storyteller’s yin is to the Deep Diver’s yang. Storytellers love explaining stuff to people. You could have built a great set of data systems, and be performing some really cutting-edge analysis, but without a Storyteller, you won’t be able to get these insights out to a broad audience.

Finding a good Storyteller is pretty challenging. You do want someone who understands data quite well, so that she can grasp the complexities and limitations of the material she’s working with; but it’s a rare person indeed who can be really deep in data skills and also have good instincts around communications.

The thing your Storyteller should prize above all else is clarity. It takes significant effort and talent to take a complex set of statistical conclusions and distil them into a simple message that people can take action on. Your Storyteller will need to balance the inherent uncertainty of the data with the ability to make concrete recommendations.

Another good skill for a Storyteller to have is data visualization. Some of the most light bulb-lighting moments I have seen with data have been where just the right visualization has been employed to bring the data to life. If your Storyteller can balance this skill (possibly even with some light visualization development capability, like using D3.js; at the very least, being a dab hand with Excel and PowerPoint or equivalent tools) with her narrative capabilities, you’ll have a really valuable player.

There’s no one place you need to go to find Storytellers – they can be lurking in all sorts of fields. You might find that one of your developers is actually really good at putting together presentations, or one of your marketing people is really into data. You may also find that there are people in places like Finance or Market Research who can spin a good yarn about a set of numbers – poach them.

6. The Snoop
Jimmy_Stewart_Rear_Window_thumb6These next two people – The Snoop and The Privacy Wonk – come as a pair. Let’s start with the Snoop. Many analysis projects are hampered by a lack of primary data – the product, or website, or marketing campaign isn’t instrumented, or you aren’t capturing certain information about your customers (such as age, or gender), or you don’t know what other products your customers are using, or what they think about them.

The Snoop hates this. He cannot understand why every last piece of data about your customers, their interests, opinions and behaviors, is not available for analysis, and he will push relentlessly to get this data. He doesn’t care about the privacy implications of all this – that’s the Privacy Wonk’s job.

If the Snoop sounds like an exhausting pain in the ass, then you’re right – this person is the one who has the team rolling their eyes as he outlines his latest plan to remotely activate people’s webcams so you can perform facial recognition and get a better Unique User metric. But he performs an invaluable service by constantly challenging the rest of the team (and other parts of the company that might supply data, such as product engineering) to be thinking about instrumentation and data collection, and getting better data to work with.

The good news is that you may not have to hire a dedicated Snoop – you may already have one hanging around. For example, your manager may be the perfect Snoop (though you should probably not tell him or her that this is how you refer to them). Or one of your major stakeholders can act in this capacity; or perhaps one of your Deep Divers. The important thing is not to shut the Snoop down out of hand, because it takes relentless determination to get better quality data, and the Snoop can quarterback that effort. And so long as you have a good Privacy Wonk for him to work with, things shouldn’t get too out of hand.

7. The Privacy Wonk
Sadness_InsideOut_2815The Privacy Wonk is unlikely to be the most popular member of your team, either. It’s her job to constantly get on everyone’s nerves by identifying privacy issues related to the work you’re doing.

You need the Privacy Wonk, of course, to keep you out of trouble – with the authorities, but also with your customers. There’s a large gap between what is technically legal (which itself varies by jurisdiction) and what users will find acceptable, so it pays to have someone whose job it is to figure out what the right balance between these two is. But while you may dread the idea of having such a buzz-killing person around, I’ve actually found that people tend to make more conservative decisions around data use when they don’t have access to high-quality advice about what they can do, because they’re afraid of accidentally breaking some law or other. So the Wonk (much like Sadness) turns out to be a pretty essential member of the team, and even regarded with some affection.

Of course, if you do as I suggest, and make sure you have a Privacy Wonk and a Snoop on your team, then you are condemning both to an eternal feud in the style of the Corleones and Tattaglias (though hopefully without the actual bloodshed). But this is, as they euphemistically say, a “healthy tension” – with these two pulling against one another you will end up with the best compromise between maximizing your data-driven capabilities and respecting your users’ privacy.

Bonus eighth member: The Cat Herder (you!)
The one person we haven’t really covered is the person who needs to keep all of the other seven working effectively together: To stop the Open Source Guru from sneering at the Handyman’s handiwork; to ensure the Data Modeler and Deep Diver work together so that the right measures and dimensionality are exposed in the datasets you publish; and to referee the debates between the Snoop and the Privacy Wonk. This is you, of course – The Cat Herder. If you can assemble a team with at least one of the above people, plus probably a few developers for the Open Source Guru to boss about, you’ll be well on the way to unlocking a ton of value from the data in your organization.

Think I’ve missed an essential member of the perfect data team? Tell me in the comments. diggDigg RedditReddit StumbleUponStumbleUpon

May 17, 2015

The rise of the Chief Data Officer

mad-men-monolithAs the final season of Mad Men came to a close this weekend, one of my favorite memories from Season 7 is the appearance of the IBM 360 mainframe in the Sterling Cooper & Partners offices, much to the chagrin of the creative team (whose lounge was removed to make space for the beast), especially poor old Ginsberg, who became convinced the “monolith” was turning him gay (and took radical steps to address the issue).

My affection for the 360 is partly driven by the fact that I started my career at IBM, closer in time to Man Men Series 7 (set in 1969) than the present day (and now I feel tremendously old having just written that sentence). The other reason I feel an affinity for the Big Blue Box is because my day job consists of thinking of ways to use data to make marketing more effective, and of course that is what the computer at SC&P was for. It was brought in at the urging of the nerdish (and universally unloved) Harry Crane, to enable him to crunch the audience numbers coming from Nielsen’s TV audience measurement service to make TV media buying decisions. This was a major milestone in the evolution of data-driven marketing, because it linked advertising spend to actual advertising delivery, something that we now take for granted.

The whole point of Mad Men introducing the IBM computer into the SC&P offices was to make a point about the changing nature of advertising in the early 1970s – in particular that Don Draper and his “three martini lunch” tribe’s days were numbered. Since then, the rise of the Harry Cranes, and the use of data in marketing and advertising, has been relentless. Today, many agencies have a Chief Data Officer, an individual charged with the task of helping the agency and its clients to get the best out of data.

But what does, or should, a Chief Data Officer (or CDO) do? At an advertising & marketing agency, it involves the following areas:

Enabling clients to maximize the value they get from data. Many agency clients have significant data assets locked up inside their organization, such as sales history, product telemetry, or web data, and need help to join this data together and link it to their marketing efforts, in order to deliver more targeted messaging and drive loyalty and ROI. Additionally, the CDO should advise clients on how they can use their existing data to deliver direct value, for example by licensing it.

Advising clients on how to gather more data, safely. A good CDO offers advice to clients on strategies for collecting more useful data (e.g. through additional telemetry), or working with third-party data and data service providers, while respecting the client’s customers’ privacy needs.

Managing in-house data assets & services. Some agencies maintain their own in-house data assets and services, from proprietary datasets to analytics services. The CDO needs to manage and evolve these services to ensure they meet the needs of clients. In particular, the CDO should nurture leading-edge marketing science techniques, such as predictive modeling, to help clients become even more data-driven in their approach.

Managing data partnerships. Since data is such an important part of a modern agency’s value proposition, most agencies maintain ongoing relationships with key third-party data providers, such as BlueKai or Lotame.The CDO needs to manage these relationships so that they complement the in-house capabilities of the agency, and so the agency (and its clients) don’t end up letting valuable data “walk out of the door”.

Driving standards. As agencies increasingly look to data as a differentiating ingredient across multiple channels, using data and measurement consistently becomes ever more important. The CDO needs to drive consistent standards for campaign measurement and attribution across the agency so that as a client works with different teams, their measurement framework stays the same.

Engaging with the industry & championing privacy. Using data for marketing & advertising is not without controversy, so the DCO needs to be a champion for data privacy and actively engaged with the industry on this and other key topics.

As you can see, that’s plenty for the ambitious CDO to do, and in particular plenty that is not covered by other traditional C-level roles in an ad agency. I think we’ll be seeing plenty more CDOs appointed in the months and years to come. diggDigg RedditReddit StumbleUponStumbleUpon

March 01, 2015

Is MAU an effective audience metric?

instagram-user-growthThere was much hullabaloo in December when Instagram announced it had reached the milestone of 300 million monthly users, surpassing Twitter, and putting the latter under a bit of pressure in its earnings call a couple of weeks ago. But there has also been plenty of debate about whether these measures of the reach of major internet services are reliable, especially when comparing numbers from two different companies. Just what is a “monthly active user”, or MAU, anyway?

Defining MAU and DAU

Monthly Active Users is a pretty simple metric conceptually – it is the number of unique users who were “active” on a service within a given month. It doesn’t matter how many times each user used the service in the month; they’re only counted once (it’s a UU measure, after all). Daily Active Users is just the same measure, but over the period of a single day. So when Instagram says it had 300m active users in the Month of November, that means that 300m unique users did something in one of Instagram’s apps during the month.

Of course, for a signed-in service like Facebook, Twitter or Instagram, the total number of registered users will always be much higher than active users, since there will always be a significant subset of users who register for a service and then never use it (or have stopped using it). By some estimates, Twitter has almost 900 million registered users, almost four times the number of monthly active users. But registered users doesn’t tell you very much if you’re trying to run one of these services, at least not on its own – if it is massively out of whack with your active user counts, then it might indicate that your service isn’t very compelling or sticky.

Since journalists are also skeptical about registered user numbers, online services have taken to reporting MAU instead. These services have an incentive to report the biggest possible active user numbers, so tend to include almost any measurable interaction with their app or service in the definition of “active”. But from an analytical point of view, this isn’t the most helpful definition. Not every interaction with a website or app really represents “active” or “intentional” use. But how do you define “active” engagement with your app or service? That depends on what you’re trying to achieve with the metric. Let’s break it down.


Let’s look at some of the things you can do with the Instagram app:

  • Launch the app
  • Browse your feed (just look at photos)
  • Look at someone’s profile
  • Follow someone
  • Favorite a photo
  • Comment on a photo
  • Post a photo
  • Post a video

I’ve tried to order this list from “least-engaged” behaviors at the top to “most-engaged” behaviors at the bottom. At one end of the spectrum, it’s almost impossible to use Instagram without browsing your feed (it’s the thing that comes up when you launch the app), so it’s hardly a reliable indication of true engagement (some fraction of that number will even be people who launched the app by mistake when they were stabbing at their phone trying to launch Candy Crush Saga from the icon next door). At the other end, users who are posting lots of photos and video are clearly much more engaged, and a count of these folks would be a reliable indication of the size of the engaged population.

So where to draw the line? That depends on what you consider to be the minimum bar for “engaged” behavior. At Microsoft we’re having some very interesting discussions internally on where and how to draw this line across our diverse range of products – “Active” use means something very different across Bing, Office and Skype, to name just three. The advice I am giving my colleagues is to set the bar fairly high (i.e. not count too many behaviors as active use). Why? Well, consider the diagram below:


The outermost circle in the diagram represents the entire population of users of a service. As we covered earlier, only a subset of these users could be considered “active” (i.e. actually use the service at all), and an even smaller subset “active and engaged” (use the service in a meaningful way). If you’re running the service, it’s this group of users, however, that you’re most interested in cultivating and growing – they’re the ones who become the “fans” that will promote your service to their friends, and (if your service has any sort of social or network quality) will actually contribute to the quality of the service itself (Instagram would be pretty dull if nobody posted any photos).

What this all adds up to is that if you’re looking to track the growth and engagement of your user base, you probably want to track a couple of metrics:

  • Monthly Active Users (MAU) [Active Unengaged + Active Engaged, above]
  • Monthly Engaged Users (MEU) [Active Engaged only]

Of these two, the really important one is the MEU – the a number that really represents worthwhile usage of your product or service, and which only includes behaviors that are the ones you really want to encourage amongst the user base. If I were working at Instagram, I’d probably include almost all of the actions in the list above (possibly excluding app launch) in my definition of Active Users; but I would only include “Post picture” and “Post video” in my definition of Engaged Users (I might be persuaded to include “Post Comment” since it does contribute to the network.

Tracking MEU has another couple of advantages: If the number goes down, you’ll know that engagement with your service is diminishing. You can also track MEU as a fraction of MAU: If MEU/MAU is only 50% you can focus on growing engagement in your active base, whereas if MEU/MAU is 95% (i.e. almost all active users are engaged), you’ll probably want to focus on growing the active base (by recruiting new users).

The tactics for moving MAU and MEU will differ. To grow MEU, you can market to your existing base of “active unengaged” users (the population who falls into MAU but not MEU). These are the lurkers or the casual users who may only need a little nudge to become truly engaged and move into the middle circle. To grow MAU, you’ll need to recruit new users to your service, either from the pool of inactive users, or from the general population. This is usually a harder nut to crack, and one of the best tools in any case is to use your base of engaged “fans” to recruit – which underlines the importance of growing the MEU number.

So a final benefit of using MEU is that it is likely easier to move than MAU; and the next time you’re standing in front of your VP going through your product dashboard, you’ll be glad you picked a KPI you can actually move. diggDigg RedditReddit StumbleUponStumbleUpon

February 17, 2014

Building your own web analytics system using Big Data tools

Jenga1It’s been a busy couple of years here at Microsoft. For the dwindling few of you who are keeping track, at the beginning of 2012 I took a new job, running our “Big Data” platform for Microsoft’s Online Services Division (OSD) – the division that owns the Bing search engine and MSN, as well as our global advertising business.

As you might expect, Bing and MSN throw off quite a lot of data – around 70 terabytes a day.(that’s over 25 petabytes a year, to save you the trouble of calculating it yourself). To process, store and analyze this data, we rely on a distributed data infrastructure spread across tens of thousands of servers. It’s a pretty serious undertaking; but at its heart, the work we do is just a very large-scale version of what I’ve been doing for the past thirteen years: web analytics.

One of the things that makes my job so interesting, however, is that although many of the data problems we have to solve are familiar – defining events, providing a stable ID, sessionization, enabling analysis of non-additive measures, for example – the scale of our data (and the demands of our internal users) has meant that we have had to come up with some creative solutions, and essentially reinvent several parts of the web analytics stack.

What do you mean, the “web analytics stack”?

To users of a commercial web analytics solution, the individual technology components of those solutions are not very explicitly defined, and with good reason – most people simply don’t need to know this information. It’s a bit like demanding to know how the engine, transmission, brakes and suspension work if you’re buying a car – the information is available, but the majority of people are more interested in how fast the car can accelerate, and whether it can stop safely.

However, as data volumes are increasing, and web analytics are needing to be ever more tightly woven into the other data that organizations generate and manage, more people are looking to customize their solutions, and so it’s becoming more important to understand their components.

The diagram below provides a very crude illustration of the major components of a typical web analytics “stack”:


In most commercial solutions, these components are tightly woven together and often not visible (except indirectly through management tools), for a good reason: ease of implementation. At least for a “default” implementation, part of the value proposition of a commercial web analytics solution is “put our tag on your pages, and a few minutes/hours later, you’ll see numbers on the screen”.

A cunning schema

In order to achieve this promise, these tools have to make (and enforce) certain assumptions about the data, and these assumptions are embodied in the schema that they implement.Some examples of these default schema assumptions are:

  • The basic unit of interaction (transaction event) is the page view
  • Page views come with certain metadata such as User Agent, Referrer, and IP address
  • Page views are aggregated into sessions, and sessions into user profiles, based on some kind of identifier (usually a cookie)
  • Sessions contain certain attributes such as session length, page view count and so on.

Now, none of these schema assumptions is universal, and many tools have the capability to modify and extend the schema (and associated processing rules) quite dramatically. Google Universal Analytics is a big step in this direction, for example. But the reason I’m banging on about the schema is that going significantly “off schema” (that is to say, building your own data model, where some or all of the assumptions above may not apply) is one of the key reasons why people are looking to augment their web analytics solution.

Web Analytics Jenga

The other major reason to build a custom web analytics solution is to swap out one (or more) of the components of the “stack” that I described above to achieve improved performance, flexibility, or integration with another system. Some scenarios in which this might be done are as follows:

  • You want to use your own instrumentation/data collection technologies, and then load the data into a web analytics tool for processing & analysis
  • You want to expose data from your web analytics system in another analysis tool
  • You want to include significant amounts of other data in the processing tier (most web analytics tools allow you to join in external data, but only in relatively simple scenarios)

Like a game of Jenga, you can usually pull out one or two the blocks from the stack of a commercial web analytics tool without too much difficulty. But if you want to pull out more – and especially if you want to create a significantly customized schema – the tower starts to wobble. And that’s when you might find yourself asking the question, “should we think about building our own web analytics tool?”

“Build your own Web Analytics tool? Are you crazy?”

Back in the dim and distant past (over ten years ago), when I was pitching companies in the UK on the benefits of WebAbacus, occasionally a potential customer would say, “Well, we have been looking at building our own web analytics tool”. At the time, this usually meant that they had someone on staff who could write Perl scripts to process log data. I would politely point out that this was a stupid idea, for all the reasons that you would expect: If you build something yourself, you have to maintain and enhance it yourself, and you don’t get any of the benefits of a commercial product that is funded by licenses to lots of customers, and which therefore will continue to evolve and add features.

But nowadays the technology landscape for managing, processing and analyzing web behavioral data (and other transactional data) has changed out of all recognition. There is a huge ecosystem, mostly based around Hadoop and related technologies, that organizations can leverage to build their own  big data infrastructures, or extend commercial web analytics products.

At the lower end of the Web Analytics stack, tools like Apache Flume can be deployed to handle log data collection and management, with other tools such as Sqoop and Oozie managing data flows; Pig can be used for ETL and enrichment in the data processing layer; or Storm can be used for streaming (realtime) data processing. Further up the stack, Hive and HBase can be used to provide data warehousing and querying capabilities, while there is an increasing range of options (Cloudera’s Impala, Apache Drill, Facebook’s Presto, and Hortonworks’ Stinger) to provide the kind of “interactive analysis” capabilities (dynamic filtering across related datasets) which commercial Web Analytics tools are so good at. At finally, at the top of the stack, Tableau is an increasingly popular choice for reporting & data visualization, and of course there is the Microsoft Power BI toolset.

In fact, with the richness of the ecosystem, the biggest challenge for anyone looking to roll their own Web Analytics system is a surfeit of choice. In subsequent blog posts (assuming I am able to increase my rate of posting to more than once every 18 months) I will write more about some of the choices available at various points in the stack, and how we’ve made some of these choices at Microsoft. But after finally bestirring myself to write the above, I think I need a little lie down now. diggDigg RedditReddit StumbleUponStumbleUpon

May 01, 2012

Google launches cloud-based BigQuery service

Some interesting news today: Google has fully launched the cloud-based BigQuery service that it first previewed last November. From the website:

Google BigQuery is a web service that lets you do interactive analysis of massive datasets—up to billions of rows. Scalable and easy to use, BigQuery lets developers and businesses tap into powerful data analytics on demand.

The BigQuery service is built on the back of Google’s enormous investments in data infrastructure and exposes some of the clever tools the company has built for internal use to an internal audience. It’s designed to help with ad hoc queries against unstructured data – kind of Hadoop in the cloud with a front-end querying service attached. In this regard it shares some similarities with the Hadoop on Azure service from my illustrious employers.

The interesting question with all these cloud-based Big Data services (a list of some of which you can find here, and here) is the acceptability to customers of loading significant amounts of data to the cloud, and dealing with the privacy and security questions that arise as a result. But it is interesting to contrast the significant complexity that attends any conversation about in-house or on-premise big data with the simplicity offered by a cloud-based approach.

The most intriguing aspect of Google’s foray into this area is the prospect of the company being able to leverage its “secret sauce” in terms of data analysis tools and technologies – few other companies may be able to match the kind of investment that Google can make here. diggDigg RedditReddit StumbleUponStumbleUpon


About me



Enter your email address:

Delivered by FeedBurner