The Problem Statement and the proposed Bayesian MMM solution

Consider a company, which runs an online shop, and advertises on seven different paid channels (TV, radio, billboards, Google Ads, etc). Based on weekly data on weekly advertisement costs and revenue available for 2 years, the company would like to understand how effective different channels are. One has to take into account that marketing actions have usually not an immediate effect, ads and campaigns in one week may influence sales in the coming weeks. So different channels can be expected to target different audiences at different times and for different durations, and hence will have lagged effects on revenue.

For this problem, I used a Bayesian Media Mix Modelling approach. MMM can be perfomed with a simpler linear regression approach, however with a bayesian approach, we can find more robust solutions, and estimate confidence intervals. For this work, I took inspiration and used code from the following sources:

Blogpost #1 by Robert Kübler
Blogpost #2 by Slava Kisilevich
Git repo #1 by Slava Kisilevich
Git repo #2 by Ilya Katsov
Jin et al. 2017

Install / Import Packages

!pip install pymc3

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Requirement already satisfied: pymc3 in /usr/local/lib/python3.8/dist-packages (3.11.5)
Requirement already satisfied: patsy>=0.5.1 in /usr/local/lib/python3.8/dist-packages (from pymc3) (0.5.3)
Requirement already satisfied: semver>=2.13.0 in /usr/local/lib/python3.8/dist-packages (from pymc3) (2.13.0)
Requirement already satisfied: cachetools>=4.2.1 in /usr/local/lib/python3.8/dist-packages (from pymc3) (5.3.0)
Requirement already satisfied: arviz>=0.11.0 in /usr/local/lib/python3.8/dist-packages (from pymc3) (0.12.1)
Requirement already satisfied: fastprogress>=0.2.0 in /usr/local/lib/python3.8/dist-packages (from pymc3) (1.0.3)
Requirement already satisfied: deprecat in /usr/local/lib/python3.8/dist-packages (from pymc3) (2.1.1)
Requirement already satisfied: numpy<1.22.2,>=1.15.0 in /usr/local/lib/python3.8/dist-packages (from pymc3) (1.21.6)
Requirement already satisfied: pandas>=0.24.0 in /usr/local/lib/python3.8/dist-packages (from pymc3) (1.3.5)
Requirement already satisfied: typing-extensions>=3.7.4 in /usr/local/lib/python3.8/dist-packages (from pymc3) (4.4.0)
Requirement already satisfied: scipy<1.8.0,>=1.7.3 in /usr/local/lib/python3.8/dist-packages (from pymc3) (1.7.3)
Requirement already satisfied: theano-pymc==1.1.2 in /usr/local/lib/python3.8/dist-packages (from pymc3) (1.1.2)
Requirement already satisfied: dill in /usr/local/lib/python3.8/dist-packages (from pymc3) (0.3.6)
Requirement already satisfied: filelock in /usr/local/lib/python3.8/dist-packages (from theano-pymc==1.1.2->pymc3) (3.9.0)
Requirement already satisfied: matplotlib>=3.0 in /usr/local/lib/python3.8/dist-packages (from arviz>=0.11.0->pymc3) (3.2.2)
Requirement already satisfied: packaging in /usr/local/lib/python3.8/dist-packages (from arviz>=0.11.0->pymc3) (23.0)
Requirement already satisfied: xarray>=0.16.1 in /usr/local/lib/python3.8/dist-packages (from arviz>=0.11.0->pymc3) (2022.12.0)
Requirement already satisfied: xarray-einstats>=0.2 in /usr/local/lib/python3.8/dist-packages (from arviz>=0.11.0->pymc3) (0.5.1)
Requirement already satisfied: setuptools>=38.4 in /usr/local/lib/python3.8/dist-packages (from arviz>=0.11.0->pymc3) (57.4.0)
Requirement already satisfied: netcdf4 in /usr/local/lib/python3.8/dist-packages (from arviz>=0.11.0->pymc3) (1.6.2)
Requirement already satisfied: python-dateutil>=2.7.3 in /usr/local/lib/python3.8/dist-packages (from pandas>=0.24.0->pymc3) (2.8.2)
Requirement already satisfied: pytz>=2017.3 in /usr/local/lib/python3.8/dist-packages (from pandas>=0.24.0->pymc3) (2022.7.1)
Requirement already satisfied: six in /usr/local/lib/python3.8/dist-packages (from patsy>=0.5.1->pymc3) (1.15.0)
Requirement already satisfied: wrapt<2,>=1.10 in /usr/local/lib/python3.8/dist-packages (from deprecat->pymc3) (1.14.1)
Requirement already satisfied: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.1 in /usr/local/lib/python3.8/dist-packages (from matplotlib>=3.0->arviz>=0.11.0->pymc3) (3.0.9)
Requirement already satisfied: kiwisolver>=1.0.1 in /usr/local/lib/python3.8/dist-packages (from matplotlib>=3.0->arviz>=0.11.0->pymc3) (1.4.4)
Requirement already satisfied: cycler>=0.10 in /usr/local/lib/python3.8/dist-packages (from matplotlib>=3.0->arviz>=0.11.0->pymc3) (0.11.0)
Requirement already satisfied: cftime in /usr/local/lib/python3.8/dist-packages (from netcdf4->arviz>=0.11.0->pymc3) (1.6.2)

import arviz as az
import matplotlib.pyplot as plt
import math
import numpy as np
import pandas as pd
from prophet import Prophet
import pymc3 as pm3
import seaborn as sns
import theano
import theano.tensor as tt

from sklearn.preprocessing import MinMaxScaler
from sklearn.metrics import mean_absolute_error, mean_absolute_percentage_error

Prepare Data

data_wdates = pd.read_csv(
  'https://raw.githubusercontent.com/OnurKerimoglu/bayesian_mmm/master/data/MMM_test_data.csv',
  parse_dates=['start_of_week']
)

Detect Trend and Seasonality

Let’s first detect the trend ans seasonility in the data, which we will use as control variables in our model.

prophet_data = data_wdates.rename(columns = {'revenue': 'y', 'start_of_week': 'ds'})

prophet = Prophet(yearly_seasonality=True, weekly_seasonality=False)

prophet.fit(prophet_data[["ds", "y"]])
prophet_predict = prophet.predict(prophet_data[["ds", "y"]])

INFO:prophet:Disabling daily seasonality. Run prophet with daily_seasonality=True to override this.
DEBUG:cmdstanpy:input tempfile: /tmp/tmpsf7dam5e/4pfe8f3w.json
DEBUG:cmdstanpy:input tempfile: /tmp/tmpsf7dam5e/kumd0ckr.json
DEBUG:cmdstanpy:idx 0
DEBUG:cmdstanpy:running CmdStan, num_threads: None
DEBUG:cmdstanpy:CmdStan args: ['/usr/local/lib/python3.8/dist-packages/prophet/stan_model/prophet_model.bin', 'random', 'seed=9274', 'data', 'file=/tmp/tmpsf7dam5e/4pfe8f3w.json', 'init=/tmp/tmpsf7dam5e/kumd0ckr.json', 'output', 'file=/tmp/tmpsf7dam5e/prophet_modeliit8_vzo/prophet_model-20230204201112.csv', 'method=optimize', 'algorithm=lbfgs', 'iter=10000']
20:11:12 - cmdstanpy - INFO - Chain [1] start processing
INFO:cmdstanpy:Chain [1] start processing
20:11:12 - cmdstanpy - INFO - Chain [1] done processing
INFO:cmdstanpy:Chain [1] done processing

plot = prophet.plot_components(prophet_predict, figsize = (20, 10))

Adding the detected trend and seasonality signals back to the data table:

prophet_columns = [col for col in prophet_predict.columns if (col.endswith("upper") == False) & (col.endswith("lower") == False)]

final_data = data_wdates.copy()
final_data["trend"] = prophet_predict["trend"]
final_data["season"] = prophet_predict["yearly"]

The final feature (X) and target (y) data:

X = final_data.drop(columns=['revenue', 'start_of_week'])
y = final_data['revenue']

The Model

Carryover (adstock)

For modelling carryover, following Jin et al. 2017, we use an adstock function of form:

\[ x^*_{t,m} = \frac{\sum_l w_m(l)x_{t-l,m}}{\sum_l w_m(l)} \]

Here, \(w_m\) is a nonnegative weight function, which can be described with a geometric decay function, i.e.,

\[ w^g_m = \alpha_m^l,~l = 0, 1, ..., L-1,~0<\alpha_m<1 \]

where, \(w_m\) describes the weight of the effect on each time step \(l\), that lasts for \(L\) time steps (which we here prescribe to be 13 time steps, i.e., weeks, which is a good approximation for infinity according to Jin et al. 2017), and \(\alpha_m\) is the decay rate for the channel \(m\).

In order to account for potential delays in the media effects, following Jin et al. 2017 again, we can add \(\theta_m\), the delay of the peak effect for the media channel \(m\) into the equation above as follows:

\[ w^d_m = \alpha_m^{(l-\theta_m)^2},~l = 0, 1, ..., L-1,~0<\alpha_m<1,~0 \leq \theta_m \leq L-1 \]

def adstock_weights(alpha, theta, length=13, delayed=False):
    if delayed:
        w = [tt.power(alpha, tt.power(i-theta,2)) for i in range(length)]
    else:
        w = [tt.power(alpha, i) for i in range(length)]
    
    w = tt.as_tensor_variable(w)
    
    return w

def carryover (x, alpha, theta=0, length=13, delayed=False):
  
    w = adstock_weights(alpha, theta, length, delayed)
    
    x_lags = tt.stack(
        [tt.concatenate([tt.zeros(i),x[:x.shape[0]-i]]) for i in range(length)]
    )
    
    return tt.dot(w, x_lags)

Next, we build a model that consists of delayed media channels and control variables:

\[ \hat{y} = \epsilon + \tau + \sum_c \gamma_c z_{t,c} + \sum_m \beta_m x^*_{t,m} \]

where, \(\epsilon\) and \(\tau\) represent noise and baseline revenue, \(z_{t,c}\) and \(\gamma\) represent the control variable \(c\) and their effects, and \(x^*_{t,m}\) and \(\beta_m\) represent the (adstocked) media spending \(m\) and their effects, respectively.

Note that here for simplicity, we assume no shape effects (i.e., no saturation). We further assume that marketing contributions can only be positive, which can be achieved by drawing the contribution coefficient from a half-normal distribution.

control_variables = ["trend", "season"]
media_channels = [f'spend_channel_{i}' for i in range(1,8)]
transform_variables = control_variables+media_channels

y_transformed=y/10000 #rescale target variable

X_transformed = X.copy() #Min-max scale the features

numerical_encoder_dict = {}
for feature in transform_variables:
    scaler = MinMaxScaler()
    original = final_data[feature].values.reshape(-1, 1)
    transformed = scaler.fit_transform(original)
    X_transformed[feature] = transformed
    numerical_encoder_dict[feature] = scaler

with pm3.Model() as mmm1:

    #baseline (tau) and noise
    base = pm3.Normal("base", np.mean(y_transformed.values), sigma = 2) #tau
    #base = pm3.Exponential('base', lam=0.01)
    noise = pm3.Exponential('noise', lam=0.1) #epsilon

    #media effects
    channel_contributions = []
    for channel in media_channels:
        print(f"Media channels: Adding {channel}")
        x = X_transformed[channel].values
        #channel coefficients (forced positive):
        beta = pm3.HalfNormal(f'beta_{channel}', sigma = 2)
        #adstock decay
        alpha = pm3.Beta(f'alpha_{channel}', alpha=2, beta=2)
        #delay
        theta = pm3.Uniform(f'theta_{channel}',lower=0, upper=12)
        x_star = carryover(
                    x,
                    alpha,
                    theta,
                    delayed=True
                    )
        channel_contribution = pm3.Deterministic(
            f'contribution_{channel}',
            beta * x_star,
            )
        channel_contributions.append(channel_contribution)
    
    #control effects
    control_contributions = []
    for control_var in control_variables:
        print(f"Control Variables: Adding {control_var}")
        z = X_transformed[control_var].values
        #control variables
        gamma = pm3.Normal(f"gamma_{control_var}", sigma = 3)
        control_effect = gamma * z
        control_contributions.append(control_effect)

    #add everything
    revenue = pm3.Normal(
        'revenue',
        mu= base + sum(control_contributions) + sum(channel_contributions),
        sigma=noise,
        observed=y_transformed
    )

Media channels: Adding spend_channel_1
Media channels: Adding spend_channel_2
Media channels: Adding spend_channel_3
Media channels: Adding spend_channel_4
Media channels: Adding spend_channel_5
Media channels: Adding spend_channel_6
Media channels: Adding spend_channel_7
Control Variables: Adding trend
Control Variables: Adding season

Do the prior distributions make sense?

We can check whether the model estimates based on priors more or less make sense, as can be judged from a rough alignment of the corresponding estimates with the observations.

with mmm1:
    prior_pred = pm3.sample_prior_predictive()
prior_names = [prior_name for prior_name in list(prior_pred.keys()) if (prior_name.endswith("logodds__") == False) & (prior_name.endswith("_log__") == False)]
fig, ax = plt.subplots(figsize = (20, 8))
_ = ax.plot(prior_pred["revenue"].T, color = "0.5", alpha = 0.1)
_ = ax.plot(y_transformed.values, color = "red")

WARNING:theano.tensor.blas:We did not find a dynamic library in the library_dir of the library we use for blas. If you use ATLAS, make sure to compile it with dynamics library.
WARNING:theano.tensor.blas:We did not find a dynamic library in the library_dir of the library we use for blas. If you use ATLAS, make sure to compile it with dynamics library.

Check the prior distributions:

#plots priors using the random variables
def plot_priors(variables, prior_dictionary = None):
    if isinstance(variables[0], pm3.model.TransformedRV) == False and prior_dictionary is None:
        raise Exception("prior dictionary should be provided. It can be generated by sample_prior_predictive")
    cols = 7
    rows = int(math.ceil(len(variables)/cols))
    fig, ax = plt.subplots(rows, cols, figsize=(15, 3*rows))
    ax = np.reshape(ax, (-1, cols))
    for i in range(rows):
         for j in range(cols):
            vi = i*cols + j
            if vi < len(variables):
                var = variables[vi]
                if isinstance(var, pm3.model.TransformedRV):
                    sns.histplot(var.random(size=10000).flatten(), kde=True, ax=ax[i, j])
                    #p.set_axis_labels(var.name)
                    ax[i, j].set_title(var.name)
                else:
                    prior = prior_dictionary[var]
                    sns.histplot(prior, kde=True, ax = ax[i, j])
                    ax[i, j].set_title(var)
    plt.tight_layout()

media_coef_priors = [p for p in prior_names if p.startswith("beta")]
plot_priors(media_coef_priors, prior_pred)
print(f"beta priors: {len(media_coef_priors)}")

adstock_priors = [p for p in prior_names if p.startswith("alpha")]
plot_priors(adstock_priors, prior_pred)
print(f"alpha priors: {len(adstock_priors)}")

adstock_priors = [p for p in prior_names if p.startswith("theta")]
plot_priors(adstock_priors, prior_pred)
print(f"theta priors: {len(adstock_priors)}")

control_coef_priors = [p for p in prior_names if p.startswith("gamma_")] + ["base", "noise"]
plot_priors(control_coef_priors, prior_pred)
print(f"gamma priors: {len(control_coef_priors)}")

beta priors: 7
alpha priors: 7
theta priors: 14
gamma priors: 4

Fit the model

with mmm1:
  trace = pm3.sample(return_inferencedata=True, tune=3000, target_accept=0.95)
  trace_summary = az.summary(trace)

100.00% [4000/4000 02:44<00:00 Sampling chain 0, 0 divergences]

100.00% [4000/4000 02:57<00:00 Sampling chain 1, 0 divergences]

/usr/local/lib/python3.8/dist-packages/arviz/stats/diagnostics.py:586: RuntimeWarning: invalid value encountered in double_scalars
  (between_chain_variance / within_chain_variance + num_samples - 1) / (num_samples)
/usr/local/lib/python3.8/dist-packages/arviz/stats/diagnostics.py:586: RuntimeWarning: invalid value encountered in double_scalars
  (between_chain_variance / within_chain_variance + num_samples - 1) / (num_samples)

trace_summary

	mean	sd	hdi_3%	hdi_97%	mcse_mean	mcse_sd	ess_bulk	ess_tail	r_hat
base	7.155	1.224	4.886	9.464	0.033	0.023	1393.0	1355.0	1.00
gamma_trend	2.445	1.990	-1.465	5.856	0.056	0.040	1276.0	1215.0	1.00
gamma_season	4.993	1.951	1.466	8.759	0.055	0.039	1279.0	1303.0	1.00
noise	4.076	0.322	3.455	4.662	0.008	0.006	1713.0	1255.0	1.01
beta_spend_channel_1	1.556	1.210	0.000	3.724	0.049	0.034	514.0	681.0	1.00
...	...	...	...	...	...	...	...	...	...
contribution_spend_channel_7[99]	0.449	0.365	0.000	1.113	0.009	0.006	1412.0	953.0	1.00
contribution_spend_channel_7[100]	0.423	0.339	0.000	1.039	0.008	0.006	1512.0	936.0	1.00
contribution_spend_channel_7[101]	0.459	0.387	0.000	1.155	0.009	0.007	1447.0	992.0	1.00
contribution_spend_channel_7[102]	0.485	0.420	0.000	1.257	0.010	0.007	1356.0	952.0	1.00
contribution_spend_channel_7[103]	0.507	0.444	0.000	1.324	0.011	0.008	1303.0	953.0	1.00

753 rows × 9 columns

Analysis

Posterior distributions

az.plot_posterior(
    trace,
    var_names=['~contribution'],
    filter_vars='like'
)

array([[<matplotlib.axes._subplots.AxesSubplot object at 0x7f9af5f12400>,
        <matplotlib.axes._subplots.AxesSubplot object at 0x7f9af620fd00>,
        <matplotlib.axes._subplots.AxesSubplot object at 0x7f9af6554070>,
        <matplotlib.axes._subplots.AxesSubplot object at 0x7f9af6680850>],
       [<matplotlib.axes._subplots.AxesSubplot object at 0x7f9af6a422e0>,
        <matplotlib.axes._subplots.AxesSubplot object at 0x7f9af6afb130>,
        <matplotlib.axes._subplots.AxesSubplot object at 0x7f9af6d135b0>,
        <matplotlib.axes._subplots.AxesSubplot object at 0x7f9af6bf6ca0>],
       [<matplotlib.axes._subplots.AxesSubplot object at 0x7f9af3bbbbb0>,
        <matplotlib.axes._subplots.AxesSubplot object at 0x7f9af3bc7280>,
        <matplotlib.axes._subplots.AxesSubplot object at 0x7f9af47db640>,
        <matplotlib.axes._subplots.AxesSubplot object at 0x7f9af6f9e190>],
       [<matplotlib.axes._subplots.AxesSubplot object at 0x7f9af6004550>,
        <matplotlib.axes._subplots.AxesSubplot object at 0x7f9af5fe1730>,
        <matplotlib.axes._subplots.AxesSubplot object at 0x7f9afafc7730>,
        <matplotlib.axes._subplots.AxesSubplot object at 0x7f9af176f490>],
       [<matplotlib.axes._subplots.AxesSubplot object at 0x7f9af1789bb0>,
        <matplotlib.axes._subplots.AxesSubplot object at 0x7f9af1731310>,
        <matplotlib.axes._subplots.AxesSubplot object at 0x7f9af1749a30>,
        <matplotlib.axes._subplots.AxesSubplot object at 0x7f9af16f0190>],
       [<matplotlib.axes._subplots.AxesSubplot object at 0x7f9af17098b0>,
        <matplotlib.axes._subplots.AxesSubplot object at 0x7f9af16a5f10>,
        <matplotlib.axes._subplots.AxesSubplot object at 0x7f9af16c97c0>,
        <matplotlib.axes._subplots.AxesSubplot object at 0x7f9af1663eb0>],
       [<matplotlib.axes._subplots.AxesSubplot object at 0x7f9af16866a0>,
        <matplotlib.axes._subplots.AxesSubplot object at 0x7f9af1622dc0>,
        <matplotlib.axes._subplots.AxesSubplot object at 0x7f9af1653520>,
        <matplotlib.axes._subplots.AxesSubplot object at 0x7f9af15fec40>]],
      dtype=object)

Predictions vs Observations

We can now check the model skill by plotting the predictions and observations together, and calculating, e.g., MAE.

with mmm1:
    posterior = pm3.sample_posterior_predictive(trace)

100.00% [2000/2000 01:04<00:00]

y_pred = posterior['revenue'].mean(0)*10000
y_stds = posterior['revenue'].std(0)*10000

MAE = mean_absolute_error(y.values, y_pred)
MAPE = mean_absolute_percentage_error(y.values, y_pred)*100
SkillStr = 'MAE: %5d\nMAPE: %5.2f%%'%(MAE,MAPE)
fig, ax = plt.subplots(figsize=(12, 6))
plt.subplots_adjust(left=0.15,
                        bottom=0.15,
                        right=0.95,
                        top=0.9)
ax.plot(y.values, linewidth=2, c='r', label='Observations')
ax.plot(y_pred, linewidth=1, c='b', label='Mean prediction')
ax.fill_between(np.arange(len(y)), y_pred - 2*y_stds, y_pred + 2*y_stds, alpha=0.33)
ax.text(0.85,0.9,SkillStr, transform=ax.transAxes)
ax.legend(loc='upper center')
ax.set_xlabel('Week')
ax.set_ylabel('Revenue')
plt.show()

Except for two weeks, the observations lay within 2 standard deviations plus/minus the predictions. That instance is likely due to a special event, like a promotion or a holiday, which is not accounted for by the model. The mean absolute error corresponds to about 20% of the revenue.

Channel Contributions and ROI

def compute_mean(trace, channel):
    return (trace
            .posterior[f'{channel}']
            .values
            .reshape(2000, 104)
            .mean(0)
           )

channels = [f'contribution_spend_channel_{i}' for i in range(1,8)]

unadj_contributions = pd.DataFrame(
    {'Base+Trend+Seas': trace.posterior['base'].values.mean()
                 +trace.posterior['gamma_trend'].values.mean()
                 +trace.posterior['gamma_season'].values.mean()},
    index=X.index
)

for channel in channels:
    unadj_contributions[channel] = compute_mean(trace, channel)

adj_contributions = (unadj_contributions
                     .div(unadj_contributions.sum(axis=1), axis=0)
                     .mul(y, axis=0)
                    )

ax = (adj_contributions
      .plot.area(
          figsize=(12, 6),
          linewidth=1,
          title='Predicted Revenue and Breakdown',
          ylabel='Revenue',
          xlabel='Week'
      )
     )
    
handles, labels = ax.get_legend_handles_labels()
ax.legend(
    handles[::-1], labels[::-1],
    title='Channels', loc="upper right",
    #bbox_to_anchor=(1.01, 0.5)
)
plt.show()

According to this plot, the model suggests that a large portion of the revenue is not explained by marketing actions.

For each channel \(m\), percentage \(ROI_m\) can be calculated according to:

\(ROI_m = \frac{\sum_t C_{t,m} - \sum_t S_{t,m}}{\sum_t S_{t,m}} * 100\)

where \(C_{t,m}\) and \(S_{t,m}\) are the revenue contribution and spends to the media channel \(m\) at a given time step (\(t\)).

#Calculate ROI for each channel
total_contr = adj_contributions.sum(axis=0)
total_spend = X.sum(axis=0)

Cchannels = [f'contribution_spend_channel_{i}' for i in range(1,8)]
            
Schannels = [f'spend_channel_{i}' for i in range(1,8)]

ROI_l= [None] * 7
spend_l = [None] * 7
contr_l = [None] * 7
for i in range(7):
    spend_l[i] = total_spend[Schannels[i]]
    contr_l[i] = total_contr[Cchannels[i]]
    ROI_l[i] = (contr_l[i] - spend_l[i])/spend_l[i] *100

fig, (ax1, ax2) = plt.subplots(2, 1, figsize=(8, 10))
ax1.bar(np.arange(1,8) - 0.2, spend_l, color = 'r', width = 0.4, label='Spend')
ax1.bar(np.arange(1,8) + 0.2, contr_l, color = 'b', width = 0.4, label='Contr')
ax1.set_xlabel('Marketing Channel')
ax1.set_ylabel('Contribution and Spends')
ax1.legend(loc='upper left')

ax2.bar(range(1,8),ROI_l)
ax2.set_xlabel('Marketing Channel')
ax2.set_ylabel('ROI (%)')

plt.show()

Our model suggests that only channels 1, 2 and 6 generate positive net gains. Among these channels, 2 seems most effective in terms of ROI, however, in terms of absolute revenue contribution, channel 6 is the most important source. Among the channels that results in net costs channel 7 is the one that requires most immediate attenion, both in terms of ROI and absolute net cost. Continued investment in channels 3 and 4 seem also questionable.