October 16, 2016

### Methodology

### Results

92%, 99%, 99.9%… The past few years have seen lots of flashy announcements touting accuracy statistics from various cross device solutions that are confidence inspiring at first glance. These results leave much to be desired, though, when we dig just one level deeper and ask exactly how datasets are matched and what constituent quantities are compared in order to produce those stats. For example, some have gone so far as to include cases where the cross device solution did *not* make a link in the accuracy computation. For example, if your algorithm make a link in 5% of possible cases these would be evaluated for veracity. However the 95% of the cases where a link was not made would also be included in the “correct” piece of the calculation if there was in fact no link between the ids. While this is in some theoretical sense appropriate, our position is that it doesn’t answer the basic question of interest as regards accuracy, which is: “When you declare to me that there is a link between cookie A and device B, what’s the probability that your declaration is true?”

Transparency when reporting our metric dictates exact detailing of how we arrive at the numbers we claim. This requires some preliminaries. Firstly, we have a probabilistic device graph that is the result of our internal processing. The final output of this is a list of cookie ids (Lotame ids), which are each matched to a number of mobile device ids (IDFAs on iOS and GAIDs on Android). On the other side, we have deterministic matches that comes from a number of providers and are indexed on the same ids as used by our graph. This results in a comparison link by link between the two graphs where there are four possible cases:

- A link is in both graphs. This is the least complicated case and under any methodology this should be called a true positive (an accurate match) for the probabilistic graph we’re comparing.
- A link is in the probabilistic graph, but not in the deterministic graph. This case requires more analysis as there are at least a couple of ways a link can manifest this behavior.
- This can happen because the deterministic graph is entirely “unaware” of at least 1 of the 2 ids in the link. We interpret that the deterministic graph couldn’t possibly have made this link since it never knew about the id on one side of the link. We therefore call this link in the probabilistic graph of unknown veracity and do not compute it either as a false positive or a false negative.
- The other possibility is that the deterministic graph is “aware” of both ids, but they are linked to ids other than those which are linked in the probabilistic graph (these ids are known, but linking to other ids) We call this an inaccurate match (false positive)

- A potential link is neither in the probabilistic nor the deterministic graph. Whereas some choose to consider this a true negative, we elect to not count this at all, firstly because we don’t believe end customers care about the rate at which the graph fails to declare (when there shouldn’t be a link) and more importantly because the incredibly large fraction of ids which should not be linked makes an accuracy metric that includes this quantity very easy to force to arbitrarily high levels (never declare any links; indeed an empty graph would have an impressively high “accuracy”)
- A link is in the deterministic graph, but not in the probabilistic graph. Again, two cases need to be discussed here
- The probabilistic graph is unaware of at least one of the ids seen in the deterministic graph. We call this unknown rather than a missed detection, because the lack of linking could be simply due to the fact that the data input to the graph never included the id in question.
- The probabilistic graph did see the ids in the link, but didn’t link them to each other. In this case, we declare this a false negative.

Once all these quantities are defined, accuracy and reach are consequently well defined and are computed in a straightforward fashion. Accuracy is the rate of true positive to all positive (true positive/(true positive + false positive)) and reach is the rate of true positive to all true links (true positive/(true positive + false negatives))

We suspect that our measure of accuracy is a lower bound on the true statistic as it is impossible to collect a deterministic graph which is in any sense complete or completely accurate. For example, if we use publisher or online service X, a user A may happen not to log into site X on their phone. Even worse, a friend user B may log in on his or her own phone and have logged in once on user A’s desktop browser. There is no way to ultimately verify truth. However, if both deterministic and probabilistic methodologies agree that two ids are linked, it stands to reason that it’s unlikely both are wrong.

Our default settings create a graph with above 90% accuracy and around 75% reach. We believe this to be sufficient for a wide variety of data propagation use cases including targeting, retargeting, and aggregate level analytics where the underlying data is unlikely to exhibit accuracy beyond this limit. We do not recommend expanding reach beyond our standard limits because the marginal accuracy for each extra link we add decreases. Our cutoff, while achieving a net 90%+ accuracy rate for the graph, yields a minimum marginal accuracy rate of around 65% for the lowest scoring links. We do not think it’s valuable for any customer to include links which a lower marginal accuracy. However, for more specific use cases where the accuracy is of utmost importance, the graph can be tuned up to almost 97% accuracy where reach is sacrificed down to around 20%. In between these two extremes, the tradeoff is that from almost any point on the curve, we give up around 50% of reach for every 2-3 points of accuracy.

*This article was written by Lotame’s Chief Data Scientist, Omar Abdala, who was recently awarded the American Marketing Association’s 4 Under 40 Emerging Leaders Award!*