Using AUC to Optimize N-Gram Weights for Cryptanalysis

Using AUC to Optimize N-Gram Weights for Cryptanalysis

When it comes to cryptanalysis, evaluating and training fitness functions for classical cipher solvers can be a complex task. Typically, candidate plaintexts are scored with character-level n-gram log-likelihoods estimated from a large corpus. But what if we could use a more robust metric to optimize these fitness functions?

I’ve been experimenting with using ROC/AUC (Receiver Operating Characteristic/Area Under the Curve) as a criterion to evaluate and train fitness functions. The idea is to frame this as a pairwise ranking problem: sample two candidate keys, decrypt both, compute their n-gram scores, and check whether the score difference is consistent with an oracle preference.

In my case, the oracle is Levenshtein distance to the ground-truth plaintext, and the fitness function “wins” if it ranks the one with smaller edit distance higher. As expected, higher-order n-grams help, and a tuned bigram-trigram mixture outperforms plain trigrams.

But here’s the interesting part: instead of estimating n-gram weights generatively, why not learn them discriminatively by trying to maximize pairwise AUC on local neighborhoods? Treat the scorer as a linear model over n-gram count features and optimize a pairwise ranking surrogate.

I haven’t trained this yet, but I’ve only been using AUC to evaluate fitness functions, which works shockingly well. So, I’m asking: has anyone seen this done explicitly, i.e., training n-gram weights to maximize pairwise ROC/AUC under a task-specific oracle and neighborhood?

This approach feels close to pairwise discriminative language modeling or bipartite ranking. If you have any insights or know of similar work, I’d love to hear about it.

*Further reading: [Pairwise Ranking](https://en.wikipedia.org/wiki/Learning_to_rank)*

Leave a Comment

Your email address will not be published. Required fields are marked *