Clustering cheaply under local differential privacy (Laurent Massoulie)

The trade-off between privacy preservation and statistical inference takes
interesting forms in the context of recommendation engines. In we considered the problem of inferring
clusters of similar items from inputs of users under an extreme form of
differential privacy, namely local differential privacy. Local DP mandates
each user to only provide input that is distorted by noise so that the
underlying ratings of items by that user (as well as which items it rated)
remain ambiguous. We assessed performance of clustering schemes under
these constraints by query complexity. This counts how many users are
probed before a cluster structure underlying users’s ratings can be
reliably identified. Assuming users rate a small number of items chosen
uniformly at random, the best query complexity we could achieve is an
impracticably large \tilde{O}(N^2) where N is the number of items.
However the best lower bounds we know are in \tilde{\Omega}(N).

This begs the question of determining whether in this setup quasi-linear
query complexity suffices to successfully cluster, and if so, what scheme
achieves such complexity?

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s