Richard Bergmair's Projects

Mathematical Modelling & Simulation of Search Anonymity

Contract type: consulting
Client: DuckDuckGo, Inc.
Role: Senior Backend Engineer
Time period: Aug 2018 – Mar 2022
Volume: ~ 6200h

data science • Linux • Perl • Python • amazon ec2 • amazon s3 • cloud infrastructure • Elasticsearch • apache solr • search engines • Kyoto Cabinet • NoSQL • clickhouse • PostgreSQL • QGIS • analytic cube • matplotlib • numpy • scipy • scikit-learn

I was working for a web search engine company serving privacy-conscious users and entrusted with a mission-critical project reporting directly to c-suite executives: It consisted in formulating the mathematical model for the anonymity properties inherent within data passed on to upstream service providers.

Since there was no widely publicized mathematical model for anonymity within data structures of the kind we were dealing with, there was a fair amount of creative mathematical problem solving involved to come up with the model.

The company’s commitment to privacy meant that I had to develop special methodology to extract statistics in a privacy-conscious way from the production system. These were then used to generate, through Monte Carlo simulation, data to serve as a substitute for “real” session-information in our subsequent analysis.

This data was then run through our anonymization process as well as an evaluation process that quantified the level of anonymity in the data pre/post anonymization.

==> DuckDuckGo