The course meets for Mondays lectures and Wednesday discussions, both at 12:15-1:15pm PT. In Holidays weeks (W3 MLK, W8 President’s Day) lectures will be Wednesday and discussions will be Friday, again both at 12:15-1:15pm PT.The first three weeks will be zoom (see Canvas). After that, we will meet in 380-380W on Mondays and Lathrop 299 on Wednesdays. Holiday week Fridays will be in 380-380W.
Instructor: Prof. Johan Ugander (MS&E), jugander@
Office hours (zoom): Wednesdays 4:30p-5:30p PT, with added office hours for project support later in the course.
Office hours (zoom): See canvas, but also posted here for convenience:
- PS 1, due week 3, Fri 1/21, 12:15pm PT
- Fri Jan 14, 14:00-15:00 PT
- Weds Jan 19, 14:30-15:30 PT
- PS 2, due week 5, Weds 2/2, 12:00pm PT
- Fri Jan 28, 14:00-15:00 PT
- Mon Jan 31, 14:30-15:30 PT
- PS 3, due week 7, Weds 2/16, 12:00pm PT
- Fri Feb 11, 14:00-15:00 PT
- Mon Feb 14, 14:30-15:30 PT
- Project OH
- Wed Feb 23, 14:30-15:30 PT
- Wed Mar 2, 14:30-15:30 PT
The course evaluation consists of three parts: problem sets (40%), in-class discussion leading and participation (20%), and group project reports and presentations (40%). Students will rotate to lead Wednesday discussions. There will be 3 problem sets that include significant data manipulation and coding. These will be due before the second class meetings of Week 3 (Friday), Week 5 (Wednesday), and Week 7 (Wednesday). Group projects will be developed over the course of the quarter and presented during Week 10.
Lectures will be recorded, but synchronous attendance is expected. Please email Prof. Ugander if you will be missing lecture. Discussion section attendance is mandatory and are not recorded. Because of the key role of discussions, is not possible to complete this course asynchronously.
Week 1: Introduction (1/3, 1/5)
- Solon Barocas, danah boyd (2017) Engaging the Ethics of Data Science in Practice, Communications of the ACM.
- Ethan Zuckerman (2016) The Perils of Using Technology to Solve Other People’s Problems, The Atlantic.
- Eric Horvitz, Deirdre Mulligan (2015) Data, privacy, and the greater good, Science.
- Ivan Illich (1968) To Hell With Good Intentions.
Week 2: Digital exhaust and privacy (1/10, 1/12)
Discussion paper: Sweeney (2000)
- 1990 U.S. Census:
Latanya Sweeney (2000) Simple Demographics Often Identify People Uniquely, CMU Data Privacy Working Paper 3.
Philippe Golle (2006) Revisiting the Uniqueness of Simple Demographics in the US Population, WPES. - AOL search logs (2006):
NY Times (Aug. 9, 2006), A Face Is Exposed for AOL Searcher No. 4417749.
NY Times (Aug 23, 2006), Researchers Yearn to Use AOL Logs, but They Hesitate. - Netflix prize (2007), Facebook Beacon (2007-2009), and the Video Privacy Protection Act:
Arvind Narayanan, Vitaly Shmatikov (2008) Robust De-anonymization of Large Sparse Datasets, IEEE Security and Privacy. (see also author FAQ)
Bruce Schneier (Dec 12, 2007) Why ‘Anonymous’ Data Sometimes Isn’t, Wired.
NY Times (Sept 19, 2009) Facebook Will Shut Down Beacon to Settle Lawsuit.
Brian Singel (Dec 17, 2009) Netflix spilled your Brokeback Mountain secret, lawsuit claims, Wired. - Location data, cell phones, credit cards (2009, 2013):
Philippe Golle & Kurt Partridge (2009) On the anonymity of home/work location pairs, IEEE PerCom.
Yves-Alexandre de Montjoye et al. (2013) Unique in the Crowd: The privacy bounds of human mobility, Scientific Reports.
Yves-Alexandre de Montjoye et al. (2015) Unique in the shopping mall: On the reidentifiability of credit card metadata, Science.
Scott Berinato (Feb 8, 2015) There’s No Such Thing as Anonymous Data, Harvard Business Review.
D.J. Pangburn (Sept 26, 2017) Even This Data Guru Is Creeped Out By What Anonymous Location Data Reveals About Us, Fast Company.
NY Times (Dec 10, 2018) Your Apps Know Where You Were Last Night, and They’re Not Keeping It Secret.
NY Times Opinion (Dec 19, 2019) Twelve Million Phones, One Dataset, Zero Privacy. - Miscellaneous:
Jonathan Chang et al. (2010) ePluribus: Ethnicity on Social Networks, ICWSM.
Timnit Gebru et al. (2017) Using deep learning and Google Street View to estimate the demographic makeup of neighborhoods across the United States, PNAS.
Nikhil Naik et al. (2017) Computer vision uncovers predictors of physical urban change, PNAS.
NY Times (Dec 31, 2017) How Do You Vote? 50 Million Google Images Give a Clue.
The Outline (Feb 27, 2017) J.C. Penney’s troubles are reflected in satellite images of its parking lots.
Kashmir Hill (Feb 7, 2018) The House That Spied on Me, Gizmodo.
Andrew Bortz, Dan Boneh, Palash Nandy (2007) Exposing Private Information by Timing Web Applications, WWW.
Andrew Reed, Michael Kranch (2017) Identifying HTTPS-Protected Netflix Videos in Real-Time, CODASPY.
Jinwoo Kim et al. (2017) Hello, Facebook! Here is the stalkers’ paradise!: Design and analysis of enumeration attack using phone numbers on Facebook, International Conference on Information Security Practice and Experience.
Yaniv Erlich, Tal Shor, Itsik Pe’er, Shai Carmi (2018) Identity inference of genomic data using long-range familial searches, Science.
- Cynthia Dwork & Aaron Roth (2014) Chapter 1: The Promise of Differential Privacy, in The Algorithmic Foundations of Differential Privacy, NOW Publishers.
- Zhanglong Ji, Zachary C. Lipton, Charles Elkan (2014) Differential Privacy and Machine Learning: a Survey and Review, arXiv.
- Logistic regression:
Kamalika Chaudhuri, Claire Monteleoni (2009) Privacy-preserving logistic regression, NeurIPS. - Federated learning:
H. Brendan McMahan, Eider Moore, Daniel Ramage, Seth Hampson, Blaise Agüera y Arcas (2017) Communication-Efficient Learning of Deep Networks from Decentralized Data, AISTATS.
Abhishek Bhowmick, John Duchi, Julien Freudiger, Gaurav Kapoor, Ryan Rogers (2018) Protection Against Reconstruction and Its Applications in Private Federated Learning, arXiv. - Frank McSherry (Feb 25, 2018) Uber’s differential privacy .. probably isn’t.
- Nicholas Carlini et al. (2018) The Secret Sharer: Measuring Unintended Neural Network Memorization & Extracting Secrets, arXiv.
- Yoav Goldberg (2018) 4gram language models share secrets too…, Github.
- Linear Program Reconstruction (Diffix attack):
Irit Dinur and Kobbi Nissim (2003) Revealing information while preserving privacy, PODS.
Aloni Cohen, Kobbi Nissim (2019) Linear Program Reconstruction in Practice, arXiv.
Andrea Gadotti, Florimond Houssiau, Luc Rocher, Benjamin Livshits, Yves-Alexandre de Montjoye (2019) When the Signal is in the Noise: Exploiting Diffix’s Sticky Noise, USENIX.
- Overview:
Lawrence Lessig (Oct 8, 2009) Against Transparency, The New Republic.
Chapter from Evgeny Morozov’s “To Save Everything, Click Here”
Karen EC Levy, David Merritt Johns (2016) When open data is a Trojan Horse: The weaponization of transparency in science and governance, Big Data & Society. - Huffington Post FundRace (2007-2012, now offline):
Fundrace via archive.org, Oct 31, 2012.
Deborah G. Johnson, Priscilla M. Regan, Kent Wayland (2011) Campaign Disclosure, Privacy and Transparency, William & Mary Bill of Rights Journal. - California Proposition 8, “eightmaps” (2008), California campaign finance disclosure laws:
Brad Stone (Feb 7, 2009) Prop 8 Donor Web Site Shows Disclosure Law Is 2-Edged Sword, The New York Times.
Cal-Access campaign Finance database
California Civic Data Coallition - Gun owners after Sandy Hook (2012):
Mark Memmott (2012) N.Y. Website Posts Map Of People With Gun Permits, Draws Criticism, NPR.
Snopes.com meta-review (2013) The gunowner next door. - Mug shots online (2013-Present):
David Segal (Oct 5, 2013) Mugged by a Mug Shot Online, NY Times.
Rebecca Beitsch (Dec 11, 2017) Fight Against Mugshot Sites Brings Little Success, Pew.
David Segal (Feb 11, 2017) Trying to Minimize the Misery of Mug Shots, NY Times.
Eumi Lee (2018) Monetizing Shame: Mugshots, Privacy, and the Right to Access, Rutgers Law Review.
ClassActionAgainstMugShotWebsites.com - Clinton Emails and Wikileaks (2016):
Cesar Hidalgo (Nov 4, 2016) What I learned from visualizing Hillary Clinton’s emails, Medium.
Cesar Hidalgo (Nov 8, 2016) What I learned the night of the election, and what I would like to see in the future, Medium. - The Right to be Forgotten (2015-Present):
Theo Bertram et al. (2019) Five years of the right to be forgotten, CCS.
Google Transparency Report (2018) Search removals under European privacy law.
Radiolab podcast (Aug 23, 2019) Right to be forgotten. - Real estate:
Sharique Hasan, Anuj Kumar (2018) Digitization and Divergence: School Ratings and Segregation in America, Working paper.
- Adam D. I. Kramer, Jamie E. Guillory and Jeffrey T. Hancock (2014) Experimental evidence of massive-scale emotional contagion through social networks, PNAS.
- Michael Luca (July 29, 2014) Were OkCupid’s and Facebook’s Experiments Unethical?, HBR.
- Scott E. Carrell, Bruce I. Sacerdote, James E. West (2013) From natural variation to optimal policy? The importance of endogenous peer group formation, Econometrica.
- So you think you can test? game by Lukas Vermeer
- A/B testing and sample size/budgeting:
Azevedo et al. (2019) A/B testing with fat tails, Working paper.
Elea McDonnell Feit & Ron Berman (2019) Test & Roll: Profit-Maximizing A/B Tests, Working paper. - A/B testing and p-hacking:
Ramesh Johari, Pete Koomen, Leonid Pekelis, David Walsh (2017) Peeking at A/B Tests: Why it matters, and what to do about it, KDD.
Ron Berman, Leonid Pekelis, Aisling Scott, Christophe Van den Bulte (2018) p-Hacking and False Discovery in A/B Testing, Working Paper. - Experiment Aversion, the “A/B Effect”:
Michelle Meyer (2015) Two Cheers for Corporate Experimentation: The A/B Illusion and the Virtues of Data-Driven Innovation, J. on Telecomm. & High Tech. L..
Michelle Meyer (2018) Ethical Considerations When Companies Study – And Fail to Study – Their Customers, The Cambridge Handbook of Consumer Privacy.
Michelle N. Meyer, Patrick R. Heck, Geoffrey S. Holtzman, Stephen M. Anderson, William Cai, Duncan J. Watts, Christopher F. Chabris (2019) Objecting to experiments that compare two unobjectionable policies or treatments, PNAS. (letter to editor, reply)
Patrick Heck, Christopher Chabris, Duncan Watts, Michelle Meyer (Oct 21, 2019) Sometimes People Dislike Experiments More than They Dislike Their Worst Conditions: Within-Subjects Evidence for “Experiment Aversion” and the A/B Effect, Working paper. - Ad experiments:
Randall A Lewis, Justin M Rao (2015) The unfavorable economics of measuring the returns to advertising, QJE.
Garrett A. Johnson, Randall A. Lewis, and Elmar I. Nubbemeyer (2017) Ghost Ads: Improving the Economics of Measuring Online Ad Effectiveness, JMR.
Brett R. Gordon, Florian Zettelmeyer, Neha Bhargava, Dan Chapsky (2018) A Comparison of Approaches to Advertising Measurement: Evidence from Big Field Experiments at Facebook, Working Paper.
Jason Huang, David H. Reiley, Nickolai M. Riabov (2018) Measuring Consumer Sensitivity to Audio Advertising: A Field Experiment on Pandora Internet Radio, Working Paper.
Erin Griffith (April 30, 2018) Pandora Learns the Cost of Ads, and of Subscriptions, Wired.
- Recommendation and persuasion:
Maurits Kaptein & Dean Eckles (2010) Selecting effective means to any end: Futures and ethics of persuasion profiling. International Conference on Persuasive Technology. - Search Engines:
David Easley & Jon Kleinberg (2011) 18.6, The Effect of Search Tools and Recommendation Systems, Networks, Crowds, and Markets.
Ryen White & Eric Horvitz (2015) Belief Dynamics and Biases in Web Search. ACM Transactions on Information Systems.
Wired (Sept 7, 2016) Google’s Clever Plan to Stop Aspiring ISIS Recruits.
Jigsaw (formerly Google Ideas) The Redirect Method, white paper.
NY Times (Sept 26, 2017) As Google Fights Fake News, Voices on the Margins Raise Alarm.
Wall Street Journal (Nov 16, 2017) Google Has Picked an Answer for You—Too Bad It’s Often Wrong. (examples). - Collaborative filtering and recommendation systems:
Quartz (Dec 21, 2015) The magic that makes Spotify’s Discover Weekly playlists so damn good.
The Guardian (Feb 2, 2018) How an ex-YouTube insider investigated its secret algorithm.
Wired (March 13, 2018) Youtube will link directly to Wikipedia to fight conspiracy theories. - Recommendation systems privacy leaks:
Joseph A. Calandrino, Ann Kilzer, Arvind Narayanan, Edward W. Felten, & Vitaly Shmatikov (2011) “You Might Also Like:” Privacy Risks of Collaborative Filtering. IEEE Security & Privacy.
Maciej Ceglowski (Sept 21, 2017) Anatomy of a moral panic.
Gizmodo (March 30, 2017) This Is Almost Certainly James Comey’s Twitter Account.
Jean Yang (March 31, 2017) Five Research Ideas Instagram Could Have Used to Protect Comey’s Secret Twitter. - Recommendation system and control:
Carol Esmark, Stephanie M. Noble (Dec 28, 2016) Your In-Store Customers Want More Privacy.
Mike Yeomans, Anuj K. Shah, Sendhil Mullainathan, Jon Kleinberg (2018) Making sense of recommendations, Working paper. - Folk theories of feed, “algorithm aversion”:
Berkeley Dietvorst, Joseph Simmons, Cade Massey. (2014) Algorithm Aversion: People Erroneously Avoid Algorithms After Seeing Them Err, J Experimental Psychology.
Berkeley Dietvorst, Joseph Simmons, Cade Massey (2016) Overcoming Algorithm Aversion: People Will Use Imperfect Algorithms If They Can (Even Slightly) Modify Them, Working paper.
Motahhare Eslami et al. (2015) “I always assumed that I wasn’t really that close to [her]”: Reasoning about invisible algorithms in the news feed, CHI.
K. Vaccaro et al. (2017) The Illusion of Control: Placebo Effects of Control Settings, CHI.
Megan French, Jeff Hancock (2017) What’s the Folk Theory? Reasoning About Cyber-Social Systems, Working Paper.
Pew Research (Sept 5, 2018) Many Facebook users don’t understand how the site’s news feed works.
Maurice Jakesch, Megan French, Xiao Ma, Jeffrey T. Hancock, Mor Naaman (2019) AI-Mediated Communication: How the Perception that Profile Text was Written by AI Affects Trustworthiness, CHI.
- Browser/Phone Fingerprinting:
EFF Panopticlick, AmIUnique
Peter Eckersley (2010) How Unique Is Your Web Browser?, PETS.
Arvind Narayanan (Feb 18, 2010) Cookies, Supercookies and Ubercookies: Stealing the Identity of Web Visitors.
Steven Englehardt, Arvind Narayanan (2016) Online Tracking: A 1-million-site Measurement and Analysis, SIGSAC.
FiveThirtyEight (Sept 2, 2016) Internet Tracking Has Moved Beyond Cookies.
The Verge (Apr 21, 2017) Uber tried to fool Apple and got caught.
NY Times (June 23, 2017) Google Will No Longer Scan Gmail for Ad Targeting.
NY Times Opinion (Sept 18, 2019) This Article Is Spying on You.
Washington Post (Oct 31, 2019) Think you’re anonymous online? A third of popular websites are ‘fingerprinting’ you. - Personalized Paywalls:
Shan Wang (Feb 22, 2018) After years of testing, The Wall Street Journal has built a paywall that bends to the individual reader, Harvard Neiman Lab.
David Skok (Dec 2016) What lies beyond paywalls?, Harvard Neiman Lab. - Personalized design:
Marketing Land (Oct 28, 2016) Facebook’s racial targeting isn’t new, bad or always illegal despite renewed attention.
NY Times (Oct 23, 2018) Some Viewers Think Netflix Is Targeting Them by Race. Here’s What to Know. - “Dynamic Pricing” (personalized prices):
CBC Marketplace (Nov 24, 2017) How companies use personal data to charge different people different prices for the same product.
Mark Perry (Sept 18, 2017) Electrical workers in FL are paid 2-3X the normal hourly wage and regarded as heroes. But aren’t they price gougers?
Motherboard (May 19, 2017) Uber Is Using AI to Charge People as Much as Possible for a Ride.
Forbes (Apr 14, 2014) Different Customers, Different Prices, Thanks To Big Data.
Time Magazine (June 26, 2012) Orbitz Shows Higher Prices to Mac Users.
Wall Street Journal (Dec 24, 2012) Websites Vary Prices, Deals Based on Users’ Information.
Mikians et al. (2012) Detecting price and search discrimination on the Internet, Hotnets. - Ad Personalization, Custom Audiences:
Aleksandra Korolova (2011) Privacy Violations Using Microtargeted Ads: A Case Study, Journal of Privacy and Confidentiality.
Giridhari Venkatadri, Athanasios Andreou, Yabing Liu, Alan Mislove, Krishna P. Gummadi, Patrick Loiseau, Oana Goga (2018) Privacy Risks with Facebook’s PII-Based Targeting: Auditing a Data Broker’s Advertising Interface, IEEE Security and Privacy.
Irfan Faizullabhoy, Aleksandra Korolova (2018) Facebook’s Advertising Platform: New Attack Vectors and the Need for Interventions, ConPro.
Muhammad Ali, Piotr Sapiezynski, Miranda Bogen, Aleksandra Korolova, Alan Mislove, Aaron Riek (2019) Ad Personalization: Discrimination through optimization: How Facebook’s ad delivery can lead to skewed outcomes, CSCW.
Piotr Sapiezynski, Avijit Ghosh, Levi Kaplan, Alan Mislove, Aaron Rieke (2019) Algorithms that “Don’t See Color”: Comparing Biases in Lookalike and Special Ad Audiences, Working paper. - Ad Transparency:
Tami Kim, Kate Barasz, Leslie John (2018) Why Am I Seeing This Ad? The Effect of Ad Transparency on Ad Effectiveness, Journal of Consumer Research.
Discussion paper: Kosinski et al. (2013)
- Friends and likes:
Elena Zheleva, Lise Getoor (2009) To Join or Not to Join: The Illusion of Privacy in Social Networks with Mixed Public and Private User Profiles, WWW.
Michal Kosinski, David Stillwell, There Graepel (2013) Private traits and attributes are predictable from digital records of human behavior, PNAS.
Jonathan Mayer, Patrick Mutchler, and John C. Mitchell (2016) Evaluating the privacy properties of telephone metadata, PNAS.
Kristen M. Altenburger, Johan Ugander (2018) Monophily in social networks introduces similarity among friends-of-friends, Nature Human Behaviour. - Interactions with location data, metadata, browsing data:
Nathan Eagle, Alex Pentland, David Lazer (2009) Inferring friendship network structure by using mobile phone data, PNAS.
David Crandall, Lars Backstrom, Dan Cosley, Siddharth Suri, Daniel Huttenlocher, Jon Kleinberg (2010) Inferring social ties from geographic coincidences, PNAS.
Jessica Su, Ansh Shukla, Sharad Goel, Arvind Narayanan (2017) De-anonymizing Web Browsing Data with Social Networks, WWW. - Social network APIs:
Iraklis Symeonidis, Filipe Beato, Pagona Tsormpatzoudi, Bart Preneel (2015) Collateral damage of Facebook Apps: an enhanced privacy scoring model, manuscript.
Wall Street Journal (Oct 25, 2010) A Web Pioneer Profiles Users by Name. (About RapLeaf)
The Verge (Feb 14, 2012) iOS apps and the address book: who has your data, and how they’re getting it. (original blog post with technical details)
Jonathan Albright (Mar 20, 2018) The Graph API: Key Points in the Facebook and Cambridge Analytica Debacle, Medium.
FTC’s 2011 privacy complaint against Facebook (note paragraph 37), settlement, and technical context.
- Health Insurance Portability and Accountability Act (HIPAA), 1996.
NY Times (June 26, 2019) Google and the University of Chicago Are Sued Over Data Sharing. - EU General Data Protection Regulation (GDPR), effective 5/25/18.
Example of GDPR reporting obligation: Recode (Dec 4, 2018) Another Facebook bug may have exposed millions of users’ private photos to app developers. - Vermont Data Broker regulation (Act 171 of 2018)
Washington Post (June 24, 2019) Data brokers are selling your secrets. How states are trying to stop them. - California Consumer Privacy Act (CCPA), effective 1/1/2020.
NY Times (Nov 4, 2019) I Got Access to My Secret Consumer Score. Now You Can Get Yours, Too. - Bryce Goodman & Seth Flaxman (2016) European Union regulations on algorithmic decision-making and a “right to explanation”, 2016 ICML Workshop on Human Interpretability in Machine Learning.
- Rachel Cummings, Deven Desai (2018) The Role of Differential Privacy in GDPR Compliance, FATREC.
- Jian Jia, Ginger Zhe Jin, Liad Wagman (2018) The Short-Run Effects of GDPR on Technology Venture Investment, NBER.
- Midas Nouwens, Ilaria Liccardi, Michael Veale, David Karger, Lalana Kagal (2020) Dark Patterns after the GDPR: Scraping Consent Pop-ups and Demonstrating their Influence, CHI.
- Effects on Online Advertising:
Avi Goldfarb, Catherine Tucker (2011) Privacy Regulation and Online Advertising, Management Science.
Google (August 27, 2019) Effect of disabling third-party cookies on publisher revenue, from this Google Ads blog post.
Garrett Johnson, Scott Shriver, Shaoyin Du (2019) Consumer Privacy Choice in Online Advertising: Who Opts Out and at What Cost to Industry?, Marketing Science.
Garrett Johnson & Scott Shriver (2019) Privacy & Market Concentration: Intended & Unintended Consequences of the GDPR, working paper.
- Presentations by students.
In the event that a student is found to have violated the honor code (including through Early Resolution), the penality may include a full denial of credit for the course. See the Student Conduct Penalty Code, Section J.
Students with Documented Disabilities
Students who may need an academic accommodation based on the impact of a disability must initiate the request with the Office of Accessible Education (OAE). Professional staff will evaluate the request with required documentation, recommend reasonable accommodations, and prepare an Accommodation Letter for faculty dated in the current quarter in which the request is made. Students should contact the OAE as soon as possible since timely notice is needed to coordinate accommodations. For more information: http://studentaffairs.stanford.edu/oae