Jennifer Golbeck: The curly fry conundrum: Why social media "likes" say more than you might think
The Evolution of the Web and User Data
The Static to Interactive Shift
- In the early days of the web, content was primarily static, created by organizations or tech-savvy individuals.
- The rise of social media in the 2000s transformed the web into an interactive platform where average users contribute vast amounts of content.
- Facebook exemplifies this shift, boasting 1.2 billion monthly users who create online personas with minimal technical skills.
Data Collection and Behavioral Insights
- Users unknowingly share extensive personal data online, leading to unprecedented behavioral and demographic insights.
- Target's use of purchase history to predict a customer's pregnancy illustrates how companies analyze patterns for predictive modeling.
Predictive Modeling Techniques
- Target computes a "pregnancy score" based on subtle purchasing behaviors rather than obvious indicators like baby products.
- Researchers can predict various attributes (political preference, personality traits, etc.) using social media interactions and user behavior.
Understanding Predictive Indicators
- A study published in the Proceedings of the National Academies highlights that even seemingly irrelevant likes (e.g., curly fries) can indicate intelligence levels.
- This phenomenon is explained through sociological theories such as homophily—people tend to associate with others similar to themselves.
Implications for Users
- The spread of information through networks resembles disease transmission; understanding this helps explain why certain likes correlate with specific attributes.
User Data Control and Its Implications
The Problem of User Data Control
- The speaker highlights a significant issue regarding user data control, emphasizing that users often lack power over how their data is utilized, which poses a problem for the future.
- An example is provided where the speaker could start a business predicting personal attributes (e.g., teamwork ability, substance use) based on user data without consent, illustrating the potential misuse of such information.
Challenges in Policy and Law
- The speaker discusses the difficulty of enacting meaningful changes to intellectual property law in the U.S. to give users more control over their data due to political inertia.
- Social media companies' revenue models depend on exploiting user data, complicating efforts to shift ownership back to users; it’s noted that users are often seen as products rather than customers.
Scientific Approaches to Data Privacy
- A proposed alternative path involves scientific research aimed at developing mechanisms that inform users about risks associated with sharing personal information online.
- Suggestions include allowing users to encrypt their uploaded data so that it remains inaccessible and worthless to third parties while still being viewable by selected individuals.
Ethical Considerations in Data Usage
- The speaker argues that if privacy measures hinder predictive methods developed from user data, this should be viewed as a success; the goal is not merely prediction but enhancing online interactions.