1. Introduction, Threat Models

Name: 1. Introduction, Threat Models
Uploaded: 2015-07-14T16:21:11.000Z
Duration: 2 h 34 min 13 s
Description: MIT 6.858 Computer Systems Security, Fall 2014 View the complete course: http://ocw.mit.edu/6-858F14 Instructor: Nickolai Zeldovich In this lecture, Professor Zeldovich gives a brief overview of the class, summarizing class organization and the concept of threat models. License: Creative Commons BY-NC-SA More information at http://ocw.mit.edu/terms More courses at http://ocw.mit.edu

Introduction

The professor introduces the course and explains what the students can expect to learn.

Course Overview

The course is about building secure systems, understanding why computer systems are sometimes insecure, and how to make them better.

Each lecture will focus on a research paper that students should read ahead of time.

There are lab assignments that cover different security problems using different programming languages.

Students can ask questions anytime during lectures or office hours.

Co-Lecturer and TAs

James Mickens is the co-lecturer from Microsoft Research who will lecture on web security later in the semester.

There are four TAs: Stephen, Webb, [INAUDIBLE], and James.

Textbook

There isn't a great textbook for this topic, so each lecture will be focused around a research paper assigned on the website.

Lecture Structure

The professor explains how each lecture will be structured around a research paper.

Lecture Format

Each lecture (other than this one) will be focused around some research paper that students should read ahead of time.

Students should answer questions about the paper in the submission system before 10:00 PM before the lecture day.

During lectures, they'll discuss the paper's system, problem it solves, when it works/doesn't work, etc.

Class Organization

The professor discusses class organization and encourages students to ask questions if something doesn't seem right.

Topics and Papers

There's a preliminary schedule up on the website but they're pretty flexible if there are other topics or papers students want to hear more about.

If there's ever a question or mistake during lectures just interrupt and ask. Security is all about details so getting everything right is important.

Lab Assignments

There will be a series of lab assignments that cover different security problems using different programming languages.

The first lab assignment is already posted on the website and involves finding ways to exploit buffer overflow vulnerabilities in a web server.

Every lab uses a different language, so it's important to start early and learn all these languages if you haven't seen them before.

Conclusion

The professor concludes by encouraging students to start early on the lab assignments and offering help with understanding binary programs.

Learning Different Languages

Learning all these languages may be useful for your moral character or something like that, but it will take some preparation especially if you haven't seen these languages before.

It might be helpful to start early, especially since lab one relies on a lot of subtle details of C and Assembly code that aren't taught in other classes here in as much detail.

TAs will hold office hours next week where they'll do some sort of tutorial session to help students get started with understanding what a binary program looks like, how to disassemble it, how to figure out what's on the stack, etc.

Introduction to Security

The instructor introduces the course and explains the rules for accessing MIT's network when doing security research. He also provides an overview of what security is and how it relates to achieving goals in the presence of adversaries.

What is Security?

Security involves achieving a goal in the presence of an adversary who wants to prevent you from succeeding.

A secure system can still function regardless of what the bad guy is trying to do.

Security policies define what a system should be able to do, such as maintaining confidentiality, integrity, or availability.

Threat models are assumptions about what the bad guy will do and are necessary for designing secure systems.

Mechanisms are software or hardware that enforce security policies under a given threat model.

Rules for Accessing MIT's Network

Not everything that is technically possible in this class is legal.

Students should consult with instructors or TAs if they are unsure about whether something is allowed on MIT's network.

Conclusion

The instructor concludes by summarizing the key points covered in this lecture and emphasizing the importance of being conservative when choosing a threat model.

Key Points

It's important to err on the side of caution when selecting a threat model because bad guys can always surprise you.

Mechanisms enforce security policies under a given threat model, allowing us to achieve our goals even in the presence of adversaries.

Introduction to Computer Security

In this section, the speaker introduces the concept of computer security and explains why it is a difficult problem to solve.

The Challenge of Computer Security

Computer systems are almost always compromised in some way or another.

Security tends to be a difficult problem because we have to make sure our security policy is followed regardless of what the attacker can do.

Designing a secure system requires us to consider all possible ways an attacker could try to access our data.

This process is iterative, and we must constantly update our threat model and mechanisms as new vulnerabilities are discovered.

Pushing Systems to Their Limits

In this section, the speaker discusses how pushing systems to their limits can help identify weaknesses and improve security.

Finding Weaknesses in Systems

Throughout this class, we will push different systems to their limits and see where they break down.

Every system has a breaking point, but that doesn't mean it's worthless. We just need to know when each system design is appropriate.

By identifying weaknesses in our systems, we can better understand when certain ideas work and when they are not applicable.

The Power of Security Mechanisms

In this section, the speaker explains how security mechanisms enable cool things that were not possible before.

Enabling New Capabilities with Security Mechanisms

Security mechanisms allow us to protect against certain classes of attacks and enable new capabilities that were not possible before.

For example, Native Client from Google allows us to run arbitrary x86 native code in the web browser securely, which was not possible before.

Introduction to Security Mechanisms

In this section, the speaker introduces how good security mechanisms can enable constructing new systems that were not possible before. The speaker also highlights that in the rest of the lecture, he will go through different examples of how security goes wrong.

Examples of How Security Goes Wrong

People get the policy wrong, threat model wrong and mechanism wrong.

Account recovery questions can weaken a system's security policy if not implemented correctly.

Sarah Palin's email account was hacked because her recovery questions were easily accessible on Wikipedia.

Mat Honan's Gmail account was hacked due to multiple systems interacting with each other and allowing password resets based on personal information like billing address and credit card number.

Amazon Security Vulnerabilities

This section discusses how a bad actor was able to compromise an Amazon account and the vulnerabilities that allowed this to happen.

Amazon's Management System

Amazon's management system allows users to purchase items without signing in, making it easy for bad actors to use someone else's credit card.

A bad actor can add a new credit card to someone else's account and reset their password by providing one of their credit card numbers.

Even if a bad actor breaks into someone's Amazon account, they cannot see saved credit card numbers but can see the last four digits.

Threat Models

Threat models should not make strong assumptions about human behavior since people often pick weak passwords and click on random links.

Threat models may change over time as technology advances. For example, 56-bit keys were once considered secure but are now vulnerable.

Kerberos Key Size

This section discusses how the key size for Kerberos changed over time due to advancements in technology.

In the mid-'80s, 56-bit keys were considered secure for Kerberos.

As technology advanced and Kerberos became more popular, larger key sizes became necessary.

Threat Models and Assumptions

This section discusses the importance of keeping up with the times when it comes to threat models and assumptions. It highlights examples of bad threat models, such as assuming all certificate authorities are trustworthy, and emphasizes the need to balance your threat model against who you think is out to get you.

Outdated Assumptions

In the mid-1980s, it was assumed that Kerberos account keys were secure. However, with modern hardware and web services, anyone's Kerberos account key can be obtained in roughly a day.

The assumption that hardware is trustworthy is no longer valid today due to revelations about what the NSA is capable of doing. They have hardware backdoors that they can insert into computers.

Balancing Your Threat Model

When protecting yourself from government attacks, assume your laptop might be compromised physically regardless of what you install in it.

Protecting yourself from the NSA may be an expensive proposition; however, if you're only protecting yourself from random students snooping around in your Athena home directory or similar activities, you may not have to worry about this stuff as much.

It's essential to balance your threat model against who you think is out to get you.

Bad Threat Models

Secure websites' SSL protocol assumes all certificate authorities are trustworthy. However, hundreds of CAs exist globally, any of which can make a certificate for any hostname or domain name. As a result, compromising one CA allows bad actors to impersonate any website.

DARPA wanted secure operating systems in the 1980s and had universities build prototypes. A red team of bad actors broke into the systems in surprising ways, such as compromising a server with all the source code for an operating system.

Introduction to Mechanisms and Security Policies

This section introduces the concept of mechanisms and how they are used to enforce security policies. The instructor explains that mechanisms are the most complicated part of the story, and as a result, much of this class will focus heavily on mechanisms and how to make them secure.

Mechanism Bugs

Mechanisms can fail in many ways, making it challenging to create secure systems.

It is easier to make crisp statements about mechanisms than policies or threat models.

A problem was discovered in Apple's iCloud service where they did not enforce the same mechanism at all interfaces.

iCloud provides various services for the same set of accounts, including file storage, photo sharing, and find my iPhone interface.

Developers forgot to call a function on the find my iPhone API that would limit login attempts after 10 tries. As a result, bad actors could guess passwords through this interface at millions of attempts per day.

Conclusion

This example shows that having the right policy (only allowing authenticated users access), threat model (bad actors might be able to guess passwords), and mechanism (limiting login attempts after a certain number of tries) is essential for creating secure systems.

Examples of Small Programming Mistakes Leading to Catastrophic Results

In this section, the speaker provides two examples of small programming mistakes that led to catastrophic results. The first example is about Citibank's website, and the second example is related to Bitcoin on Android phones.

Citibank's Website

Citibank had a website that allowed users to view their credit card information.

A guy figured out that by changing the account ID number in the URL, he could access someone else's account.

This happened because they forgot to check if the ID number was valid and belonged to the person who was logged in.

They may have had a bad threat model and thought no one could hit URLs.

Bitcoin on Android Phones

Bitcoin relies heavily on generating good random keys that no one else can guess.

The Android applications for Bitcoin were getting random values for these keys using Java API called SecureRandom().

However, it turned out that this library had a small bug in it where it forgot to initialize the PRNG with a seed, so it was just all zeros.

As a result, everyone could figure out what your random numbers were, which means they could generate the same private key as you and transfer your Bitcoins.

Bitcoin Signature Scheme

In this section, the professor explains how the signature scheme used by Bitcoin assumes that every time a new signature is generated with a key, a fresh nonce is used. If two signatures are generated with the same nonce, someone can figure out what the key is.

Signature Generation and Nonce

The particular signature scheme used by Bitcoin assumes that every time you generate a new signature with that key, you use a fresh nonce for generating that signature.

If you ever generate two signatures with the same nonce, someone could apply some clever math to your signatures and sort of extract your public or private key out of it.

Importance of Details in Computer Security

In this section, the professor emphasizes how almost every detail in computer security has a chance of really mattering. He also stresses on being very clear about what is the specification of your system and what are all the corner cases.

Consequences of Seemingly Inconsequential Mistakes

Almost something seemingly inconsequential like forgetting to check something or forgetting to initialize the random seed can have pretty dramatic consequences for the overall system.

You really have to be very clear about what is the specification of your system? What is it doing? Exactly what are all the corner cases?

Pushing All Edge Cases

A good way to think of breaking a system or figuring out if your system is secure is to really push all edge cases like what happens if my input is just large enough? Or what is the biggest or smallest input? What's sort of strangest set of inputs I could provide to my program and push it in all these corner cases?

Ambiguity in SSL Certificates

In this section, the professor explains how SSL certificates encode names into the certificate itself and how they use a particular encoding scheme that writes down the name of the server you're connecting to. He also highlights an example of ambiguity in SSL certificates.

Encoding Scheme for Names

In SSL certificates, they use a particular encoding scheme that writes down the name of the server you're connecting to by first writing down the number of bytes in the string.

So this is like-- in the SSL certificate, somewhere in there, there is this byte 10 followed by 10 bytes saying what the host name is. And when a browser takes it, well, the browser is written in C. And the way C represents strings is by null terminating them.

Example of Ambiguity

Suppose that I own the domain foo.com. So I can get certificates for anything dot foo dot com. So what I could do is ask for a certificate for the name amazon.com0x.foo.com.

That's a perfectly valid string with 20 bytes. But then when a browser takes this string and loads it in memory, well,

what it does is copy the string amazon.com0x.foo.com and dutifully add terminating zero at end but then when rest

of browser software goes and tries to interpret string at this memory location, it'll keep going up until it gets to zero and say OK well that's end of string so this is Amazon.com. That's it.

So this sort of disconnect between how C software represents strings and other software can lead to ambiguity issues with SSL certificates.

Introduction to Mechanism Failure

In this section, the speaker introduces the concept of mechanism failure and how it can be exploited by attackers. The importance of understanding different ways of encoding is highlighted.

Mechanism Failure

Different ways of encoding can lead to disagreement between systems, which can be exploited by attackers.

Buffer overflows are a common example of mechanism failure that will be discussed in more detail.

Lab one involves exploiting vulnerabilities in a web server due to buffer overflows.

Understanding Web Servers and Threat Models

This section provides an overview of web servers and threat models. The policy and threat model for a typical web server are discussed.

Web Server Setting

A web server is a program that accepts connections from the outside world, takes requests (packets), processes them, and sends replies.

The programmer's intended behavior is probably the policy for the web server.

The attacker cannot log in remotely or have physical access but can send any packet they want.

Threat Model

Anything that can be delivered to the web server is fair game for an attacker.

The goal is for the web server not to allow arbitrary actions that violate its policy.

Common Problems with Writing Software in C

This section discusses how memory allocation mismanagement when writing software in C can lead to security problems.

Memory Allocation Mismanagement

Writing software in C can lead to memory allocation mismanagement.

A single byte can make a huge difference in the SSL certificate naming example.

Writing buggy web server software due to memory allocation mismanagement can lead to security problems.

Introduction

In this section, the speaker introduces the program that will be used as an example throughout the video. The program reads a request and stores it in a buffer before parsing it as an integer and returning the integer.

Program Overview

The program reads a request from the network or keyboard input.

It stores the input in a buffer.

The program parses the input as an integer and returns it.

The program prints whatever integer is returned.

Testing Input Limits

In this section, the speaker tests how the program handles large inputs and non-numerical inputs.

Large Inputs

When provided with large inputs, such as a number close to 2^31, the program returns that number without issue.

However, when provided with extremely large inputs consisting of many characters (such as As), the program crashes.

Non-Numerical Inputs

When provided with non-numerical inputs, such as letters or symbols, the program returns 0 without issue.

Debugging Under A Debugger

In this section, the speaker uses a debugger to analyze what happens when providing extremely large inputs to the program.

Setting Up Debugger

The speaker sets up a breakpoint in the redirect function of their code.

They run their code under a debugger and stop at that breakpoint.

Analyzing Stack Pointer Value

Using low-level analysis of CPU registers, the speaker determines that the stack pointer value is currently pointing to a specific memory location.

They begin drawing a diagram of the program's stack on a board.

Disassembling Code

The speaker attempts to disassemble the code of their redirect function but runs into some issues with their debugger.

Understanding the Stack

In this section, the instructor explains how the stack works and how variables are stored in memory.

The Stack

Variables are stored on the stack.

The buffer starts at the bottom of the stack and grows upwards.

The return address is located above all other variables on the stack.

Finding Return Address

In this section, the instructor shows how to find the return address on the stack using GDB.

Using GDB

The EBP pointer points to a location on the stack that contains a saved EBP and then a return address.

By examining EBP plus four, we can find where our program will jump to when it returns from redirect.

We can use GDB to disassemble an instruction pointer and see which function contains that address.

Buffer Overflow

In this section, the instructor demonstrates how buffer overflow can cause a segmentation fault.

Buffer Overflow

Passing too much data to a function can overwrite other variables on the stack.

getS() keeps writing As beyond what was allocated for it in memory.

This overwrites not only As but also other variables such as i and even potentially return addresses.

Printing out buffer shows that there are 180 As instead of 128.

If we try to return now, our program will jump to an invalid address and crash.

Introduction to Buffer Overflow

In this section, the instructor introduces buffer overflow and explains how it can be exploited.

What is Buffer Overflow?

Buffer overflow is a vulnerability that occurs when a program tries to store more data in a buffer than it was designed to hold.

When a buffer overflows, it can overwrite adjacent memory locations, including the return address on the stack.

After overflowing the buffer, there may be other code that runs before reaching the overwritten return address.

Exploiting Buffer Overflow

By providing all As as input, we can write zero to a memory location which terminates strings in C.

To exploit buffer overflow, we need to get to the point where we use the value placed on the stack.

We may have to massage our input in some cases so that other code doesn't exit right away or do something silly.

Buffer overflow is exploitable because an attacker can carefully construct input values and get it to jump somewhere else.

Exploiting Buffer Overflow

In this section, the instructor demonstrates how buffer overflow can be exploited by manually changing things on the stack.

Manually Changing Things on Stack

Overflowing the stack with lots of As and then manually changing things on the stack can allow us to jump to some point we want to jump.

In this program, there are limited interesting things that could be done after jumping.

Introduction to Buffer Overflow

In this section, the professor introduces buffer overflow and explains how it can be used to exploit a program.

How Buffer Overflow Works

The function main initializes, calls redirect, does some more stuff, and then calls printf.

We can set the argument to printf by taking the value of x equals percent d and try to stick it in the stack using the debugger.

When we run the program again, it prints out x equals some garbage which happens to be just whatever is on the stack that was passed to printf. However, it crashes because we changed the return address so that when we return from redirect, we now jump to a new address right after printf.

Understanding Program Crash

In this section, the professor explains why our pseudoattack didn't work and what actually happens when a program crashes.

Why Our Pseudoattack Didn't Work

The crash occurs because when main continues running after printf returns, it does another return but doesn't have a valid return address on the stack. So presumably, we return to some other memory location that's up on the stack and jump somewhere else.

If you really wanted to be careful with your attack, you would carefully plant not just this return address up on the stack but also figure out where is this second red going to get its return address from and try to carefully place something else on the stack there that will ensure that your program cleanly exits after it gets exploited so that no one notices.

Alternative Design for Stack Growth

In this section, the professor discusses an alternative design for stack growth and how it could potentially solve the problem of buffer overflow.

Flipping the Stack Around

Some machines actually have stacks that grow up. An alternative design we could imagine is one where the stack starts at the bottom and keeps going up instead of going down.

However, this alternative architecture still has its own problems as it could overrun the return address if there was a buffer down there.

Conclusion

In this section, the professor concludes by summarizing why it's bad to jump to or have these buffer overflows.

The problem with buffer overflow is that eventually, it runs over the return address.

We understand why it's bad to jump to or have these buffer overflows.

Buffer Overflow Exploits

In this section, the professor discusses how buffer overflow exploits work and how they can be used to take control of a program's execution.

How Buffer Overflow Exploits Work

When a function call is made, the return address is saved on the stack. The called function then saves its own EBP on the stack and posts its own variables higher up.

If an attacker can overflow a buffer with their input, they can change the return address and jump to wherever they want in memory.

Attackers can supply other inputs to get the server to run arbitrary code by having the return address point into the buffer.

Modern machines try to provide some defenses against these kinds of attacks by associating permissions with various memory regions.

Bypassing Non-executable Stacks

Even if a stack is non-executable, attackers can still jump in the middle of main or other pieces of code in a program that are doing interesting stuff.

Mechanism Problems and Defense Techniques

In this section, the professor discusses how to fix source code mistakes and make it difficult for hackers to exploit bugs. He also introduces defense techniques that will be covered in the next two lectures.

Fixing Source Code Mistakes

The best way to fix source code mistakes is by changing the source code and avoiding making calls that can cause problems.

Making it Difficult for Hackers

Many people try to devise techniques that make it more difficult for hackers to exploit bugs.

Examples of these techniques include making the stack non-executable and doing something slightly more elaborate than injecting shell code onto the stack.

Defense Techniques

The next two lectures will cover defense techniques that make it much more difficult for hackers to exploit things, although they are not all perfect.

Final Exam and Quizzes

There are two quizzes, but no final during the final week of class. There is a quiz right before it.

Minimizing Trusted Computing Base

A better design is one where you structure your whole system so that security doesn't depend on all pieces of software enforcing your security policy. Instead, a small number of components enforce your security policy while other components don't matter for security purposes if they're right or wrong. This technique minimizes your trusted computing base and helps get around mechanism bugs and problems discussed in this lecture.