The Most Important
Thing
How Mozilla Does Security and What You Can Steal
Johnathan Nightingale
Human Shield
Mozilla Corporation
johnath@mozilla.com
So you want to steal a
security architecture...
Do you actually want to get better?
Do you care about responsiveness?
Can you let go of secrecy?
Do you actually want to get better?
Some of these things will be painful.
The only thing worse, from a security point of view, is not doing them
...but that doesn’t mean they won’t have a cost, so know how much it matters to you to be safe. For us it matters a lot.
Do you care about responsiveness?
A lot of our process is catered to responding intelligently and quickly to exposure, to keep our users safe.
If you are a hardware manufacturer without an ability to update, parts of this will still help, but obviously rapid responsiveness is not something that keeps you
up at night.
Can you let go of secrecy?
Openness brings huge benefits
Incentive to help when people can see results
Lets you massively scale your talent pool
Lets you break the correlation between employee fix rate and security
But it has obvious costs. You can mitigate those:
Scaling your talent pool means opening it *to your talent pool*.
For us, any programmer out there might have really important contributions, so we open (almost) all the way
For a missile manufacturer or an Aibo engineer, maybe the pool to which you need to open is smaller
This is not a talk telling you to open-source your product if you haven’t already
Though I think that would be keen.
Why steal from us?
We have been at it for a while...
in a phenomenally hostile environment...
with 250 million users...
we seem to be doing a
lot of things right...
and you can see how
we do it
10 years of public source on what was already a mature project
Our job is to take untrusted third party code and execute it locally
Top-Down Security
Response
Design
Implementation
Testing
Metrics
This is the kind of thing that Right-Thinking Security People will advocate.
“Security is not something you can just bolt on to a product in the
week before ship” goes the argument.
“Security needs to be integrated up front - start by defining your success
criteria, design with security in mind, and that will flow down through
implementation and testing, and then obviously you respond to incidents”
This Diagram is Stupid
Response
Design
Implementation
Testing
Metrics
Why is the waterfall stupid for security?
- It assumes you can know at the outset where/what your mistakes will be, and
design against them
- It assumes that security will trump everything else in the design stage
- It assumes that whatever is agreed upon in the design stage will persist
- It assumes you can get your success criteria right up front
- It is terribly brittle if these assumptions are violated, and is near useless for anything
past v1.0
It also fails to capitalize on huge learning opportunities
- Implementing before you have tests guarantees you will break things
- Testing that only accounts for implementation assumptions, not past incident
responses is naive
- Knowing which metrics to use is very hard until you’ve gone a couple rounds through
the rest
Maybe it’s not stupid, just arrogant
Good Security is a
Feedback Loop
• The idea that security can be wholly top-down, with
discrete one-way steps in an orderly flow from start
to end is the worst kind of process management
fiction
• Your security process should instead ask at every
step, “How can we make sure problems like this
never happen again?”
The single most important thing
you can do is find ways to
capture expensive knowledge so
that you never pay for the same
lesson twice
In the rest of this talk, I’m going to discuss the 5 phases of security
planning identified in that waterfall, in reverse order, and talk about
the kinds of expensive knowledge you can identify and how we try
to extract maximum benefit from it
Response
A security compromise is the most
expensive knowledge of all
Response
Prepare
Triage
Deploy
Fix
Schedule
Mitigate
Post-Mortem
Who should help?
With tests! (More later)
This is not the same as shipping! (More later)
Where is it written down?
- The important part is not that you copy this verbatim (though you
probably can) but that you recognize the importance of each step.
- An acceptable outcome from triage is that you do nothing.
- There is inherent risk in any fix: breaking compatibility, regressing
unrelated behaviour, opportunity cost due to lost developer time. Is it
worth fixing?
- A lot is being glossed over here, like the QA required on the fix.
Learning from
Response
• 5 Whys
• It’s okay for post-mortems to be short, it’s
not okay to skip them
• Blame-finding poisons the ability to be open
about mistakes and how to fix them
5 Whys is a system that shouldn’t be treated as overly rigorous, but helps.
Basically, when something goes wrong, ask why. When you get an answer,
ask why, and so on. Make investments in fixing each level of problem.
Ask Questions
• Who did we have to bring in late?
• Why didn’t we notice that we broke the
internet?
• How could we have dealt better with the
original reporter?
• What were our bottlenecks?
Write down the answers for next time
(there’s always a next time)
- Notice that a lot of these questions are not technical, they are
social/communication/organizational
- These are often the hard parts to get right
- But they are also the most important parts to get right
- The bottleneck to fixing an exploit is almost never the time
needed to write the code.
Socialize What You
Learn
• Mozilla’s security group is about 80 people
• Includes core security team, development
leads, QA, individual developers,
management
• When response changes policy, this is
explicitly brought out to the larger
community
Our security group is probably at least 1/3 non-employees. It is the
central hub for any security issue that might require response.
When we discover in a post-mortem that someone should have been
involved who wasn’t, they are added to the list for future incidents.
Testing
Testing is your best defense against
forgetting, because you will forget
Data Point
We run:
• 90,000 automated tests
• using 8 different test
harnesses
• on 4 platforms
• at least 20 times a day
- This slide isn’t about tooting our own horn, it’s about
demonstrating that we believe in testing a lot
- Sometimes we got there one step at a time, sometimes
we stole other test frameworks wholesale
You Already Know Why
• Tests protect your features from security-
based changes
• Tests protect your security from feature-
based changes
• Tests capture and transfer expensive
knowledge
• Tests reduce Bus Factor
Bus Factor - the risk exposure you have if Critical Person X gets hit by a bus.
Now Make It Happen
• It must be easy to add new tests
• Yes, this is tricky at first
• Money can be exchanged for goods and
services!
• Nothing lands without tests
• Nothing. Lands. Without. Tests.
The hardest part about testing is getting started.
You need to make the investment up front, get religion about it, and
start requiring it from your developers.
- Buy someone else’s testing kit
- Hire a summer student to build out an infrastructure and write tests, then
give a lunch and learn
- If you’re not empowered to hire summer students, pay a bonus for most
non-trivial tests per quarter until you hit a good level of coverage
- If you’re not empowered to pay bonuses, buy a bottle of scotch out of
pocket - your own peace of mind is worth it.
It’s Hard To Test
• This is terrifying
• Steal another test harness - you are
probably not the first person who needed
to test this
• http://wikipedia.org/wiki/List_of_unit_testing_frameworks
has a few hundred
• Don’t underestimate manual testing
This is terrifying because it leads to swaths of untested code
- and because it represents a developer who is trying to avoid writing tests.
- Which is not to say that it isn’t true, just that it shouldn’t be an out
A test harness is just a simple API that you can build tests against, and includes an
ability to batch, automate and report on them
- There are tests that automate web browsing
- There are tests that will automate UI interactions
- There are tests that will compare screen contents to a known sample
- If all else fails, a checklist that a human has to sign off on each day is a start
- A human can quickly make “looks right” judgements that a computer can’t
- This will tend towards habituation though, and is resource intensive.
Fuzzing
• Benefit/Cost is high, because cost is low
• Anywhere you take input, especially
structured input, is a candidate
• SPIKE, Peach, Sulley or DIY
• http://www.fuzzing.org/fuzzing-software
- The noise is free to ignore
- The signal is not something your ad-hoc testing is ever going to discover
- We fuzz DOM, JS, CSS, among others.
- SPIKE, Peach and Sulley are popular frameworks that try to make it easy
- We often build our own on the development team, it’s a specialized,
but learnable skill
- I’ll link to a couple of ours at the end of the presentation
Penetration Testing
• Mixed Bag (YMMV)
• Scope Helps
• Can help map your attack surface, expose
internal assumptions
• Never an Upcheck, only ever a Downcheck
Our experience has been primarily with 3rd party pen testing
We try to focus on a specific area, rather than just saying “go find bugs”
otherwise no pen-testing engagement is long enough
When it’s good, it exposes new classes of problem, socializes the
attacker mindset
When it’s bad, it’s just a couple of guys with a static analysis package
A pen test can’t tell you you’re safe, only that you’re unsafe.
One More Thing
Tests that don’t run are a waste of everyone’s time
Option: Automatic Gunfire
Buy a box that sits in a corner and runs tests off
trunk every hour. Put a gun on it that shoots
people who break tests.
Option: Manual Slog
Make check-in approval contingent on running
tests, every single time.
We use option 1. The gun is a metaphor, in our case, for intense social ostracism, and near-
immediate
back-out of offending code, so people are incented to keep their code clean.
Option 2 is fine if you don’t have the resources for an automated test box, but it can be time
intensive since it makes every developer do the work. One box running tests can generally
narrow the regression to one of, say, 10 checkins, and given the test failure, the culprit is
usually easy to identify.
Implementation
“We have tests” is not an excuse
to keep breaking things
Where Mistakes Are
Made
• Strategic-level mistakes can be made in
design, but most security bugs come from
mistakes not caught during implementation
• Your ability to profit from expensive
knowledge is highest here, but here is
where you’re probably doing the worst job
I bet a lot of the audience *does* have an incident-response plan in place
I bet some even have testing
But I bet almost no one has implementation-time measures like mandatory code review.
No-Brainers
• Static analysis tools
• assert()
• “Public” Betas
• Alphas?
• Source?
If your environment doesn’t have asserts, write em.
Public doesn’t always mean “everyone”
- it means it should be visible to people outside your organization who can help you find
problems early
- maybe that’s other teams in your company, maybe that’s customers
- And yes, maybe it is the whole wide world. You’d be surprised what people will volunteer
if they have the access
Tougher
• Non-security bugs point to security bugs
• Do you have crash reporting?
• No bug happens once
• Where else are you assuming that a null
pointer isn’t exploitable?
• Bad patterns - knowledge that you get to
benefit from more than once.
Crash reporting is a huge source of free security research, take
advantage of it
The Game Changer
• Socializes security knowledge by sharing it
• Gatekeeper against “This is little, it’ll be
fine”
• P(Mistake1) * P(Mistake2) << P(Mistake1)
The most important change you can make at
implementation is mandatory review
Yes, it might slow things down
and require training
and maybe even social adjustment as managers get review from juniors.
Mandatory code review sucks. The only thing that sucks more is not having mandatory
code review.
What it does, though, is distribute knowledge, so that mistakes that are “obvious” to your
security person start being “obvious” to everyone, because everyone remembers version 2.0.6
where we did that thing...
Design
Every time you eliminate a threat class
an angel gets its wings
Making Things Right
• Design for re-use
• Design for testability
• Change is risky, Rip & Replace more so
Make it easier to profit from expensive knowledge
Design for re-use
- It multiplies exposure to problems, but also means that each fix only has to be
applied once, it multiplies the value of each expensive lesson
This doesn’t mean a free for all to fix everything
- fixes are risky, and Rip & Replace is riskier still.
Sometimes code with a pile of patches is robust, not flaky, because you have ironed out almost all the kinks
- New code will have new kinks
- Making the call requires the kind of socialized knowledge that comes with everything we’ve been talking about
Choose Your Focus
• Find areas that keep needing “temporary”
field patches and fix them for good
• Be systematic about identifying areas of
attention: threat modeling, attack trees, new
research - whatever helps you choose
To be honest, we’re still experimenting with these methodologies
A browser has a fair bit of attack surface, so techniques which allow
that to be visualized and managed are powerful tools.
Metrics
Measure what matters, not what’s
easy to measure
Now with
12%
more bits!
Don’t Know What
Matters?
• Ask your users
• Ask sales
• Don’t ask your competitors, they are
looking for the easy way out
The #1 Grade-A
Stupidest Metric of All
• A focus on bug counting creates perverse
incentives for security
• Developers hide bugs from management
• You hide bugs from customers
Bug Count
Counting bugs teaches you to bury all the
expensive knowledge you should be sharing
Think Harder
• Days of exposure
• Average time to deploy fix
• Better would be avg. time until > 90% of
users are using the fix
• Customer downtime
Our users want to be safe.
They don’t care how many administrative tickets (“bugs”) it took us to get there
So we measure things that have to do with our users being safe
- How many days in the year were they unsafe?
- When we get a fix out, how long before it reaches 90% saturation?
- For us, the number is about 5-6 days.
Get Creative
• Number of regressions per update cycle
• Number of all nighters
• Start using similar metrics when judging
your own suppliers & platforms
• Tension between metrics can be a good
thing, if it pulls people towards awesome
Stupid Criticisms
• This model is totally reactive, not proactive
• This model is steady-state, not innovative
Reactive
- No, this model recognizes that security is hard, and doesn’t always fit into UML diagrams
- By all means, design with security in mind, but don’t lean on that as a complete solution
- Being proactive is great, but not at the expense of also being able to react quickly and intelligently
- This model recognizes the value of expensive information, and seeks to maximize everyone’s exposure to that value so that less time,
money, and user agony is spent revisiting it later
- Any product that wants to get past v2 needs to be doing something like this, or they risk becoming a security joke
Steady-State
- Between releases, having a high bar for security and keeping it high is a noble thing, but no one said that these were the only design
activities in a company
- Remember how I said that tests keep sec safe from features, and features safe from sec?
- It’s because I implicitly expect the product to have features besides security, and bugs besides security.
- Trust me, getting security right, being brave enough to break away from stupid metrics in your press and sales engagements is plenty
innovative
- And if what you’re doing resonates with customers, it’s a huge differentiator
- Because most people aren’t in this talk, most people aren’t at this conference.
- Most people don’t get it like you do.
- That’s the business case for all this pain. Security is a HUGE differentiator.
Our tools, let me show
you them
Tinderbox http://www.mozilla.org/tinderbox.html
Mochitest http://developer.mozilla.org/en/docs/Mochitest
Litmus http://wiki.mozilla.org/Litmus
MXR http://mxr.mozilla.org/
Dehydra http://developer.mozilla.org/en/docs/Dehydra
Bug Policy http://www.mozilla.org/projects/security/security-bugs-policy.html
Bugzilla https://bugzilla.mozilla.org/
Fuzzers http://www.squarefree.com/2007/08/02/introducing-jsfunfuzz/http://www.squarefree.com/2009/03/16/css-grammar-fuzzer/
Tinderbox - our build and test tracking system, with automated weaponry
Mochitest - The framework we use for most of our web-based testing
Litmus - Our homegrown system for human testing
MXR - The Mozilla Cross Reference - our source browser
Dehydra - Our static-analysis-and-rewrite project
Bug Policy - Our classification system
Bugzilla - Our bug tracking system
Fuzzers - Specifically jsfunfuzz, our javascript engine fuzzer (adaptable to other things, though!)
Remember This Slide
• Capture expensive knowledge everywhere,
so that you don’t have to re-learn it
• Apply that knowledge everywhere
• Nothing lands without tests
• Nothing lands without code review
• Counting bugs is stupid, try harder
Credits
• Developer Kit, Sean Martell, http://developer.mozilla.org/en/docs/Promote_MDC
• Waterfall, dave.hau, http://flickr.com/photos/davehauenstein/271469348/
• Alarm, Shannon K, http://flickr.com/photos/shannonmary/96320881/
• Oops, estherase, http://flickr.com/photos/estherase/24513484/
• Card House, Bah Humbug, http://flickr.com/photos/gibbons/2294375187/
• Bulldozer, Atli Harðarson, http://flickr.com/photos/atlih/2223726160/
本文档为【AND-403_0412】,请使用软件OFFICE或WPS软件打开。作品中的文字与图均可以修改和编辑,
图片更改请在作品中右键图片并更换,文字修改请直接点击文字进行修改,也可以新增和删除文档中的内容。
该文档来自用户分享,如有侵权行为请发邮件ishare@vip.sina.com联系网站客服,我们会及时删除。
[版权声明] 本站所有资料为用户分享产生,若发现您的权利被侵害,请联系客服邮件isharekefu@iask.cn,我们尽快处理。
本作品所展示的图片、画像、字体、音乐的版权可能需版权方额外授权,请谨慎使用。
网站提供的党政主题相关内容(国旗、国徽、党徽..)目的在于配合国家政策宣传,仅限个人学习分享使用,禁止用于任何广告和商用目的。