Dynamic /Taint/ Analysis

Dynamic Taint Analysis

Nicolas B. Pierron [:nbp]
JS Team Meetup 2014

All about origin & flow

          var pwd = $('#password').val();
           
          xhr("/msg/" + pwd);

All about origin & flow

          var pwd = $('#password').val();
          var hash = hasher(challenge, pwd);
          xhr("/msg/?hash=" + hash);

Dynamic Analysis

          var url;
          var pwd = $('#password').val();
          if (predicate(pwd))
            url = "/msg/" + pwd;
          else
            url = "/msg/?hash=" + hasher(challenge, pwd);
          xhr(url);

2 Proposals

DOMinator
no-name-yet?

DOMinator

Ivan Alagenchev, (Mark Goodwin)

Taint String only:

Use lengthAndFlags field to taint Strings.
Instrument string manipulations.

DOMinator

Pros:

Little performance impact, when disabled (?)

Cons:

Correctness issues. (JSON.parse)
Invasive instrumentation.

no-name-yet

(a research team somewhere on Earth)

Taint all Values:

Box all doubles.
Use the ValueTag and object flags for tainting.
Instrument all operations.

no-name-yet

Pros:

??

Cons:

Terrible performance overall.
Invasive instrumentation.

Invasive

Spread (one implementation) around the code base.
Maintained by all JS developers.
We are not security engineers.

Can we do better?

Other: Jalangi

Koushik Sen, seen on Air Mozilla, as and Addon

Dynamic analysis framework:

Rewrite JavaScript code.
Instrument with function calls.
Not restricted to Taint analysis.

Jalangi Hooks

Dynamic analysis? framework:

Rewrite the code and emulate the operators

        var y = …;
        function f(x) {
          return x + y;
        }

        var y = …;
        function f(x) {
          return Binary('+', x, y);
        }

Jalangi Boxing

Dynamic analysis? framework:

Emulate the code and box & unbox values

        function Binary(op, x, y) {
          if (op == "+")
            return box(unbox(x) + unbox(y),
                       tainted(x) | tainted(y));
          …
        }

Jalangi

Pros:

Externalize analysis logic.

Cons:

Correctness issues. (emulate JavaScript)
Additional Parser.
Bad performance during analysis.

Can We Do Better?

Proposal: Instrumentation

        function f(x) {
          return x + y;
        }

Bytecode (overview) of

        function f(x) {
          let _x = x, _y = y;
          let _ctx = %probe.getContext();
          %probe.Plus(_x, _y, _ctx);
          let _r = _x + _y;
          %probe.PlusResult(_r, _ctx);
          return _r;
        }

Proposal: Compartement

        function Plus(x, y, ctx) {
          ctx.taint = tainted(x) | tainted(y);
        }
        function PlusResult(res, ctx) {
          return setTaint(res, ctx.taint);
        }

Proxies → Shadow object by default, no more plain values.

Proposal: Implementation

        typedef bool (*EmitFuncProto)(ExclusiveContext *cx,
                                       BytecodeEmitter *bce,
                                       ParseNode *pn);

        static bool
        EmitFuncWithProbes(ExclusiveContext *cx,
                           BytecodeEmitter *bce,
                           ParseNode *pn)
        {
          … // Wrap and delegate.

Proposal: Implementation

        class BytecodeEmitter
        {
            EmitFuncProto EmitFunc_;

        // Initialization of the BytecodeEmitter
        if (hasFuncProbes()) {
          EmitFunc_ = EmitFuncWithProbes;

→ Cons: performance impact (?), when not used.

Proposal

Pros:

Incremental support of operations.
Externalize analysis logic.
Potentially run offline.
Boxed by default.

Cons:

?

External analysis example

Pros:

Record & replay.
Tracking origin of null / undefined.
Type-checking of Plus operators.
Boxed by default.

Cons:

?