Dynamic Taint Analysis

Dynamic Taint Analysis

Taint dump

All about origin & flow

          var pwd = $('#password').val();
           
          xhr("/msg/" + pwd);
        

All about origin & flow

          var pwd = $('#password').val();
          var hash = hasher(challenge, pwd);
          xhr("/msg/?hash=" + hash);
        

Dynamic Analysis

          var url;
          var pwd = $('#password').val();
          if (predicate(pwd))
            url = "/msg/" + pwd;
          else
            url = "/msg/?hash=" + hasher(challenge, pwd);
          xhr(url);
        

2 Proposals

  1. DOMinator
  2. no-name-yet?

DOMinator

Taint String only:

  • Use lengthAndFlags field to taint Strings.
  • Instrument string manipulations.

DOMinator

Pros:

  • Little performance impact, when disabled (?)

Cons:

  • Correctness issues. (JSON.parse)
  • Invasive instrumentation.

no-name-yet

Taint all Values:

  • Box all doubles.
  • Use the ValueTag and object flags for tainting.
  • Instrument all operations.

no-name-yet

Pros:

  • ??

Cons:

  • Terrible performance overall.
  • Invasive instrumentation.

Invasive

  • Spread (one implementation) around the code base.
  • Maintained by all JS developers.
  • We are not security engineers.

Can we do better?

Other: Jalangi

Dynamic analysis framework:

  • Rewrite JavaScript code.
  • Instrument with function calls.
  • Not restricted to Taint analysis.

Jalangi Hooks

Dynamic analysis? framework:

Rewrite the code and emulate the operators

        var y = …;
        function f(x) {
          return x + y;
        }
      
        var y = …;
        function f(x) {
          return Binary('+', x, y);
        }
      

Jalangi Boxing

Dynamic analysis? framework:

Emulate the code and box & unbox values

        function Binary(op, x, y) {
          if (op == "+")
            return box(unbox(x) + unbox(y),
                       tainted(x) | tainted(y));
        
        }
      

Jalangi

Pros:

  • Externalize analysis logic.

Cons:

  • Correctness issues. (emulate JavaScript)
  • Additional Parser.
  • Bad performance during analysis.

Can We Do Better?

Proposal: Instrumentation

        function f(x) {
          return x + y;
        }
      

Bytecode (overview) of

        function f(x) {
          let _x = x, _y = y;
          let _ctx = %probe.getContext();
          %probe.Plus(_x, _y, _ctx);
          let _r = _x + _y;
          %probe.PlusResult(_r, _ctx);
          return _r;
        }
      

Proposal: Compartement

        function Plus(x, y, ctx) {
          ctx.taint = tainted(x) | tainted(y);
        }
        function PlusResult(res, ctx) {
          return setTaint(res, ctx.taint);
        }
      

Proxies → Shadow object by default, no more plain values.

Proposal: Implementation

        typedef bool (*EmitFuncProto)(ExclusiveContext *cx,
                                       BytecodeEmitter *bce,
                                       ParseNode *pn);
      
        static bool
        EmitFuncWithProbes(ExclusiveContext *cx,
                           BytecodeEmitter *bce,
                           ParseNode *pn)
        {
          … // Wrap and delegate.
      

Proposal: Implementation

        class BytecodeEmitter
        {
            EmitFuncProto EmitFunc_;
      
        // Initialization of the BytecodeEmitter
        if (hasFuncProbes()) {
          EmitFunc_ = EmitFuncWithProbes;
      

→ Cons: performance impact (?), when not used.

Proposal

Pros:

  • Incremental support of operations.
  • Externalize analysis logic.
  • Potentially run offline.
  • Boxed by default.

Cons:

  • ?

External analysis example

Pros:

  • Record & replay.
  • Tracking origin of null / undefined.
  • Type-checking of Plus operators.
  • Boxed by default.

Cons:

  • ?

When do we Start?