A brief talk about JavaScript Sandbox

Preface:

Speaking of sandboxes, our minds may reflexively think of the picture above and become instantly interested, but unfortunately this article does not involve "Minecraft" (an old cover party). The following article will gradually introduce the sandbox of "Browser World".

1. What is a sandbox?

In computer security, a Sandbox is a security mechanism used to isolate running programs. It is usually used to execute untested or untrusted programs or codes. It creates an independent execution environment for the program to be executed, and the execution of the internal program will not affect the operation of the external program.

For example, the following scenarios involve the abstract concept of sandbox:

The page program we developed runs in the browser. The program can only modify the part of the interface that the browser allows us to modify. We cannot affect the status outside the browser through this script. In this scenario, the browser itself is a sandbox.
Each tab in the browser runs an independent web page, and each tab does not affect each other. This tab is a sandbox.
......

2. What are the application scenarios of sandbox?

The above introduces some relatively macro sandbox scenarios. In fact, there are many scenarios in daily development that require the application of such a mechanism:

When executing the string returned by JSONP request or introducing an unknown third-party JS library, you may need to create a sandbox to execute these codes.
The calculation of Vue template expressions is run in a sandbox. The expressions in the template string can only obtain some global objects. This is mentioned in the official documentation. For details, please refer to the source code

Online code editors, such as CodeSanbox , will place the program in a sandbox when executing the script to prevent the program from accessing/affecting the main page.
Many applications provide a Plugin mechanism, and developers can write their own plug-ins to implement certain custom functions. Students who have developed plug-ins should know that there are many restrictions when developing plug-ins. These applications need to follow the operating rules set by the host program when running plug-ins. The plug-in's operating environment and rules are a sandbox. For example, the following figure shows how Figma plugin works:

In short, as long as we encounter untrusted third-party code, we can use sandbox to isolate the code to ensure the stable operation of external programs. If untrusted code is executed without any processing, the most obvious side effect/harm in the front end is pollution and tampering of the global window state, affecting the main page function and even being attacked by XSS.

// Sub-application code window.location.href = 'www.diaoyu.com'

Object.prototype.toString = () => {

    console.log('You are a fool :)')

  }

document.querySelectorAll('div').forEach(node => node.classList.add('hhh'))

sendRequest(document.cookie)

...

3. How to implement a JS sandbox

To implement a sandbox, it is actually necessary to develop a program execution mechanism, under the action of this mechanism, the operation of the program inside the sandbox will not affect the operation of the external program.

3.1 The simplest sandbox

To achieve this effect, the most direct idea is that all variables accessed in the program come from a reliable or autonomous context environment instead of taking values from the global execution environment. Then, to achieve that all variables are accessed from a reliable context environment,

We need to construct a scope for the program to be executed:

//Execution context object const ctx = 
    func: variable => {
        console.log(variable)
    },
    foo: 'foo'
}

// The simplest sandbox function poorestSandbox(code, ctx) {
    eval(code) // constructs a function scope for executing the program}

// Program to be executed const code = `
    ctx.foo = 'bar'
    ctx.func(ctx.foo)
`

poorestSandbox(code, ctx) // bar

Such a sandbox requires the source program to add the prefix of the execution context object when obtaining any variable, which is obviously very unreasonable because we have no way to control the behavior of third parties. Is there a way to remove this prefix?

3.2 A very simple sandbox (With)

Using the with statement can help us remove this prefix. with will add a new scope at the top of the scope chain. The variable object of this scope will be added to the object passed in by with . Therefore, compared with the external environment, the internal code will prioritize searching on this object when searching for variables.

//Execution context object const ctx = {
    func: variable => {
        console.log(variable)
    },
    foo: 'foo'
}

// Very Poor Sandbox function veryPoorSandbox(code, ctx) {
    with(ctx) { // Add with
        eval(code)
    }
}

// Program to be executed const code = `
    foo = 'bar'
    func(foo)
`

veryPoorSandbox(code, ctx) // bar

This achieves the effect that variables in the executing program are searched in the context provided by the sandbox before the external execution environment.

The problem is that when a variable is not found in the provided context object, the code will still search up the scope chain layer by layer. Such a sandbox still cannot control the execution of the internal code. We want the code in the sandbox to only look for variables in the manually provided context object, and to report an error or return undefined if the variable does not exist in the context object.

3.3 Not so simple sandbox (With + Proxy)

To solve the above problems, we use a new feature of ES2015 - Proxy . Proxy can proxy an object, thereby intercepting and defining the basic operations of the object.

The get and set methods in Proxy can only intercept properties that already exist in the proxy object. These two hooks are unaware of properties that do not exist in the proxy object. Therefore, here we use Proxy.has() to intercept the access of any variable in the with code block and set a whitelist. The variables in the whitelist can be accessed normally using the scope chain. The variables not in the whitelist will continue to determine whether they exist in the context object maintained by the sandbox. If they exist, they will be accessed normally. If they do not exist, an error will be reported directly.

Since has will intercept all variable accesses in the with code block, and we only want to monitor the program in the executed code block, we also need to convert the form of manual code execution:

// Construct a with to wrap the code to be executed and return a function instance of the with code block function withedYourCode(code) {
  code = 'with(globalObj) {' + code + '}'
  return new Function('globalObj', code)
}


// Whitelist of global scopes that can be accessed const access_white_list = ['Math', 'Date']


// Program to be executed const code = `
    Math.random()
    location.href = 'xxx'
    func(foo)
`

//Execution context object const ctx = {
    func: variable => {
        console.log(variable)
    },
    foo: 'foo'
}

// Proxy object of execution context object const ctxProxy = new Proxy(ctx, {
    has: (target, prop) => { // has can intercept access to any property in the with code block if (access_white_list.includes(prop)) { // In the accessible whitelist, you can continue to search upwards return target.hasOwnProperty(prop)
      }

      if (!target.hasOwnProperty(prop)) {
          throw new Error(`Invalid expression - ${prop}! You can not do that!`)
      }

      return true
    }
})

// Not so poor sandbox function littlePoorSandbox(code, ctx) {

    withedYourCode(code).call(ctx, ctx) // point this to the manually constructed global proxy object}


littlePoorSandbox(code, ctxProxy)

// Uncaught Error: Invalid expression - location! You can not do that!

At this point, many relatively simple scenarios can be covered ( eg: Vue 's template string), but what if you want to implement a web editor like CodeSanbox ? In such an editor, we can use global variables such as document and location at will without affecting the main page.

This leads to another question: how to let the subroutine use all global objects without affecting the external global state?

3.4 Natural high-quality sandbox (iframe)

When I heard the above question, I immediately called myself an expert. iframe iframe can create an independent browser-native level operating environment, which is isolated from the main environment by the browser. The global objects accessed by the script program running in iframe are all provided by the current iframe execution context and will not affect the main functions of its parent page. Therefore, using iframe to implement a sandbox is currently the most convenient, simple and safe method.

Imagine a scenario like this: there are multiple sandbox windows in a page, one of which needs to share some global states with the main page (eg: when you click the browser back button, the sub-application will also return to the previous level), and another sandbox needs to share some other global states with the main page (eg: share cookie login state).

Although the browser provides postMessage and other methods for communication between the main page and iframe , it is difficult and unmaintainable to implement this scenario using only the iframe.

3.5 should be able to use the sandbox (With + Proxy + iframe)

In order to achieve the above scenario, we can stitch the above methods together:

Taking advantage of the natural isolation of iframe from global objects, iframe.contentWindow is taken out as the global object executed in the current sandbox.
Use the sandbox global object as the parameter of with to restrict the access of the internal execution program, and use Proxy to monitor the access inside the program.
Maintain a shared state list, list the global states that need to be shared with the outside world, and implement access control within Proxy .

//Sandbox global proxy object class class SandboxGlobalProxy {

    constructor(sharedState) {
        // Create an iframe object and take out the native browser global object as the global object of the sandbox const iframe = document.createElement('iframe', {url: 'about:blank'})
        document.body.appendChild(iframe)
        const sandboxGlobal = iframe.contentWindow // Global object of sandbox runtime return new Proxy(sandboxGlobal, {
            has: (target, prop) => { // has can intercept access to any property in the with code block if (sharedState.includes(prop)) { // If the property exists in the shared global state, let it search the outer layer along the prototype chain return false
                }

                if (!target.hasOwnProperty(prop)) {
                    throw new Error(`Invalid expression - ${prop}! You can not do that!`)
                }
                return true
            }
        })

    }

}


function maybeAvailableSandbox(code, ctx) {

    withedYourCode(code).call(ctx, ctx)

}

const code_1 = `

    console.log(history == window.history) // false

    window.abc = 'sandbox'

    Object.prototype.toString = () => {

        console.log('Traped!')

    }

    console.log(window.abc) // sandbox

`

const sharedGlobal_1 = ['history'] // Global object that you want to share with the external execution environment const globalProxy_1 = new SandboxGlobalProxy(sharedGlobal_1)

maybeAvailableSandbox(code_1, globalProxy_1)


window.abc // undefined

Object.prototype.toString() // [object Object] does not print Traped

From the results of the example code, we can see that by leveraging the natural environment isolation advantage of iframe and the powerful control of with + Proxy , we have achieved the isolation of global objects in the sandbox and global objects in the outer layer, and realized the sharing of some global properties.

3.6 Sandbox Escape

Sandbox is a security strategy for authors , but it may be a constraint for users. Creative developers try to get rid of this constraint in various ways, which is also called sandbox escape. Therefore, the biggest challenge for a sandbox program is how to detect and prohibit the execution of these unexpected programs.

The sandbox implemented above seems to have satisfied our needs. Is it done? In fact, the following operations will affect the environment outside the sandbox and achieve sandbox escape:

When accessing an internal property of an object in the sandbox execution context, Proxy cannot capture the access operation of this property . For example, we can directly get the outer global object through window.parent in the execution context of the sandbox.

// When accessing the properties of an object in a sandbox object, some of the above code is omitted const ctx = {

    window: {

        parent: {...},

        ...

    }

}

const code = `

    window.parent.abc = 'xxx'

`

window.abc // xxx

By accessing the prototype chain to achieve escape , JS can directly declare a literal, and then search up the prototype chain of the literal to access the outer global object. This behavior is also imperceptible.

const code = `

    ({}).constructor.prototype.toString = () => {

        console.log('Escape!')

    }

`

({}).toString() // Escape! Expected [object Object]

3.7 “Flawless” Sandbox (Customize Interpreter)

There are more or less some defects in implementing a sandbox through the above methods. Is there a sandbox that is close to being complete?

In fact, many open source libraries are already doing this, that is, analyzing the source program structure to manually control the execution logic of each statement. In this way, both the context of specifying the program runtime and capturing operations that attempt to escape the sandbox control are under control. Implementing such a sandbox is essentially implementing a custom interpreter.

function almostPerfectSandbox(code, ctx, illegalOperations) {

    return myInterpreter(code, ctx, illegalOperations) // custom interpreter }

4. Summary

This article mainly introduces the basic concepts and application scenarios of sandboxes and guides you to think about how to implement a JavaScript sandbox. The implementation method of the sandbox is not static, and its goals should be analyzed in combination with specific scenarios. In addition, preventing sandbox escapes is also a long and arduous task, because it is difficult to cover all execution case in the early stages of construction.

No sandbox is assembled overnight, like Minecraft.

5. Reference

References:

Source code: https://github.com/vuejs/vue/blob/v2.6.10/src/core/instance/proxy.js
CodeSanbox: https://codesandbox.io/
with: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Statements/with
Proxy: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Proxy
CodeSanbox: https://codesandbox.io/
Writing a JavaScript framework - Sandboxed Code Evaluation: https://blog.risingstack.com/writing-a-javascript-framework-sandboxed-code-evaluation/
Talk about the sandbox in JS: https://juejin.cn/post/6844903954074058760#heading-1

You may also be interested in: