OranLooney - Clone() In JavaScript

My other essay provides source code and documentation for shallow copy, deep copy, and clone functions. If you came here searching for "javascript clone," it's probably what you want. This is more of an background piece on cloning in prototype-based languages.

I recently did a google search for "javascript clone", and noticed there's a lot of confusion about cloning vs. copying out there. Every function called clone() I found was in fact either a shallow or deep copy. All well and good but not actually a "clone," really. I blame Java for this; the copy method on Java's Object is called clone(). But notice how they describe it in their documentation:

    clone()
        Creates and returns a copy of this object.

When the first word that comes to mind to describe a method ("copy") is different than the method's name, something has gone wrong. I would expect clone() to mean what it means in other prototype-based languages like SELF, NewtonScript and Io: a function that returns a new object which dynamically looks to the original for defaults. Unlike those languages, there is no native clone() function in JavaScript, but because JavaScript is still a prototype-based langauge it's possible to replicate that behavior exactly, elegantly, and easily. In fact, on some platforms, this feature is exposed directly: read about differential inheritance in JavaScript. Unfortunately, this non-standard extension isn't available on all platforms, so let's roll up our sleeves and get to work.

The source code also includes a quite thorough shallow copy() function that handles native types and objects, and if you accept the Java definition of clone() that's probably what you were looking for. However, the "real" clone() is far more interesting, and it may very well be what you should be using, so read on.

Here is the source code for clone() and copy(). It is also included on this page so you can play around with it in Firebug.

The Inheritance Relationship

You already know that JavaScript objects are collections of key-value pairs; variously called maps or hash tables. But that isn't the whole story, because core primitives like Date and String have an actual value somewhere deep inside them, and Objects have a somewhat magical relationship with their constructors and prototypes. A JavaScript object really is a collection of key-value pairs plus some special internal data.

One of those data is the prototype: an internal reference to the Object it was spawned from. The prototype basically supplies default properties: when you look up a key in a given Object, JavaScript first looks for it in the Object itself, but if it doesn't find it, it then looks for the property in the prototype, and then in the prototype's prototype, and so on until it either finds the key or run into the root Object. So, an Object and its prototypes form a chain.

This is very similar to "inheritance" in other object-oriented programming languages, but note the key distinction: we're only talking about objects (instances), not classes. In class-based languages, an "object" is an "instance" of a "class", and a class can "inherit from" (or "extend" or "sub-class") another class. In prototype-based languages, we inherit from other objects, not classes.

In college, I was taught class-based object-oriented programming exclusively, with the suggestion that it alone was Right. However, there are a number of theoretical and practical reasons to consider prototype-based inheritance. Let's start with Hofstadter's discussion of how ideas relate to each other in "Classes and Instances" from Godel, Escher, Bach: An Eternal Golden Braid:

   It might seem at first sight that a given symbol would inherently
   be either a symbol for a class or a symbol for an instance - but
   that in an oversimplification.  Actually most symbols may play
   either role, depending on the context of their activation.  For
   example, look at the list below:
     (1) a publication
       (2) a newspaper
         (3) The San Francisco Chronicle
           (4) the May 18th edition of the Chronicle
             (5) my copy of the May 18th edition of the Chronicle
               (6) my copy as it was a few days later[...]

He then goes on to talk about the "The Prototype Principle," and "The Splitting-off of Instances from Classes," making the point that anything can be considered specific or abstract, from some point of view. This isn't computer science specifically (although Hofstadter is an AI researcher) but a philosophical attempt to understand how concepts really interact.

The way I see it, the reasons class-based languages are more common is ease-of-implementation and performance in static languages. In a static language like C++, class information is (mostly) used at compile time and is expressed in generated machine code, while an object is just a sequential block of memory. Concepts like scope and class don't really exist at runtime; they are only implicit in the generated machine code. So we're not going to get away with treating an object as a class in a static language. Dynamic languages, like JavaScript, are a different story: here, scope and inheritance are already runtime concepts, so we have more flexibility. Since we don't have to worry about generating machine code, we are free to pursue inheritance models that may more closely resemble the human thought process.

Which isn't to say that prototype-based inheritance necessarily comes with a performance cost. NewtonScript actually used prototype-based inheritance as a feature to reduce memory usage, on a system where RAM was the major limiting factor. You see, with prototype-based inheritance, you only need to store the values that are actually specific to that object; everything else can be held in a shared prototype. Compare this to C++, where every object reserves memory for its own members and those of its parent classes; sub-classes just keep getting bigger. Instead of thinking of it as a cost, we can think of it as a trade-of: in exchange for a more compact memory footprint with more sharing and less duplication, we do more work to access a particular piece of memory, searching up the chain of prototypes.

Clone and You

All we need to take away from all this ivory-tower stuff is that there may not be a clean line between "classes" and "instances;" and in fact examples pop up quite often of wanting to extend or locally modify an existing object. The "scope chain" between global and local variables in most languages is isomorphic to the prototype relationship. When Oracle opens up a transaction for you, it's just like you're writing to a clone (which then gets applied to the original when you commit.) [SVN's cheap copies] are another kind of clone. It's really a very common and powerful idea.

So here's what my clone() function can do for you: it can take any existing object and tear-off your own version of it... by creating an empty object whose prototype is the original. If the original gets modified, so will the clone, whereas changes to the clone never propogate back to the original. It's a bit like laying a transparency over a piece of paper and drawing on it with a marker.

Clone() in JavaScript

Here's my original version of the clone() function in JavaScript, from owl_util.js

function clone(obj) {
    // A clone of an object is an empty object 
            // with a prototype reference to the original.

    // a private constructor, used only by this one clone.
            function Clone() { } 
    Clone.prototype = obj;
    var c = new Clone();
            c.constructor = Clone;
            return c;
}

Since JavaScript always hangs prototypes off of a constructor, I simply use a closure to create a private constructor for each clone; the constructor is thrown away after being used once. In that sense, each clone is its own class. This is correct, but it's not as efficient as it could be. I'll show you a less clear but more efficient version below.

The net result of this trick is that I can clone any Object, and the clone will have the prototype relationship with its original.

var original = { a:'A', b:'B' };
var clone = owl.util.clone(original);
// clone.a == 'A'
// clone.b == 'B'
clone.a = 'Apple';
clone.a == 'Apple'
// original.a == 'A'  // unchanged
original.b = 'Banana'
// clone.b == 'Banana'  // change shows through
clone.c = 'Car'
// original.c is undefined
original.a = 'Abracadabra'
// clone.a == 'Apple'  // clone's new value hides the original's
delete clone.a
// clone.a = 'Abracadabra'  // original value visible again
// repeating "delete clone.a" won't delete the original's value.

Tightening Our Belts

The above version of clone() uses more memory than necessary by creating a closure for each clone. However, it turns out we can avoid this by reusing the same constructor function for all clones. The basic idea comes from MochiKit's clone() utility, but their implementation is confusing because the function uses itself as the constructor. This implementation is more readable:

function Clone() { }
function clone(obj) {
    Clone.prototype = obj;
    return new Clone();
}

Here, Clone() is the reused constructor, and we switch its prototype to the source object just before spawning each clone.

At first I wasn't sure this would work in all browsers, but the ECMA-262 JavaScript standard clearly states in section 13.2.2 that the new object's internal [[prototype]] property is set to whatever the constructors prototype property is at time of construction. That means this version of clone() should work in any compliant JavaScript engine.

Clone != Shallow Copy != Deep Copy

I hope I've convinced you that the cloning an object is a very different thing that making a copy, deep or shallow. It means giving the clone the prototype or "is-a" relationship. My implementation probably isn't perfect, but I wanted to get it out there to remind people that in dynamic languages, cloning is a distinct and extremely interesting operation.

Also

Mad props to David Flanagan for JavaScript: The Definitive Guide 5th Edition, the best book on JavaScript I've ever seen and the source of much wisdom.

- Oran Looney January 23rd 2008

Thanks for reading. This blog is in "archive" mode and comments and RSS feed are disabled. We appologize for the inconvenience.