Typed arrays and binary data

Introduction

Typed arrays are not competitors to the well-known Array class... they are rather a necessity in the era of HTML5 and handling large amounts of binary data. Generally, everything you store in a typed array, you can also store in an ordinary array. However, typed arrays in some cases are much more efficient. Among the new browsers' capabilities which take advantage of them are: WebGL, File API, WebSocket and audio/video manipulation. In this article, we will introduce you typed arrays and the data view - specialized data structures which allow to operate on binary data.

Binary data

A raw buffer of binary data is represented in JavaScript by the ArrayBuffer class. You cannot directly read or modify the array buffer, you have to use views. There are two types of views: the DataView which allows a heterogeneous access and typed arrays which are single type data. Basically the array buffer is for data storage and the views for its interpretation. It is possible to use multiple different views with a single ArrayBuffer object. An array buffer is created with a length parameter, which allocates specific number of bytes, with all values initialized with zeros:

new ArrayBuffer(length)

Where:
length - the size in bytes

The ArrayBuffer object has only a few properties:

  • byteLength - the length of the array buffer in bytes, read-only
  • slice(beginOffset [, endOffset]) - returns a copy of an array buffer starting from the byte indicated by the beginOffset and ending at the endOffset. The beginOffset is inclusive and the endOffset is exclusive

Data view

To access different data types in the same array buffer we use the DataView object which is constructed in the following way:

new DataView(buffer [, byteOffset [, byteLength]])

Where:
buffer - is a reference to the existing ArrayBuffer object
byteOffset - is a start place for DataView, if not defined view will start at the first byte
byteLength - is the length in bytes, if not defined, the view length will match the buffer length

The DataView has also three properties corresponding to those passed to its constructor. To manage data the DataView has eight methods for reading and eight methods for writing numerical values.

dataview.getDataType(byteOffset [, littleEndian])
datawiev.setDataType(byteOffset, value [, littleEndian])

Where:
DataType - is one of the number types, see the table below
byteOffset - in bytes, starting point for reading or writing data
value - the number value to set
littleEndian - optional and only for multibytes (data types starting from int16), indicate whether the int value is stored in little-endian or big-endian format. The default value is false, meaning that the big-endian value is written or read. About the bytes order you can read later in this article.

DataView read methods getInt8 getUint8 getInt16 getUint32 getInt32 getUint32 getFloat32 getFloat64
DataView write methods setInt8 setUint8 setInt16 setUint32 setInt32 setUint32 setFloat32 setFloat64

In the following example we check the array buffer properties and do some simple operations on it using the data view. We create an 8 bytes length array buffer ab1 and a data view dv1 on it. Next, we create another array buffer ab2 which is a 4 bytes copy of ab1 from byte 4 to 8. The length property of those buffers is 8 and 4 approprietly. Next, we write to those buffers some values. We use the Uint8Array to read values. As we can see, they differ one from another.

var ab1 = new ArrayBuffer(8);
var dv1 = new DataView(ab1);
var ab2 = ab1.slice(4,8);
var dv2 = new DataView(ab2);
console.log(ab1.byteLength); // 8
console.log(ab2.byteLength); // 4
for(var i=0;i<4;++i) {
	dv1.setUint8(i,i+1);
	dv2.setUint8(i,i+2);
}
console.log(new Uint8Array(ab1)); // [1, 2, 3, 4, 0, 0, 0, 0]
console.log(new Uint8Array(ab2)); // [2, 3, 4, 5] 

We create a new data view on the ab1 buffer starting from 4th byte. Next we set 0xFF values on bytes 0, 1 and the value 0xFFFF on byte 2. Then we read the array buffer using Uint8Array, as we can see all the bytes starting from 4th byte are equal 0xFF.

var dv3 = new DataView(ab1,4);
dv3.setUint8(0,0xFF);
dv3.setUint8(1,0xFF);
dv3.setUint16(2,0xFFFF);
console.log(new Uint8Array(ab1)); // [1, 2, 3, 4, 255, 255, 255, 255] 

Typed arrays

Another type of views are the typed arrays. We can use typed arrays instead of ViewData in situations when all the data in the binary buffer are of the same type. Typed arrays differ from simple arrays in several aspects:

  • they have fixed length defined during creation
  • all elements are of the same numeric value type
  • because of numeric type they cannot be multidimensional
  • all elements are initialized with the zero value
  • they have different properties and methods

There are nine different kinds of typed arrays:

Array type Size Description Range
Int8Array 1 8-bit twos complement signed integer -128(0x80) to 127(0x7F)
Uint8Array 1 8-bit unsigned integer 0(0x00) to 255(0xFF)
Uint8ClampedArray 1 8-bit unsigned integer (clamped for canvas ImageData) 0(0x00) to 255(0xFF)
Int16Array 2 16-bit twos complement signed integer -32,768 to 32,767
Uint16Array 2 16-bit unsigned integer 0 to 65,535
Int32Array 4 32-bit twos complement signed integer -2,147,483,648 to 2,147,483,647
Uint32Array 4 32-bit unsigned integer 0 to 4,294,967,295
Float32Array 4 32-bit IEEE floating point number +/- 3.4028 e 38
Float64Array 8 64-bit IEEE floating point number +/- 1.7976 e 308

Non symetric range is related to binary signed number representation which is Two's complement in JavaScript.

When typed array is created automatically, an underlying array buffer, big enough to store the array data, is also created. Each array type can be constructed in one of the following ways:

new TypedArray(length)
new TypedArray(typedArray)
new TypedArray(object)
new TypedArray(buffer [, byteOffset [, length]])

Where:
length - the number of elements to be created, all bytes equal to zero
typedArray - can be any typed array which will be copied into a new array with all its elements converted to an appropriate type
object - can be a simple array or array-like objects, which will be converted to a typed array
buffer, byteOffset, length - called with these parameters where byteOffset and length are optional, typed array instead of creating its own array buffer becomes the view of the specified one.

Typed arrays have the following properties:
buffer - reference to the underlying array buffer
byteLength - data length in bytes, the data starting point is defined by byteOffset, read-only and set at construction time
byteOffset - starting point of data in the ArrayBuffer, read-only, set at construction time
length - number of array elements
set(array [, offset]) - the first parameter is an array from which elements are copied, and the second is optional - index from which it starts copying. If the copied array is bigger than the target array, an exception is thrown
subarray(beginOffset, endOffset) - returns a new typed array of the same type, which has also in common the same array buffer. The beginOffset is inclusive and the endOffset is exclusive.
BYTES_PER_ELEMENT - size of the array element, number of bytes

In the following example we create a simple array with ten elements starting from 0 to 9 and pass it to the constructor of the Uint8Array class. We get a typed array with corresponding elements. Next we create a new array from the typed array using the subarray method and modify its value using the set method. An important thing to notice is that when we change the values in a subArray, values of the typedArray are also changed. That happens because these arrays share the same array buffer.

var standardArray = Array.apply(null,new Array(10)).map(Function.call.bind(Number));
var typedArray = new Uint8Array(standardArray);
console.log(typedArray.BYTES_PER_ELEMENT);	// 1
console.log(typedArray.length);	// 10
console.log(typedArray);	// [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
var subArray = typedArray.subarray(0,5);
console.log(subArray);	// [0, 1, 2, 3, 4]
subArray.set(new Uint8Array(3),1);
console.log(subArray);	// [0, 0, 0, 0, 4]
console.log(typedArray);	// [0, 0, 0, 0, 4, 5, 6, 7, 8, 9]
console.log(subArray.buffer === typedArray.buffer); // true

Overflow

When working with typed arrays we should be aware of something what is called the integer overflow. An overflow occurs when we write into a typed array some number which is out of the array type range (see the table with array types and ranges). The numeric value that we can read after the writing operation is different, it can be for example smaller or positive, not negative.

Signed integers in JavaScript are represented by 2C - Two's Complement and what we get when writing value out of the specific range is related to it. In the following example we create a typed array with ten elements which every element is an 8-bit integer value - Int8Array, that means it has 7 bit for value and 1 bit for the sign. As you can see the array is initialized with zero values. At the first two indexes we write a minimum and a maximum number from the Int8Array range (from -128 to 127) and at the next four values out of the range: -9900, 9900,-140, 140. After tracing values written to the array, we can see some different values 84, -84, 116, -116.

var a = new Int8Array(10);
console.log(a); // [0, 0, 0, 0, 0, 0, 0, 0, 0, 0] 
a[0] = -128;
a[1] = 127;
a[2] = -9900;
a[3] = 9900;
a[4] = -140;
a[5] = 140;
console.log(a); // [-128, 127, 84, -84, 116, -116, 0, 0, 0, 0]

How does that come about? Negative numbers are converted to two's complement (2C) and if they have number of bits bigger than eight - the high order bits are thrown away.  If a positive number converted to binary (B2) has number of bits bigger than eight then high order bits are thrown away so only eight can stay. Then if the high order bit is equal to one number then it is treated as a negative and converted from 2C.

-9900 -> 101100101010100(2C) -> 01010100(2C) -> 84(B10)
9900 -> 010011010101100(B2) -> 10101100(2C) -> -84(B10)
-140 -> 101110100(2C) -> 01110100​(2C) -> 116(B10)
140 -> 010001100(B2) -> 10001100(2C) -> -116(B10)

Let's check the array of type Uint8Array:

var a = new Uint8Array(10);
console.log(a); // [0, 0, 0, 0, 0, 0, 0, 0, 0, 0] 
a[0] = -0x80;
a[1] = 0x7F;
a[2] = -9900;
a[3] = 9900;
a[4] = -140;
a[5] = 280;
console.log(a); // [128, 127, 84, 172, 116, 24, 0, 0, 0, 0]

Overflow also occurs and the only difference is that numbers in which the high order bit is equal to one are always treated as conventional binary values (B2).

-128 -> 10000000(2C) = 128(B10) 
-9900 -> 101100101010100(2C) -> 01010100(B2) = 84(B10)
9900 -> 010011010101100(B2) -> 10101100(B2) = 172(B10)
-140 -> 010001100(B2) -> 101110100(2C) -> 01110100(B2) = 116(B10)
280 -> 0100011000(B2) -> 00011000(B2) = 24(B10)

In the following example we can observe differences between overflow management in Uint8Array and Uint8ClampedArray. The clamped array doesn’t allow to put into it values bigger than 255 and smaller than 0. Numbers out of that range are set to 255 or 0 appropriately. It's made for convenience, a clamped array is used for image data storage. Knowing that we can do some operations without worrying about getting some unexpected results or doing a range check.

var au = new Uint8Array(4);
au[1] = -9900;
au[2] = 0xFF;
au[3] = 9999;
console.log(au); // [0, 84, 255, 15]
var ac = new Uint8ClampedArray(4)
ac[1] = -9900;
ac[2] = 0xFF;
ac[3] = 9999;
console.log(ac); // [0, 0, 255, 255] 
-9900 -> 010011010101100(B2) -> 101100101010100(C2) -> 01010100(B2) = 84(B10)
9999 -> 010011100001111(B1) -> 00001111(B2) = 15(B10)

Uint8ClampedArray is the property of an object returned by the canvas context methods createImageData and getImageData.

What about arrays of the float type? When we try to write values out of the range of float type array, we get the Infinitive or -Infinitive. Maximum and minimum values can be put into Float64Array . This is equal to the JavaScript's constant Number.MAX_VALUE.

var a = new Float32Array(4);
a[0] = -3.4028e38;
a[1] = 3.4028e38;
a[2] = -3.4029e38;
a[3] = 3.4029e38;    
console.log(a); // [-3.4027999387901484e+38, 3.4027999387901484e+38, -Infinity, Infinity]
    
var a = new Float64Array(4);
a[0] = -Number.MAX_VALUE;
a[1] = Number.MAX_VALUE;
a[2] = -1.7977e308;
a[3] = 1.7977e308;
console.log(a); // [-1.7976931348623157e+308, 1.7976931348623157e+308, -Infinity, Infinity]

Byte order

Byte order is something we have to deal with when data is read into a program from external sources.  There are two ways of storing multibyte data on different computer systems:

  • big-endian - low order bytes are stored at lower index position
  • little-endian - low order bytes are stored at higher index position

The only way to deal with this issue in JavaScript is to use a DataView object which allows to set an appriopriate byte order parameter during write and read operations. In the following example we create two byte length array buffers and construct a DataView on it. Next we write to the buffer a 16-bit value 0x00FF using the setUint16 method with the third parameter set to false and later to true. With parameter false which is the default, the value is written in the big-endian mode, that means in the buffer there should be placed: at the first place the 0x00 value and after it 0xFF. With the parameter value set to true, the content is written in the less-endian mode. That means in the buffer there should be placed: at the first place 0xFF and at the next place 0x00. To read the buffer data we use the Uint8Array within which the byte order is reflected by the array indices.

var ab = new ArrayBuffer(2);
var dv = new DataView(ab);
dv.setUint16(0, 255, false);
console.log(new Uint8Array(ab)); // [0, 255] 
dv.setUint16(0, 255, true);
console.log(new Uint8Array(ab)); // [255, 0] 

Knowing that we can write a piece of code which will check endianness of our system, we create a 16-bit array and put into it the 0x00FF value. Next we create an 8-bit array using the buffer of a previously created array. If the value at the zero index of an 8-bit array is equal to 0xFF, the system has little-endian order.

var a16 = new Uint16Array(1);
a16[0] = 255;
var a8 = new Uint8Array(a16.buffer);
console.log(a8[0]===255); // true
console.log(a8); // [255, 0]

Summary

Typed arrays, array buffer and the data view are part of ECMAScript 6. These data structures were added for performance reasons and to fulfil requirements of some new HTML5 features. You can see practical examples of some of these features in the article called: Binary data from and to file.