SystemML Class
std/2009/data/numeric
Summary
Container for an array of primitive numeric values
Status
Stable

Overview

The most used data class in BRAHMS, numeric array contains an N-dimensional array of numeric elements of any primitive type listed under Numeric Types. All numeric array objects thus have one-to-one equivalent representations in Matlab. Many process classes will use no data classes beyond this one.

Notes

Representation of complex numbers

Complex numbers are generally stored in one of two ways. The first, "adjacent", used by e.g. Matlab, has all real data laid out as if there were no imaginary data, then all imaginary data following directly with the same layout (that is, the real/imag is treated as a trailing dimension of 2). The second, "interleaved", used by e.g. NumPy, has real/imaginary parts of each scalar adjacent in memory, and these double-size objects are then laid out just like an array of real numbers (that is, the real/imag is treated as a leading dimension of 2). Since it must serve multiple clients in multiple languages, std/data/numeric is agnostic about this choice, and can store data either as TYPE_CPXFMT_ADJACENT (adjacent) or TYPE_CPXFMT_INTERLEAVED (interleaved).

For maximal performance, and to avoid unnecessary translations back and forth, your client process can handle both formats. However, if you are more concerned with ease of use than performance, and since reordering will not usually be a performance bottleneck, it is possible to access your input data in a specified form only, regardless of what is passed by your inputs (this will usually make for much simpler client code). Many processes may read the same data object, and it is preferable to only peform the translation once for all of them that need it, so the translation is performed (if required) internally to std/data/numeric. For instance, if the data is in adjacent form, and one client process reads it in interleaved form, it is then available for direct reading in either format by further readers.

You can insist on a particular form, or indicate that you will accept the data in either form, by calling setReadFormat() before you call getContent(). If you do not specify, the default is to return data in adjacent form. If you do specify one or the other, the conversion will be done for you (efficiently, as described above), and you will never know what format the data came in as. To accept both forms, explicitly specify setReadFormat(TYPE_CPXFMT_UNSPECIFIED | ...).

Column-major versus row-major

std/data/numeric supports both of these formats, in principle, but currently balks if you ask it to store in row-major. Clients of std/data/numeric should take note that a future release will allow row-major data, so that data in a std/data/numeric may be in either format.

You can insist on a particular form, or indicate that you will accept the data in either form, by calling setReadFormat() before you call getContent(). If you do not specify, the default is to return data in column-major form. If you do specify one or the other, the conversion will be done for you (efficiently, as described above), and you will never know what format the data came in as. To accept both forms, explicitly specify setReadFormat(TYPE_ORDER_UNSPECIFIED | ...).

Output format

An (N+1)-dimensional array of numeric data, where the (N+1)th dimension is sample number.

Native Interface

The data object provides the following C++ interface. The Input and Output classes are "Accessors". An Accessor is unattached when it is created, and is attached to a Port either by calling attach() (on an Input) or create() (on an Output).

For more usage examples than are given below, see the source code for any Standard Library process that uses this data type.

Notes
  • The C++ interface overlays a C interface; for details of the C interface, see the header file for the class.
  • As with all Data and Utility Components, this one presents its C++ interface in a namespace named for its SystemML Class, with underscores instead of slashes and the release tagged on the end. It may be easier to import this namespace, provided it does not conflict with other symbols in your source code, but you can always use symbols within it explicitly, as shown here. An alternative is to create a namespace alias with a more convenient name - this approach is used, for example, in the 1199 template.
C/C++ API (Data)
struct Structure { // physical TYPE type; // full numeric type (includes complexity, shape, flags etc.) struct Dimensions dims; // real elements only; complexity does not affect this count // derived TYPE typeElement; // broken down numeric type (type | TYPE_ELEMENT_MASK) UINT32 bytesPerElement; UINT64 numberOfElementsReal; UINT64 numberOfElementsTotal; UINT64 numberOfBytesReal; UINT64 numberOfBytesTotal; UINT8 complex; // boolean (true if data is complex) UINT8 scalar; // boolean (true if numberOfElementsReal is unity) UINT8 realScalar; // boolean (true if numberOfElementsTotal is unity) };

Accessor

C/C++ API (Data)
struct Accessor { bool isAttached() const; bool isPresent() const; const char* getName() const; UINT32 getFlags() const; void selectSet(Symbol hSet); // read the structure const Structure* getStructure(); };
bool isAttached() const
Return true if Port is attached, false otherwise.
bool isPresent() const
Return true if Data is present in the Port, false otherwise (always returns false if Accessor is not yet attached). Note that this is not the same thing as the data being Due (see Design Principles).
const char* getName() const
Return the name of the Data object in the Port, if attached (raises an error if not attached).
UINT32 getFlags() const
Return the flags of the Data object in the Port, if attached (raises an error if not attached).
void selectSet(Symbol hSet)
Select the Set where the target Port will be created or found (otherwise, the default set is assumed). This call raises an error if it is made after attachment.
const Structure* getStructure()
Get the structure of the Data object (raises an error if not attached).

Input

C/C++ API (Data)
struct Input : public Accessor { // attach the input port (in EVENT_INIT_CONNECT) const Input& attach(Symbol hComponent, std::string name); const Input& attach(Symbol hComponent, UINT32 index); // attach to a port that may not be there bool tryAttach(Symbol hComponent, std::string name); // validate the input port (in EVENT_INIT_CONNECT) void validateStructure(TYPE type); void validateStructure(TYPE type, const Dims& dims); void validateStructure(TYPE type, const Dimensions& dims); // specify content format to be returned from getContent() void setReadFormat(TYPE type); // read the port (in EVENT_RUN_SERVICE) const void* getContent(); UINT64 getContent(const void*& real, const void*& imag); };
const Input& attach(Symbol hComponent, std::string name)
Attach the Accessor to a Port specified by name (pass a handle to the Process).
const Input& attach(Symbol hComponent, UINT32 index)
Attach the Accessor to a Port specified by index (pass a handle to the Process).
bool tryAttach(Symbol hComponent, std::string name)
Try to attach, but return false (rather than raising an error) if the named Port is not present on the input Set.
Input& now()
Assert that the Port is both attached and Due (i.e. raise an error if not).
void validateStructure(TYPE type)
Assert that the Data object has the specified type. type should include one or both of a numeric format constant (e.g. TYPE_DOUBLE) and a complex constant (TYPE_REAL or TYPE_COMPLEX). If any constant is absent, that category is not validated, allowing you to receive, for instance, DOUBLE-type data of either real or complex type.
void validateStructure(TYPE type, const Dims& dims)
void validateStructure(TYPE type, const Dimensions& dims)
Assert that the Data object has the specified type and dimension. Type is handled as above. dims can include specific dimensions sizes, DIM_ANY (accept anything), DIM_NONZERO (accept anything non-zero), and the final entry can be DIM_ELLIPSIS to indicate that zero or more further dimensions may be present without failing validation. Pass TYPE_UNSPECIFIED to validate only the dimensions.
void setReadFormat(TYPE type)
Set the required storage format for all subsequent calls to getContent(). Pass type as TYPE_UNSPECIFIED to get the buffer in whatever is its native form (check the return value of getStructure()->type to identify that form). Otherwise, specify one of TYPE_CPXFMT_ADJACENT, TYPE_CPXFMT_INTERLEAVED and/or one of TYPE_ORDER_COLUMN_MAJOR, TYPE_ORDER_ROW_MAJOR to cause automatic conversion, if necessary, to the specified form. Before you call this function, the default form is (TYPE_CPXFMT_ADJACENT | TYPE_ORDER_COLUMN_MAJOR). Therefore, if you make no call, the memory block returned from getContent() will always be a pointer to a block of this type. If you want to accept a wider range of formats for performance reasons, you will have to call this function and specify what you accept.
const void* getContent()
Get the content of the Data object (raises an error if not attached) for reading only, or NULL if the object has zero elements. Use in EVENT_RUN_SERVICE. Automatic conversion is performed internally so that the buffer returned has the required complex storage format and array ordering (adjacent/interleaved and column-major/row-major, see Notes and setReadFormat()).
UINT64 getContent(const void*& real, const void*& imag)
Get the content of the Data object as separate real and imaginary data blocks (raises an error if not attached), or NULL pointers if the object has zero elements. imag will also be NULL if no complex data is present, or if the complex storage format is TYPE_COMPLEX_INTERLEAVED. Use in EVENT_RUN_SERVICE. No conversion is performed. This function will be deprecated in a future release of this interface: use the above function instead (real and imaginary data are always packed into one contiguous memory block).
Notes
  • validateStructure() will also validate TYPE_CPXFMT_MASK constants and TYPE_ORDER_MASK constants. However, since automatic conversion is provided, validating on these categories only prevents you from receiving certain classes of data, with no benefit. Instead, specify the format you require to setReadFormat(), and you can receive the data in the format you prefer.

Output

C/C++ API (Data)
struct Output : public Accessor { // prepare for creation (in EVENT_INIT_CONNECT) void setName(const char* name); void setSampleRate(SampleRate sampleRate); // create the output port (in EVENT_INIT_CONNECT) void create(Symbol hComponent); void create(Symbol hComponent, const Accessor& accessor); // structure the data in the created port (in EVENT_INIT_CONNECT) void setStructure(TYPE type, const Dims& dims); void setStructure(TYPE type, const Dimensions& dims); // get pointers to write the port directly (in EVENT_RUN_SERVICE) void* getContent(); UINT64 getContent(void*& real, void*& imag); // write the output port (in EVENT_RUN_SERVICE) void setContent(const void* real, const void* imag = 0, UINT32 bytes = 0); };
void setName(const char* name)
Set the name that the target Port will be given on creation (else, default name supplied by framework will be used).
void setSampleRate(SampleRate sampleRate)
Set the sample rate that the target Port will be given on creation (else, process sample rate will be used).
void create(Symbol hComponent)
Create the Port (pass a handle to the Process).
void create(Symbol hComponent, const Accessor& accessor)
Create the Port as a copy of the Data in the Port associated with the passed Accessor (pass a handle to the Process). The structure of the new Data will be already set to be the same as that passed, so it will not be necessary to call setStructure().
void setStructure(TYPE type, const Dims& dims)
void setStructure(TYPE type, const Dimensions& dims)
Set the structure of a newly created Data object. This call must be made within the context of the same event in which the Port was created. Element type (e.g. TYPE_DOUBLE) and complexity (e.g. TYPE_REAL) must be specified. Complex storage format, if unspecified, defaults to TYPE_CPXFMT_ADJACENT. Array ordering, if unspecified, defaults to TYPE_ORDER_COLUMN_MAJOR.
void* getContent()
Get the content of the Data object (raises an error if not attached), or NULL if the object has zero elements. Returned pointer can be used for writing. Use in EVENT_RUN_SERVICE. Since this pointer points to the writeable copy of this data, the buffer must be written in the object's native storage format as specified in an earlier call to setStructure().
UINT64 getContent(void*& real, void*& imag)
Get the content of the Data object as separate real and imaginary data blocks (raises an error if not attached), or NULL pointers if the object has zero elements. imag will be NULL if no complex data is present, or if the storage format is TYPE_CPXFMT_INTERLEAVED. Use in EVENT_RUN_SERVICE. Returned pointers can be used for writing. Since this pointer points to the writeable copy of this data, the buffer must be written in the object's native storage format as specified in an earlier call to setStructure(). This function will be deprecated in a future release of this interface: use the above function instead (real and imaginary data are always packed into one contiguous memory block).
void setContent(const void* real, const void* imag = 0, UINT32 bytes = 0)
Set the content of a Data object. Alternative to writing to the pointer returned by getContent(), where necessary, but requires a copy of the memory block(s). If object is real, or complex data format is TYPE_CPXFMT_INTERLEAVED, imag should be NULL. If imag is non-NULL, real/imaginary data is copied from real/imag; if imag is NULL, and complex data is required, all the data (real and imaginary) is copied from real (i.e. twice as much, if complex). bytes, if supplied, must be equal to the number of real bytes (pass zero to skip this check and simply copy out the number of bytes expected to be passed). Use in EVENT_RUN_SERVICE. This function will be deprecated in a future release of this interface: use the above function instead (real and imaginary data are always packed into one contiguous memory block).

Generic Interface

The structure string has the form <type>/<REAL|COMPLEX>/<CPXFMT_ADJACENT|CPXFMT_INTERLEAVED>/<COLUMN_MAJOR|ROW_MAJOR>/<comma-separated-dims>, where <type> is something like "UINT32" or "DOUBLE". <CPXFMT_ADJACENT|CPXFMT_INTERLEAVED> can be absent, in which case complex storage format must be implied (see EventGenericStructure). <ROW_MAJOR|COLUMN_MAJOR> can be absent, in which case storage order must be implied (see EventGenericStructure). For example, DOUBLE/REAL/2,2 is a standard 2x2 numeric matrix. The numeric equivalent form is the native form of this data object.

Example

Creating an output with an Output Accessor, and writing it during EVENT_RUN_SERVICE.

C++ Source Code (against 1199)
Output output; ... case EVENT_INIT_CONNECT: { // instantiate output, and initialise it output.setName("out"); output.create(hComponent); output.setStructure(TYPE_DOUBLE | TYPE_REAL, Dims(2, 4)); // ok return C_OK; } case EVENT_RUN_SERVICE: { // prepare my data in a buffer, and copy it into the output. // this is less efficient, but is indicated if the buffer // must persist across calls to this event. MY_DATA p; p.set(...); output.setContent((void*) &p); // or, obtain a pointer to the output, and write there directly. // this is more efficient, and is indicated in most cases. MY_DATA* p = (MY_DATA*) output.getContent(); p->set(...); // ok return C_OK; }

Example

Attaching to an input with an Input Accessor. Note that in this example, we attach to the Port in EVENT_INIT_PRECONNECT, but we cannot expect the Port to be Due until EVENT_INIT_CONNECT. We also create an output with the same structure as the input.

C++ Source Code (against 1199)
Input input; Output output; ... case EVENT_INIT_PRECONNECT: { // attach input.attach(hComponent, "in"); // ok return C_OK; } case EVENT_INIT_CONNECT: { // on last call, we know the Port is definitely Due if (event->flags & F_LAST_CALL) { // get structure const numeric::Structure* structure = input.getStructure(); // store information type = structure->typeElement; numEls = structure->numberOfElementsTotal; numBytes = structure->numberOfBytesTotal; // create output as copy of input - note that we have // to ask explicitly for the name to be taken as well output.setName(input.getName()); output.create(hComponent, input); } // ok return C_OK; }

Example

Notes
  • The Accessor cannot be copied (or assigned) once it is attached to a port (for performance reasons); this is important if you store an array of Accessors in an STL container, because resizing STL containers generally causes copies to be made. The examples below illustrate.

The code below will work just fine, where we do all the resizing (i.e. copying) before we start attaching Accessors to Ports.

C++ Source Code (against 1199)
case EVENT_INIT_PRECONNECT: { // accessors vector<Input> inputs; // create all inputs.resize(iif.getNumberOfPorts()); // attach all for (UINT32 s=0; s<inputs.size(); s++) inputs[s].attach(hComponent, s); // ok return C_OK; }

The code below will not (in general) work, and will raise an exception, because we intersperse copying with attaching.

C++ Source Code (against 1199)
case EVENT_INIT_PRECONNECT: { // accessors vector<Input> inputs; // create/attach one by one for (UINT32 s=0; s<iif.getNumberOfPorts(); s++) { inputs.resize(s + 1); inputs[s].attach(hComponent, s); } // ok return C_OK; }