resulttype funcname ( parameterlist ) := BEGINC++
code
resulttype | The ECL return value type of the C++ function. |
funcname | The ECL definition name of the function. |
parameterlist | A comma separated list of the parameters to pass to the function. |
code | The C++ function source code. |
The BEGINC++ structure makes it possible to add in-line C++ code to your ECL. This is useful where string or bit processing would be complicated in ECL, and would be more easily done in C++, typically for a one-off use. For more commonly used C++ code, writing a plugin would be a better solution (see the External Service Implementation discussion).
The implementation must be written to be thread safe and any calls to external libraries must be made to thread safe versions of those libraries.
You can use EMBED instead of BEGINC++ to embed C++ code and specify additional options (for example, DISTRIBUTED) using this form:
myFunction(string name) := EMBED(C++ [: options])
... text
ENDEMBED
WARNING: This feature could create memory corruption and/or security issues, so great care and forethought are advised--consult with Technical Support before using.
Types are passed as follows:
//The following typedefs are used below:
typedef unsigned size32_t;
typedef wchar_t UChar; [ unsigned short in linux ]
The following list describes the mappings from ECL to C++. For embedded C++ the parameters are always converted to lower case, and capitalized in conjunctions (see below).
ECL C++ [Linux in brackets]
BOOOLEAN xyz bool xyz
INTEGER1 xyz signed char xyz
INTEGER2 xyz int16_t xyz
INTEGER4 xyz int32_t xyz
INTEGER8 xyz signed __int64 xyz [ long long ]
UNSIGNED1 xyz unsigned char xyz
UNSIGNED2 xyz uint16_t xyz
UNSIGNED4 xyz uint32_t xyz
UNSIGNED8 xyz unsigned __int64 xyz [ unsigned long long xyz ]
REAL4 xyz float xyz
REAL/REAL8 xyz double xyz
DATA xyz size32_t lenXyz, void * xyz
STRING xyz size32_t lenXyz, char * xyz
VARSTRING xyz char * xyz;
QSTRING xyz size32_t lenXyz, char * xyz
UNICODE xyz size32_t lenXyz, UChar * xyz
VARUNICODE xyz UChar * xyz
DATA<nn> xyz void * xyz
STRING<nn> xyz char * xyz
QSTRING<nn> xyz char * xyz
UNICODE<nn> xyz UChar * xyz
SET OF ... xyz bool isAllXyz, size32_t lenXyz, void * xyz
Note that strings of unknown length are passed differently from those with a known length. A variable length input string is passed as a number of characters, not the size (i.e. qstring/unicode), followed by a pointer to the data, like this (size32_t is an UNSIGNED4):
STRING ABC -> size32_t lenAbc, const char * abc;
UNICODE ABC -> size32_t lenABC, const UChar * abc;
A dataset is passed as a size/pointer pair. The length gives the size of the following dataset in bytes. The same naming convention is used:
DATASET(r) ABC -> size32_t lenAbc, const void * abc
The rows are accessed as x+0, x + length(row1), x + length(row1) + length(row2)
LINKCOUNTED DATASET(r) ABC -> size32_t countAbc, const byte * * abc
The rows are accessed as x[0], x[1], x[2]
NOTE: variable length strings within a record are stored as a 4 byte number of characters, followed by the string data.
Sets are passed as a set of parameters (all, size, pointer):
SET OF UNSIGNED4 ABC -> bool isAllAbc, size32_t lenAbc, const void * abc
Return types are handled as C++ functions returning the same types with some exceptions. The exceptions have some extra initial parameters to return the results in:
ECL C++ [Linux in brackets]
DATA xyz size32_t & __lenResult, void * & __result
STRING xyz size32_t & __lenResult, char * & __result
CONST STRING xyz size32_t lenXyz, const char * xyz
QSTRING xyz size32_t & __lenResult, char * & __result
UNICODE xyz size32_t & __lenResult, UChar * & __result
CONST UNICODE xyz size32_t & __lenResult, const UChar * & __result
DATA<nn> xyz void * __result
STRING<nn> xyz char * __result
QSTRING<nn> xyz char * __result
UNICODE<nn> xyz UChar * __result
SET OF ... xyz bool __isAllResult, size32_t & __lenResult, void * & __result
DATASET(r) size32_t & __lenResult, void * & __result
LINKCOUNTED DATASET(r)
size32_t & __countResult, byte * * & __result
STREAMED DATASET(r)
returns a pointer to an IRowStream interface
(see the eclhelper.hpp include file for the definition)
For example,
STRING process(STRING value, INTEGER4 len)
has the prototype:
void process(size32_t & __lenResult, char * & __result,
size32_t lenValue, char * value, int len);
A function that takes a string parameter should also have the type prefixed by const in the ECL code so that modern compilers don't report errors when constant strings are passed to the function.
BOOLEAN isUpper(const string mystring) := BEGINC++
size_t i=0;
while (i < lenMystring)
{
if (!isupper((byte)mystring[i]))
return false;
i++;
}
return true;
ENDC++;
isUpper('JIM');
Parameters can also include streamed datasets.
If stream is specified on the dataset then the parameter is passed as an IRowStream. The next row from the dataset is obtained by calling:
dataset->nextRow();
After it has been processed the row must be freed by calling
rtlReleaseRow(next);
For example:
traceDataset(STREAMED DATASET(r) ds, BOOLEAN isLocal = FALSE) := EMBED(C++)
#include <stdio.h>
#body
for(;;)
{
const byte * next = (const byte *)ds->nextRow();
if (!next)
return;
unsigned __int64 id = *(__uint64 *)(next);
size32_t lenName = *(size32_t *)(next + sizeof(__uint64));
const char * name = (char *)(next + sizeof(__uint64) + sizeof(size32_t));
printf("id(%u) name(%.*s)\n", (unsigned)id, lenName, name);
rtlReleaseRow(next);
}
ENDEMBED;
If the result of a c++ function is a streamed dataset, then it needs to return an instance of an IRowStream interface. The function will also be passed an extra implicit parameter:
IEngineRowAllocator * _resultAllocator
which is used to allocate the rows that are returned from the function.
For example:
// This function takes two streamed inputs and outputs the result of two values
// from the left multiplied together and added to a row from the right.
STREAMED DATASET(r) myDataset(STREAMED DATASET(r) ds1, STREAMED DATASET(r) ds2)
:= EMBED(C++ : activity)
#include <stdio.h>
#body
class MyStreamInlineDataset : public RtlCInterface, implements IRowStream
{
public:
MyStreamInlineDataset(IEngineRowAllocator * _resultAllocator, IRowStream * _ds1,
IRowStream * _ds2)
: resultAllocator(_resultAllocator), ds1(_ds1), ds2(_ds2)
{
}
RTLIMPLEMENT_IINTERFACE virtual const void *nextRow() override
{
const byte * next1a = (const byte *)ds1->nextRow();
if (!next1a)
return nullptr;
const byte * next1b = (const byte *)ds1->nextRow();
const byte * next2 = (const byte *)ds2->nextRow();
if (!next1b || !next2)
rtlFailUnexpected();
unsigned __int64 value1a = *(const unsigned __int64 *)next1a;
unsigned __int64 value1b = *(const unsigned __int64 *)next1b;
unsigned __int64 value2 = *(const unsigned __int64 *)next2;
rtlReleaseRow(next1a);
rtlReleaseRow(next1b);
rtlReleaseRow(next2);
unsigned __int64 result = value1a * value1b + value2;
RtlDynamicRowBuilder rowBuilder(resultAllocator);
byte * row = rowBuilder.getSelf();
*(__uint64 *)(row) = result;
return rowBuilder.finalizeRowClear(sizeof(unsigned __int64));
}
virtual void stop() override
{
ds1->stop();
ds2->stop();
}
protected:
Linked<IEngineRowAllocator> resultAllocator;
IRowStream * ds1;
IRowStream * ds2;
}; return new MyStreamInlineDataset(_resultAllocator, ds1, ds2);
ENDEMBED;
Note: If the resulting row does not have a fixed size, you should call:
byte * row = rowBuilder.ensureCapacity(<totalSize>, nullptr);
instead of:
byte * row = rowBuilder.getSelf();
This code uses a RtlDynamicRowBuilder which is a class used by the code generator. Instead of using the RtlDynamicRowBuilder class, you could directly call resultAllocator->createRow().
When a data type is included in an input row, rather than being passed as a parameter, the format is the same as the parameters, except that instead of having a pointer to the string etc., the string follows the 4-byte length. The data in the row is not aligned; that is, it has packing of 1.
#option pure | By default, embedded C++ functions are assumed to have side-effects, which means the generated code won't be as efficient as it might be since the calls can't be shared. Adding #option pure inside the embedded C++ code causes it to be treated as a pure function without side effects. |
#option once | Indicates the function has no side effects and is evaluated at query execution time, even if the parameters are constant, allowing the optimizer to make more efficient calls to the function in some cases. |
#option action | Indicates side effects, requiring the optimizer to keep all calls to the function. |
#body | Delimits the beginning of executable code. All code that precedes #body (such as #include) is generated outside the function definition; all code that follows it is generated inside the function definition. |
Example:
//static int add(int x,int y) {
INTEGER4 add(INTEGER4 x, INTEGER4 y) := BEGINC++
#option pure
return x + y;
ENDC++;
OUTPUT(add(10,20));
//static void reverseString(size32_t & __lenResult,char * & __result,
// size32_t lenValue,char * value) {
STRING reverseString(STRING value) := BEGINC++
size32_t len = lenValue;
char * out = (char *)rtlMalloc(len);
for (unsigned i= 0; i < len; i++)
out[i] = value[len-1-i];
__lenResult = len;
__result = out;
ENDC++;
OUTPUT(reverseString('Kevin'));
// This is a function returning an unknown length string via the
// special reference parameters __lenResult and __result
//this function demonstrates #body, allowing #include to be used
BOOLEAN nocaseInList(STRING search,
SET OF STRING values) := BEGINC++
#include <string.h>
#body
if (isAllValues)
return true;
const byte * cur = (const byte *)values;
const byte * end = cur + lenValues;
while (cur != end)
{
unsigned len = *(unsigned *)cur;
cur += sizeof(unsigned);
if (lenSearch == len && memicmp(search, cur, len) == 0)
return true;
cur += len;
}
return false;
ENDC++;
//and another example, generating a variable number of Xes
STRING buildString(INTEGER4 value) := BEGINC++
char * out = (char *)rtlMalloc(value);
for (unsigned i= 0; i < value; i++)
out[i] = 'X';
__lenResult = value;
__result = out;
ENDC++;
//examples of embedded, LINKCOUNTED, and STREAMED DATASETs
inRec := { unsigned id };
doneRec := { unsigned4 execid };
out1rec := { unsigned id; };
out2rec := { real id; };
DATASET(doneRec) doSomethingNasty(DATASET(inRec) input) := BEGINC++
__lenResult = 4;
__result = rtlMalloc(8);
*(unsigned *)__result = 91823;
ENDC++;
DATASET(out1Rec) extractResult1(doneRec done) := BEGINC++
const unsigned id = *(unsigned *)done;
const unsigned cnt = 10;
__lenResult = cnt * sizeof(unsigned __int64);
__result = rtlMalloc(__lenResult);
for (unsigned i=0; i < cnt; i++)
((unsigned __int64 *)__result)[i] = id + i + 1;
ENDC++;
LINKCOUNTED DATASET(out2Rec) extractResult2(doneRec done) := BEGINC++
const unsigned id = *(unsigned *)done;
const unsigned cnt = 10;
__countResult = cnt;
__result = _resultAllocator->createRowset(cnt);
for (unsigned i=0; i < cnt; i++)
{
size32_t allocSize;
void * row = _resultAllocator->createRow(allocSize);
*(double *)row = id + i + 1;
__result[i] = (byte *)_resultAllocator->finalizeRow(allocSize, row, allocSize);
}
ENDC++;
STREAMED DATASET(out1Rec) extractResult3(doneRec done) := BEGINC++
class myStream : public IRowStream, public RtlCInterface
{
public:
myStream(IEngineRowAllocator * _allocator, unsigned _id) : allocator(_allocator), id(_id), idx(0) {}
RTLIMPLEMENT_IINTERFACE
virtual const void *nextRow()
{
if (idx >= 10)
return NULL;
size32_t allocSize;
void * row = allocator->createRow(allocSize);
*(unsigned __int64 *)row = id + ++idx;
return allocator->finalizeRow(allocSize, row, allocSize);
}
virtual void stop() {}
private:
Linked<IEngineRowAllocator> allocator;
unsigned id;
unsigned idx;
};
#body
const unsigned id = *(unsigned *)done;
return new myStream(_resultAllocator, id);
ENDC++;
ds := DATASET([1,2,3,4], inRec);
processed := doSomethingNasty(ds);
out1 := NORMALIZE(processed, extractResult1(LEFT), TRANSFORM(RIGHT));
out2 := NORMALIZE(processed, extractResult2(LEFT), TRANSFORM(RIGHT));
out3 := NORMALIZE(processed, extractResult3(LEFT), TRANSFORM(RIGHT));
SEQUENTIAL(OUTPUT(out1),OUTPUT(out2),OUTPUT(out3));
See Also: External Service Implementation, EMBED Structure