Articles tagged with 'Custom Functions'

Unicode

If you haven't heard of Unicode you have certainly seen it. You are seeing it now since Unicode is the standard for the encoding of characters viewable in Web Browsers and on computers in general. As of this writing, version 10 of the standard includes more then 136,000 characters from multiple writing systems and Medidata Rave supports the Unicode standard both for study designs and for data collection. So what is the problem?

Actually, there is no problem so long as you know what characters from the Unicode standard are being used in your study, where they are and how they display and appear in outputs.

Unicode in Study Design

If you are building your study in Japanese or localizing it to Russian, Armenian or Greek then having the full set of Unicode characters to use is vital. For studies in English you may want to stick to the set of 128 characters known as ASCII (a-Z, 0-9 and symbols). But sometimes you can be surprised by characters that aren՚t what you think they are…

Did you spot those alternative characters hiding in the last sentence?

characters that aren՚t what you think they are…

vs:

characters that aren't what you think they are...

Still can't see it? Hint: It's the ՚ and the … The differences are (or at least, may be) subtle on the screen but when we render them in a Rave PDF they appear quite different:

Apostrophe and Ellipsis

It is very hard for the human eye to distinguish between these characters the way they are rendered in Browsers but they are different characters and the font that Rave uses to display characters won't have a way to render all 135,000 possible characters so it is best (in English studies at least) to stick to characters that appear in the limited ASCII set of characters that all fonts cover well.

Be especially wary of text that is cut and pasted from web pages, Word and Excel or from PDF documents. It is very tempting to copy verbatim from a Protocol document but word processors use all kinds of character variants to make writing look better on the screen or in print. You can't even trust the spaces in these documents because Unicode defines at least 20 different "empty" space characters of different widths including one that has no width at all (i.e. it is invisible!)

Tip: TrialGrid Diagnostic 70 will identify and highlight non-ASCII characters, even invisible ones

Unicode in Study Data

If unexpected characters in study design can cause strange PDF outputs, unexpected or unwanted characters in the clinical data can be real poison. A study that collects data in the English language might expect that all the text data in the study is in ASCII. However, Rave will accept data input to text fields of any Unicode character so the same problems of cut & pasted content can occur. Rave is 100% Unicode compatible so it will happily take, store and output any Unicode content but SAS and other analysis programs may have to be set to accept non-ASCII content.

In English studies you want to identify non-ASCII content at the point of entry. This can only be done with a Custom Function that looks at the content of a text field and determines if any of the characters are outside the ASCII range. A quick search of the web will throw up simple code which will return true if it finds a non-ASCII character in the input string:

    //Take string from datapoint.Data or datapoint.StandardValue
    string s = "characters that aren՚t what you think they are…";  

    foreach (char c in s)
    { 
        if (((int)c) > 127) 
        { 
            return true; 
        } 
    } 
    return false;

Tip: TrialGrid contains a CQL extension that makes this as easy as using FieldName.IsNotAscii in an Edit Check.

Summary

Rave handles Unicode really well and web browsers are very good at displaying a wide range of Unicode characters but not all characters can be displayed by all systems so be careful what you put into your study design and what you collect in your study data. Being able to cut and paste text between systems is great for productivity but can have unintended consequences.

Save 20-30% on Edit Check Builds

Andrew and I are on a mission to reduce the cost and effort of building Rave studies by 50%. It's an ambitious goal but nothing really worth doing is easy.

One of the most costly areas of study build is the writing and testing of Edit Checks. So lets take a look at Edit Checks and where the costs are.

Three levels of Edit Check logic

In the previous post we looked at the three levels of edit check logic:

  • Field Checks (Range, IsRequired, QueryFutureDate etc)
  • Configured Checks (Rave Edit Checks)
  • Custom Functions

Field Checks can be set up with a few clicks and some data entry for expected high and low ranges. They are extremely fast and easy to set up and require little or no testing since they are features of the validated Rave system. Field edit checks are so easy we're giving these a value of $1 for all the checks set on a field (Is Required, Simple Numeric Ranges, Cannot be a Future date etc). That doesn't mean they literally cost $1 to include in your study. Depending on how you build, staffing costs, how luxurious your offices are etc your price will vary. $1 is just a good baseline figure to compare other costs against.

Configured Checks are written using Rave's Edit Check editor which uses a postfix notation (1 1 + 2 isequalto). Rave Edit Checks are flexible and very functional but every Edit Check that is written has to be specified, written and tested making it more expensive to create than a simple Field Check. You also need a more skilled study builder to write a Configured check. So let's say, $10, on average, to create a configured edit check. Again, $10 is not a literal cost, it's just a comparison.

Lastly we have Custom Functions. These are written in C#, VB.NET or SQL and require some level of true programming expertise. Custom Functions are the fallback, the special tool in the toolbox for the truly complex situations. Besides the difficulty of hiring (and keeping) good programmers in the current technical market Custom Functions have to be specified, reviewed for coding standards and performance impact as well as tested. We'll say, conservatively, $50 for the development of a Custom Function. Once again $50 is just a relative cost to the $1 field check since the average Custom Function is at least 50x more complex than a field check.

Study Averages

There is no such thing as an average study the size and complexity of a study depends on it's Phase, Therapeutic Area and many other variables. But we have seen a lot of trials over the years so we'll illustrate costs with what we think is fairly typical: A study with around 1,000 data entry fields, 1,000 Configured Edit Checks and 100 Custom Functions.

Given those numbers we can draw a graph that shows how the Edit Checks in our study stack up.

TypicalEditChecksByType

A graph of the costs is also enlightening:

OverallCost1

The bulk of the costs is in the Configured Edit Checks but those 100 Custom Functions account for 30% of the cost.

How to reduce the cost?

Field Edits are so easy there is little that could be done to make creating them more efficient but there is scope for improvement in Configured Edits and Custom Functions. How could we reduce the costs of those?

At TrialGrid we're attacking this challenge with CQL, the Clinical Query Language. CQL is an infix format for Rave Configured Edit Checks which is easy and fast to write and which has built-in testing facilities.

An Edit Check with CQL (infix) logic like:

A > B AND (C == D OR C == E)

would be translated into a Rave Edit Check (postfix) logic like:

 A
 B
 ISGREATERTHAN
 C
 D
 ISEQUALTO
 C
 E
 ISEQUALTO
 OR
 AND

CQL also includes a set of built-in functions that automatically generate Custom Functions for you.

For example, We have been asked for an Edit Check that determines if a text field contains non-ASCII characters. Using it in a CQL expression is easy:

AETERM.IsNotAscii

The TrialGrid application takes care of generating the Custom Function. You'll still need some bespoke Custom Functions but fewer and fewer as time goes on and we build more into CQL.

We (conservatively) estimate that CQL can save a Clinical Programmer or Data Manager 50% of the effort of writing Configured Edit Checks and that the generation of Custom Functions will reduce the number of Custom Functions that have to be hand-written by at least 10%. When we plug these numbers into our costings for our example study the price drops to $10,500 from $16,000 a saving of 34%

OverallCost2

Who wouldn't want that?

Image

Have you hit the Edit Check Wall?

Anyone who participates in endurance sports such as cycling or running will have heard of The Wall. It is the point at which the athlete exhausts their glycogen stores, resulting in a feeling of fatigue, the inability to go on.

As a Data Manager in a Study Builder role the chances are that you have experienced something similar, the point at which the logic for an edit check becomes too complex and you have to fall back on a Custom Function. This is the Edit Check complexity "Wall."

Three levels of Edit Check logic

Essentially Rave has three levels of edit check logic:

  • Field Checks (Range, IsRequired, QueryFutureDate etc)
  • Configured Checks (Rave Edit Checks)
  • Custom Functions

Field Checks can be set up with a few clicks and some data entry for expected high and low ranges. They are extremely fast and easy to set up and require little or no testing since they are features of the validated Rave system.

Configured Checks are written using Rave's Edit Check editor which uses a postfix notation (1 1 + 2 isequalto). Rave Edit Checks are flexible and very functional but every Edit Check that is written has to be specified, written and tested making it an order of magnitude more expensive to create than a Field Check. The learning curve for Configured Checks is quite steep since most of us were taught infix notation in school (1 + 1 == 2).

Lastly we have Custom Functions. These are written in C#, VB.NET or SQL and require some level of true programming expertise. Custom Functions are the fallback, the special tool in the toolbox for the truly complex situations. Besides the difficulty of hiring (and keeping) good programmers in the current technical market, Custom Functions have to be specified, reviewed for coding standards and performance impact as well as tested. Because of the level of skill required we want to write as few Custom Functions as possible.

Costs and learning curves

A graph of the learning curves for the different Edit Check logic types might look like this:

TheWall1

As we can see from the image, Field Checks have a very fast learning curve but they don't get you to a very high level of complexity. Learning Configured checks can be done quite quickly for the basics but mastery takes longer and eventually you reach the Wall where the complexity of a specified check means that you will need to use a Custom Function. We are all familiar with the most simple Custom Function:

return true;

But doing anything more complicated takes technical training.

Mitigation

The Wall represents the transition from Configured Checks to those requiring Custom Functions. We know that writing Custom Functions is expensive so we want to reduce reliance on them and so move the Wall further away. Some strategies which can be used to do this are:

  • Have standard/parameterized Custom Functions. For instance, instead of writing a custom function to compare specific date and time values, create a parameterized function which can be used for any date and time comparisons. These types of standard functions don't need the same level of validation as a bespoke Custom Function.

  • Analyse the Edit Checks you have written in the past and the queries that they generated. Research on Edit Check complexity in Medidata Rave studies found that the most complex edit checks were the ones least likely to fire. If an Edit Check requires logic so complex that it requires a bespoke Custom Function you may be better off using a manual listing or running the Edit Check as part of other back-end checks.

What we are doing

At TrialGrid we're attacking this challenge with CQL, our Clinical Query Language. CQL is an infix format for Rave Configured Edit Checks.

An Edit Check with CQL (infix) logic like:

A > B AND (C == D OR C == E)

would be translated into a Rave Edit Check (postfix) logic like:

 A
 B
 ISGREATERTHAN
 C
 D
 ISEQUALTO
 C
 E
 ISEQUALTO
 OR
 AND

In fact the translation works both ways, Rave Edit Checks can be instantly translated into CQL and CQL can be translated instantly back into Rave Edit Checks. There is no lock-in here, CQL translates into pure-Rave Edit Checks. Since infix notation is what we all learned in school, CQL is much easier to learn.

But we can do more. CQL includes a set of built-in functions that look like Rave Edit Check functions (IsEqualTo, IsPresent etc) but which automatically generate Custom Functions for you.

For example, We have been asked for an Edit Check that determines if a text field contains non-ASCII characters. Providing a standard Custom Function to do that is easy enough but we go one further and integrate it into CQL.

AETERM.IsNotAscii

to the user this looks no more complex than the standard Rave IsNotEmpty test:

AETERM.IsNotEmpty

The TrialGrid application takes care of generating the Custom Function. You'll still need some bespoke Custom Functions but fewer and fewer as time goes on and we build more into CQL.

Why wait?

But why wait? TrialGrid allows you to create these function templates yourself, extending CQL and your Rave Edit Checks with your own private functions that become part of the CQL language. Want to know if AETERM.IsSigned? or AETERM.HasOpenQuery? Add them to CQL and give your Custom Function programmers more interesting work to do.

Configuration, not Programming

By using TrialGrid, Edit Checks that would previously have required Custom Functions can now be done by configuration. Our graph looks more like:

TheWall2

The Wall is moved further away and the learning curve is made much flatter. This is more than just a nice to have, it means more productive Study Build staff and reduced costs. Another step on our journey to reduce the time and effort of Study Build by 50%.

Interested in improving your Rave study build efficiency? Contact us to find out how TrialGrid can help.

Brick wall image by FWStudio