Articles tagged with 'TrialGrid'

User Acceptance Testing Part 3

In the last part we introduced the Given..When..Then structure for writing User Acceptance Tests and showed how TrialGrid automates execution of these tests against Medidata Rave and also how the tests can be checked for validity before they are even run. In this final part we'll look at automating the generation of tests for Edit Checks. This blog post is a bit technical but if you are curious (or skeptical) about our claim of generating tests for 1,000 Edit Checks in 60 seconds, this post explains how we do it.

Alternatively, you could sign up for our free webinar https://register.gotowebinar.com/register/2929700804630029324 and watch TrialGrid in action. Seeing is believing!

Solving Constraints

Before we start looking at Edit Checks let's first look at a little puzzle. Maybe you did this kind of puzzle at school.

1
2
3
4
5
6
I have two integer numbers (whole numbers) A and B. 
A is always a positive number 
A is always an even number 
A is always greater than zero
A is always less than B
If B = 10, what are the possible values for A?

The answer is that A can be 2, 4, 6 or 8. You probably solved this puzzle in seconds.

This kind of problem is known as a Constraint Statisfaction Problem and it is well studied in Computer Science circles. My computer solved this puzzle in 0.0007 seconds - though I helped it out a bit by telling it only to look at values for A in the range 1 to 100 since I didn't want to wait while it considered the infinite set of positive even numbers. Computers can be really good at solving these problems if the set of values to search in isn't too large.

Edit Checks are Constraint Problems

Now we've warmed up on a simple constraint problem lets look at another. This is an Edit Check that uses Systolic and Diastolic blood pressure to identify pre-hypertension:

1
2
3
4
5
6
7
 IF:
    field SYSBP in form VS in folder SCREEN is greater than or equal to 120
    AND field SYSBP in form VS in folder SCREEN is less than or equal to 129
    AND field DIABP in form VS in folder SCREEN is less than 80
 THEN:
    Open Query "Subject is pre-hypertensive (SYS 120..129 and DIA < 80). Please confirm." 
    on field SYSBP in form VS in folder SCREEN to Marking group "Site from System"

In TrialGrid CQL the logic test part would look like:

1
2
3
SCREEN.VS.SYSBP[0] >= 120 
AND SCREEN.VS.SYSBP[0] <= 129
AND SCREEN.VS.DIABP[0] < 80

And in Medidata Rave QuickEdit:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
||StandardValue|VSORRES|SCREEN|VS|SYSBP|0||||||
|120|3|||||||||||
IsGreaterThanOrEqualTo|||||||||||||
||StandardValue|VSORRES|SCREEN|VS|SYSBP|0||||||
|129|3|||||||||||
IsLessThanOrEqualTo|||||||||||||
And|||||||||||||
||StandardValue|VSORRES|SCREEN|VS|DIABP|0||||||
|80|2|||||||||||
IsLessThan|||||||||||||
And|||||||||||||

Now if we wanted to work out what values would fire this Edit Check we can formulate it as a Constraints problem using the variables SYSBP and DIABP and quickly come up with a truth table of possible values and their results:


SYSBP SYSBP 120..129? DIABP DIABP < 80? Expected
120 Yes 79 Yes Fires
122 Yes 80 No Does not Fire
129 Yes 66 Yes Fires
0 No 500 No Does not Fire
125 Yes -90 Yes Fires

Okay, the last two are not likely ones you would come up with but the way we have this Edit Check formulated these are valid values. The SYSBP and DIABP Fields are defined in this study with a DataFormat of 3 which means maximum value 999 and minimum value -999. For the computer trying to solve this problem to work out which values will fire the Edit Check the constraints are:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
From Data Format:
    SYSBP must be >= -999
    SYSBP must be <= 999
    DIABP must be >= -999
    DIABP must be <= 999

From Edit Check:
    SYSBP must be >= 120
    SYSBP must be <= 129
    DIABP must be < 80

That gives us ranges for SYSBP of (120..129) and for DIABP of (-999..79) When values are within those ranges, the Edit Check will fire. The system can also work out what values will cause the Edit Check not to fire.

Solving the Constraint Problem

So if Edit Checks are just a type of Constraint Problem, how does TrialGrid solve them?

We said already that the definition of our example edit check in TrialGrid is:

1
2
3
SCREEN.VS.SYSBP[0] >= 120 
AND SCREEN.VS.SYSBP[0] <= 129
AND SCREEN.VS.DIABP[0] < 80

This format is generated automatically on import of Checks from Medidata Rave but users can also create Edit Checks using this format directly. In order to convert it to QuickEdit or back into the Architect Loader Spreadsheet format we have to parse this syntax. That turns the text into a tree-like structure in-memory:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
   - LogicalOperator("AND") 
       - LogicalOperator("AND") 
            - Expression
                - DataPoint(folder="SCREEN", form="VS", field="SYSBP" recordposition=0)
                - Comparator(">=")
                - Value("120")
            - Expression     
                - DataPoint(folder="SCREEN", form="VS", field="SYSBP" recordposition=0)
                - Comparator("=<")
                - Value("129")
        - Expression     
            - DataPoint(folder="SCREEN", form="VS", field="DIABP" recordposition=0)
            - Comparator("<")
            - Value("80")

And by examining this tree we can generate a constraint formula by looking at the DataFormats of Fields. For numeric Fields this is simple but for other types of Field this is not too hard. Checkbox fields can only have a value of 1 or 0. Computers already store dates/times as the number of seconds since January 1 1970 and fields with Data Dictionaries are already constrained to a limited set of values (e.g. "YES", "NO", "UNKNOWN")

For any Edit Check there are likely hundreds of value combinations that will fire the check and hundreds more that will not fire the check. For now TrialGrid is set to generate a single positive and a single negative test for each Edit Check using constraints that we feel are reasonable. For example, the range of future dates is infinite but the system won't generate dates out in the year 9800.

Finding invalid Edit Checks

Medidata Rave will allow you to write Edit Checks which are valid but which will never fire. A simple example:

1
SCREEN.VS.SYSBP[0].IsEmpty AND SCREEN.VS.SYSBP[0] > 100 

The SYSBP value can never be both Empty and > 100. Nobody sets out to write a check like this but in a complex Check it can happen. The constraint solver won't be able to find a solution to this problem and will fail with a warning.

Generated test case

The TrialGrid constraint solver can solve many Edit Checks and generate a positive (Check fires) and negative (Check does not fire) testcase for each in the Given..When..Then syntax.

Here is a testcase generated by TrialGrid for our example Edit Check:

Generated Test Case

TrialGrid has a batch-create option which allows you to select as many Edit Checks as you like and generate tests for them. This generally takes less than a minute for 1,000 Edit Checks.

Running tests can also be done in batch so you can take a study with no tests, generate tests and start looking at the results in about 5 minutes.

Summary

In this mini-series of blog posts we've tried to show the basics of our Automated User Acceptance Testing system. We've described what UAT is, how we format UAT tests so that they are meaningful to humans and can be executed by a computer and finally shown how the system can generate tests to get you started quickly.

That's the high level view, if you want to see this system in action please sign up for our Free Webinar on January 10 2019.

Registration at https://register.gotowebinar.com/register/2929700804630029324

User Acceptance Testing Part 2

Yesterday I described the cost and effort of creating, executing and maintaining test scripts for User Acceptance Testing. I also made the bold statement that the TrialGrid approach could reduce this effort by a factor of 10. As you might expect, we achieve this through automation. Of the three parts:

1) Create the tests
2) Execute the tests
3) Maintain the tests

The first, Create the tests, is the most technically challenging. Before we can automate creation of tests we first need to understand what tests are going to look like and how they are executed.

If we look at the test script from the first part of this series we can see that it is designed to be read and executed by a human:


Name : Check SCREENING_VISIT_DATE
Version : 1

Step Instruction Expected Result Actual Result Comments User Date Pass / Fail
1 Log into EDC using an Investigator role Login succeeds, user is at Rave User home page
2 Navigate to Site 123 User is at Subject listing for Site 123
3 Create a new Subject with the following data: Date of Birth = 10 JAN 1978 Subject is created with Date of Birth 10 JAN 1978
4 Navigate to Folder Screening, Visit Form and enter the following data: Visit Date = 13 DEC 1968 Visit Date for Screening Folder is 13 DEC 1986.
5 Confirm that Edit Check has fired on Visit Date with text "Visit Date is before subject Date of Birth. Please correct." Edit Check has fired.

Once executed the signed and dated test script will be kept as evidence to be reviewed by the Sponsor.

Clearly, if we want to automate the process of executing these scripts we need to keep that readability. We need a format for test scripts that is structured enough for software to execute but also naturally readable for humans.

Executable Specifications

Fortunately, Software Development has had a solution to this problem for nearly a decade. Behaviour Driven Development (BDD) is an approach to writing specifications and acceptance tests which can be read and understood by humans and executed by software. This is exactly what we are looking for. BDD doesn't specify any particular format for these tests but the most widely adopted standard is called "gherkin"[1].

Gherkin uses a simple syntax. Here is a short example:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
Feature: Buying things from the shop

  If a user has money they can buy things

  Scenario: Buying things
    Given Alice has $1.30 
    When she visits the grocery store
    And she buys 1 banana for $0.25
    Then she will have $1.05    
    And she will have 1 banana

This example starts with a "Feature" declaration. It's documentation telling is about the Scenario tests which follow. At line 3 we have some free text description of the tests. At line 5 we start with a Scenario called "Buying things".

Scenarios follow a format of:

Given some background information that sets up the test conditions
When some action is taken
Then I should see some result

Acceptance tests for Edit Checks

If we convert our original test script to the Gherkin Given..When..Then structure we might get:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
# Version 1.0

Feature: Testing Edit check SCREENING_VISIT_DATE

  Visit Date should not be before Date of Birth

  Scenario: The Check Fires
    Given I log into EDC using an Investigator role 
    And I navigate to Site 123
    When I create a new Subject
    And I enter "10 JAN 1978" as the Date Of Birth on the Subject Form
    And I enter "13 DEC 1968" as the Visit Date for the Visit Form in the Screening Folder
    Then I will see query text "Visit Date is before subject Date of Birth. Please correct."   

This format is a little wordy, mostly because of the need to specify Fields, Forms and Folders for Data. Gherkin includes a data table structure which can help here and we can combine it with some simple shortcuts for field selection:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
# Version 1.0

Feature: Testing Edit check SCREENING_VISIT_DATE

  Visit Date should not be before Date of Birth

  Scenario: The Check Fires
    Given I log into EDC using an Investigator role 
    And I navigate to Site 123
    When I create a new Subject
    And I enter data:
    | DataPoint                      | Value       |
    | SUBJECT.SUBJECT.DOB            | 10 JAN 1978 |
    | SCREENING.VISIT.VDATE          | 13 DEC 1968 | 
    Then I will see query text 
       """
       Visit Date is before subject Date of Birth. Please correct.
       """   

That makes the test a bit more concise.

Automated Testing

The Given..When..Then style of test is readable to a human but can also be read by software. The format isn't totally free-form. Each of the Given / When / Then steps must conform to a pattern that the software understands. Currently TrialGrid understands around 50 patterns which can check a wide range of states in Medidata Rave, not just whether a query exists. This means you can write tests which ensure that Forms and Fields are visible (or not visible) to certain EDC user Roles, to check the calculations for Derivations and to verify the results of data integrations such as IxRS feeds which enter data into Medidata Rave forms.

Tests can be read and then executed against a live Rave instance by the TrialGrid UAT module. Provide a Rave URL, study name, environment (e.g. TEST) and credentials to interact with Rave and the TrialGrid system will execute your tests against the Rave instance.

Data is entered via Rave Web Services and results verified automatically. Screenshots of the Rave page showing results of actions such as data entry and queries created can be captured for both Classic Rave and the new Rave EDC (formerly called RaveX). Results are updated in real-time as the system works through each step but you can also leave it to run unattended and view the results when it is done. TrialGrid runs these tasks in the background so you can get on with some other work.

The output is a PDF document that shows the actions taken and the results, comparing expected results against actual and providing screenshots as evidence.

Automated Maintainance

One of the challenges of test scripts is keeping them up-to-date with changes to the study. For example, imagine an Edit Check that ensures that when Race is "Other" then Race Other is specified on the demography form. The check has been programmed with the query text:

"Race is Other and Specify Other Race is missing. Please review and correct."

But in the test we are looking for:

"Race is Other and Specify Other Race is missing."

This could happen if the Specification or the programming of the Edit Check changed. But we want them to match. A human tester might be tempted to pass this test as "close enough" but automated test software looks for an exact match and will fail this test.

The TrialGrid approach can identify these kinds of problems before the test is ever run. In this example we see a warning in the Test editor which identifies that there is an issue:

Enter URL

Here we are giving the system extra hints about what are test relates to through the @EditCheck "tag" (another feature of the gherkin format) and referencing the DM001 Edit Check. This has several benefits:

  1. By setting up a link between the Edit Check and the Test we can say whether an Edit Check has been tested or not and calculate what percentage of Edit Checks and other objects are exercised by tests.

  2. The system has greater contextual knowledge about what is being tested and can help with warnings like the one shown here.

TrialGrid performs similar validation of data dictionary values, unit dictionary selections, Folder, Field and Form references and more. This capability reduces the effort of maintaining tests and supports risk-based approaches where you don't run tests because "nothing has changed and this test is still valid". This function can tell you something has changed and this test may not be valid any more.

Summary

In this second part of the three part series on Automated User Acceptance Testing we briefly covered the formatting of the tests, how they are executed and how the system helps you ensure that tests stay in synchronization with the Edit Checks, Forms and other objects that they are supposed to be testing.

These features make the execution and maintenance of tests much easier and faster but we are still left with the huge challenge of writing these kinds of tests for hundreds of Edit Checks. In the last part we'll cover how the TrialGrid system can automate that part, creating tests in seconds that would take a human hundreds of hours of effort.

Come back tomorrow. But if you want to see this system in action don't forget our Free Webinar on January 10 2019.

Registration at https://register.gotowebinar.com/register/2929700804630029324

Notes:

[1] Why "gherkin" is a story in itself but in summary, "gherkin" is the format used by a software tool called "cucumber" and it is called cucumber because passing tests are shown in green text so the idea was to get everything to "look as green as a 'cuke". I know, hilarious.

User Acceptance Testing Part 1

Back in June 2017 Andrew attended the Medidata Next Basel Hackathon and
put together a proof-of-concept for Automated User Acceptance Testing (UAT). 18 months later we're getting ready to release the first production version of this system.

What took so long? Well, to say we've been busy is something of an understatement. In that time we've built up to more than 100 Diagnostics; advanced editors for Matrices, Data Dictionaries, Unit Dictionaries and Forms; standards and library management features and a whole lot more. In all, we've released more than 150 new features of the software to our pre-release Beta site since June 2017. But we held off on UAT features because we really wanted to do it right.

What is User Acceptance Testing anyway?

But first, what is User Acceptance Testing and what are the challenges to doing it?

The term User Acceptance Testing comes from software projects. Imagine that an organization wants to automate some business process. They get their business domain experts together to create a Specification for what the software should do. This Specification is passed to the developers to build the solution. When it comes back from the developers the organization will perform User Acceptance Testing to ensure that the software meets the Specification.

In the world of Rave study building, User Acceptance Testing may be done by the Sponsor team but it may also be done by a CRO with the evidence of testing being provided to the Sponsor team. Regardless of its roots, User Acceptance Testing in our industry means the process of testing to provide evidence that the study as-built matches the Specification.

Test Scripts

The current gold standard for testing evidence is to have a Test Script which can be manually executed by a user. A typical script for the testing of an Edit Check might look something like this:


Name : Check SCREENING_VISIT_DATE
Version : 1

Step Instruction Expected Result Actual Result Comments User Date Pass / Fail
1 Log into EDC using an Investigator role Login succeeds, user is at Rave User home page
2 Navigate to Site 123 User is at Subject listing for Site 123
3 Create a new Subject with the following data: Date of Birth = 10 JAN 1978 Subject is created with Date of Birth 10 JAN 1978
4 Navigate to Folder Screening, Visit Form and enter the following data: Visit Date = 13 DEC 1968 Visit Date for Screening Folder is 13 DEC 1986.
5 Confirm that Edit Check has fired on Visit Date with text "Visit Date is before subject Date of Birth. Please correct." Edit Check has fired.

The script consists of a set of instructions each with expected results. The user performs each step and documents the actual results, adding their initials and the date/time of the execution along with any comments and whether the step passed or failed. The user may also capture screenshots or the test subject may be maintained in a test environment for the study as evidence that the Check was tested.

Risk-based Approach

Since a phase-III trial might contain more than 1,000 Edit Checks many organizations building studies will take a risk based approach. If an Edit Check comes from a Library it may have been tested once in an example study and then not tested for each new study where that Edit Check is used. Edit Checks considered low risk may not be tested in this way at all.

A risk based approach means that we're balancing a negative outcome (a migration to fix an edit check say) against the cost of a more comprehensive set of tests. If we assume 10 minutes to write and 5 minutes to execute a positive test (the check fires) and a negative test (the check doesn't fire) then 1,000 edit checks is....counts on fingers...250 hours, more than a month of effort. The work doesn't stop there of course - these test scripts would have to be maintained. If the Query Message of an Edit Check is changed then the test script should also be updated to reflect that and if the View or Entry restrictions for a Field are changed then the script should be checked to ensure that the user type executing the test (e.g. Investigator or Data Manager) can still see the query. Even then, the test scripts are going to be executed once because the cost/effort of re-running these scripts because of a change to the study is just too prohibitive.

In summary, we would create a test script for every Edit Check if we could but it is a huge undertaking to:

1) Create the tests
2) Execute the tests
3) Maintain the tests

The TrialGrid Approach

"Doing UAT Right" means taking the work out of each of these steps. It is no good having a solution that executes the tests quickly if it doesn't also reduce the effort of creating the tests. Having fast authoring and execution of tests doesn't help if the resulting tests can't be easily maintained.

We are confident that with the new TrialGrid approach you can reduce the overall effort of creating, running and maintaining scripted test cases by at least a factor of 10. That means that 250 hours for 1000 edit checks would be reduced to 25 hours, at a rate of $100/hr that's a saving a $22,500 saved per study.

Inconceivable? Impossible? Unlikely? Why not join our free webinar on Thursday January 10th to find out more:

https://register.gotowebinar.com/register/2929700804630029324

Or if you can't wait until then. Come back tomorrow for another post on the TrialGrid approach to UAT.