Regulatory Common Pitfalls of RECIST 1.1 Application in Clinical Trials Abstract: Revised Response Evaluation Criteria in Solid Tumours (RECIST 1.1) came into effect in 2009 in order to address the issues and limitations of RECIST 1.0. While RECIST 1.1 has addressed many challenges with the previous version, some drawbacks still remain. More than five years of clinical trials monitoring across numerous international sites has revealed that investigators across all countries encounter many and similar difficulties in interpreting RECIST 1.1, and tend to make the same mistakes. These errors, if not timeously identified and corrected, lead to increased variability of data across the trial sites and may affect efficacy endpoints. In our opinion, proactive training, focused on typical questions/ mistakes, may help to increase the accuracy of the RECIST data extraction at the trial sites. Background Since the global requirement of randomised and controlled clinical trials, over the past four decades, for new drug approvals in oncology, there has been a challenge to uniformly apply imaging-based tumourspecific response criteria with the purpose of objectively assessing treatment response in cancer trials. In 2009 a revised RECIST 1.1 (Response Evaluation Criteria in Solid Tumors) was developed to address the drawbacks of previously applied response systems such as the WHO or the initial version of RECIST (1.0). The major changes, from RECIST 1.0 to RECIST 1.1, included the number of lesions to be assessed (reduced from a maximum of 10 to a maximum of five in total (and from five to two per organ, at maximum)). In RECIST 1.1, target lesions (TL) must be at least 10 mm in the longest diameter. Furthermore, assessment of pathological lymph nodes (LN) was incorporated with a short axis of 15mm or more, and these could also be considered assessable as target lesions. The key features of RECIST 1.1 versus RECIST 1.0 are summarised in Table 1. As seen in Table 1, RECIST 1.1 also clarified progressive disease (PD) in several aspects. In addition to the RECIST 1.0 PD definition of an increase in sum by 20%, a 5 mm absolute increase was further added to guard against over-calling PD (progressive disease) when the increase in total sum is very small. Finally, interpretation of the new FDG-PET CT scan in the detection of new lesions was included. While RECIST 1.1 provides a standardised posttreatment monitoring with clearly defined outcome, it could be applied incorrectly. Thereby, a false radiological interpretation may occur, resulting in negative implications not only for clinical trial results but also for patient care. Findings and Procedure Details Having maintained a database of frequently asked questions on RECIST 1.1 applications received from 26 Journal for Clinical Studies
clinical trial sites since the revised RECIST criteria were implemented in 2009, we have identified the most frequent mistakes in RECIST 1.1 interpretation and grouped them into five major categories (Table 2). Our database contains questions collected from more than fifty clinical trials in several malignancies, namely: lung, colorectal, breast, prostate, pancreatic, gastrointestinal and ovarian cancer. We have identified that the most common mistakes are related to selection of inappropriate lesions at baseline, inaccurate reassessment of both target lesions and non-target lesions, difficulty in assessment of small new lesions, evaluating lymph nodes both at baseline and at follow-up, and substantial deviation from scanning schedules. Despite a clear definition, in RECIST 1.1, of target lesion number and size, we could subsume that almost half of all the mistakes occur under the first category pertaining to an incorrect number or identification of target lesions selected at baseline. The most common mistake was to select more than two target lesions per organ in cases when only one or two organs were involved in the malignant process. Another mistake was to select all suitable target lesions but less than five, and then at a follow-up timepoint to add additional target lesions in Table 1 Highlights of revised RECIST 1.1, summary of major changes RECIST 1.0 to RECIST 1.1: RECIST 1.0
RECIST 1.1
Number of target lesions
10 lesions, 5 per organ
5 lesions, 2 per organ
Assessment of pathological lymph nodes
Not mentioned
Nodes with a short axis of â&#x2030;Ľ 15 mm are considered measurable and assessable as target lesions. The short axis measurement should be included in the sum of lesions in calculation of tumor response. Nodes that shrink to <10 mm short axis are considered normal.
Definition of disease progression
20% increase in target lesions sum
20% increase in target lesions sum plus 5 mm absolute increase
Detection of new lesions
Not specified
Section on detection of new lesions, including the interpretation of FDG-PET scan assessment is included
Definition of unequivocal progression of non-target lesions
Unequivocal progression considered as PD
More detailed description of unequivocal progression to indicate that it should not normally trump target disease status. It must be representative of overall disease status change, not a single lesion increase
Imaging guidance
Not included
Includes a new imaging appendix with updated recommendations on the optimal anatomical assessment of lesions
Volume 7 Issue 3
Regulatory Table 2 Five major categories of typical RECIST 1.1 pitfalls (1) selecting target lesions at baseline
a) wrong number of the target lesions selected at baseline or change of the number of target lesions at a follow-up timepoint b) selecting a target lesion with non-reproducible measurements, most often a lesion in a hollow movable, distensible organ, e.g. a bowel wall lesion c) selecting of a target lesion that is not clearly a metastasis d) selecting a lymph node lesion smaller than 15 mm in short axis as a target or using long axis measurement for lymph nodes
(2) reassessing target lesions
a) premature declaration of progressive disease based on 20% increase in total sum non-metering the 5 mm absolute increase rule or based on 20% increase in a single lesion instead of total sum b) assessment of a timepoint response with reference to the previous timepoint rather than to baseline and nadir c) trouble assessing target lesions response in situations where one of the lesions could not be visualized or resolved
(3) reassessing non-target lesions
a) premature declaration of progressive disease based on change in size of non-target lesions by analogy with the target lesions assessment rules b) trouble interpreting changes in volume or reappearance of a pleural effusion or ascites that was present at baseline
(4) assessing for new lesions
a) new lesion is not acknowledged as progressive disease if its size is smaller than the measurable lesion threshold
(5) substantial deviations from the scanning schedule
a) critical deviations from the tumor assessment visit window outlined in the study protocol b) failure to perform end of treatment RECIST 1.1 assessment in case of symptomatic deterioration
case non-target lesions grew or new lesions appeared. For example, in one case we discovered that all the five target lesions assigned were liver metastases even though RECIST 1.1 clearly states: “When more than one measurable lesion is present at baseline all lesions up to a maximum of five lesions total (and a maximum of two lesions per organ) representative of all involved organs should be identified as target lesions and will be recorded and measured at baseline.” Thus RECIST 1.1 is clear that when patients have only one or two organ sites involved, a maximum of two and four lesions respectively could be recorded. An additional common error discovered is that while all suitable target lesions, but with a total numbering less than five, were identified at baseline, at a followup timepoint new target lesions were added if non-target lesions grew or new lesions appeared. One extra issue identified is that of lesions in paired organs like lungs or adrenals. While RECIST 1.1 is silent on this issue, a commonly accepted approach is to select not more than two target lesions in paired organs, i.e. not more than two in lungs, and not more than two in adrenals, etc. The same common approach is relevant to lymph node lesions, i.e. not more than two target lesions in lymph nodes 5. See Table 3 Case 1. Another mistake found was declaring lesions with nonreproducible measurements, most often a lesion in a hollow movable, distensible organ, e.g. a bowel wall lesion, as target lesions. This error mostly occurred in 28 Journal for Clinical Studies
gastrointestinal cancer trials. According to RECIST 1.1 guidelines, “Target lesions should be selected on the basis of their size (lesions with the longest diameter); be representative of all involved organs, but in addition should be those that lend themselves to reproducible repeated measurements.” Measurements of a lesion located in a distensible or hollow organ are not expected to be reliably reproduced and thus, even though gastrointestinal lesions are not truly non-measurable, many of them should not be described as target lesions as their longest diameter can hardly be defined on single plane images. The assessment of lesions in hollow organs may demonstrate a marked variability in size depending on the filling status of the organ making the lesion nonmeasurable. One more issue relates to selecting a pseudolesion as target. These lesions are not part of the malignant clone. Examples of such pseudolesions are incidental adrenal masses, hemangiomas, benign ovarian tumours and cysts. Assignment of non-malignant lesions as target at baseline results in incorrect follow-up evaluation of an imaging-based endpoint and outcome of response assessment. It is, however, NOT recommended that a wrongfully included pseudolesion at the baseline sum is simply later excluded from follow-up sums since this biases in favour of therapy response. In such situations when a pseudolesion has been identified in follow-up, a new baseline must be created after the lesion is proved to be non-malignant. Assignment of lymph node lesions as target merits particular attention. This is because RECIST 1.1 introduced rules for lymph nodes to be considered pathological for the first time: “…pathological nodes which are defined as measurable and may be identified as target lesions must meet the criterion of a short axis of ≥ 15mm by CT scan. Only the short axis of these nodes will contribute to the baseline sum.” Since lymph nodes are normal anatomical structures, which could be seen on CT scans even if they are normal and not involved by cancer, RECIST 1.1 discriminated between normal and pathologic nodes by size. Lymph nodes sized ≥ 1 cm in the short axis diameter are considered malignant, according to RECIST 1.1, and if they are smaller than 15 mm in the short axis they cannot be considered as target lesions. Benign nodes are more likely to be ovoid and malignant infiltration makes them more rounded. Thus, if the ratio of the long axis to short axis diameter is less than two the lymph node, there is a higher probability that the lymph node is malignant. The reason that the short axis diameter of a lymph node should be measured for deeming them as pathological or for determination of target lesion size is because it has been demonstrated that that the lymph nodes are more likely to become rounder (i.e. increase in the short axis) in case of malignant infiltration. The short axis diameter is measured perpendicular to the longest diameter of the lymph node.
Volume 7 Issue 3
Regulatory In spite of the RECIST 1.1 guidelines for pathological lymph nodes, in the presence of non-nodal lesions suitable to be target, it is advisable to register malignant lymph nodes as non-target as there are more challenges and controversies in assessment of the nodal lesions response. Unfortunately, and despite the agreed upon guidelines by the experts, 10-20% of normal-sized locoregional nodes contain tumour deposits and up to 30% of enlarged nodes demonstrate only inflammatory hyperplasia. Also, in some tumours, the incidence of metastatic disease within normal-sized nodes is greater than others. For example, in patients with colorectal cancer, 90% of nodal metastases occur in nodes less than 10mm.
occurs occasionally. Since tumour therapy may induce tumour necrosis, at times tumour size could be maintained or even increased even though the living tumour mass may be lesser. In such situations, the actively perfused part of the tumour may actually be lower even though necrotic and edematous areas may show increase. Evaluation criteria like Choi which include tumour density following contrast media application have been proposed as the criterion of choice for the determination of tumour response in such situations. 2, 3 See Table 3 Case 3. Besides, investigators had trouble assessing target lesionsâ&#x20AC;&#x2122; response in situations where only one of the lesions could not be visualised. At that, the approach is different in situations when a target lesion is not visible on the scan at a timepoint because of some technical reason or being shadowed by another pathological
An additional issue pertains to bony lesions. RECIST 1.1 suggests that lytic or mixed lytic-blastic bone lesions with identifiable soft tissue components can be considered as measurable target lesions. However, pure bone lesions, which are much more common, usually do Table 3 ĐĄase reports not show any change in size Case 1: A patient had two lymph node lesions at baseline that corresponded to a measurable nodal under therapy and thus do lesions definition. The radiologist selected both lesions as target. At the re-assessment time point not qualify as target lesions. on week 32, two more lymph node lesions appeared. Instead of declaring progressive disease the See Table 3 Case 2. The second most common category of errors we found is re-assessment of target lesions in follow-up. This included premature declaration of progressive disease based on 20% increase in total sum but not triggering the required 5 mm absolute increase rule. Additional mistakes in this category included declaring PD based on 20% increase in a single lesion instead of total sum, assessment of a timepoint response with reference to the immediately previous timepoint rather than to the guidelines recommended comparison with baseline or the nadir, whichever is smaller. Also, some clinical sites wrongfully understand stable disease category as no change comparing to baseline, rather than a calculated category which numerical value shows neither sufficient increase to constitute progressive disease nor sufficient shrinkage to represent partial response. A genuine issue with RECIST 1.1 also www.jforcs.com
radiologist added these new lesions as target thus violating several RECIST 1.1 rules at once. As a result of the mistake new lesions were not considered progressive disease, number of target lesions was changed at a post baseline assessment and more than 2 target lesions in one organ (lymph node) were selected. Case 2 Radiologist described groups of slightly enlarged inguinal lymph nodes suspicious of metastatic involvement in a colorectal cancer patient. The lesions were smaller than 10 mm in their short axis, yet by their round shape they were most likely malignant. According to RECIST 1.1 these lesions could not be included as pathological and should have been considered normal because of the size criterion even though the radiologist, in the radiological summary, indicated the lymph nodes were suspicious of metastases. At follow up, at week 16, the lesions started to grow and became more than 10 mm in their short axis. According to RECIST 1.1 the patient had to be discontinued from the study due to new lesions appearance even though target lesions showed no progression and the lymph nodes in question were suspected to be malignant by the radiologist. Case 3 In a patient with colorectal cancer and multiple metastases in liver and lungs the radiologist described all present lesions in great details including accurate measurements of all the target and non-target lesions at every visit. The investigator performing RECIST 1.1 assessments at each visit tried to calculate response of both target and non-target lesions based on their measurements. In follow up, at week 32, the investigator erroneously determined PD because of one initially nontarget lesion which increased in size by 20 % and showed more than 5 mm absolute increase. She made decision to discontinue the patient from the study based on an increase in a single non-target lesion that should have been assessed qualitatively only and never to be measured as per the criterion meant by the RECIST 1.1 for target lesions only. Case 4 Patient with SCLC had two target lesions selected in the right lung. At week 16 he developed collapsed lung segment that shadowed one of the target lesions on CT. In such a situation the case should have been considered non-evaluable and the patient followed for progressive disease only. However the investigator mistakenly decided to replace the target lesion that could not be anymore visualized with a different lesion in different location and continued response assessment. In general it is not recommended to select target lesions in anatomical areas that could potentially become non-evaluable, like this lesion in a collapsed lung. Case 5 A patient with breast cancer had five target lesions at baseline. In the course of the treatment one of the lesions resolved and the rest four lesions significantly decreased in size showing partial response. The investigator stopped documenting the resolved lesion among the target ones and added another lesion suitable to be target instead thus affecting response assessment.
Journal for Clinical Studies 30
Regulatory process such as a collapsed lung versus only one target lesion resolved because of a good response to treatment. The first situation makes the case inevaluable at this timepoint. This is clearly described in the RECIST 1.1 guidelines under paragraph 4.4.2. 1 Missing assessments and inevaluable designation: “When no imaging/ measurement is done at all at a particular time point, the patient is not evaluable (NE) at that time point. If only a subset of lesion measurements are made at an assessment, usually the case is also considered NE at that time point, unless a convincing argument can be made that the contribution of the individual missing lesion(s) would not change the assigned time point response. This would be most likely to happen in the case of PD. For example, if a patient had a baseline sum of 50mm with three measured lesions and at follow-up only two lesions were assessed, but those gave a sum of 80 mm, the patient will have achieved PD status, regardless of the contribution of the missing lesion.” However, if one of the target lesions resolved, the case is evaluable at the timepoint with the default value of 0 mm recordered for the resolved lesion and response assessed based on the total sum of diameters of the target lesions that are still visible. See Table 3 Cases 4 and 5. Errors in re-assessment of non-target lesions are the third category. These errors, in our database, were most often due to premature declaration of progressive disease based on change in size of non-target lesions by analogy with the target lesions assessment rules. This error comes from misunderstanding of the basic RECIST 1.1 concept that lies in division of all the lesions detected in the patient`s body by target, to be measured and assessed quantitatively throughout the course of treatment, and non-target, which are to be assessed qualitatively only and are never measured independently of whether they are measurable or not. In cases when investigators perform assessment based on radiological summary containing non-target lesions measurements, they should disregard the measurements for the purpose of RECIST 1.1 assessment. Another common difficulty in this category pertains to interpretation of changes in volume or reappearance of a pleural effusion or ascites that was present at baseline. Reappearance or increase in volume of an effusion or ascites present at baseline does not represent progressive disease. Change in volume of pre-existing exudate may be due to several reasons, including reaction to the anticancer therapy itself. The fourth category in our classification is perception of a new lesion by the reader. This, in our experience, is the most common reason of inter-reader variability in RECIST 1.1 assessment. Some readers do not acknowledge a new lesion if its size is smaller than the measurable lesion threshold. This contradicts the RECIST 1.1 definition of a new lesion: “A lesion identified on a follow-up study in an anatomical location that was not www.jforcs.com
scanned at baseline is considered a new lesion and will indicate disease progression…While there are no specific criteria for the identification of new radiographic lesions, the finding of a new lesion should be unequivocal: i.e. not attributable to differences in scanning technique, change in imaging modality or findings thought to represent something other than tumor.” In other words, any, even very small, unequivocal new lesion, independently of its size or location, indicates progressive disease. The last error category we found and describe in this article relates to substantial deviations from the scanning schedule. Overall, strict compliance to the scanning schedule is critical if one wants to obtain accurate study results. Critical deviations from the tumour assessment visit window outlined in the study protocol are considered as major protocol deviations because they may exclude the patient from the final statistical analysis and thus compromise the trial results. In situations when treatment cycle is delayed, for example due to toxicity, and treatment visits are re-scheduled, tumour assessment visits should continue in strict compliance with the initial schedule. The investigators often wrongfully ignored the requirement and delayed the scanning visit in order to conduct it at the same time as the delayed treatment visit to avoid additional patient visits to clinic. However, a gradual shift in scanning schedule to bring it in compliance with the treatment schedule threatens the clinical trial results. From our point of view, the investigators often make this error simply because there was not enough emphasis placed in the study protocol on the importance of sticking to the scanning schedule that should be taken into account at the time of the protocol writing. Finally, very often, the investigators fail to perform end of treatment RECIST 1.1 assessment in case of symptomatic deterioration. Symptomatic deterioration even being the reason for stopping therapy is not a descriptor of an objective tumour response, and every effort should be made to document objective progression even after discontinuation of treatment. Conclusion In clinical trials with the imaging-based primary endpoints, investigators, CRAs, and radiologists should be aware that even minor mistakes contradicting RECIST 1.1 guidelines can dramatically influence both the treatment decision and the clinical trial results. We found a substantially high number of common errors across numerous international clinical sites. Proper training of all the study stakeholders about the correct application of the current RECIST 1.1 criteria can avoid possible pitfalls and limitations, thereby lowering the negative implications for patient management and the clinical trials results. References 1. Eisenhauer, E. A., Therasse, P., Bogaerts, J., Schwartz, L.H., Sargent, D., Ford, R., Dancey, J., Arbuck, S., Gwyther, S., Mooney, M., Rubinstein, L., Shankar, Journal for Clinical Studies 31
Regulatory L., Dodd, L., Kaplan, R., Lacombe, D., Verweij J. New response evaluation criteria in solid tumours: Revised RECIST guideline (version 1.1). European Journal of Cancer. 45, 228–247 (2009). 2. Choi, H. et al. Correlation of computed tomography and positron emission tomography in patients with metastatic gastrointestinal stromal tumor treated at a single institution with imatinib mesylate: proposal of new computed tomography response criteria. J Clin Oncol. 25(13), 1753-9 (2007). 3. Van der Veldt, A. A., et al. Choi response criteria for early prediction of clinical outcome in patients with metastatic renal cell cancer treated with sunitinib. Br J Cancer. 102(5), 803-9 (2010). 4. Sharma, M. R., Maitland, M. L., Ratain, M. J. RECIST: No Longer the Sharpest Tool in the Oncology Clinical Trials Toolbox. Cancer Research. 72 (20), October 15, 2012. 5. Darkeh, M. H. S. E, Suzuki, C., Torkzad, M. R. The minimum number of target lesions that need to be measured to be representative of the total number of target lesions (according to RECIST). The British Journal of Radiology. 82, 681–686 (2009). 6. Moskowitz, C. S., Jia, X., Schwartz, L. H., Gönen, M. A simulation study to evaluate the impact of the number of lesions measured on response assessment. European Journal of Cancer 45, 300–310 (2009). 7. Verweij, J., Therasse, P., Eisenhauer, E. on behalf of the RECIST working group Cancer clinical trial outcomes: Any progress in tumour-size assessment? European Journal of Cancer. 45, 225 –227 (2009) 8. Kekelidze, M., Lodise, P., Tozakidou, M., Seitel, M., Bongartz, G. M., Basel, C.H., Rome, I.T., Heidelberg, D.E. 10 most frequently made mistakes with RECIST 1.1: how Radiologist can fail - and how to avoid them. European Society of Radiology. (2014). 9. Schwartz, L. H., Bogaerts, J., Ford, R., Shankar, L., Therasse, P., Gwyther, S., Eisenhauer, E. A. Evaluation of lymph nodes with RECIST 1.1. European Journal Of Cancer. 45, 261 –267 (2009). 10. Bogaerts, J., Ford, R., Sargent, D., Schwartz, L. H., Rubinstein, L., Lacombe, D., Eisenhauer, E., Verweij, J., Therasse, P., for the RECIST Working Party Individual patient data analysis to assess modifications to the RECIST criteria. European Journal of Cancer. 45, 248–260 (2009). 11. Ford, R., Schwartz, L., Dancey, J., Dodd, L. E., Eisenhauer, E. A., Gwyther, S., Rubinstein, L., Sargent, D., Shankar, L., Therasse, P., Verweij, J. Lessons learned from independent central review. European Journal Of Cancer. 45, 268–274 (2009) 12. Ganeshalingam, S., Koh, D-M. Nodal staging. Cancer Imaging. 9, 104-111 (2009). 13. Ratain, M. J., Sargent D. J. Optimising the design of phase II oncology trials: The importance of randomization. European Journal of Cancer. 45, 275– 280 (2009). 14. Dancey, J. E., Dodd, L.E., Ford, R., Kaplan, R., Mooney, M., Rubinstein, L., Schwartz, L. H., Shankar, L., Therasse, P. Recommendations for the assessment of www.jforcs.com
progression in randomised cancer treatment trials. European Journal Of Cancer. 45, 281–289 (2009). 15. Sargent, D. J., Rubinstein, L., Schwartz, L., Dancey, J. E., Gatsonis, C., Dodd, L. E., Shankar, L. K. Recommendations for the assessment of progression in randomised cancer treatment trials. European Journal of Cancer. 45, 290–299 (2009). 16. Forrest, J. V., Friedman, P. J. Radiologic Errors in Patients With Lung Cancer. The Western Journal of Medicine. 13, (485-490) 1981.
Iryna Teslenko, M.D., MSc, MBA is Director of Medical Monitoring & Consulting at PSI CRO AG. She is a board-certified physician and holds a Master of Science degree in radiology diagnostics. She has more than 11 years of experience in clinical research as a clinical research professional. She is also the author/co-author of more than 20 publications. E-mail: iryna.teslenko@psi-cro.com
Maxim Belotserkovskiy, M.D., Ph.D., MBA is Head of Medical Affairs at PSI CRO AG, and has board certifications in internal medicine, rheumatology, anaesthesiology and intensive care, and haemodialysis, Certified Associate Professor of Pathological Physiology. He has more than 25 years of experience in clinical research as an investigator and clinical research professional. He is also the author/coauthor of more than 140 publications. E-mail: maxim.belotserkovsky@psi-cro.com
Akhil Kumar, MD (USA), is a board-certified haematologist and oncologist in the USA. He currently works as an independent consultant and has been a medical director at the PSI CRO AG, MGI Pharma, and GlaxoSmithKline. Prior to joining the pharmaceutical industry in 2005, PSI, he was an assistant professor of medicine and medical oncology at the Rutgers Cancer Institute of New Jersey (CINJ). He is an author of 11 publications. Email: akhil.kumar@psi-cro.com Journal for Clinical Studies 32