# Chapter 18: Presentation formats

# Construction of score chart

Below we discuss the construction of the score chart.

### 1. Multiply and round regression coefficients of binary predictors and dummy variables of categorical predictors

The first step is to multiply regression coefficients and round them to scores. A simple approach is to multiply coefficients by 10. But we can also search for smaller rounded scores. For example, the coefficients of the binary predictors Teratoma, Pre.AFP, and Pre.HCG were quite similar (approximately 0.8 for the penalized coefficients, Table 18.4). We multiply by 10/8 to give these 3 predictors each a score of 1. In general, we can often find lower numbers for multiplication that still allow for a refined prediction.

The R command for a search of scaling factor for the penalized coefficients was:

for (i in seq(1, 10, by=0.5)) {

cat("Multiply by:", 10/i, "i=", i, "Coefs:", round(full.pen$coefficients[-1] * 10 / i), "\n") }

### 2. Search for score for continuous predictors

Three continuous predictors are considered in the prediction model: LDH, size, and reduction. We decided for a log transformation (LDHst), square root transformation (Post.size), and linear (Reduction in size during treatment). These continuous predictors need to be rescaled in such a way that a **1 point in score corresponds approximately to a 10/8 increase in log odds**.

We first treat these predictors as continuous variables, and later consider categorization as a further simplification. We go through several steps: a) score 0: convenient? With a predictor such as age, it is often strange to have a score of 0 at 0 years. We may need to change the reference point to a sensible value, which we subtract from the original value.

Score points: We aim to find values of the predictors where the scores are +1, +2, etc points, depending on the distribution of predictor values and the predictor effects.

Intermediate scores: We have to think about intermediate values, which have e.g. a score of +0.5 points. Other values can then be scored by linear interpolation.

We consider these 3 steps for the 3 continuous predictors in the testicular cancer histology prediction model (LDH, postchemotherapy size, and reduction in size).

log(LDHst)

sqrt(size)

I(reduction)

### 3. Estimate the common multiplication factor

After finding a suitable set of weights we need to find the multiplication factor for the scores. In the testicular cancer model we round after multiplication with the factor 10/8 in a logistic model. To compensate for this multiplication, the common multiplication factor was found to be 0.86 in a fit with this rescaled linear predictor. This factor can be multiplied with the shrinkage factor (in this case: 0.95) to obtain shrunk predictions (shrunk.beta=0.81).

The calculation is as follows , where we omit the intercept from the set of coefficients:

score.fit <- lrm(NEC~Teratoma+Pre.AFP+Pre.HCG+LDHr+SQPOSTr+REDUC5, data=n544,x=T,y=T)

rounded.lp <- score.fit$x %*% rep(1,6) # All scores a weight of 1

# multiplier makes the rescaled factors for logistic formula

multiplier <- lrm.fit(y=score.fit$y, x=rounded.lp)$coef[2] # this finds: 0.86

shrunk.beta <- 0.95 * multiplier # shrinkage * 0.86

shrunk.beta # shrunk multiplier for better predictions, 0.81

### 4. Estimate the intercept, check *c* statistic, and present as score chart

*c*statistic, and present as score chart

We estimate the intercept corresponding to the scores, using the rounded coefficients multiplied with the shrunk.beta coefficient as an offset variable:

lrm.fit(y=score.fit$y, offset=shrunk.beta*rounded.lp) # The formula is: lp = –1.94 + 0.81*score.

We check the deterioration in discriminative performance. The *c* statistic of the original model was **0.839**; uniform shrinkage does not affect this value. We find that the *c* statistic with rounded scores and using continuous predictors is only 0.001 lower (**0.838**).

The final score chart can be constructed in several ways. Especially, the presentation of values for continuous predictors is possible with scores horizontally, or vertically. A more general approach is to use the **rms::nomogram()** function

## R script for testicular cancer case study

Several variants to present the testicular cancer model are shown in this R script, starting with the **n544** data set. Enjoy!

### Derive a table with risk estimates (Br J Cancer 1996)

## make a score from 0 - 5 for a simple table

## and leave postsize on one axis; exclude increase in size (low p(nec))

score5 <- n544$Teratoma+n544$Pre.AFP+n544$Pre.HCG+n544$PRELDH+n544$REDUCr

score5 <- ifelse(n544$REDUCr<0,0,score5)

describe(score5)

# Simple coding: 5 categories for postsize (no difference 20-30 and 30-50 mm)

n544$POST5 <- ifelse(n544$SQPOST<=sqrt(10),0,

ifelse(n544$SQPOST<=sqrt(20),1,

ifelse(n544$SQPOST<=sqrt(30),2,

ifelse(n544$SQPOST<=sqrt(50),3,4))))

POST5 <- n544$POST5[n544$REDUCr>-1]

y <- n544$NEC[n544$REDUCr>-1]

x <- ftable(as.data.frame(cbind(score5,POST5,y)))

ftable(x, col.vars = c(1,3))

# Predictions from simplified lrm model

full.simple2 <- lrm(y~as.factor(POST5)+score5,x=T,y=T,se.fit=T)

# Calculate predicted probabilities + 95% CI

x <- cbind(full.simple2$x,

round(plogis(full.simple2$linear.predictors),2),

round(plogis(full.simple2$linear.predictors-1.96*full.simple2$se.fit),2),

round(plogis(full.simple2$linear.predictors+1.96*full.simple2$se.fit),2))

unique(x[order(full.simple2$x[,1],full.simple2$x[,2],full.simple2$x[,3],full.simple2$x[,4]),])

## C stat for this Table; which makes 5 groups (4 cut-offs in ROC curve)

n544$simple.cat <- ifelse(n544$POST5==3 & score5==4,70,

ifelse(n544$POST5==3 & score5==5,80,

ifelse(n544$POST5==2 & score5==3,60,

ifelse(n544$POST5==2 & score5==4,80,

ifelse(n544$POST5==2 & score5==5,90,

ifelse(n544$POST5<2 & score5==2,60,

ifelse(n544$POST5<2 & score5==3,70,

ifelse(n544$POST5<2 & score5==4,80,

ifelse(n544$POST5<2 & score5==5,90, 50)))))))))

describe(n544$simple.cat)

rcorr.cens(n544$simple.cat, n544$NEC)

#########################################

### Graph for impact of mass size (Radiology 2000)

## graph pre - post with lines for score 1 - 4

score4 <- n544$TER+n544$PREAFP+n544$PREHCG+n544$PRELDH

lrm(n544$NEC ~ as.vector(score4) + as.vector(n544$REDUC10))

full.simple3 <- lrm.fit(y=n544$NEC,x=cbind(score4,n544$SQPOST,n544$REDUC10))

full.simple3

# reduc = (pre-post) / pre; reduc = 1 - post/pre; reduc-1 = -post/pre; pre = -post/(reduc-1)

n544$presize <- - n544$SQPOST^2 / ifelse((n544$REDUC10/10 - 1) != 0,(n544$REDUC10/10 - 1),-.04)

describe(n544$presize)

# Now calculate iso probability lines: e.g. prob -70%,

# depending on postsize (horizontal), and presize (vertical)

cbind(n544$presize, n544$SQPOST^2, plogis(full$linear.predictor))

# calculate PRESIZE from SQPOST, for given probabilities

presize = f(sqpost, plogis(full))

pre = -post/(reduc10-10) # for reduc10<10

# range sqpost from 2 to 50; score 4 0 - 4

full.simple3$coef

x <- as.matrix(cbind(rep(0:4,49), rep(sqrt(2:50),5)))

lp.simple3 <- x %*% full.simple3$coef[2:3] + full.simple3$coef[1]

lp.simple3

# now calculate reduc10 with condition p=70% etc

# Solve equation qlogis(.7) = lp.simple3 + full.simple3$coef[4] * n544$REDUC10

reduc10.70 <- (qlogis(.7) - lp.simple3) /full.simple3$coef[4]

# Calculate presizes: pre = -post/(reduc10/10-1)

presize.70 <- - rep(2:50,5) / (reduc10.70/10 - 1)

# in 1 formula for efficiency

presize.90 <- - rep(2:50,5) / (((qlogis(.9) - lp.simple3) /full.simple3$coef[4])/10 - 1)

presize.80 <- - rep(2:50,5) / (((qlogis(.8) - lp.simple3) /full.simple3$coef[4])/10 - 1)

presize.60 <- - rep(2:50,5) / (((qlogis(.6) - lp.simple3) /full.simple3$coef[4])/10 - 1)

presize.50 <- - rep(2:50,5) / (((qlogis(.5) - lp.simple3) /full.simple3$coef[4])/10 - 1)

x <- as.matrix(cbind(rep(0:4,49), rep(sqrt(2:50),5),rep(2:50,5),lp.simple3,reduc10.70,

presize.70,presize.90,presize.80,presize.60,presize.50 ))

colnames(x) <- Cs(score4, sqpost,postsize,lp,reduc1070,

presize70,presize90,presize80,presize60,presize50)

x <- as.data.frame(x)

x[1:10,]

x[x<0] <- NA

# Try to make some plots; isoprobability lines for different scores, x-axis=postsize

par(mfrow=c(2,2))

for (i in 1:4) {

plot(y=x[x$score4==i,"postsize"],x=x[x$score4==i, "presize50"], main=paste("Score=",i,sep=""),

ylab='Postchemo size (mm)', xlab="Prechemo size (mm)",

xlim=c(0,100), ylim=c(0,50),type="p", pch="50%", axes=F,las=1)

axis(side=2,at=c(0,10,20,30,50))

axis(side=1,at=c(0,20,50,100))

points(y=x[x$score4==i,"postsize"],x=x[x$score4==i, "presize60"], pch="60%")

points(y=x[x$score4==i,"postsize"],x=x[x$score4==i, "presize70"], pch="70%")

points(y=x[x$score4==i,"postsize"],x=x[x$score4==i, "presize80"], pch="80%")

points(y=x[x$score4==i,"postsize"],x=x[x$score4==i, "presize90"], pch="90%")

} # end loop over score 1 - 4

#########################################

### Meta-model with tree presentation (Int J Cancer 1999)

## Start with dichotomizing predictions from full as <70% vs >=70%

n544$predfull <- plogis(full$linear.predictor)

n544$predhigh <- ifelse(plogis(full$linear.predictor)<.7,0,1)

rcorr.cens(n544$predhigh, n544$NEC)

table(n544$predhigh, n544$NEC)

# So: low prediction, NEC 129:273; high prediction, NEC 116:26

273 /(273+26) # sensitivity for residual tumor/teratoma

116 /(116+129) # specificity for necrosis

# Make tree model for this outcome

library(rpart)

options(digits=3)

tree.orig <- rpart(NEC ~ Teratoma+Pre.AFP+Pre.HCG+LDHst+Post.size+Reduction,

data=n544)

plot(tree.orig)

text(tree.orig, use.n=F)

tree.meta <- rpart(predhigh ~ Teratoma+Pre.AFP+Pre.HCG+LDHst+Post.size+Reduction, data=n544)

plot(tree.meta)

text(tree.meta, use.n=F)

## Make smooth tree presentation; classification with reduction, teratoma, AFP

score3 <- as.numeric(n544$Reduction>70)+as.numeric(n544$Teratoma==1)+as.numeric(n544$Pre.AFP==1)

n544$tree.cat <- ifelse(n544$Reduction>50 & score3>=2,1,0)

describe(n544$tree.cat)

rcorr.cens(n544$tree.cat, n544$NEC)

table(n544$tree.cat, n544$NEC)

# So: low prediction, NEC 134:264; high prediction, NEC 111:35

264 /(264+35) # sensitivity for residual tumor/teratoma

111 /(111+134) # specificity for necrosis

## End illustrations of presentation formats