Dec 26 2007

Option Value Calculations

The use of options is an extension to traditional NPV for calculating what is termed “Expanded” or “Strategic NPV.”   In both approaches the point of reference is always the current point in time or the effective date of appraisal.  Both approaches are therefore limited by current point in time perceptions of future risk.    The difference is that with Strategic NPV, there is an assumption that should a possible negative situation occur in the future, it will be avoided if possible.  This is called the “option,”  – the option not to develop or not to proceed down a negative path.  The “Option Value” is simply the difference between the Strategic NPV and the Passive NPV, that is:

                             Option Value = Strategic NPV – Passive NPV

In simple concrete terms, consider the purchase of land for possible development of a subdivision in one year.  There is reason to believe that there is a 30% chance that housing prices will fall to a point in one year that will result in a negative NPV, should development proceed.  There is a calculated 70% chance that prices will be such that a positive NPV of $800,000 would result.  The Passive or Traditional NPV approach will weight the two outcomes and discount to a present NPV.  That is, the Passive NPV approach will assume that should negative conditions prevail, development will nonetheless proceed.   Strategic NPV will assume, on the contrary, that the developer has the option in one year to cancel the project and sell the land should house prices be unfavorable.  What happens is that the Strategic NPV will likely be higher than the Passive NPV.  That is, that the land residual based on Strategic NPV will be positive and possibly higher than the land residual calculated on the basis of Passive NPV.   This is as it should be.  Passive NPV fails to take into consideration that the developer can change direction at some point in the future – that he can avert negative outcomes.  The consequence of employing Traditional or Passive NPV in these kinds of situations is that incorrect decisions are made regarding Highest and Best Use or, in more general terms, significant errors in valuation result.  

One characteristics of Strategic NPV formla is that values are often expressed using terms based on  max(V1, V2) functions.  For example,

                                                    NPV = max(I,0)/R

or NPV is the maximum of the expected Income I or 0 divided by the required rate of return.  In other words, if the expected income I is negative, the developer uses $0 as the income, that is, the developer exercises the option NOT to proceed with the associated activity.   One of the consequences of this is that by avoiding risk, the developer can use a smaller rate of return, a “riskless” rate of return.  Clearly this has a double impact on NPV:  (1) Negative income is removed and (2) this results in a lower required rate of return that reflects lower risk.  This typically results in a higher NPV or land residual, that does a better job of supporting accurate Highest and Best Use analysis.

An option may also involve choices with different costs, in which case occurances of the function min(cost1, cost2) function can be found.  

Calculations for simple options can be fairly straightforward.  However, options can fall into more complex patterns that include any of the following:

  1. Option to defer investment.
  2. Option to expand.
  3. Option to contract.
  4. Option to temporarily shut down.
  5. Option to abandon for salvage value.
  6. Option to switch use.  
  7. Option to default on planned staged costs during construction.

(More to follow)

No responses yet

Dec 26 2007

Earth (An R language package for multiple adaptive regression splines)

It has been my experience that the best adjustments for appraisal with a limited number of sales transactions in a complex real estate environment such as the San Francisco Bay Area can be obtained using multiple adaptive regression splines.   My experience has been primarily based on using MARS (r) from Salford Systems.  However, with a starting price of around $2,500 and yearly license fees of $1,200, it is a relatively expensive package for most appraisers.  Recently, I have tried using R language modules such as Earth and MDA that provide similar, although weaker functionality.

While Earth doesn’t provide all of the features found in the Salford-Systems implementations, it does provide plotting and printing methods.  The biggest downside is the lack of cross-validation, although this could be implemented with some additional programming.  The following information is taken from the documentation:

Limitations

The following aspects of MARS are mentioned in Friedman’s papers but not implemented in earth:

  1. Piecewise cubic models
  2. Specifying which predictors must be entered linearly
  3. Specifying which predictors can interact
  4. Model slicing (”plotmo” goes part way)
  5. Handling missing values
  6. Logistic regression
  7. Special handling of categorical predictors
  8. Fast MARS h parameter
  9. Cross validation to determine penalty
  10. Anova tables with sigma and other information.

Large Models and Execution Time

For a given set of input data, the following can increase the speed of the forward pass:

  1. Decreasing fast.k
  2. Decreasing nk
  3. Decreasing degree
  4. Increasing threshold
  5. Increasing min.span.         

    The backward pass is normally much faster than the forward pass, unless pmethod=”exhaustive”. Reducing nprune reduces exhaustive search time. One strategy is to first build a large model and then adjust pruning parameters such as nprune using update.earth.

     

The Forward Pass

The forward pass adds terms in pairs until the first of the following conditions is met:

  1. Reach maximum number of terms (nterms>=nk)
  2. Reach DeltaRSq threshold (DeltaRSq where DeltaRSq is the difference in RSquared caused by adding the current term pair
  3. Reach max RSq (RSq>1-thresh)
  4. Reach min GRSq (GRSq < -10).  

  5. Set trace>=1 to see the stopping condition.         

        

    The result of the forward pass is the set of terms defined by $dirs and $cuts in earth’s return value.

    Note that GCVs (via GRSq) are used during the forward pass only as one of the stopping conditions and in trace prints. Changing the penalty argument does not change the knot positions.  

    The various stopping conditions mean that the actual number of terms created by the forward pass may be less than nk. There are some other reasons why the actual number of terms may be less than nk

    1. The forward pass discards one side of each term pair if it adds nothing to the model—but the forward pass counts terms as if they were actually created in pairs
    2. As a final step, the forward pass deletes linearly dependent terms, if any, so all terms in $dirs and $cuts are independent. And remember that the pruning pass will further discard terms.

     The Pruning Pass

    The pruning pass is handed the sets of terms created by the forward pass. Its job is to find the subset of these terms that gives the lowest GCV. The pruning pass works like this: it determines the subset of terms (using pmethod) with the lowest RSS for each model size in 1:nprune (see the Force.xtx.prune argument above for some details). It saves the RSS and term numbers for each such subset in rss.per.subset and prune.terms. It then applies the Get. crit function with ppenalty to rss.per.subset to yield gcv.per.subset. It chooses the model with the lowest value in gcv.per.subset, and puts its term numbers into selected.terms. Finally, it runs lm to determine the fitted.values, residuals, and coefficients, by regressing the response y on the selected.terms of bx.

    Set
    trace>=3 to trace the pruning pass.By default Get.crit is earth:::get.gcv. Alternative Get.crit functions can be defined.See the source code of get.gcv for an example.

    Testing on New Data

    This example demonstrates one way to train on 80% of the data and test on the remaining 20%. In practice a dataset larger than the one below should be used for splitting. Also, remember that the test set should not be used for parameter tuning— use GCVs or separate validation sets for that.

    train.subset <- sample(1:nrow(trees), .8 * nrow(trees))

    test.subset <- (1:nrow(trees))[-train.subset]

    a <- earth(Volume ~ ., data = trees[train.subset, ])

    yhat <- predict(a, newdata = trees[test.subset, ])

    y <- trees$Volume[test.subset]

    print(1 – sum((y – yhat)^2) / sum((y – mean(y))^2)) # print R-Squared

    Establishing Variable Importance

    Establishing predictor importance is in general a tricky and even controversial problem.

    Running plotmo with ylim=NULL (the default) gives an idea of which predictors make the largest changes to the predicted value.

    You can also use drop1 (assuming you are using the formula interface to earth). Calling drop1(my.earth.model) will delete each predictor in turn from your model, rebuild the model from scratch each time, and calculate the GCV each time. You will get warnings that the earth library function extractAIC.earth is returning GCVs instead of AICs — but that is what you want so you can ignore the warnings. The column labeled AIC in the printed response from drop1 will actually be a column of GCVs not AICs. The Df column is not much use in this context.

    You will get lots of output from drop1 if you built your original earth model with trace>0. You can set trace=0 by updating your model before calling drop1. Do it like this:
                      
    my.model <- update.earth(my.model, trace=0).
    Remember that these techniques only tell you how important a variable is with the other variables already in the model. There are alternative ways of measuring variable importance (using resampling) but they are not yet implemented.

    Which Predictors Were Added To the model first?

    You can see the forward pass adding terms with trace=2 or higher. But remember, pruning will remove some of the terms. Another approach is to use
                               
    summary(my.model, decomp=”none”)
    which will list the basis functions remaining after pruning, in the order they were added by the forward pass.

    Which Predictors Are Actually Used In The Model?

    The following function will give you a a list of the predictors in the model:
                      get.used.pred.names <- function(obj) # obj is an earth object
    names(which(apply(obj$dirs[obj$selected.terms,,drop=FALSE],2,any)))

    Why Do Are There Fewer Terms Than nk, Even With prune=”none”?

    See the section above on the forward pass.

    Multiple Response Models

    If y has K columns then earth builds K simultaneous models. Each model has the same set of basis functions (i.e. same bx and selected.terms) but different coefficients (the returned coefficients will have K columns). The models are built and pruned as usual but with the GCVs and RSSs averaged across all K responses.

    Since earth attempts to optimize for all models simultaneously, the results will not be as “good” as building the models independently. i.e. the GCV of the combined model will not be as good as the GCVS for independent models, on the whole. However, the combined model may be a better model in other senses, depending on what you are trying to achieve.

    For more details on using GCVs averaged over multiple responses see section 4.1 of Hastie, Tibshirani, and Buja Flexible Discriminant Analysis by Optimal Scoring, JASA, December 1994 
    http://www-stat.stanford.edu/~hastie/Papers/fda.pdf
    .

    Using Eearth with FDA and MDA

    Earth can be used with fda and mda in the mda package. Earth will generate a multiple response model, as described above. Use keep.fitted=TRUE if you want to call plot.earth later (actually only necessary for large datasets, see the description of keep.fitted in fda). Use keepxy=TRUE if you want to call update or plotmo later. Use trace>=5 to see the call to earth generated by fda or mda. Example:   library(mda)

    (a <- fda(Species ~ ., data=iris, keep.fitted=TRUE, method=earth, keepxy=TRUE))

    plot(a)

    summary(a$fit) # examine earth model embedded in fda model

    plot(a$fit)

    plotmo(a$fit, ycolumn=1, ylim=c(-1.5,1.5), clip=FALSE)

    plotmo(a$fit, ycolumn=2, ylim=c(-1.5,1.5), clip=FALSE)

    Warning and Error Messages

    Earth prints most error and warning messages without printing the ‘call’. If you are mystified by a warning message, try setting options(warn=2) and using traceback.

    Author(s)

    Stephen Milborrow, derived from mda::mars by Trevor Hastie and Robert Tibshirani.

    References

    The primary references are the Friedman papers. Readers may find the MARS section in Hastie,

    Tibshirani, and Friedman a more accessible introduction. Faraway takes a hands-on approach, using the ozone data to compare mda::mars with other techniques. (If you use Faraway’s examples with earth instead of mars, use $bx instead of $x). Earth’s pruning pass uses the leaps package which is based on techniques in Miller.

    Faraway Extending the Linear Model with R http://www.maths.bath.ac.uk/~jjf23

    Friedman (1991) Multivariate Adaptive Regression Splines (with discussion) Annals of Statistics 19/1, 1–141

    Friedman (1993) Fast MARS Stanford University Department of Statistics, Technical Report 110  http://www-stat.stanford.edu/research/index.html

    Hastie, Tibshirani, and Friedman (2001) The Elements of Statistical Learning http://www-stat.stanford.edu/~hastie/pub.htm

    Miller, Alan (1990, 2nd ed. 2002) Subset Selection in Regression            

          

       

     

     

     

      

     

     

     

     

     

Comments Off