Join The Analytics Community Today

Why join Analytics World? Be visible to the entire analytics community by contributing Analytics More »


Reject Inference in Credit Scoring : Building a robust Credit Scorecard

For statistical model building the key assumption is that the sample used to develop the model is indicative of the overall population. In particular, the sample should be similar to the population on which it will be applied. This is true for behavioural modeling, but this assumption does not hold true for application scorecards. For any application score card the known good bad population (KGB) is only the population which was approved in earlier cases. For the rejected population we do not have the information about the performance hence we can not use the same for modeling in normal cases. Without the build sample being similar to the target application population, the chances of model performing good and reasonable reduces to a great extent as it introduces sampling bias.

Using Operating system commands and functions in SAS® : Zip and unzip in SAS code. gzip SAS, gunzip SAS

One of the big problems faced by analytics professionals, especially those who use SAS, is space or rather lack of space in file system. Most of the time we encounter big datasets which result in destination directory full error messages. This may be for the permanent data directory where you are creating the data or in the work directory. This is annoying. It results in your program termination, wasting your work and important hours and sometime days. You need to run the process again from zero or some intermediate milestone but rerun is necessary. I myself has faced this problem alot of times and always wished if I can have more space. The space addition is out of scope for us most of the time.

Model Validation Measures

Validation along with disclosures of the same for risk management models in banks is  one the important parameters for success of BASEL accord. While there are no clear guidelines around validation measures is provided by the Basel committee, there is given a clear framework which should be adhering to by the bank or financial institute. It says that there should be independent validation should be done on timely basis and it should be well documented and declared. Given this it is of paramount improtance for banks to have clearly defined validation measures and sufficient disclosures of the same.

The perils of the LAG function

A common belief among SAS® programmers is that the function LAG, when passed a variable name as argument, returns the value of the variable from the last observation. However, this may not always work as expected as the following example shows. Suppose we create a dataset containing a single variable “category” as -

data test;
 input category $;