Datasets

The transitionMatrix distribution includes a number of datasets to support testing / training objectives. Datasets come in two main types:

  • State Transition Data (used in estimation). There are both dummy (synthetic) examples and some actual data. Transition data are usually in CSV format.

  • Transition Matrices and Multi-period Sets of matrices (again both dummy and actual examples). Transition matrices are usually in JSON format.

State Transition Data

The scripts are located in examples/python. For testing purposes all examples can be run using the run_examples.py script located in the root directory. Some scripts have an example flag that selects alternative input data or estimators.

List of Transition Datasets

File

Format

Events

Entities

States

Generator

Description

rating_data_raw.csv

Compact

4000

1829

9

Extract

A typical credit rating dataset

rating_data.csv

Compact

3780

1642

9

Data cleaning script

A typical credit rating dataset

scenario_data.csv

Compact

550

50

5

synthetic_data.csv

Compact

100

10

2

synthetic_data1.csv

Compact

100

1

4

Generator(=1)

DURATION TYPE DATASETS (Compact format)

synthetic_data2.csv

Compact

10000

1000

2

Generator(=2)

DURATION TYPE DATASETS (Compact format)

synthetic_data3.csv

Compact

2000

100

7

Generator(=3)

DURATION TYPE DATASETS (Compact format)

synthetic_data4.csv

Compact

10000

1000

8

Generator(=4)

Cohort type dataset (Generic Rating Matrix). Offers a semi-realistic example

synthetic_data5.csv

Compact

50000

10000

3

Generator(=5)

Large cohort type dataset useful for testing convergence

synthetic_data6.csv

Compact

20000

1000

2

Generator(=6)

COHORT TYPE DATASETS

synthetic_data7.csv

Canonical

1295

1000

8

Generator(=7)

Duration type datasets in Long Format

synthetic_data8.csv

Canonical

10000

10000

2

Generator(=8)

Duration type datasets in Long Format

synthetic_data9.csv

Canonical

1338

1000

8

Generator(=9)

Duration type datasets in Long Format

synthetic_data10.csv

Canonical

12000

2000

9

Generator(=10)

Credit Rating Migrations in Long Format / Compact Form

test.csv

Compact

14

7

3

Transition Matrices

  • generic_monthly

  • generic_multiperiod

  • JLT

  • sp 2017