Datasets

The transitionMatrix distribution includes a number of datasets to support testing / training objectives. Datasets come in two main types:

  • State Transition Data (used in estimation). There are both dummy (synthetic) examples and some actual data. Transition data are usually in CSV format.
  • Transition Matrices and Multi-period Sets of matrices (again both dummy and actual examples). Transition matrices are usually in JSON format.

State Transition Data

The scripts are located in examples/python. For testing purposes all examples can be run using the run_examples.py script located in the root directory. Some scripts have an example flag that selects alternative input data or estimators.

List of Transition Datasets
File Format Events Entities States Generator Description
rating_data_raw.csv Compact 4000 1829 9 Extract A typical credit rating dataset
rating_data.csv Compact 3780 1642 9 Data cleaning script A typical credit rating dataset
scenario_data.csv Compact 550 50 5    
synthetic_data.csv Compact 100 10 2    
synthetic_data1.csv Compact 100 1 4 Generator(=1) DURATION TYPE DATASETS (Compact format)
synthetic_data2.csv Compact 10000 1000 2 Generator(=2) DURATION TYPE DATASETS (Compact format)
synthetic_data3.csv Compact 2000 100 7 Generator(=3) DURATION TYPE DATASETS (Compact format)
synthetic_data4.csv Compact 10000 1000 8 Generator(=4) Cohort type dataset (Generic Rating Matrix). Offers a semi-realistic example
synthetic_data5.csv Compact 50000 10000 3 Generator(=5) Large cohort type dataset useful for testing convergence
synthetic_data6.csv Compact 20000 1000 2 Generator(=6) COHORT TYPE DATASETS
synthetic_data7.csv Canonical 1295 1000 8 Generator(=7) Duration type datasets in Long Format
synthetic_data8.csv Canonical 10000 10000 2 Generator(=8) Duration type datasets in Long Format
synthetic_data9.csv Canonical 1338 1000 8 Generator(=9) Duration type datasets in Long Format
synthetic_data10.csv Canonical 12000 2000 9 Generator(=10) Credit Rating Migrations in Long Format / Compact Form
test.csv Compact 14 7 3    

Transition Matrices

  • generic_monthly
  • generic_multiperiod
  • JLT
  • sp 2017