Risk Datasets#

In this tutorial we are going to demonstrate the usage of the Bayesline Risk Datasets API, which allows to define new datasets that are used as a foundation to estimate factor risk models.

A risk dataset comprises all underlying data necessary to build a risk model:

factor exposures (including style exposures, industry/regional exposures, etc.)
asset price data
fundamental data (market caps)
security master data

In this first iteration we are allowing to ingest custom exposures into the Bayesline ecosystem, leveraging Bayesline data for the rest. In subsequent product iterations the user will be able to bring custom data for all other items, allowing to mix and match which data is brought by the user and which data is brought by Bayesline.

In this notebook we will introduce and explore:

system datasets and user datasets and how to list them
how to create a new risk dataset
how to create a new risk dataset with custom exposures
how do add exposures to an existing risk dataset

Imports & Setup#

For this tutorial notebook, you will need to import the following packages.

import datetime as dt
import numpy as np
import polars as pl

from bayesline.apiclient import BayeslineApiClient

from bayesline.api.equity import (
    CategoricalExposureGroupSettings,
    CategoricalFilterSettings,
    ContinuousExposureGroupSettings,
    ExposureSettings,
    FactorRiskModelSettings, 
    ModelConstructionSettings,
    RiskDatasetHuberRegressionExposureSettings,
    RiskDatasetSettings,
    RiskDatasetUnitExposureSettings,
    RiskDatasetUploadedExposureSettings,
    UniverseSettings
)

We will also need to have a Bayesline API client configured.

bln = BayeslineApiClient.new_client(
    endpoint="https://[ENDPOINT]",
    api_key="[API-KEY]",
)

The main entrypoint for the Risk Datasets API sits on bln.equity.riskdatasets. All dataset functionality can be reached from here on out.

See here for relevant docs:

Risk Datasets API Summary

risk_datasets = bln.equity.riskdatasets

Obtaining Available Datasets#

To list existing datasets we utilize the get_dataset_names method. When creating new datasets the names will appear in this list and are used downstream when creating risk model specifications.

We distinguish system and user datasets. System datasets are available to all users, e.g. the Bayesline-Global dataset. User datasets are created and owned by an individual user.

# default is "All", i.e. both system and user datasets
risk_datasets.get_dataset_names()

['Bayesline-US-500-1y', 'Bayesline-US-All-1y']

risk_datasets.get_dataset_names(mode="User")

[]

risk_datasets.get_dataset_names(mode="System")

['Bayesline-US-500-1y', 'Bayesline-US-All-1y']

A default dataset is a system dataset that will be used in absence of specifying a concrete dataset when creating a risk model. It is by definition the first result of risk_datasets.get_dataset_names().

risk_datasets.get_default_dataset_name()

'Bayesline-US-500-1y'

Creating a New Dataset#

When creating a new risk dataset we utilize the create_new_dataset method for which we need to provide a dataset name and a RiskDatasetSettings object.

At the bare minimum we need to specify a reference dataset, which is an existing dataset that all input data will be sourced from. The custom nature then is introduced by selectively specifying which data is to be brought in by the user.

Note that below minimal configuration effectivelt creates a copy of the reference dataset.

See here for relevant docs:

Risk Datasets Settings

settings = RiskDatasetSettings(
    reference_dataset="Bayesline-US-All-1y"
)

risk_dataset_api = risk_datasets.create_dataset("My-Dataset", settings=settings)

Above create_dataset invocation merely

Adds the given settings into the settings registry under given name.
Produces the physical dataset according to the given settings

Note that we could have simply saved the settings in the settings registry directly which would have skipped step 2. This is perfectly feasible but requires us to invoke the dataset creation separtely (explained below).

We can verify the settings registry creation by inspecting the registry directly. What we will notice is that system datasets are not included.

risk_datasets.settings.names()

{'My-Dataset': 1}

One thing we can immediately do on the returned risk_dataset_api is to describe it and inspect what the available styles, industries etc. are. Note that these immediately flow through to the relevant settings menus on bln.equity.universes, bln.equity.exposures etc.

See here for relevant docs:

RiskDatasetProperties

risk_dataset_props = risk_dataset_api.describe()

risk_dataset_props.exposure_settings_menu.continuous_hierarchies["style"]

{'size': ['log_market_cap', 'log_total_assets'],
 'value': ['book_to_price'],
 'growth': ['price_to_earnings'],
 'volatility': ['sigma', 'sigma_eps', 'beta'],
 'momentum': ['mom6', 'mom12'],
 'dividend': ['dividend_yield'],
 'leverage': ['debt_to_assets', 'debt_to_equity']}

Loading and Updating an Existing Dataset#

Step 2 from above can at any time be invoked manually to trigger a full recreation of the dataset, using the latest versions of all referenced datasets. To do this we simply load back the dataset we previousy created using either its name or globally unique identifier. Note that the system tracks the versions of all input data such that a dataset won’t be updated if it is already at the latest version.

risk_dataset_api = risk_datasets.load("My-Dataset")

update_result = risk_dataset_api.update()

The RiskDatasetUpdateResult gives summary information about the update process.

update_result

RiskDatasetUpdateResult()

Using the Custom Dataset#

We can now use the dataset to produce risk models.

riskmodel_engine = bln.equity.riskmodels.load(
    FactorRiskModelSettings(
        universe=UniverseSettings(dataset="My-Dataset"),
        exposures=ExposureSettings(
            exposures=[
                ContinuousExposureGroupSettings(hierarchy="market"),
                CategoricalExposureGroupSettings(hierarchy="trbc"),
                CategoricalExposureGroupSettings(hierarchy="continent"),
                ContinuousExposureGroupSettings(
                    hierarchy="style", standardize_method="equal_weighted"
                ),
            ]
        ),
        modelconstruction=ModelConstructionSettings(
            estimation_universe=None,
            zero_sum_constraints={"trbc": "mcap_weighted", "continent": "mcap_weighted"},
        ),
    )
)

risk_model_api = riskmodel_engine.get_model() 

risk_model_api.fret().head()

shape: (5, 27)

date	market.Market	trbc.Energy	trbc.Basic Materials	trbc.Industrials	trbc.Consumer Cyclicals	trbc.Consumer Non-Cyclicals	trbc.Financials	trbc.Healthcare	trbc.Technology	trbc.Utilities	trbc.Real Estate	trbc.Institutions, Associations & Organizations	trbc.Government Activity	trbc.Academic & Educational Services	continent.Africa	continent.America	continent.Asia	continent.Europe	continent.Oceania	style.Size	style.Value	style.Growth	style.Volatility	style.Momentum	style.Dividend	style.Leverage
date	f32	f32	f32	f32	f32	f32	f32	f32	f32	f32	f32	f32	f32	f32	f32	f32	f32	f32	f32	f32	f32	f32	f32	f32	f32	f32
2024-08-27	-0.00662	-0.005006	-0.001604	-0.001761	-0.000787	-0.000351	-0.000128	-0.00016	0.001769	-0.006791	0.001203	0.010677	-0.090849	0.001282	0.0	0.0	0.0	0.0	0.0	0.000884	-0.001134	0.000552	-0.003083	0.002274	-0.000953	0.00056
2024-08-28	-0.012348	-0.002409	-0.001622	0.003402	-0.002045	0.002533	0.001218	0.004993	-0.002645	0.00066	0.002225	0.014776	-0.028662	0.004507	0.0	0.0	0.0	0.0	0.0	0.001153	0.001489	-0.00011	-0.00669	0.000152	-0.000305	0.000672
2024-08-29	0.00546	0.007056	0.002521	0.003471	-0.005	-0.004006	-0.000004	0.000151	0.000723	0.002447	-0.005334	-0.007312	-0.021434	-0.003192	0.0	0.0	0.0	0.0	0.0	0.000547	0.000381	-0.000642	0.000228	-0.001798	-0.000249	-0.000154
2024-08-30	0.002992	-0.005578	0.001023	0.001636	-0.001236	0.000673	0.000944	-0.001164	0.000131	-0.000165	0.001446	0.001859	0.010118	-0.005694	0.0	0.0	0.0	0.0	0.0	0.003316	0.000122	0.00046	0.002553	0.001415	0.000359	-0.000451
2024-09-03	-0.03024	-0.011552	-0.009356	-0.003709	0.007006	0.013953	-0.00133	0.008716	-0.00472	0.003385	0.007138	0.030158	-0.028869	0.008999	0.0	0.0	0.0	0.0	0.0	-0.001625	-0.000662	-0.000918	-0.01847	-0.001838	0.001418	0.002279

Updating System Datasets#

Users with administrator permissions can update system wide datasets, e.g. Bayesline-Global. In practice this means that the remote source will be checked for new data (e.g. as part of a daily data update) and any changes will be incorporated into the Bayesline ecosystem. Updating a system dataset affects all users.

bayesline_risk_dataset = risk_datasets.load("Bayesline-US-All-1y")

bayesline_risk_dataset.update()

RiskDatasetUpdateResult()

Creating a Custom Risk Dataset#

For the remainder of the tutorial we will:

upload sample exposures
upload a simulated time series (e.g. an oil price)
create a custom risk dataset using the uploaded exposures and esitmate huber regression exposures for the simulated time series.

Uploading Custom Exposures#

In below example we will be uploading a set of sample exposures for the top 100 US companies. For this we are first using the Exposures API to create the sample exposures and then the Uploaders API (see the Data Uploaders Tutorial for a detailed walk through) to upload the exposures as a custom exposure dataset.

Creating Sample Exposures#

exposures_api = bln.equity.exposures.load(
    ExposureSettings(
        exposures=[
            ContinuousExposureGroupSettings(
                hierarchy="style",
                include=["log_market_cap", "mom12"],
            )
        ]
    )
)

universe_settings = UniverseSettings(
    dataset="Bayesline-US-All-1y",
    categorical_filters=[
        CategoricalFilterSettings(hierarchy="continent", include=["USA"]),
    ],
)
exposures_df = exposures_api.get(universe_settings, standardize_universe=None)

exposures_df.tail()

shape: (5, 4)

date	bayesid	style.Size	style.Momentum
date	str	f32	f32
2025-08-25	"ICFFE54368"	0.721191	-0.931152
2025-08-25	"ICFFE60191"	-0.878418	1.291016
2025-08-25	"ICFFE94AED"	-0.334473	-0.10968
2025-08-25	"ICFFEBBB38"	0.658203	0.681152
2025-08-25	"ICFFF2F5AD"	-0.553223	-0.746582

top_100_assets = (
    exposures_df
    .group_by("bayesid")
    .agg(pl.col("style.Size").mean())
    .sort("style.Size")
    .tail(100)
    .select("bayesid")
)

top_100_assets.head()

shape: (5, 1)

bayesid
str
"IC9034C494"
"ICBE9E1C51"
"ICB06B393A"
"IC87439A97"
"ICA91CFDE1"

exposures_df = (
    exposures_df
    .join(top_100_assets, on="bayesid", how="semi")
)

Upload the Sample Exposures#

uploaders = bln.equity.uploaders

uploaders.get_data_types()

['exposures', 'factors', 'hierarchies', 'portfolios']

exposure_uploader = uploaders.get_data_type("exposures")

exposure_dataset = exposure_uploader.get_or_create_dataset("My-US-Top100-Exposures")

For the uploader we need to provide one of the accepted input formats. Below we choose the Long-Format parser and transform our exposures_df to fit this format.

exposure_dataset.get_parser_names()

['Long-Format', 'Wide-Format']

exposure_dataset.get_parser("Long-Format").get_examples()[0]

shape: (8, 6)

date	asset_id	asset_id_type	factor_group	factor	exposure
date	str	str	str	str	f64
2025-01-06	"GOOG"	"cusip9"	"style"	"momentum_6"	-0.3
2025-01-06	"GOOG"	"cusip9"	"market"	"market"	1.0
2025-01-06	"AAPL"	"cusip9"	"style"	"momentum_6"	0.1
2025-01-06	"AAPL"	"cusip9"	"market"	"market"	1.0
2025-01-07	"GOOG"	"cusip9"	"style"	"momentum_6"	-0.28
2025-01-07	"GOOG"	"cusip9"	"market"	"market"	1.0
2025-01-07	"AAPL"	"cusip9"	"style"	"momentum_6"	0.0
2025-01-07	"AAPL"	"cusip9"	"market"	"market"	1.0

upload_df =(
    exposures_df.filter(pl.col("date") <= dt.date(2024, 12, 31))
    .rename({"bayesid": "asset_id", "style.Size": "size", "style.Momentum": "momentum"})
    .unpivot(
        on=["size", "momentum"],
        index=["date", "asset_id"],
        variable_name="factor",
        value_name="exposure",
    )
    .with_columns(
        pl.lit("bayesid").alias("asset_id_type"),
        pl.lit("style").alias("factor_group")
    )
)

upload_df.tail()

shape: (5, 6)

date	asset_id	factor	exposure	asset_id_type	factor_group
date	str	str	f32	str	str
2024-12-31	"ICF765A5A4"	"momentum"	-0.022278	"bayesid"	"style"
2024-12-31	"ICF7F6AFDC"	"momentum"	0.881348	"bayesid"	"style"
2024-12-31	"ICF982536B"	"momentum"	0.634277	"bayesid"	"style"
2024-12-31	"ICFB370DC9"	"momentum"	1.3125	"bayesid"	"style"
2024-12-31	"ICFE6CF9D5"	"momentum"	1.0234375	"bayesid"	"style"

exposure_dataset.fast_commit(upload_df, mode="append")

UploadCommitResult(version=1, committed_names=[])

To verify that our data was uploaded correctly we can obtain the data back from the exposure dataset.

exposure_dataset.get_data().collect().head()

shape: (5, 6)

date	asset_id	asset_id_type	factor_group	factor	exposure
date	str	str	str	str	f32
2024-08-25	"IC006CA2E0"	"bayesid"	"style"	"size"	2.8203125
2024-08-25	"IC0390CC2A"	"bayesid"	"style"	"size"	3.0
2024-08-25	"IC03C39235"	"bayesid"	"style"	"size"	2.777344
2024-08-25	"IC069B311C"	"bayesid"	"style"	"size"	3.0
2024-08-25	"IC07C95C1B"	"bayesid"	"style"	"size"	3.0

Uploading Time Series Data#

In below example we will be uploading two hypothetical time series (e.g. the oil price or the total returns of a technology index). Those can then be used to run create asset level exposures using Bayesline’s huber regression framework.

Creating Sample Time Series#

Below we simply create two random time series by sampling a normal distribution with a positive drift.

dates = upload_df["date"].unique().sort()
mu, sigma = 0.0002, 0.01
rng = np.random.default_rng(seed=42)
returns_oil = rng.normal(mu, sigma, size=len(dates))
returns_tech = rng.normal(mu, sigma, size=len(dates))

returns_df = pl.DataFrame({
    "date": dates,
    "returns_oil": returns_oil,
    "returns_tech": returns_tech
})

returns_df.tail()

shape: (5, 3)

date	returns_oil	returns_tech
date	f64	f64
2024-12-27	0.008551	-0.009669
2024-12-28	0.003769	-0.002253
2024-12-29	0.014833	0.007973
2024-12-30	-0.011688	0.004548
2024-12-31	-0.006198	-0.003562

Upload the Factor Time Series#

This is the same as above, only that we use the factors data type for the uploader.

uploaders.get_data_types()

['exposures', 'factors', 'hierarchies', 'portfolios']

factor_uploader = bln.equity.uploaders.get_data_type("factors")

factor_ts_dataset = factor_uploader.get_or_create_dataset("Oil-and-Tech-Returns")

factor_ts_dataset.fast_commit(returns_df, mode="append")

UploadCommitResult(version=1, committed_names=[])

factor_ts_dataset.get_data().collect().tail()

shape: (5, 3)

date	factor	value
date	str	f32
2024-12-27	"returns_tech"	-0.009669
2024-12-28	"returns_tech"	-0.002253
2024-12-29	"returns_tech"	0.007973
2024-12-30	"returns_tech"	0.004548
2024-12-31	"returns_tech"	-0.003562

Creating a Custom Risk Dataset#

Recall that we named the custom exposure dataset My-US-Top100-Exposures and the factor time series dataset Oil-and-Tech-Returns. We will use these name to specify that the exposure input data for the new risk dataset should be sourced from these uploads.

Also note from above that as a factor group we specified style. Factor groups are used to logically group exposures into styles, regions, industries, etc. This is particularly important if we bring more than one set of industry or region schemas (e.g. TRBC and GICS). Below we specify only the style_factor_group (meaning no other exposure groups will be brought in, even if they existed in our uploaded exposure dataset).

Lastly, note that below we specify exposures as a list. We can reference more than one exposure upload and create a consolidated risk dataset from it. In fact we do just that in this example where we bring in two different sources of exposures.

Note below nuances:

we may want to bring in a market factor (and potentially industry and country factors). In absence of bringing our own we can stub in unit exposure dummies, shown below for the market factor.

Below we use default settings for both the uploaded exposures and the huber regressions. There is an extensive set of available options, see below for relevant docs:

riskdataset_settings = RiskDatasetSettings(
    reference_dataset="Bayesline-US-All-1y",
    exposures=[
        RiskDatasetUnitExposureSettings(
            factor_group="market", factor="market", factor_type="continuous"
        ),
        RiskDatasetUploadedExposureSettings(
            exposure_source="My-US-Top100-Exposures",
            continuous_factor_groups=["style"],
            factor_groups_gaussianize=["style"],
            factor_groups_fill_miss=["style"],
        ),
        RiskDatasetHuberRegressionExposureSettings(
            tsfactors_source="Oil-and-Tech-Returns",
        ),
    ],
)

risk_dataset_api = risk_datasets.create_dataset("My-Risk-Dataset", settings=riskdataset_settings)

risk_dataset_props = risk_dataset_api.describe()

risk_dataset_props.exposure_settings_menu.continuous_hierarchies

{'market': ['market'],
 'style': ['momentum', 'size'],
 'huber_style': ['returns_oil', 'returns_tech']}

First we might be interested in what the huber regression based exposures worked out to be. Everything is linked with the rest of the Bayesline ecosystem so we can simply pick them up through the Exposures API.

universe_settings = UniverseSettings(dataset="My-Risk-Dataset")

exposure_settings = ExposureSettings(
    exposures=[
        ContinuousExposureGroupSettings(hierarchy="market"),
        ContinuousExposureGroupSettings(hierarchy="huber_style"),
    ],
)

exposures_api = bln.equity.exposures.load(exposure_settings)
exposures_df_my_model = exposures_api.get(universe_settings, standardize_universe=None)
exposures_df_my_model.filter(pl.col("bayesid") == "IC83A1B819").tail()  # Apple, Inc.

shape: (5, 5)

date	bayesid	market.market	huber_style.returns_oil	huber_style.returns_tech
date	str	f32	f32	f32
2024-12-27	"IC83A1B819"	1.0	-0.955078	-0.959473
2024-12-28	"IC83A1B819"	1.0	-0.955078	-0.958984
2024-12-29	"IC83A1B819"	1.0	-0.955078	-0.959961
2024-12-30	"IC83A1B819"	1.0	-0.932617	-0.966797
2024-12-31	"IC83A1B819"	1.0	-0.912109	-0.956543

Below we can now build a factor risk model with the risk dataset we just created.

riskmodel_engine = bln.equity.riskmodels.load(
    FactorRiskModelSettings(
        universe=universe_settings,
        exposures=exposure_settings,
        modelconstruction=ModelConstructionSettings(
            estimation_universe=None,
        ),
    )
)

risk_model_api = riskmodel_engine.get_model() 

risk_model_api.fret().tail()

shape: (5, 4)

date	market.market	huber_style.returns_oil	huber_style.returns_tech
date	f32	f32	f32
2024-12-24	0.008429	-0.000125	-0.001807
2024-12-26	0.00402	-0.001028	0.001026
2024-12-27	-0.009208	-0.00152	0.001672
2024-12-30	-0.007894	-0.000219	0.000948
2024-12-31	0.00024	-0.001648	0.001766

Adding Exposures to an Existing Custom Risk Dataset#

As a last step in this tutorial we will add exposures for 2025 to our existing exposures upload and then update the risk dataset we already created.

Recall the steps from above to obtain some sample exposures up to the end of 2024. We will follow the same steps here (note that we’ll reuse the same top 100 assets from above).

Also note that:

we won’t update the factor time series to demonstrate the behavior in case of only partially available exposures.
the dataframe we upload also contains 2024 dates. These will be ignored when uploading in append mode (i.e. any existing date/factor combindations are ignored).

exposures_df.tail()

shape: (5, 4)

date	bayesid	style.Size	style.Momentum
date	str	f32	f32
2025-08-25	"ICF765A5A4"	2.689453	-0.899414
2025-08-25	"ICF7F6AFDC"	3.0	0.027222
2025-08-25	"ICF982536B"	3.0	0.439941
2025-08-25	"ICFB370DC9"	3.0	1.701172
2025-08-25	"ICFE6CF9D5"	3.0	1.165039

exposures_df = exposures_df.join(top_100_assets, on="bayesid", how="semi")

exposures_df

shape: (36_600, 4)

date	bayesid	style.Size	style.Momentum
date	str	f32	f32
2024-08-25	"IC006CA2E0"	2.8203125	0.522461
2024-08-25	"IC0390CC2A"	3.0	-0.417725
2024-08-25	"IC03C39235"	2.777344	1.508789
2024-08-25	"IC069B311C"	3.0	0.887695
2024-08-25	"IC07C95C1B"	3.0	-0.302246
…	…	…	…
2025-08-25	"ICF765A5A4"	2.689453	-0.899414
2025-08-25	"ICF7F6AFDC"	3.0	0.027222
2025-08-25	"ICF982536B"	3.0	0.439941
2025-08-25	"ICFB370DC9"	3.0	1.701172
2025-08-25	"ICFE6CF9D5"	3.0	1.165039

upload_df =(
    exposures_df
    .rename({"bayesid": "asset_id", "style.Size": "size", "style.Momentum": "momentum"})
    .unpivot(
        on=["size", "momentum"],
        index=["date", "asset_id"],
        variable_name="factor",
        value_name="exposure",
    )
    .with_columns(
        pl.lit("bayesid").alias("asset_id_type"),
        pl.lit("style").alias("factor_group")
    )
)

upload_df.tail()

shape: (5, 6)

date	asset_id	factor	exposure	asset_id_type	factor_group
date	str	str	f32	str	str
2025-08-25	"ICF765A5A4"	"momentum"	-0.899414	"bayesid"	"style"
2025-08-25	"ICF7F6AFDC"	"momentum"	0.027222	"bayesid"	"style"
2025-08-25	"ICF982536B"	"momentum"	0.439941	"bayesid"	"style"
2025-08-25	"ICFB370DC9"	"momentum"	1.701172	"bayesid"	"style"
2025-08-25	"ICFE6CF9D5"	"momentum"	1.165039	"bayesid"	"style"

Note how below we choose the append mode which allows us to add data rather than overwrite previous data.

exposure_dataset.fast_commit(upload_df, mode="append")

UploadCommitResult(version=2, committed_names=[])

exposure_dataset.version_history()

{2: datetime.datetime(2025, 12, 14, 21, 44, 34, 719000, tzinfo=datetime.timezone.utc),
 1: datetime.datetime(2025, 12, 14, 21, 44, 7, 593000, tzinfo=datetime.timezone.utc),
 0: datetime.datetime(2025, 12, 14, 21, 44, 7, 517000, tzinfo=datetime.timezone.utc)}

New exposures have been uploaded, as a last step we need to update our risk dataset. Note that as of now we need to manually update the risk dataset to bring in the changes. In a future release functionality will be added to automatically trigger the risk dataset update if input data changes.

risk_dataset_api.update()

RiskDatasetUpdateResult()

Fitting the risk model again we’ll find that the new exposures have been captured.

riskmodel_engine = bln.equity.riskmodels.load(
    FactorRiskModelSettings(
        universe=universe_settings,
        exposures=exposure_settings,
    )
)

risk_model_api = riskmodel_engine.get_model() 

risk_model_api.fret().tail()

shape: (5, 4)

date	market.market	huber_style.returns_oil	huber_style.returns_tech
date	f32	f32	f32
2025-08-19	-0.003729	0.0	0.0
2025-08-20	-0.001297	0.0	0.0
2025-08-21	-0.001252	0.0	0.0
2025-08-22	0.02278	0.0	0.0
2025-08-25	-0.007338	0.0	0.0

Housekeeping#

Below demonstrates how to delete risk datasets.

risk_datasets.delete_dataset("My-Dataset")

RawSettings(model_type='RiskDatasetSettings', name='My-Dataset', identifier=1, exists=True, raw_json={'reference_dataset': 'Bayesline-US-All-1y', 'exposures': [{'exposure_type': 'referenced', 'continuous_factor_groups': None, 'categorical_factor_groups': None}], 'exchange_codes': None, 'trim_assets': 'none', 'trim_start_date': 'earliest_start', 'trim_end_date': 'latest_end'}, references=[], extra={})

risk_datasets.delete_dataset("My-Risk-Dataset")

RawSettings(model_type='RiskDatasetSettings', name='My-Risk-Dataset', identifier=3, exists=True, raw_json={'reference_dataset': 'Bayesline-US-All-1y', 'exposures': [{'exposure_type': 'unit', 'factor_group': 'market', 'factor': 'market', 'factor_type': 'continuous'}, {'exposure_type': 'uploaded', 'exposure_source': 'My-US-Top100-Exposures', 'continuous_factor_groups': ['style'], 'categorical_factor_groups': [], 'factor_groups_gaussianize': ['style'], 'factor_groups_gaussianize_maintain_zeros': [], 'factor_groups_fill_miss': ['style']}, {'exposure_type': 'huber_regression', 'tsfactors_source': 'Oil-and-Tech-Returns', 'factor_group': 'huber_style', 'include': 'All', 'exclude': [], 'fill_miss': True, 'window': 126, 'epsilon': 1.35, 'alpha': 0.0001, 'alpha_start': 10.0, 'student_t_level': None, 'clip': [None, None], 'gaussianize': True, 'gaussianize_maintain_zeros': False, 'impute': True, 'currency': 'USD', 'calendar': {'dataset': None, 'filters': [['XNYS']]}}], 'exchange_codes': None, 'trim_assets': 'none', 'trim_start_date': 'earliest_start', 'trim_end_date': 'latest_end'}, references=[], extra={})

exposure_dataset.destroy()

factor_ts_dataset.destroy()

Risk Datasets

Contents

Risk Datasets#

Imports & Setup#

Obtaining Available Datasets#

Creating a New Dataset#

Loading and Updating an Existing Dataset#

Using the Custom Dataset#

Updating System Datasets#

Creating a Custom Risk Dataset#

Uploading Custom Exposures#

Creating Sample Exposures#

Upload the Sample Exposures#

Uploading Time Series Data#

Creating Sample Time Series#

Upload the Factor Time Series#

Creating a Custom Risk Dataset#

Adding Exposures to an Existing Custom Risk Dataset#

Housekeeping#