Idiosyncratic volatility and correlation forecasting

Idiosyncratic volatility and correlation forecasting#

Use this notebook to extract a volatility forecast report for the idiosyncratic returns. The notebook also shows how to compute idiosyncratic correlations. Getting these correlations out of the system directly is not available yet.

import datetime as dt
from itertools import combinations_with_replacement

import polars as pl

from bayesline.api.equity import (
    ReportSettings,
    ExposureSettings,
    FactorRiskModelSettings,
    ModelConstructionSettings,
    ReportSettings,
    UniverseSettings,
    CategoricalExposureGroupSettings,
    ContinuousExposureGroupSettings,
    PortfolioHierarchySettings,
    IdiosyncraticVolatilityReportSettings,
    IdiosyncraticReturnReportSettings,
    IdioReportSettingsV2,
)
from bayesline.apiclient import BayeslineApiClient
bln = BayeslineApiClient.new_client(
    endpoint="https://[ENDPOINT]",
    api_key="[API-KEY]",
)

We begin by specifying a standard factor model that we can compute the idiosyncratic returns in reference to.

factorriskmodel_settings = FactorRiskModelSettings(
    universe=UniverseSettings(dataset="Bayesline-US-All-1y"),
    exposures=ExposureSettings(
        exposures=[
            ContinuousExposureGroupSettings(hierarchy="market"),
            CategoricalExposureGroupSettings(hierarchy="trbc"),
            ContinuousExposureGroupSettings(hierarchy="style"),
        ]
    ),
    modelconstruction=ModelConstructionSettings(
        estimation_universe=None,
        zero_sum_constraints={"trbc": "mcap_weighted"}
    ),
)

Getting the idiosyncratic volatility forecasts#

For the idiosyncratic volatility, we can directly query the output. Below we extract a dataframe with the sqrt-diagonal of the idiosyncratic risk matrix. We run with default settings here, but underneath IdioReportSettingsV2 many different options are available.

report_settings = IdioReportSettingsV2(
    factor_model_settings=factorriskmodel_settings
)

report_engine = bln.equity.reports.load(report_settings)
report = report_engine.calculate(start_date="2026-01-02", end_date="2026-01-30")

idio_vol_df = report.accessor.get_data(
    [], expand=("date", "asset_id"), value_cols=("idio_vol",)
).with_columns(pl.col(pl.Float32).fill_nan(None))
idio_vol_df.drop_nulls()
shape: (171_242, 3)
dateasset_ididio_vol
datestrf32
2026-01-05"IC000B1557"0.007989
2026-01-05"IC0010CEFE"0.326402
2026-01-05"IC0021AFB7"0.05656
2026-01-05"IC002CE8B9"1.21583
2026-01-05"IC002DC646"0.114718
2026-01-30"ICFFE60191"0.60148
2026-01-30"ICFFE938FD"0.188128
2026-01-30"ICFFE94AED"0.273419
2026-01-30"ICFFEBBB38"0.072164
2026-01-30"ICFFF2F5AD"1.396709

Computing the idiosyncratic correlations#

Sometimes it is necessary to allow for density in the idiosyncratic risk matrix. Factor models may not be able to explain co-movement in smaller clusters of highly similar assets. We are working on integrations, but for now it is only possible to manually compute these off-diagonal correlations from the idiosyncratic return time-series as a post-processing step. In the code below, we will extract the idiosyncratic returns, and subsequently compute the correlation matrix for two groups within our portfolio of six assets.

First, we run a very similar report as above to extract the idiosyncratic returns time-series.

idio_ret_df = report.accessor.get_data(
    [], expand=("date", "asset_id"), value_cols=("idio_return",)
).with_columns(pl.col(pl.Float32).fill_nan(None))
idio_ret_df.drop_nulls()
shape: (170_962, 3)
dateasset_ididio_return
datestrf32
2026-01-05"IC000B1557"0.000503
2026-01-05"IC0010CEFE"-0.020561
2026-01-05"IC0021AFB7"-0.003563
2026-01-05"IC002CE8B9"0.07659
2026-01-05"IC002DC646"0.007227
2026-01-30"ICFFE60191"0.018608
2026-01-30"ICFFE938FD"0.010133
2026-01-30"ICFFE94AED"0.00897
2026-01-30"ICFFEBBB38"-0.004432
2026-01-30"ICFFF2F5AD"-0.047053

Next, we define the groups of similar assets. These need to be mutually exclusive. I.e. we cannot have one asset that is part of multiple groups. Not all assets have to be part of a group.

In the example below, we put Apple, Microsoft and Alphabet in a group, and Mastercast and Visa in a separate group. NVIDIA is not part of a group.

groups = [
    sorted(["IC83A1B819", "ICF982536B", "ICA17F00B9"]),  # Apple, Microsoft, Alphabet
    sorted(["IC63213253", "IC28E9776F"]),  # Mastercard, Visa
]

# create a dataframe with all combinations
df_offdiag = pl.DataFrame(
    [
        (left, right)
        for group in groups
        for left, right in combinations_with_replacement(group, 2)
    ], 
    schema=["asset_id", "asset_id_right"],
    orient="row",
)

# just for display, this is in realistic scenarios a very large dataframe
(
    df_offdiag.sort("asset_id", "asset_id_right")
    .with_columns(pl.lit(1))
    .pivot("asset_id_right", index="asset_id", maintain_order=True, sort_columns=True)
)
shape: (5, 6)
asset_idIC28E9776FIC63213253IC83A1B819ICA17F00B9ICF982536B
stri32i32i32i32i32
"IC28E9776F"11nullnullnull
"IC63213253"null1nullnullnull
"IC83A1B819"nullnull111
"ICA17F00B9"nullnullnull11
"ICF982536B"nullnullnullnull1
# join the time series such that we have each combination that we need to compute
idio_ret_df_joined = (
    idio_ret_df
    .join(df_offdiag, on="asset_id")
    .join(idio_ret_df, left_on=("date", "asset_id_right"), right_on=("date", "asset_id"))
)
idio_ret_df_joined
shape: (180, 5)
dateasset_ididio_returnasset_id_rightidio_return_right
datestrf32strf32
2026-01-02"IC28E9776F"null"IC28E9776F"null
2026-01-02"IC28E9776F"null"IC63213253"null
2026-01-02"IC63213253"null"IC63213253"null
2026-01-02"IC83A1B819"null"IC83A1B819"null
2026-01-02"IC83A1B819"null"ICA17F00B9"null
2026-01-30"IC83A1B819"0.012707"ICA17F00B9"0.021744
2026-01-30"ICA17F00B9"0.021744"ICA17F00B9"0.021744
2026-01-30"IC83A1B819"0.012707"ICF982536B"-0.00373
2026-01-30"ICA17F00B9"0.021744"ICF982536B"-0.00373
2026-01-30"ICF982536B"-0.00373"ICF982536B"-0.00373

We compute the covariance matrix first, and then standardize into the correlation matrix. The computation of the covariance matrix relies on computing a rolling mean to correct for autocorrelation, and subsequently an exponentially weighted moving average. We then divide by the standard deviations to obtain the correlations.

# compute the covariance matrix by first using a rolling mean (for overlap),
# and then an exponential weighted moving average (for smoothing)
overlap_window = 5
half_life = 126

# compute rolling means
idio_ret_smoothed = (
    idio_ret_df.with_columns(pl.col(pl.Float32).fill_nan(None))
    .with_columns(
        pl.col("idio_return").rolling_mean(window_size=overlap_window, min_samples=1).over("asset_id"),
    )
)
# then join
idio_ret_df_joined = (
    idio_ret_smoothed.join(df_offdiag, on="asset_id")
    .join(idio_ret_smoothed, left_on=("date", "asset_id_right"), right_on=("date", "asset_id"))
) 
# then EWMA of the product                                                                                           
idio_vcov_df = (                                                     
    idio_ret_df_joined                                                                                               
    .with_columns(                                                   
        (pl.col("idio_return") * pl.col("idio_return_right"))
        .ewm_mean(half_life=half_life)
        .over(("asset_id", "asset_id_right"))                                                                        
        .alias("idio_vcov")
    )                                                                                                                
)  
idio_vcov_df
shape: (180, 6)
dateasset_ididio_returnasset_id_rightidio_return_rightidio_vcov
datestrf32strf32f32
2026-01-02"IC28E9776F"null"IC28E9776F"nullnull
2026-01-02"IC28E9776F"null"IC63213253"nullnull
2026-01-02"IC63213253"null"IC63213253"nullnull
2026-01-02"IC83A1B819"null"IC83A1B819"nullnull
2026-01-02"IC83A1B819"null"ICA17F00B9"nullnull
2026-01-30"IC83A1B819"0.007446"ICA17F00B9"0.0049050.000025
2026-01-30"ICA17F00B9"0.004905"ICA17F00B9"0.0049050.000036
2026-01-30"IC83A1B819"0.007446"ICF982536B"-0.014498-8.7480e-7
2026-01-30"ICA17F00B9"0.004905"ICF982536B"-0.014498-0.000007
2026-01-30"ICF982536B"-0.014498"ICF982536B"-0.0144980.00004
# to translate the covariance matrix to a correlation matrix, 
# we need to select the variance of the idiosyncratic returns
idio_var_df = (
    idio_vcov_df.filter(pl.col("asset_id") == pl.col("asset_id_right"))
    .select("date", "asset_id", pl.col("idio_vcov").alias("idio_var"))
)
idio_var_df
shape: (100, 3)
dateasset_ididio_var
datestrf32
2026-01-02"IC28E9776F"null
2026-01-02"IC63213253"null
2026-01-02"IC83A1B819"null
2026-01-02"ICA17F00B9"null
2026-01-02"ICF982536B"null
2026-01-30"IC28E9776F"0.000079
2026-01-30"IC63213253"0.000035
2026-01-30"IC83A1B819"0.000101
2026-01-30"ICA17F00B9"0.000036
2026-01-30"ICF982536B"0.00004
# by joining twice and normalizing, we get the correlation matrix
idio_corr_df = (
    idio_vcov_df.join(idio_var_df, on=("date", "asset_id"))
    .join(idio_var_df, left_on=("date", "asset_id_right"), right_on=("date", "asset_id"))
    .select(
        "date",
        "asset_id",
        "asset_id_right",
        (pl.col("idio_vcov") / (pl.col("idio_var") * pl.col("idio_var_right")).sqrt()).alias("idio_corr"))
)
idio_corr_df
shape: (180, 4)
dateasset_idasset_id_rightidio_corr
datestrstrf32
2026-01-02"IC28E9776F""IC28E9776F"null
2026-01-02"IC28E9776F""IC63213253"null
2026-01-02"IC63213253""IC63213253"null
2026-01-02"IC83A1B819""IC83A1B819"null
2026-01-02"IC83A1B819""ICA17F00B9"null
2026-01-30"IC83A1B819""ICA17F00B9"0.408816
2026-01-30"ICA17F00B9""ICA17F00B9"1.0
2026-01-30"IC83A1B819""ICF982536B"-0.013701
2026-01-30"ICA17F00B9""ICF982536B"-0.174511
2026-01-30"ICF982536B""ICF982536B"1.0
# for small portfolios, the dataframe is small enough to pivot and display
(
    idio_corr_df.pivot("asset_id_right", index=("date", "asset_id"), maintain_order=True, sort_columns=True)
    .filter(pl.col("date") > pl.col("date").min())
)
shape: (95, 7)
dateasset_idIC28E9776FIC63213253IC83A1B819ICA17F00B9ICF982536B
datestrf32f32f32f32f32
2026-01-05"IC28E9776F"1.01.0nullnullnull
2026-01-05"IC63213253"null1.0nullnullnull
2026-01-05"IC83A1B819"nullnull1.01.01.0
2026-01-05"ICA17F00B9"nullnullnull1.01.0
2026-01-05"ICF982536B"nullnullnullnull1.0
2026-01-30"IC28E9776F"1.00.638806nullnullnull
2026-01-30"IC63213253"null1.0nullnullnull
2026-01-30"IC83A1B819"nullnull1.00.408816-0.013701
2026-01-30"ICA17F00B9"nullnullnull1.0-0.174511
2026-01-30"ICF982536B"nullnullnullnull1.0