{
  "cells": [
    {
      "cell_type": "markdown",
      "id": "4b0aca25",
      "metadata": {},
      "source": [
        "# Idiosyncratic volatility and correlation forecasting\n",
        "\n",
        "Use this notebook to extract a volatility forecast report for the idiosyncratic returns. The notebook also shows how to compute idiosyncratic correlations. Getting these correlations out of the system directly is not available yet."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 2,
      "id": "4b9e63e4",
      "metadata": {},
      "outputs": [],
      "source": [
        "import datetime as dt\n",
        "from itertools import combinations_with_replacement\n",
        "\n",
        "import polars as pl\n",
        "\n",
        "from bayesline.api.equity import (\n",
        "    ReportSettings,\n",
        "    ExposureSettings,\n",
        "    FactorRiskModelSettings,\n",
        "    ModelConstructionSettings,\n",
        "    ReportSettings,\n",
        "    UniverseSettings,\n",
        "    CategoricalExposureGroupSettings,\n",
        "    ContinuousExposureGroupSettings,\n",
        "    PortfolioHierarchySettings,\n",
        "    IdiosyncraticVolatilityReportSettings,\n",
        "    IdiosyncraticReturnReportSettings,\n",
        "    IdioReportSettingsV2,\n",
        ")\n",
        "from bayesline.apiclient import BayeslineApiClient"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "id": "451ed0a5",
      "metadata": {
        "tags": [
          "skip-execution"
        ]
      },
      "outputs": [],
      "source": [
        "bln = BayeslineApiClient.new_client(\n",
        "    endpoint=\"https://[ENDPOINT]\",\n",
        "    api_key=\"[API-KEY]\",\n",
        ")"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "9e180008",
      "metadata": {},
      "source": [
        "We begin by specifying a standard factor model that we can compute the idiosyncratic returns in reference to."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 3,
      "id": "863a6a08",
      "metadata": {},
      "outputs": [],
      "source": [
        "factorriskmodel_settings = FactorRiskModelSettings(\n",
        "    universe=UniverseSettings(dataset=\"Bayesline-US-All-1y\"),\n",
        "    exposures=ExposureSettings(\n",
        "        exposures=[\n",
        "            ContinuousExposureGroupSettings(hierarchy=\"market\"),\n",
        "            CategoricalExposureGroupSettings(hierarchy=\"trbc\"),\n",
        "            ContinuousExposureGroupSettings(hierarchy=\"style\"),\n",
        "        ]\n",
        "    ),\n",
        "    modelconstruction=ModelConstructionSettings(\n",
        "        estimation_universe=None,\n",
        "        zero_sum_constraints={\"trbc\": \"mcap_weighted\"}\n",
        "    ),\n",
        ")"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "8b4a4102",
      "metadata": {},
      "source": [
        "## Getting the idiosyncratic volatility forecasts\n",
        "\n",
        "For the idiosyncratic volatility, we can directly query the output. Below we extract a dataframe with the sqrt-diagonal of the idiosyncratic risk matrix. We run with default settings here, but underneath `IdioReportSettingsV2` many different options are available."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 4,
      "id": "4be054b6",
      "metadata": {},
      "outputs": [],
      "source": [
        "report_settings = IdioReportSettingsV2(\n",
        "    factor_model_settings=factorriskmodel_settings\n",
        ")\n",
        "\n",
        "report_engine = bln.equity.reports.load(report_settings)\n",
        "report = report_engine.calculate(start_date=\"2026-01-02\", end_date=\"2026-01-30\")\n",
        "\n",
        "idio_vol_df = report.accessor.get_data(\n",
        "    [], expand=(\"date\", \"asset_id\"), value_cols=(\"idio_vol\",)\n",
        ").with_columns(pl.col(pl.Float32).fill_nan(None))\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 5,
      "id": "2a2aff94",
      "metadata": {},
      "outputs": [
        {
          "data": {
            "text/html": [
              "<div><style>\n",
              ".dataframe > thead > tr,\n",
              ".dataframe > tbody > tr {\n",
              "  text-align: right;\n",
              "  white-space: pre-wrap;\n",
              "}\n",
              "</style>\n",
              "<small>shape: (171_242, 3)</small><table border=\"1\" class=\"dataframe\"><thead><tr><th>date</th><th>asset_id</th><th>idio_vol</th></tr><tr><td>date</td><td>str</td><td>f32</td></tr></thead><tbody><tr><td>2026-01-05</td><td>&quot;IC000B1557&quot;</td><td>0.007989</td></tr><tr><td>2026-01-05</td><td>&quot;IC0010CEFE&quot;</td><td>0.326402</td></tr><tr><td>2026-01-05</td><td>&quot;IC0021AFB7&quot;</td><td>0.05656</td></tr><tr><td>2026-01-05</td><td>&quot;IC002CE8B9&quot;</td><td>1.21583</td></tr><tr><td>2026-01-05</td><td>&quot;IC002DC646&quot;</td><td>0.114718</td></tr><tr><td>&hellip;</td><td>&hellip;</td><td>&hellip;</td></tr><tr><td>2026-01-30</td><td>&quot;ICFFE60191&quot;</td><td>0.60148</td></tr><tr><td>2026-01-30</td><td>&quot;ICFFE938FD&quot;</td><td>0.188128</td></tr><tr><td>2026-01-30</td><td>&quot;ICFFE94AED&quot;</td><td>0.273419</td></tr><tr><td>2026-01-30</td><td>&quot;ICFFEBBB38&quot;</td><td>0.072164</td></tr><tr><td>2026-01-30</td><td>&quot;ICFFF2F5AD&quot;</td><td>1.396709</td></tr></tbody></table></div>"
            ],
            "text/plain": [
              "shape: (171_242, 3)\n",
              "┌────────────┬────────────┬──────────┐\n",
              "│ date       ┆ asset_id   ┆ idio_vol │\n",
              "│ ---        ┆ ---        ┆ ---      │\n",
              "│ date       ┆ str        ┆ f32      │\n",
              "╞════════════╪════════════╪══════════╡\n",
              "│ 2026-01-05 ┆ IC000B1557 ┆ 0.007989 │\n",
              "│ 2026-01-05 ┆ IC0010CEFE ┆ 0.326402 │\n",
              "│ 2026-01-05 ┆ IC0021AFB7 ┆ 0.05656  │\n",
              "│ 2026-01-05 ┆ IC002CE8B9 ┆ 1.21583  │\n",
              "│ 2026-01-05 ┆ IC002DC646 ┆ 0.114718 │\n",
              "│ …          ┆ …          ┆ …        │\n",
              "│ 2026-01-30 ┆ ICFFE60191 ┆ 0.60148  │\n",
              "│ 2026-01-30 ┆ ICFFE938FD ┆ 0.188128 │\n",
              "│ 2026-01-30 ┆ ICFFE94AED ┆ 0.273419 │\n",
              "│ 2026-01-30 ┆ ICFFEBBB38 ┆ 0.072164 │\n",
              "│ 2026-01-30 ┆ ICFFF2F5AD ┆ 1.396709 │\n",
              "└────────────┴────────────┴──────────┘"
            ]
          },
          "execution_count": 5,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "idio_vol_df.drop_nulls()"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "10d19a19",
      "metadata": {},
      "source": [
        "## Computing the idiosyncratic correlations\n",
        "\n",
        "Sometimes it is necessary to allow for density in the idiosyncratic risk matrix. Factor models may not be able to explain co-movement in smaller clusters of highly similar assets. We are working on integrations, but for now it is only possible to manually compute these off-diagonal correlations from the idiosyncratic return time-series as a post-processing step. In the code below, we will extract the idiosyncratic returns, and subsequently compute the correlation matrix for two groups within our portfolio of six assets.\n",
        "\n",
        "First, we run a very similar report as above to extract the idiosyncratic returns time-series."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 6,
      "id": "a857ca35",
      "metadata": {},
      "outputs": [],
      "source": [
        "idio_ret_df = report.accessor.get_data(\n",
        "    [], expand=(\"date\", \"asset_id\"), value_cols=(\"idio_return\",)\n",
        ").with_columns(pl.col(pl.Float32).fill_nan(None))"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 7,
      "id": "b3b08986",
      "metadata": {},
      "outputs": [
        {
          "data": {
            "text/html": [
              "<div><style>\n",
              ".dataframe > thead > tr,\n",
              ".dataframe > tbody > tr {\n",
              "  text-align: right;\n",
              "  white-space: pre-wrap;\n",
              "}\n",
              "</style>\n",
              "<small>shape: (170_962, 3)</small><table border=\"1\" class=\"dataframe\"><thead><tr><th>date</th><th>asset_id</th><th>idio_return</th></tr><tr><td>date</td><td>str</td><td>f32</td></tr></thead><tbody><tr><td>2026-01-05</td><td>&quot;IC000B1557&quot;</td><td>0.000503</td></tr><tr><td>2026-01-05</td><td>&quot;IC0010CEFE&quot;</td><td>-0.020561</td></tr><tr><td>2026-01-05</td><td>&quot;IC0021AFB7&quot;</td><td>-0.003563</td></tr><tr><td>2026-01-05</td><td>&quot;IC002CE8B9&quot;</td><td>0.07659</td></tr><tr><td>2026-01-05</td><td>&quot;IC002DC646&quot;</td><td>0.007227</td></tr><tr><td>&hellip;</td><td>&hellip;</td><td>&hellip;</td></tr><tr><td>2026-01-30</td><td>&quot;ICFFE60191&quot;</td><td>0.018608</td></tr><tr><td>2026-01-30</td><td>&quot;ICFFE938FD&quot;</td><td>0.010133</td></tr><tr><td>2026-01-30</td><td>&quot;ICFFE94AED&quot;</td><td>0.00897</td></tr><tr><td>2026-01-30</td><td>&quot;ICFFEBBB38&quot;</td><td>-0.004432</td></tr><tr><td>2026-01-30</td><td>&quot;ICFFF2F5AD&quot;</td><td>-0.047053</td></tr></tbody></table></div>"
            ],
            "text/plain": [
              "shape: (170_962, 3)\n",
              "┌────────────┬────────────┬─────────────┐\n",
              "│ date       ┆ asset_id   ┆ idio_return │\n",
              "│ ---        ┆ ---        ┆ ---         │\n",
              "│ date       ┆ str        ┆ f32         │\n",
              "╞════════════╪════════════╪═════════════╡\n",
              "│ 2026-01-05 ┆ IC000B1557 ┆ 0.000503    │\n",
              "│ 2026-01-05 ┆ IC0010CEFE ┆ -0.020561   │\n",
              "│ 2026-01-05 ┆ IC0021AFB7 ┆ -0.003563   │\n",
              "│ 2026-01-05 ┆ IC002CE8B9 ┆ 0.07659     │\n",
              "│ 2026-01-05 ┆ IC002DC646 ┆ 0.007227    │\n",
              "│ …          ┆ …          ┆ …           │\n",
              "│ 2026-01-30 ┆ ICFFE60191 ┆ 0.018608    │\n",
              "│ 2026-01-30 ┆ ICFFE938FD ┆ 0.010133    │\n",
              "│ 2026-01-30 ┆ ICFFE94AED ┆ 0.00897     │\n",
              "│ 2026-01-30 ┆ ICFFEBBB38 ┆ -0.004432   │\n",
              "│ 2026-01-30 ┆ ICFFF2F5AD ┆ -0.047053   │\n",
              "└────────────┴────────────┴─────────────┘"
            ]
          },
          "execution_count": 7,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "idio_ret_df.drop_nulls()"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "005a3a21",
      "metadata": {},
      "source": [
        "Next, we define the groups of similar assets. These need to be mutually exclusive. I.e. we cannot have one asset that is part of multiple groups. Not all assets have to be part of a group.\n",
        "\n",
        "In the example below, we put Apple, Microsoft and Alphabet in a group, and Mastercast and Visa in a separate group. NVIDIA is not part of a group."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 8,
      "id": "c4c6150f",
      "metadata": {},
      "outputs": [
        {
          "data": {
            "text/html": [
              "<div><style>\n",
              ".dataframe > thead > tr,\n",
              ".dataframe > tbody > tr {\n",
              "  text-align: right;\n",
              "  white-space: pre-wrap;\n",
              "}\n",
              "</style>\n",
              "<small>shape: (5, 6)</small><table border=\"1\" class=\"dataframe\"><thead><tr><th>asset_id</th><th>IC28E9776F</th><th>IC63213253</th><th>IC83A1B819</th><th>ICA17F00B9</th><th>ICF982536B</th></tr><tr><td>str</td><td>i32</td><td>i32</td><td>i32</td><td>i32</td><td>i32</td></tr></thead><tbody><tr><td>&quot;IC28E9776F&quot;</td><td>1</td><td>1</td><td>null</td><td>null</td><td>null</td></tr><tr><td>&quot;IC63213253&quot;</td><td>null</td><td>1</td><td>null</td><td>null</td><td>null</td></tr><tr><td>&quot;IC83A1B819&quot;</td><td>null</td><td>null</td><td>1</td><td>1</td><td>1</td></tr><tr><td>&quot;ICA17F00B9&quot;</td><td>null</td><td>null</td><td>null</td><td>1</td><td>1</td></tr><tr><td>&quot;ICF982536B&quot;</td><td>null</td><td>null</td><td>null</td><td>null</td><td>1</td></tr></tbody></table></div>"
            ],
            "text/plain": [
              "shape: (5, 6)\n",
              "┌────────────┬────────────┬────────────┬────────────┬────────────┬────────────┐\n",
              "│ asset_id   ┆ IC28E9776F ┆ IC63213253 ┆ IC83A1B819 ┆ ICA17F00B9 ┆ ICF982536B │\n",
              "│ ---        ┆ ---        ┆ ---        ┆ ---        ┆ ---        ┆ ---        │\n",
              "│ str        ┆ i32        ┆ i32        ┆ i32        ┆ i32        ┆ i32        │\n",
              "╞════════════╪════════════╪════════════╪════════════╪════════════╪════════════╡\n",
              "│ IC28E9776F ┆ 1          ┆ 1          ┆ null       ┆ null       ┆ null       │\n",
              "│ IC63213253 ┆ null       ┆ 1          ┆ null       ┆ null       ┆ null       │\n",
              "│ IC83A1B819 ┆ null       ┆ null       ┆ 1          ┆ 1          ┆ 1          │\n",
              "│ ICA17F00B9 ┆ null       ┆ null       ┆ null       ┆ 1          ┆ 1          │\n",
              "│ ICF982536B ┆ null       ┆ null       ┆ null       ┆ null       ┆ 1          │\n",
              "└────────────┴────────────┴────────────┴────────────┴────────────┴────────────┘"
            ]
          },
          "execution_count": 8,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "groups = [\n",
        "    sorted([\"IC83A1B819\", \"ICF982536B\", \"ICA17F00B9\"]),  # Apple, Microsoft, Alphabet\n",
        "    sorted([\"IC63213253\", \"IC28E9776F\"]),  # Mastercard, Visa\n",
        "]\n",
        "\n",
        "# create a dataframe with all combinations\n",
        "df_offdiag = pl.DataFrame(\n",
        "    [\n",
        "        (left, right)\n",
        "        for group in groups\n",
        "        for left, right in combinations_with_replacement(group, 2)\n",
        "    ], \n",
        "    schema=[\"asset_id\", \"asset_id_right\"],\n",
        "    orient=\"row\",\n",
        ")\n",
        "\n",
        "# just for display, this is in realistic scenarios a very large dataframe\n",
        "(\n",
        "    df_offdiag.sort(\"asset_id\", \"asset_id_right\")\n",
        "    .with_columns(pl.lit(1))\n",
        "    .pivot(\"asset_id_right\", index=\"asset_id\", maintain_order=True, sort_columns=True)\n",
        ")"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 9,
      "id": "e1a36f1c",
      "metadata": {},
      "outputs": [
        {
          "data": {
            "text/html": [
              "<div><style>\n",
              ".dataframe > thead > tr,\n",
              ".dataframe > tbody > tr {\n",
              "  text-align: right;\n",
              "  white-space: pre-wrap;\n",
              "}\n",
              "</style>\n",
              "<small>shape: (180, 5)</small><table border=\"1\" class=\"dataframe\"><thead><tr><th>date</th><th>asset_id</th><th>idio_return</th><th>asset_id_right</th><th>idio_return_right</th></tr><tr><td>date</td><td>str</td><td>f32</td><td>str</td><td>f32</td></tr></thead><tbody><tr><td>2026-01-02</td><td>&quot;IC28E9776F&quot;</td><td>null</td><td>&quot;IC28E9776F&quot;</td><td>null</td></tr><tr><td>2026-01-02</td><td>&quot;IC28E9776F&quot;</td><td>null</td><td>&quot;IC63213253&quot;</td><td>null</td></tr><tr><td>2026-01-02</td><td>&quot;IC63213253&quot;</td><td>null</td><td>&quot;IC63213253&quot;</td><td>null</td></tr><tr><td>2026-01-02</td><td>&quot;IC83A1B819&quot;</td><td>null</td><td>&quot;IC83A1B819&quot;</td><td>null</td></tr><tr><td>2026-01-02</td><td>&quot;IC83A1B819&quot;</td><td>null</td><td>&quot;ICA17F00B9&quot;</td><td>null</td></tr><tr><td>&hellip;</td><td>&hellip;</td><td>&hellip;</td><td>&hellip;</td><td>&hellip;</td></tr><tr><td>2026-01-30</td><td>&quot;IC83A1B819&quot;</td><td>0.012707</td><td>&quot;ICA17F00B9&quot;</td><td>0.021744</td></tr><tr><td>2026-01-30</td><td>&quot;ICA17F00B9&quot;</td><td>0.021744</td><td>&quot;ICA17F00B9&quot;</td><td>0.021744</td></tr><tr><td>2026-01-30</td><td>&quot;IC83A1B819&quot;</td><td>0.012707</td><td>&quot;ICF982536B&quot;</td><td>-0.00373</td></tr><tr><td>2026-01-30</td><td>&quot;ICA17F00B9&quot;</td><td>0.021744</td><td>&quot;ICF982536B&quot;</td><td>-0.00373</td></tr><tr><td>2026-01-30</td><td>&quot;ICF982536B&quot;</td><td>-0.00373</td><td>&quot;ICF982536B&quot;</td><td>-0.00373</td></tr></tbody></table></div>"
            ],
            "text/plain": [
              "shape: (180, 5)\n",
              "┌────────────┬────────────┬─────────────┬────────────────┬───────────────────┐\n",
              "│ date       ┆ asset_id   ┆ idio_return ┆ asset_id_right ┆ idio_return_right │\n",
              "│ ---        ┆ ---        ┆ ---         ┆ ---            ┆ ---               │\n",
              "│ date       ┆ str        ┆ f32         ┆ str            ┆ f32               │\n",
              "╞════════════╪════════════╪═════════════╪════════════════╪═══════════════════╡\n",
              "│ 2026-01-02 ┆ IC28E9776F ┆ null        ┆ IC28E9776F     ┆ null              │\n",
              "│ 2026-01-02 ┆ IC28E9776F ┆ null        ┆ IC63213253     ┆ null              │\n",
              "│ 2026-01-02 ┆ IC63213253 ┆ null        ┆ IC63213253     ┆ null              │\n",
              "│ 2026-01-02 ┆ IC83A1B819 ┆ null        ┆ IC83A1B819     ┆ null              │\n",
              "│ 2026-01-02 ┆ IC83A1B819 ┆ null        ┆ ICA17F00B9     ┆ null              │\n",
              "│ …          ┆ …          ┆ …           ┆ …              ┆ …                 │\n",
              "│ 2026-01-30 ┆ IC83A1B819 ┆ 0.012707    ┆ ICA17F00B9     ┆ 0.021744          │\n",
              "│ 2026-01-30 ┆ ICA17F00B9 ┆ 0.021744    ┆ ICA17F00B9     ┆ 0.021744          │\n",
              "│ 2026-01-30 ┆ IC83A1B819 ┆ 0.012707    ┆ ICF982536B     ┆ -0.00373          │\n",
              "│ 2026-01-30 ┆ ICA17F00B9 ┆ 0.021744    ┆ ICF982536B     ┆ -0.00373          │\n",
              "│ 2026-01-30 ┆ ICF982536B ┆ -0.00373    ┆ ICF982536B     ┆ -0.00373          │\n",
              "└────────────┴────────────┴─────────────┴────────────────┴───────────────────┘"
            ]
          },
          "execution_count": 9,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "# join the time series such that we have each combination that we need to compute\n",
        "idio_ret_df_joined = (\n",
        "    idio_ret_df\n",
        "    .join(df_offdiag, on=\"asset_id\")\n",
        "    .join(idio_ret_df, left_on=(\"date\", \"asset_id_right\"), right_on=(\"date\", \"asset_id\"))\n",
        ")\n",
        "idio_ret_df_joined"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "3ceef638",
      "metadata": {},
      "source": [
        "We compute the covariance matrix first, and then standardize into the correlation matrix. The computation of the covariance matrix relies on computing a rolling mean to correct for autocorrelation, and subsequently an exponentially weighted moving average. We then divide by the standard deviations to obtain the correlations."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 10,
      "id": "04afcfaa",
      "metadata": {},
      "outputs": [
        {
          "data": {
            "text/html": [
              "<div><style>\n",
              ".dataframe > thead > tr,\n",
              ".dataframe > tbody > tr {\n",
              "  text-align: right;\n",
              "  white-space: pre-wrap;\n",
              "}\n",
              "</style>\n",
              "<small>shape: (180, 6)</small><table border=\"1\" class=\"dataframe\"><thead><tr><th>date</th><th>asset_id</th><th>idio_return</th><th>asset_id_right</th><th>idio_return_right</th><th>idio_vcov</th></tr><tr><td>date</td><td>str</td><td>f32</td><td>str</td><td>f32</td><td>f32</td></tr></thead><tbody><tr><td>2026-01-02</td><td>&quot;IC28E9776F&quot;</td><td>null</td><td>&quot;IC28E9776F&quot;</td><td>null</td><td>null</td></tr><tr><td>2026-01-02</td><td>&quot;IC28E9776F&quot;</td><td>null</td><td>&quot;IC63213253&quot;</td><td>null</td><td>null</td></tr><tr><td>2026-01-02</td><td>&quot;IC63213253&quot;</td><td>null</td><td>&quot;IC63213253&quot;</td><td>null</td><td>null</td></tr><tr><td>2026-01-02</td><td>&quot;IC83A1B819&quot;</td><td>null</td><td>&quot;IC83A1B819&quot;</td><td>null</td><td>null</td></tr><tr><td>2026-01-02</td><td>&quot;IC83A1B819&quot;</td><td>null</td><td>&quot;ICA17F00B9&quot;</td><td>null</td><td>null</td></tr><tr><td>&hellip;</td><td>&hellip;</td><td>&hellip;</td><td>&hellip;</td><td>&hellip;</td><td>&hellip;</td></tr><tr><td>2026-01-30</td><td>&quot;IC83A1B819&quot;</td><td>0.007446</td><td>&quot;ICA17F00B9&quot;</td><td>0.004905</td><td>0.000025</td></tr><tr><td>2026-01-30</td><td>&quot;ICA17F00B9&quot;</td><td>0.004905</td><td>&quot;ICA17F00B9&quot;</td><td>0.004905</td><td>0.000036</td></tr><tr><td>2026-01-30</td><td>&quot;IC83A1B819&quot;</td><td>0.007446</td><td>&quot;ICF982536B&quot;</td><td>-0.014498</td><td>-8.7480e-7</td></tr><tr><td>2026-01-30</td><td>&quot;ICA17F00B9&quot;</td><td>0.004905</td><td>&quot;ICF982536B&quot;</td><td>-0.014498</td><td>-0.000007</td></tr><tr><td>2026-01-30</td><td>&quot;ICF982536B&quot;</td><td>-0.014498</td><td>&quot;ICF982536B&quot;</td><td>-0.014498</td><td>0.00004</td></tr></tbody></table></div>"
            ],
            "text/plain": [
              "shape: (180, 6)\n",
              "┌────────────┬────────────┬─────────────┬────────────────┬───────────────────┬────────────┐\n",
              "│ date       ┆ asset_id   ┆ idio_return ┆ asset_id_right ┆ idio_return_right ┆ idio_vcov  │\n",
              "│ ---        ┆ ---        ┆ ---         ┆ ---            ┆ ---               ┆ ---        │\n",
              "│ date       ┆ str        ┆ f32         ┆ str            ┆ f32               ┆ f32        │\n",
              "╞════════════╪════════════╪═════════════╪════════════════╪═══════════════════╪════════════╡\n",
              "│ 2026-01-02 ┆ IC28E9776F ┆ null        ┆ IC28E9776F     ┆ null              ┆ null       │\n",
              "│ 2026-01-02 ┆ IC28E9776F ┆ null        ┆ IC63213253     ┆ null              ┆ null       │\n",
              "│ 2026-01-02 ┆ IC63213253 ┆ null        ┆ IC63213253     ┆ null              ┆ null       │\n",
              "│ 2026-01-02 ┆ IC83A1B819 ┆ null        ┆ IC83A1B819     ┆ null              ┆ null       │\n",
              "│ 2026-01-02 ┆ IC83A1B819 ┆ null        ┆ ICA17F00B9     ┆ null              ┆ null       │\n",
              "│ …          ┆ …          ┆ …           ┆ …              ┆ …                 ┆ …          │\n",
              "│ 2026-01-30 ┆ IC83A1B819 ┆ 0.007446    ┆ ICA17F00B9     ┆ 0.004905          ┆ 0.000025   │\n",
              "│ 2026-01-30 ┆ ICA17F00B9 ┆ 0.004905    ┆ ICA17F00B9     ┆ 0.004905          ┆ 0.000036   │\n",
              "│ 2026-01-30 ┆ IC83A1B819 ┆ 0.007446    ┆ ICF982536B     ┆ -0.014498         ┆ -8.7480e-7 │\n",
              "│ 2026-01-30 ┆ ICA17F00B9 ┆ 0.004905    ┆ ICF982536B     ┆ -0.014498         ┆ -0.000007  │\n",
              "│ 2026-01-30 ┆ ICF982536B ┆ -0.014498   ┆ ICF982536B     ┆ -0.014498         ┆ 0.00004    │\n",
              "└────────────┴────────────┴─────────────┴────────────────┴───────────────────┴────────────┘"
            ]
          },
          "execution_count": 10,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "# compute the covariance matrix by first using a rolling mean (for overlap),\n",
        "# and then an exponential weighted moving average (for smoothing)\n",
        "overlap_window = 5\n",
        "half_life = 126\n",
        "\n",
        "# compute rolling means\n",
        "idio_ret_smoothed = (\n",
        "    idio_ret_df.with_columns(pl.col(pl.Float32).fill_nan(None))\n",
        "    .with_columns(\n",
        "        pl.col(\"idio_return\").rolling_mean(window_size=overlap_window, min_samples=1).over(\"asset_id\"),\n",
        "    )\n",
        ")\n",
        "# then join\n",
        "idio_ret_df_joined = (\n",
        "    idio_ret_smoothed.join(df_offdiag, on=\"asset_id\")\n",
        "    .join(idio_ret_smoothed, left_on=(\"date\", \"asset_id_right\"), right_on=(\"date\", \"asset_id\"))\n",
        ") \n",
        "# then EWMA of the product                                                                                           \n",
        "idio_vcov_df = (                                                     \n",
        "    idio_ret_df_joined                                                                                               \n",
        "    .with_columns(                                                   \n",
        "        (pl.col(\"idio_return\") * pl.col(\"idio_return_right\"))\n",
        "        .ewm_mean(half_life=half_life)\n",
        "        .over((\"asset_id\", \"asset_id_right\"))                                                                        \n",
        "        .alias(\"idio_vcov\")\n",
        "    )                                                                                                                \n",
        ")  \n",
        "idio_vcov_df"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 11,
      "id": "47e085ab",
      "metadata": {},
      "outputs": [
        {
          "data": {
            "text/html": [
              "<div><style>\n",
              ".dataframe > thead > tr,\n",
              ".dataframe > tbody > tr {\n",
              "  text-align: right;\n",
              "  white-space: pre-wrap;\n",
              "}\n",
              "</style>\n",
              "<small>shape: (100, 3)</small><table border=\"1\" class=\"dataframe\"><thead><tr><th>date</th><th>asset_id</th><th>idio_var</th></tr><tr><td>date</td><td>str</td><td>f32</td></tr></thead><tbody><tr><td>2026-01-02</td><td>&quot;IC28E9776F&quot;</td><td>null</td></tr><tr><td>2026-01-02</td><td>&quot;IC63213253&quot;</td><td>null</td></tr><tr><td>2026-01-02</td><td>&quot;IC83A1B819&quot;</td><td>null</td></tr><tr><td>2026-01-02</td><td>&quot;ICA17F00B9&quot;</td><td>null</td></tr><tr><td>2026-01-02</td><td>&quot;ICF982536B&quot;</td><td>null</td></tr><tr><td>&hellip;</td><td>&hellip;</td><td>&hellip;</td></tr><tr><td>2026-01-30</td><td>&quot;IC28E9776F&quot;</td><td>0.000079</td></tr><tr><td>2026-01-30</td><td>&quot;IC63213253&quot;</td><td>0.000035</td></tr><tr><td>2026-01-30</td><td>&quot;IC83A1B819&quot;</td><td>0.000101</td></tr><tr><td>2026-01-30</td><td>&quot;ICA17F00B9&quot;</td><td>0.000036</td></tr><tr><td>2026-01-30</td><td>&quot;ICF982536B&quot;</td><td>0.00004</td></tr></tbody></table></div>"
            ],
            "text/plain": [
              "shape: (100, 3)\n",
              "┌────────────┬────────────┬──────────┐\n",
              "│ date       ┆ asset_id   ┆ idio_var │\n",
              "│ ---        ┆ ---        ┆ ---      │\n",
              "│ date       ┆ str        ┆ f32      │\n",
              "╞════════════╪════════════╪══════════╡\n",
              "│ 2026-01-02 ┆ IC28E9776F ┆ null     │\n",
              "│ 2026-01-02 ┆ IC63213253 ┆ null     │\n",
              "│ 2026-01-02 ┆ IC83A1B819 ┆ null     │\n",
              "│ 2026-01-02 ┆ ICA17F00B9 ┆ null     │\n",
              "│ 2026-01-02 ┆ ICF982536B ┆ null     │\n",
              "│ …          ┆ …          ┆ …        │\n",
              "│ 2026-01-30 ┆ IC28E9776F ┆ 0.000079 │\n",
              "│ 2026-01-30 ┆ IC63213253 ┆ 0.000035 │\n",
              "│ 2026-01-30 ┆ IC83A1B819 ┆ 0.000101 │\n",
              "│ 2026-01-30 ┆ ICA17F00B9 ┆ 0.000036 │\n",
              "│ 2026-01-30 ┆ ICF982536B ┆ 0.00004  │\n",
              "└────────────┴────────────┴──────────┘"
            ]
          },
          "execution_count": 11,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "# to translate the covariance matrix to a correlation matrix, \n",
        "# we need to select the variance of the idiosyncratic returns\n",
        "idio_var_df = (\n",
        "    idio_vcov_df.filter(pl.col(\"asset_id\") == pl.col(\"asset_id_right\"))\n",
        "    .select(\"date\", \"asset_id\", pl.col(\"idio_vcov\").alias(\"idio_var\"))\n",
        ")\n",
        "idio_var_df"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 12,
      "id": "8ff6ce1d",
      "metadata": {},
      "outputs": [
        {
          "data": {
            "text/html": [
              "<div><style>\n",
              ".dataframe > thead > tr,\n",
              ".dataframe > tbody > tr {\n",
              "  text-align: right;\n",
              "  white-space: pre-wrap;\n",
              "}\n",
              "</style>\n",
              "<small>shape: (180, 4)</small><table border=\"1\" class=\"dataframe\"><thead><tr><th>date</th><th>asset_id</th><th>asset_id_right</th><th>idio_corr</th></tr><tr><td>date</td><td>str</td><td>str</td><td>f32</td></tr></thead><tbody><tr><td>2026-01-02</td><td>&quot;IC28E9776F&quot;</td><td>&quot;IC28E9776F&quot;</td><td>null</td></tr><tr><td>2026-01-02</td><td>&quot;IC28E9776F&quot;</td><td>&quot;IC63213253&quot;</td><td>null</td></tr><tr><td>2026-01-02</td><td>&quot;IC63213253&quot;</td><td>&quot;IC63213253&quot;</td><td>null</td></tr><tr><td>2026-01-02</td><td>&quot;IC83A1B819&quot;</td><td>&quot;IC83A1B819&quot;</td><td>null</td></tr><tr><td>2026-01-02</td><td>&quot;IC83A1B819&quot;</td><td>&quot;ICA17F00B9&quot;</td><td>null</td></tr><tr><td>&hellip;</td><td>&hellip;</td><td>&hellip;</td><td>&hellip;</td></tr><tr><td>2026-01-30</td><td>&quot;IC83A1B819&quot;</td><td>&quot;ICA17F00B9&quot;</td><td>0.408816</td></tr><tr><td>2026-01-30</td><td>&quot;ICA17F00B9&quot;</td><td>&quot;ICA17F00B9&quot;</td><td>1.0</td></tr><tr><td>2026-01-30</td><td>&quot;IC83A1B819&quot;</td><td>&quot;ICF982536B&quot;</td><td>-0.013701</td></tr><tr><td>2026-01-30</td><td>&quot;ICA17F00B9&quot;</td><td>&quot;ICF982536B&quot;</td><td>-0.174511</td></tr><tr><td>2026-01-30</td><td>&quot;ICF982536B&quot;</td><td>&quot;ICF982536B&quot;</td><td>1.0</td></tr></tbody></table></div>"
            ],
            "text/plain": [
              "shape: (180, 4)\n",
              "┌────────────┬────────────┬────────────────┬───────────┐\n",
              "│ date       ┆ asset_id   ┆ asset_id_right ┆ idio_corr │\n",
              "│ ---        ┆ ---        ┆ ---            ┆ ---       │\n",
              "│ date       ┆ str        ┆ str            ┆ f32       │\n",
              "╞════════════╪════════════╪════════════════╪═══════════╡\n",
              "│ 2026-01-02 ┆ IC28E9776F ┆ IC28E9776F     ┆ null      │\n",
              "│ 2026-01-02 ┆ IC28E9776F ┆ IC63213253     ┆ null      │\n",
              "│ 2026-01-02 ┆ IC63213253 ┆ IC63213253     ┆ null      │\n",
              "│ 2026-01-02 ┆ IC83A1B819 ┆ IC83A1B819     ┆ null      │\n",
              "│ 2026-01-02 ┆ IC83A1B819 ┆ ICA17F00B9     ┆ null      │\n",
              "│ …          ┆ …          ┆ …              ┆ …         │\n",
              "│ 2026-01-30 ┆ IC83A1B819 ┆ ICA17F00B9     ┆ 0.408816  │\n",
              "│ 2026-01-30 ┆ ICA17F00B9 ┆ ICA17F00B9     ┆ 1.0       │\n",
              "│ 2026-01-30 ┆ IC83A1B819 ┆ ICF982536B     ┆ -0.013701 │\n",
              "│ 2026-01-30 ┆ ICA17F00B9 ┆ ICF982536B     ┆ -0.174511 │\n",
              "│ 2026-01-30 ┆ ICF982536B ┆ ICF982536B     ┆ 1.0       │\n",
              "└────────────┴────────────┴────────────────┴───────────┘"
            ]
          },
          "execution_count": 12,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "# by joining twice and normalizing, we get the correlation matrix\n",
        "idio_corr_df = (\n",
        "    idio_vcov_df.join(idio_var_df, on=(\"date\", \"asset_id\"))\n",
        "    .join(idio_var_df, left_on=(\"date\", \"asset_id_right\"), right_on=(\"date\", \"asset_id\"))\n",
        "    .select(\n",
        "        \"date\",\n",
        "        \"asset_id\",\n",
        "        \"asset_id_right\",\n",
        "        (pl.col(\"idio_vcov\") / (pl.col(\"idio_var\") * pl.col(\"idio_var_right\")).sqrt()).alias(\"idio_corr\"))\n",
        ")\n",
        "idio_corr_df\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 13,
      "id": "9fb7a037",
      "metadata": {},
      "outputs": [
        {
          "data": {
            "text/html": [
              "<div><style>\n",
              ".dataframe > thead > tr,\n",
              ".dataframe > tbody > tr {\n",
              "  text-align: right;\n",
              "  white-space: pre-wrap;\n",
              "}\n",
              "</style>\n",
              "<small>shape: (95, 7)</small><table border=\"1\" class=\"dataframe\"><thead><tr><th>date</th><th>asset_id</th><th>IC28E9776F</th><th>IC63213253</th><th>IC83A1B819</th><th>ICA17F00B9</th><th>ICF982536B</th></tr><tr><td>date</td><td>str</td><td>f32</td><td>f32</td><td>f32</td><td>f32</td><td>f32</td></tr></thead><tbody><tr><td>2026-01-05</td><td>&quot;IC28E9776F&quot;</td><td>1.0</td><td>1.0</td><td>null</td><td>null</td><td>null</td></tr><tr><td>2026-01-05</td><td>&quot;IC63213253&quot;</td><td>null</td><td>1.0</td><td>null</td><td>null</td><td>null</td></tr><tr><td>2026-01-05</td><td>&quot;IC83A1B819&quot;</td><td>null</td><td>null</td><td>1.0</td><td>1.0</td><td>1.0</td></tr><tr><td>2026-01-05</td><td>&quot;ICA17F00B9&quot;</td><td>null</td><td>null</td><td>null</td><td>1.0</td><td>1.0</td></tr><tr><td>2026-01-05</td><td>&quot;ICF982536B&quot;</td><td>null</td><td>null</td><td>null</td><td>null</td><td>1.0</td></tr><tr><td>&hellip;</td><td>&hellip;</td><td>&hellip;</td><td>&hellip;</td><td>&hellip;</td><td>&hellip;</td><td>&hellip;</td></tr><tr><td>2026-01-30</td><td>&quot;IC28E9776F&quot;</td><td>1.0</td><td>0.638806</td><td>null</td><td>null</td><td>null</td></tr><tr><td>2026-01-30</td><td>&quot;IC63213253&quot;</td><td>null</td><td>1.0</td><td>null</td><td>null</td><td>null</td></tr><tr><td>2026-01-30</td><td>&quot;IC83A1B819&quot;</td><td>null</td><td>null</td><td>1.0</td><td>0.408816</td><td>-0.013701</td></tr><tr><td>2026-01-30</td><td>&quot;ICA17F00B9&quot;</td><td>null</td><td>null</td><td>null</td><td>1.0</td><td>-0.174511</td></tr><tr><td>2026-01-30</td><td>&quot;ICF982536B&quot;</td><td>null</td><td>null</td><td>null</td><td>null</td><td>1.0</td></tr></tbody></table></div>"
            ],
            "text/plain": [
              "shape: (95, 7)\n",
              "┌────────────┬────────────┬────────────┬────────────┬────────────┬────────────┬────────────┐\n",
              "│ date       ┆ asset_id   ┆ IC28E9776F ┆ IC63213253 ┆ IC83A1B819 ┆ ICA17F00B9 ┆ ICF982536B │\n",
              "│ ---        ┆ ---        ┆ ---        ┆ ---        ┆ ---        ┆ ---        ┆ ---        │\n",
              "│ date       ┆ str        ┆ f32        ┆ f32        ┆ f32        ┆ f32        ┆ f32        │\n",
              "╞════════════╪════════════╪════════════╪════════════╪════════════╪════════════╪════════════╡\n",
              "│ 2026-01-05 ┆ IC28E9776F ┆ 1.0        ┆ 1.0        ┆ null       ┆ null       ┆ null       │\n",
              "│ 2026-01-05 ┆ IC63213253 ┆ null       ┆ 1.0        ┆ null       ┆ null       ┆ null       │\n",
              "│ 2026-01-05 ┆ IC83A1B819 ┆ null       ┆ null       ┆ 1.0        ┆ 1.0        ┆ 1.0        │\n",
              "│ 2026-01-05 ┆ ICA17F00B9 ┆ null       ┆ null       ┆ null       ┆ 1.0        ┆ 1.0        │\n",
              "│ 2026-01-05 ┆ ICF982536B ┆ null       ┆ null       ┆ null       ┆ null       ┆ 1.0        │\n",
              "│ …          ┆ …          ┆ …          ┆ …          ┆ …          ┆ …          ┆ …          │\n",
              "│ 2026-01-30 ┆ IC28E9776F ┆ 1.0        ┆ 0.638806   ┆ null       ┆ null       ┆ null       │\n",
              "│ 2026-01-30 ┆ IC63213253 ┆ null       ┆ 1.0        ┆ null       ┆ null       ┆ null       │\n",
              "│ 2026-01-30 ┆ IC83A1B819 ┆ null       ┆ null       ┆ 1.0        ┆ 0.408816   ┆ -0.013701  │\n",
              "│ 2026-01-30 ┆ ICA17F00B9 ┆ null       ┆ null       ┆ null       ┆ 1.0        ┆ -0.174511  │\n",
              "│ 2026-01-30 ┆ ICF982536B ┆ null       ┆ null       ┆ null       ┆ null       ┆ 1.0        │\n",
              "└────────────┴────────────┴────────────┴────────────┴────────────┴────────────┴────────────┘"
            ]
          },
          "execution_count": 13,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "# for small portfolios, the dataframe is small enough to pivot and display\n",
        "(\n",
        "    idio_corr_df.pivot(\"asset_id_right\", index=(\"date\", \"asset_id\"), maintain_order=True, sort_columns=True)\n",
        "    .filter(pl.col(\"date\") > pl.col(\"date\").min())\n",
        ")"
      ]
    }
  ],
  "metadata": {
    "kernelspec": {
      "display_name": ".venv",
      "language": "python",
      "name": "python3"
    },
    "language_info": {
      "codemirror_mode": {
        "name": "ipython",
        "version": 3
      },
      "file_extension": ".py",
      "mimetype": "text/x-python",
      "name": "python",
      "nbconvert_exporter": "python",
      "pygments_lexer": "ipython3",
      "version": "3.11.15"
    }
  },
  "nbformat": 4,
  "nbformat_minor": 5
}