{
  "cells": [
    {
      "cell_type": "markdown",
      "id": "ddffe0a7",
      "metadata": {},
      "source": [
        "# Idiosyncratic volatility and correlation forecasting\n",
        "\n",
        "Use this notebook to extract a volatility forecast report for the idiosyncratic returns. The notebook also shows how to compute idiosyncratic correlations. Getting these correlations out of the system directly is not available yet."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 2,
      "id": "3ba95a46",
      "metadata": {},
      "outputs": [],
      "source": [
        "import datetime as dt\n",
        "from itertools import combinations_with_replacement\n",
        "\n",
        "import polars as pl\n",
        "\n",
        "from bayesline.api.equity import (\n",
        "    CategoricalExposureGroupSettings,\n",
        "    ContinuousExposureGroupSettings,\n",
        "    ExposureSettings,\n",
        "    FactorRiskModelSettings,\n",
        "    IdioReportSettings,\n",
        "    ModelConstructionSettings,\n",
        "    UniverseSettings,\n",
        ")\n",
        "from bayesline.apiclient import BayeslineApiClient"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "id": "474781c4",
      "metadata": {
        "tags": [
          "skip-execution"
        ]
      },
      "outputs": [],
      "source": [
        "bln = BayeslineApiClient.new_client(\n",
        "    endpoint=\"https://[ENDPOINT]\",\n",
        "    api_key=\"[API-KEY]\",\n",
        ")"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "64845a3f",
      "metadata": {},
      "source": [
        "We begin by specifying a standard factor model that we can compute the idiosyncratic returns in reference to."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 3,
      "id": "8e57717f",
      "metadata": {},
      "outputs": [],
      "source": [
        "factorriskmodel_settings = FactorRiskModelSettings(\n",
        "    universe=UniverseSettings(),\n",
        "    exposures=ExposureSettings(\n",
        "        exposures=[\n",
        "            ContinuousExposureGroupSettings(hierarchy=\"market\"),\n",
        "            CategoricalExposureGroupSettings(hierarchy=\"trbc\"),\n",
        "            ContinuousExposureGroupSettings(hierarchy=\"style\"),\n",
        "        ]\n",
        "    ),\n",
        "    modelconstruction=ModelConstructionSettings(\n",
        "        estimation_universe=None,\n",
        "        zero_sum_constraints={\"trbc\": \"mcap_weighted\"}\n",
        "    ),\n",
        ")"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "fc8fdc43",
      "metadata": {},
      "source": [
        "## Getting the idiosyncratic volatility forecasts\n",
        "\n",
        "For the idiosyncratic volatility, we can directly query the output. Below we extract a dataframe with the sqrt-diagonal of the idiosyncratic risk matrix. We run with default settings here, but underneath `IdioReportSettings` many different options are available."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 4,
      "id": "fd53eb83",
      "metadata": {
        "lines_to_next_cell": 2
      },
      "outputs": [],
      "source": [
        "report_settings = IdioReportSettings(\n",
        "    factor_model_settings=factorriskmodel_settings\n",
        ")\n",
        "\n",
        "report_engine = bln.equity.reports.load(\n",
        "    report_settings.with_dataset(\"bayesline/Bayesline-US-All-1y\")\n",
        ")\n",
        "report = report_engine.calculate(start_date=\"2026-01-02\", end_date=\"2026-01-30\")\n",
        "\n",
        "idio_vol_df = report.accessor.get_data(\n",
        "    [], expand=(\"date\", \"asset_id\"), value_cols=(\"idio_vol\",)\n",
        ").with_columns(pl.col(pl.Float32).fill_nan(None))"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 5,
      "id": "7046a4af",
      "metadata": {},
      "outputs": [
        {
          "data": {
            "text/html": [
              "<div><style>\n",
              ".dataframe > thead > tr,\n",
              ".dataframe > tbody > tr {\n",
              "  text-align: right;\n",
              "  white-space: pre-wrap;\n",
              "}\n",
              "</style>\n",
              "<small>shape: (170_899, 3)</small><table border=\"1\" class=\"dataframe\"><thead><tr><th>date</th><th>asset_id</th><th>idio_vol</th></tr><tr><td>date</td><td>str</td><td>f32</td></tr></thead><tbody><tr><td>2026-01-05</td><td>&quot;IC000B1557&quot;</td><td>0.012124</td></tr><tr><td>2026-01-05</td><td>&quot;IC0010CEFE&quot;</td><td>0.275756</td></tr><tr><td>2026-01-05</td><td>&quot;IC0021AFB7&quot;</td><td>0.083764</td></tr><tr><td>2026-01-05</td><td>&quot;IC002CE8B9&quot;</td><td>1.182203</td></tr><tr><td>2026-01-05</td><td>&quot;IC002DC646&quot;</td><td>0.108916</td></tr><tr><td>&hellip;</td><td>&hellip;</td><td>&hellip;</td></tr><tr><td>2026-01-30</td><td>&quot;ICFFE60191&quot;</td><td>0.606094</td></tr><tr><td>2026-01-30</td><td>&quot;ICFFE938FD&quot;</td><td>0.187951</td></tr><tr><td>2026-01-30</td><td>&quot;ICFFE94AED&quot;</td><td>0.275471</td></tr><tr><td>2026-01-30</td><td>&quot;ICFFEBBB38&quot;</td><td>0.070684</td></tr><tr><td>2026-01-30</td><td>&quot;ICFFF2F5AD&quot;</td><td>1.38437</td></tr></tbody></table></div>"
            ],
            "text/plain": [
              "shape: (170_899, 3)\n",
              "┌────────────┬────────────┬──────────┐\n",
              "│ date       ┆ asset_id   ┆ idio_vol │\n",
              "│ ---        ┆ ---        ┆ ---      │\n",
              "│ date       ┆ str        ┆ f32      │\n",
              "╞════════════╪════════════╪══════════╡\n",
              "│ 2026-01-05 ┆ IC000B1557 ┆ 0.012124 │\n",
              "│ 2026-01-05 ┆ IC0010CEFE ┆ 0.275756 │\n",
              "│ 2026-01-05 ┆ IC0021AFB7 ┆ 0.083764 │\n",
              "│ 2026-01-05 ┆ IC002CE8B9 ┆ 1.182203 │\n",
              "│ 2026-01-05 ┆ IC002DC646 ┆ 0.108916 │\n",
              "│ …          ┆ …          ┆ …        │\n",
              "│ 2026-01-30 ┆ ICFFE60191 ┆ 0.606094 │\n",
              "│ 2026-01-30 ┆ ICFFE938FD ┆ 0.187951 │\n",
              "│ 2026-01-30 ┆ ICFFE94AED ┆ 0.275471 │\n",
              "│ 2026-01-30 ┆ ICFFEBBB38 ┆ 0.070684 │\n",
              "│ 2026-01-30 ┆ ICFFF2F5AD ┆ 1.38437  │\n",
              "└────────────┴────────────┴──────────┘"
            ]
          },
          "execution_count": 5,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "idio_vol_df.drop_nulls()"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "23ee7bfd",
      "metadata": {},
      "source": [
        "## Computing the idiosyncratic correlations\n",
        "\n",
        "Sometimes it is necessary to allow for density in the idiosyncratic risk matrix. Factor models may not be able to explain co-movement in smaller clusters of highly similar assets. We are working on integrations, but for now it is only possible to manually compute these off-diagonal correlations from the idiosyncratic return time-series as a post-processing step. In the code below, we will extract the idiosyncratic returns, and subsequently compute the correlation matrix for two groups within our portfolio of six assets.\n",
        "\n",
        "First, we run a very similar report as above to extract the idiosyncratic returns time-series."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 6,
      "id": "8f4e1a52",
      "metadata": {},
      "outputs": [],
      "source": [
        "idio_ret_df = report.accessor.get_data(\n",
        "    [], expand=(\"date\", \"asset_id\"), value_cols=(\"idio_return\",)\n",
        ").with_columns(pl.col(pl.Float32).fill_nan(None))"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 7,
      "id": "fe82363e",
      "metadata": {},
      "outputs": [
        {
          "data": {
            "text/html": [
              "<div><style>\n",
              ".dataframe > thead > tr,\n",
              ".dataframe > tbody > tr {\n",
              "  text-align: right;\n",
              "  white-space: pre-wrap;\n",
              "}\n",
              "</style>\n",
              "<small>shape: (170_636, 3)</small><table border=\"1\" class=\"dataframe\"><thead><tr><th>date</th><th>asset_id</th><th>idio_return</th></tr><tr><td>date</td><td>str</td><td>f32</td></tr></thead><tbody><tr><td>2026-01-05</td><td>&quot;IC000B1557&quot;</td><td>-0.000764</td></tr><tr><td>2026-01-05</td><td>&quot;IC0010CEFE&quot;</td><td>0.017371</td></tr><tr><td>2026-01-05</td><td>&quot;IC0021AFB7&quot;</td><td>-0.005277</td></tr><tr><td>2026-01-05</td><td>&quot;IC002CE8B9&quot;</td><td>0.074472</td></tr><tr><td>2026-01-05</td><td>&quot;IC002DC646&quot;</td><td>0.006861</td></tr><tr><td>&hellip;</td><td>&hellip;</td><td>&hellip;</td></tr><tr><td>2026-01-30</td><td>&quot;ICFFE60191&quot;</td><td>0.019298</td></tr><tr><td>2026-01-30</td><td>&quot;ICFFE938FD&quot;</td><td>0.011209</td></tr><tr><td>2026-01-30</td><td>&quot;ICFFE94AED&quot;</td><td>0.006347</td></tr><tr><td>2026-01-30</td><td>&quot;ICFFEBBB38&quot;</td><td>-0.00221</td></tr><tr><td>2026-01-30</td><td>&quot;ICFFF2F5AD&quot;</td><td>-0.046461</td></tr></tbody></table></div>"
            ],
            "text/plain": [
              "shape: (170_636, 3)\n",
              "┌────────────┬────────────┬─────────────┐\n",
              "│ date       ┆ asset_id   ┆ idio_return │\n",
              "│ ---        ┆ ---        ┆ ---         │\n",
              "│ date       ┆ str        ┆ f32         │\n",
              "╞════════════╪════════════╪═════════════╡\n",
              "│ 2026-01-05 ┆ IC000B1557 ┆ -0.000764   │\n",
              "│ 2026-01-05 ┆ IC0010CEFE ┆ 0.017371    │\n",
              "│ 2026-01-05 ┆ IC0021AFB7 ┆ -0.005277   │\n",
              "│ 2026-01-05 ┆ IC002CE8B9 ┆ 0.074472    │\n",
              "│ 2026-01-05 ┆ IC002DC646 ┆ 0.006861    │\n",
              "│ …          ┆ …          ┆ …           │\n",
              "│ 2026-01-30 ┆ ICFFE60191 ┆ 0.019298    │\n",
              "│ 2026-01-30 ┆ ICFFE938FD ┆ 0.011209    │\n",
              "│ 2026-01-30 ┆ ICFFE94AED ┆ 0.006347    │\n",
              "│ 2026-01-30 ┆ ICFFEBBB38 ┆ -0.00221    │\n",
              "│ 2026-01-30 ┆ ICFFF2F5AD ┆ -0.046461   │\n",
              "└────────────┴────────────┴─────────────┘"
            ]
          },
          "execution_count": 7,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "idio_ret_df.drop_nulls()"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "849b8797",
      "metadata": {},
      "source": [
        "Next, we define the groups of similar assets. These need to be mutually exclusive. I.e. we cannot have one asset that is part of multiple groups. Not all assets have to be part of a group.\n",
        "\n",
        "In the example below, we put Apple, Microsoft and Alphabet in a group, and Mastercast and Visa in a separate group. NVIDIA is not part of a group."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 8,
      "id": "8fb1fbd9",
      "metadata": {},
      "outputs": [
        {
          "data": {
            "text/html": [
              "<div><style>\n",
              ".dataframe > thead > tr,\n",
              ".dataframe > tbody > tr {\n",
              "  text-align: right;\n",
              "  white-space: pre-wrap;\n",
              "}\n",
              "</style>\n",
              "<small>shape: (5, 6)</small><table border=\"1\" class=\"dataframe\"><thead><tr><th>asset_id</th><th>IC28E9776F</th><th>IC63213253</th><th>IC83A1B819</th><th>ICA17F00B9</th><th>ICF982536B</th></tr><tr><td>str</td><td>i32</td><td>i32</td><td>i32</td><td>i32</td><td>i32</td></tr></thead><tbody><tr><td>&quot;IC28E9776F&quot;</td><td>1</td><td>1</td><td>null</td><td>null</td><td>null</td></tr><tr><td>&quot;IC63213253&quot;</td><td>null</td><td>1</td><td>null</td><td>null</td><td>null</td></tr><tr><td>&quot;IC83A1B819&quot;</td><td>null</td><td>null</td><td>1</td><td>1</td><td>1</td></tr><tr><td>&quot;ICA17F00B9&quot;</td><td>null</td><td>null</td><td>null</td><td>1</td><td>1</td></tr><tr><td>&quot;ICF982536B&quot;</td><td>null</td><td>null</td><td>null</td><td>null</td><td>1</td></tr></tbody></table></div>"
            ],
            "text/plain": [
              "shape: (5, 6)\n",
              "┌────────────┬────────────┬────────────┬────────────┬────────────┬────────────┐\n",
              "│ asset_id   ┆ IC28E9776F ┆ IC63213253 ┆ IC83A1B819 ┆ ICA17F00B9 ┆ ICF982536B │\n",
              "│ ---        ┆ ---        ┆ ---        ┆ ---        ┆ ---        ┆ ---        │\n",
              "│ str        ┆ i32        ┆ i32        ┆ i32        ┆ i32        ┆ i32        │\n",
              "╞════════════╪════════════╪════════════╪════════════╪════════════╪════════════╡\n",
              "│ IC28E9776F ┆ 1          ┆ 1          ┆ null       ┆ null       ┆ null       │\n",
              "│ IC63213253 ┆ null       ┆ 1          ┆ null       ┆ null       ┆ null       │\n",
              "│ IC83A1B819 ┆ null       ┆ null       ┆ 1          ┆ 1          ┆ 1          │\n",
              "│ ICA17F00B9 ┆ null       ┆ null       ┆ null       ┆ 1          ┆ 1          │\n",
              "│ ICF982536B ┆ null       ┆ null       ┆ null       ┆ null       ┆ 1          │\n",
              "└────────────┴────────────┴────────────┴────────────┴────────────┴────────────┘"
            ]
          },
          "execution_count": 8,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "groups = [\n",
        "    sorted([\"IC83A1B819\", \"ICF982536B\", \"ICA17F00B9\"]),  # Apple, Microsoft, Alphabet\n",
        "    sorted([\"IC63213253\", \"IC28E9776F\"]),  # Mastercard, Visa\n",
        "]\n",
        "\n",
        "# create a dataframe with all combinations\n",
        "df_offdiag = pl.DataFrame(\n",
        "    [\n",
        "        (left, right)\n",
        "        for group in groups\n",
        "        for left, right in combinations_with_replacement(group, 2)\n",
        "    ], \n",
        "    schema=[\"asset_id\", \"asset_id_right\"],\n",
        "    orient=\"row\",\n",
        ")\n",
        "\n",
        "# just for display, this is in realistic scenarios a very large dataframe\n",
        "(\n",
        "    df_offdiag.sort(\"asset_id\", \"asset_id_right\")\n",
        "    .with_columns(pl.lit(1))\n",
        "    .pivot(\"asset_id_right\", index=\"asset_id\", maintain_order=True, sort_columns=True)\n",
        ")"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 9,
      "id": "30172cfc",
      "metadata": {},
      "outputs": [
        {
          "data": {
            "text/html": [
              "<div><style>\n",
              ".dataframe > thead > tr,\n",
              ".dataframe > tbody > tr {\n",
              "  text-align: right;\n",
              "  white-space: pre-wrap;\n",
              "}\n",
              "</style>\n",
              "<small>shape: (180, 5)</small><table border=\"1\" class=\"dataframe\"><thead><tr><th>date</th><th>asset_id</th><th>idio_return</th><th>asset_id_right</th><th>idio_return_right</th></tr><tr><td>date</td><td>str</td><td>f32</td><td>str</td><td>f32</td></tr></thead><tbody><tr><td>2026-01-02</td><td>&quot;IC28E9776F&quot;</td><td>null</td><td>&quot;IC28E9776F&quot;</td><td>null</td></tr><tr><td>2026-01-02</td><td>&quot;IC28E9776F&quot;</td><td>null</td><td>&quot;IC63213253&quot;</td><td>null</td></tr><tr><td>2026-01-02</td><td>&quot;IC63213253&quot;</td><td>null</td><td>&quot;IC63213253&quot;</td><td>null</td></tr><tr><td>2026-01-02</td><td>&quot;IC83A1B819&quot;</td><td>null</td><td>&quot;IC83A1B819&quot;</td><td>null</td></tr><tr><td>2026-01-02</td><td>&quot;IC83A1B819&quot;</td><td>null</td><td>&quot;ICA17F00B9&quot;</td><td>null</td></tr><tr><td>&hellip;</td><td>&hellip;</td><td>&hellip;</td><td>&hellip;</td><td>&hellip;</td></tr><tr><td>2026-01-30</td><td>&quot;IC83A1B819&quot;</td><td>0.012339</td><td>&quot;ICA17F00B9&quot;</td><td>0.022377</td></tr><tr><td>2026-01-30</td><td>&quot;ICA17F00B9&quot;</td><td>0.022377</td><td>&quot;ICA17F00B9&quot;</td><td>0.022377</td></tr><tr><td>2026-01-30</td><td>&quot;IC83A1B819&quot;</td><td>0.012339</td><td>&quot;ICF982536B&quot;</td><td>-0.000963</td></tr><tr><td>2026-01-30</td><td>&quot;ICA17F00B9&quot;</td><td>0.022377</td><td>&quot;ICF982536B&quot;</td><td>-0.000963</td></tr><tr><td>2026-01-30</td><td>&quot;ICF982536B&quot;</td><td>-0.000963</td><td>&quot;ICF982536B&quot;</td><td>-0.000963</td></tr></tbody></table></div>"
            ],
            "text/plain": [
              "shape: (180, 5)\n",
              "┌────────────┬────────────┬─────────────┬────────────────┬───────────────────┐\n",
              "│ date       ┆ asset_id   ┆ idio_return ┆ asset_id_right ┆ idio_return_right │\n",
              "│ ---        ┆ ---        ┆ ---         ┆ ---            ┆ ---               │\n",
              "│ date       ┆ str        ┆ f32         ┆ str            ┆ f32               │\n",
              "╞════════════╪════════════╪═════════════╪════════════════╪═══════════════════╡\n",
              "│ 2026-01-02 ┆ IC28E9776F ┆ null        ┆ IC28E9776F     ┆ null              │\n",
              "│ 2026-01-02 ┆ IC28E9776F ┆ null        ┆ IC63213253     ┆ null              │\n",
              "│ 2026-01-02 ┆ IC63213253 ┆ null        ┆ IC63213253     ┆ null              │\n",
              "│ 2026-01-02 ┆ IC83A1B819 ┆ null        ┆ IC83A1B819     ┆ null              │\n",
              "│ 2026-01-02 ┆ IC83A1B819 ┆ null        ┆ ICA17F00B9     ┆ null              │\n",
              "│ …          ┆ …          ┆ …           ┆ …              ┆ …                 │\n",
              "│ 2026-01-30 ┆ IC83A1B819 ┆ 0.012339    ┆ ICA17F00B9     ┆ 0.022377          │\n",
              "│ 2026-01-30 ┆ ICA17F00B9 ┆ 0.022377    ┆ ICA17F00B9     ┆ 0.022377          │\n",
              "│ 2026-01-30 ┆ IC83A1B819 ┆ 0.012339    ┆ ICF982536B     ┆ -0.000963         │\n",
              "│ 2026-01-30 ┆ ICA17F00B9 ┆ 0.022377    ┆ ICF982536B     ┆ -0.000963         │\n",
              "│ 2026-01-30 ┆ ICF982536B ┆ -0.000963   ┆ ICF982536B     ┆ -0.000963         │\n",
              "└────────────┴────────────┴─────────────┴────────────────┴───────────────────┘"
            ]
          },
          "execution_count": 9,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "# join the time series such that we have each combination that we need to compute\n",
        "idio_ret_df_joined = (\n",
        "    idio_ret_df\n",
        "    .join(df_offdiag, on=\"asset_id\")\n",
        "    .join(idio_ret_df, left_on=(\"date\", \"asset_id_right\"), right_on=(\"date\", \"asset_id\"))\n",
        ")\n",
        "idio_ret_df_joined"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "f2f21dcf",
      "metadata": {},
      "source": [
        "We compute the covariance matrix first, and then standardize into the correlation matrix. The computation of the covariance matrix relies on computing a rolling mean to correct for autocorrelation, and subsequently an exponentially weighted moving average. We then divide by the standard deviations to obtain the correlations."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 10,
      "id": "f73e48db",
      "metadata": {},
      "outputs": [
        {
          "data": {
            "text/html": [
              "<div><style>\n",
              ".dataframe > thead > tr,\n",
              ".dataframe > tbody > tr {\n",
              "  text-align: right;\n",
              "  white-space: pre-wrap;\n",
              "}\n",
              "</style>\n",
              "<small>shape: (180, 6)</small><table border=\"1\" class=\"dataframe\"><thead><tr><th>date</th><th>asset_id</th><th>idio_return</th><th>asset_id_right</th><th>idio_return_right</th><th>idio_vcov</th></tr><tr><td>date</td><td>str</td><td>f32</td><td>str</td><td>f32</td><td>f32</td></tr></thead><tbody><tr><td>2026-01-02</td><td>&quot;IC28E9776F&quot;</td><td>null</td><td>&quot;IC28E9776F&quot;</td><td>null</td><td>null</td></tr><tr><td>2026-01-02</td><td>&quot;IC28E9776F&quot;</td><td>null</td><td>&quot;IC63213253&quot;</td><td>null</td><td>null</td></tr><tr><td>2026-01-02</td><td>&quot;IC63213253&quot;</td><td>null</td><td>&quot;IC63213253&quot;</td><td>null</td><td>null</td></tr><tr><td>2026-01-02</td><td>&quot;IC83A1B819&quot;</td><td>null</td><td>&quot;IC83A1B819&quot;</td><td>null</td><td>null</td></tr><tr><td>2026-01-02</td><td>&quot;IC83A1B819&quot;</td><td>null</td><td>&quot;ICA17F00B9&quot;</td><td>null</td><td>null</td></tr><tr><td>&hellip;</td><td>&hellip;</td><td>&hellip;</td><td>&hellip;</td><td>&hellip;</td><td>&hellip;</td></tr><tr><td>2026-01-30</td><td>&quot;IC83A1B819&quot;</td><td>0.007145</td><td>&quot;ICA17F00B9&quot;</td><td>0.005119</td><td>0.000019</td></tr><tr><td>2026-01-30</td><td>&quot;ICA17F00B9&quot;</td><td>0.005119</td><td>&quot;ICA17F00B9&quot;</td><td>0.005119</td><td>0.000032</td></tr><tr><td>2026-01-30</td><td>&quot;IC83A1B819&quot;</td><td>0.007145</td><td>&quot;ICF982536B&quot;</td><td>-0.014195</td><td>-0.000002</td></tr><tr><td>2026-01-30</td><td>&quot;ICA17F00B9&quot;</td><td>0.005119</td><td>&quot;ICF982536B&quot;</td><td>-0.014195</td><td>-0.000007</td></tr><tr><td>2026-01-30</td><td>&quot;ICF982536B&quot;</td><td>-0.014195</td><td>&quot;ICF982536B&quot;</td><td>-0.014195</td><td>0.000038</td></tr></tbody></table></div>"
            ],
            "text/plain": [
              "shape: (180, 6)\n",
              "┌────────────┬────────────┬─────────────┬────────────────┬───────────────────┬───────────┐\n",
              "│ date       ┆ asset_id   ┆ idio_return ┆ asset_id_right ┆ idio_return_right ┆ idio_vcov │\n",
              "│ ---        ┆ ---        ┆ ---         ┆ ---            ┆ ---               ┆ ---       │\n",
              "│ date       ┆ str        ┆ f32         ┆ str            ┆ f32               ┆ f32       │\n",
              "╞════════════╪════════════╪═════════════╪════════════════╪═══════════════════╪═══════════╡\n",
              "│ 2026-01-02 ┆ IC28E9776F ┆ null        ┆ IC28E9776F     ┆ null              ┆ null      │\n",
              "│ 2026-01-02 ┆ IC28E9776F ┆ null        ┆ IC63213253     ┆ null              ┆ null      │\n",
              "│ 2026-01-02 ┆ IC63213253 ┆ null        ┆ IC63213253     ┆ null              ┆ null      │\n",
              "│ 2026-01-02 ┆ IC83A1B819 ┆ null        ┆ IC83A1B819     ┆ null              ┆ null      │\n",
              "│ 2026-01-02 ┆ IC83A1B819 ┆ null        ┆ ICA17F00B9     ┆ null              ┆ null      │\n",
              "│ …          ┆ …          ┆ …           ┆ …              ┆ …                 ┆ …         │\n",
              "│ 2026-01-30 ┆ IC83A1B819 ┆ 0.007145    ┆ ICA17F00B9     ┆ 0.005119          ┆ 0.000019  │\n",
              "│ 2026-01-30 ┆ ICA17F00B9 ┆ 0.005119    ┆ ICA17F00B9     ┆ 0.005119          ┆ 0.000032  │\n",
              "│ 2026-01-30 ┆ IC83A1B819 ┆ 0.007145    ┆ ICF982536B     ┆ -0.014195         ┆ -0.000002 │\n",
              "│ 2026-01-30 ┆ ICA17F00B9 ┆ 0.005119    ┆ ICF982536B     ┆ -0.014195         ┆ -0.000007 │\n",
              "│ 2026-01-30 ┆ ICF982536B ┆ -0.014195   ┆ ICF982536B     ┆ -0.014195         ┆ 0.000038  │\n",
              "└────────────┴────────────┴─────────────┴────────────────┴───────────────────┴───────────┘"
            ]
          },
          "execution_count": 10,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "# compute the covariance matrix by first using a rolling mean (for overlap),\n",
        "# and then an exponential weighted moving average (for smoothing)\n",
        "overlap_window = 5\n",
        "half_life = 126\n",
        "\n",
        "# compute rolling means\n",
        "idio_ret_smoothed = (\n",
        "    idio_ret_df.with_columns(pl.col(pl.Float32).fill_nan(None))\n",
        "    .with_columns(\n",
        "        pl.col(\"idio_return\").rolling_mean(window_size=overlap_window, min_samples=1).over(\"asset_id\"),\n",
        "    )\n",
        ")\n",
        "# then join\n",
        "idio_ret_df_joined = (\n",
        "    idio_ret_smoothed.join(df_offdiag, on=\"asset_id\")\n",
        "    .join(idio_ret_smoothed, left_on=(\"date\", \"asset_id_right\"), right_on=(\"date\", \"asset_id\"))\n",
        ") \n",
        "# then EWMA of the product                                                                                           \n",
        "idio_vcov_df = (                                                     \n",
        "    idio_ret_df_joined                                                                                               \n",
        "    .with_columns(                                                   \n",
        "        (pl.col(\"idio_return\") * pl.col(\"idio_return_right\"))\n",
        "        .ewm_mean(half_life=half_life)\n",
        "        .over((\"asset_id\", \"asset_id_right\"))                                                                        \n",
        "        .alias(\"idio_vcov\")\n",
        "    )                                                                                                                \n",
        ")  \n",
        "idio_vcov_df"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 11,
      "id": "5ac3d1c7",
      "metadata": {},
      "outputs": [
        {
          "data": {
            "text/html": [
              "<div><style>\n",
              ".dataframe > thead > tr,\n",
              ".dataframe > tbody > tr {\n",
              "  text-align: right;\n",
              "  white-space: pre-wrap;\n",
              "}\n",
              "</style>\n",
              "<small>shape: (100, 3)</small><table border=\"1\" class=\"dataframe\"><thead><tr><th>date</th><th>asset_id</th><th>idio_var</th></tr><tr><td>date</td><td>str</td><td>f32</td></tr></thead><tbody><tr><td>2026-01-02</td><td>&quot;IC28E9776F&quot;</td><td>null</td></tr><tr><td>2026-01-02</td><td>&quot;IC63213253&quot;</td><td>null</td></tr><tr><td>2026-01-02</td><td>&quot;IC83A1B819&quot;</td><td>null</td></tr><tr><td>2026-01-02</td><td>&quot;ICA17F00B9&quot;</td><td>null</td></tr><tr><td>2026-01-02</td><td>&quot;ICF982536B&quot;</td><td>null</td></tr><tr><td>&hellip;</td><td>&hellip;</td><td>&hellip;</td></tr><tr><td>2026-01-30</td><td>&quot;IC28E9776F&quot;</td><td>0.000076</td></tr><tr><td>2026-01-30</td><td>&quot;IC63213253&quot;</td><td>0.000032</td></tr><tr><td>2026-01-30</td><td>&quot;IC83A1B819&quot;</td><td>0.000092</td></tr><tr><td>2026-01-30</td><td>&quot;ICA17F00B9&quot;</td><td>0.000032</td></tr><tr><td>2026-01-30</td><td>&quot;ICF982536B&quot;</td><td>0.000038</td></tr></tbody></table></div>"
            ],
            "text/plain": [
              "shape: (100, 3)\n",
              "┌────────────┬────────────┬──────────┐\n",
              "│ date       ┆ asset_id   ┆ idio_var │\n",
              "│ ---        ┆ ---        ┆ ---      │\n",
              "│ date       ┆ str        ┆ f32      │\n",
              "╞════════════╪════════════╪══════════╡\n",
              "│ 2026-01-02 ┆ IC28E9776F ┆ null     │\n",
              "│ 2026-01-02 ┆ IC63213253 ┆ null     │\n",
              "│ 2026-01-02 ┆ IC83A1B819 ┆ null     │\n",
              "│ 2026-01-02 ┆ ICA17F00B9 ┆ null     │\n",
              "│ 2026-01-02 ┆ ICF982536B ┆ null     │\n",
              "│ …          ┆ …          ┆ …        │\n",
              "│ 2026-01-30 ┆ IC28E9776F ┆ 0.000076 │\n",
              "│ 2026-01-30 ┆ IC63213253 ┆ 0.000032 │\n",
              "│ 2026-01-30 ┆ IC83A1B819 ┆ 0.000092 │\n",
              "│ 2026-01-30 ┆ ICA17F00B9 ┆ 0.000032 │\n",
              "│ 2026-01-30 ┆ ICF982536B ┆ 0.000038 │\n",
              "└────────────┴────────────┴──────────┘"
            ]
          },
          "execution_count": 11,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "# to translate the covariance matrix to a correlation matrix, \n",
        "# we need to select the variance of the idiosyncratic returns\n",
        "idio_var_df = (\n",
        "    idio_vcov_df.filter(pl.col(\"asset_id\") == pl.col(\"asset_id_right\"))\n",
        "    .select(\"date\", \"asset_id\", pl.col(\"idio_vcov\").alias(\"idio_var\"))\n",
        ")\n",
        "idio_var_df"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 12,
      "id": "4d5d74d2",
      "metadata": {
        "lines_to_next_cell": 2
      },
      "outputs": [
        {
          "data": {
            "text/html": [
              "<div><style>\n",
              ".dataframe > thead > tr,\n",
              ".dataframe > tbody > tr {\n",
              "  text-align: right;\n",
              "  white-space: pre-wrap;\n",
              "}\n",
              "</style>\n",
              "<small>shape: (180, 4)</small><table border=\"1\" class=\"dataframe\"><thead><tr><th>date</th><th>asset_id</th><th>asset_id_right</th><th>idio_corr</th></tr><tr><td>date</td><td>str</td><td>str</td><td>f32</td></tr></thead><tbody><tr><td>2026-01-02</td><td>&quot;IC28E9776F&quot;</td><td>&quot;IC28E9776F&quot;</td><td>null</td></tr><tr><td>2026-01-02</td><td>&quot;IC28E9776F&quot;</td><td>&quot;IC63213253&quot;</td><td>null</td></tr><tr><td>2026-01-02</td><td>&quot;IC63213253&quot;</td><td>&quot;IC63213253&quot;</td><td>null</td></tr><tr><td>2026-01-02</td><td>&quot;IC83A1B819&quot;</td><td>&quot;IC83A1B819&quot;</td><td>null</td></tr><tr><td>2026-01-02</td><td>&quot;IC83A1B819&quot;</td><td>&quot;ICA17F00B9&quot;</td><td>null</td></tr><tr><td>&hellip;</td><td>&hellip;</td><td>&hellip;</td><td>&hellip;</td></tr><tr><td>2026-01-30</td><td>&quot;IC83A1B819&quot;</td><td>&quot;ICA17F00B9&quot;</td><td>0.348917</td></tr><tr><td>2026-01-30</td><td>&quot;ICA17F00B9&quot;</td><td>&quot;ICA17F00B9&quot;</td><td>1.0</td></tr><tr><td>2026-01-30</td><td>&quot;IC83A1B819&quot;</td><td>&quot;ICF982536B&quot;</td><td>-0.033258</td></tr><tr><td>2026-01-30</td><td>&quot;ICA17F00B9&quot;</td><td>&quot;ICF982536B&quot;</td><td>-0.197656</td></tr><tr><td>2026-01-30</td><td>&quot;ICF982536B&quot;</td><td>&quot;ICF982536B&quot;</td><td>1.0</td></tr></tbody></table></div>"
            ],
            "text/plain": [
              "shape: (180, 4)\n",
              "┌────────────┬────────────┬────────────────┬───────────┐\n",
              "│ date       ┆ asset_id   ┆ asset_id_right ┆ idio_corr │\n",
              "│ ---        ┆ ---        ┆ ---            ┆ ---       │\n",
              "│ date       ┆ str        ┆ str            ┆ f32       │\n",
              "╞════════════╪════════════╪════════════════╪═══════════╡\n",
              "│ 2026-01-02 ┆ IC28E9776F ┆ IC28E9776F     ┆ null      │\n",
              "│ 2026-01-02 ┆ IC28E9776F ┆ IC63213253     ┆ null      │\n",
              "│ 2026-01-02 ┆ IC63213253 ┆ IC63213253     ┆ null      │\n",
              "│ 2026-01-02 ┆ IC83A1B819 ┆ IC83A1B819     ┆ null      │\n",
              "│ 2026-01-02 ┆ IC83A1B819 ┆ ICA17F00B9     ┆ null      │\n",
              "│ …          ┆ …          ┆ …              ┆ …         │\n",
              "│ 2026-01-30 ┆ IC83A1B819 ┆ ICA17F00B9     ┆ 0.348917  │\n",
              "│ 2026-01-30 ┆ ICA17F00B9 ┆ ICA17F00B9     ┆ 1.0       │\n",
              "│ 2026-01-30 ┆ IC83A1B819 ┆ ICF982536B     ┆ -0.033258 │\n",
              "│ 2026-01-30 ┆ ICA17F00B9 ┆ ICF982536B     ┆ -0.197656 │\n",
              "│ 2026-01-30 ┆ ICF982536B ┆ ICF982536B     ┆ 1.0       │\n",
              "└────────────┴────────────┴────────────────┴───────────┘"
            ]
          },
          "execution_count": 12,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "# by joining twice and normalizing, we get the correlation matrix\n",
        "idio_corr_df = (\n",
        "    idio_vcov_df.join(idio_var_df, on=(\"date\", \"asset_id\"))\n",
        "    .join(idio_var_df, left_on=(\"date\", \"asset_id_right\"), right_on=(\"date\", \"asset_id\"))\n",
        "    .select(\n",
        "        \"date\",\n",
        "        \"asset_id\",\n",
        "        \"asset_id_right\",\n",
        "        (pl.col(\"idio_vcov\") / (pl.col(\"idio_var\") * pl.col(\"idio_var_right\")).sqrt()).alias(\"idio_corr\"))\n",
        ")\n",
        "idio_corr_df"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 13,
      "id": "658a4c97",
      "metadata": {},
      "outputs": [
        {
          "data": {
            "text/html": [
              "<div><style>\n",
              ".dataframe > thead > tr,\n",
              ".dataframe > tbody > tr {\n",
              "  text-align: right;\n",
              "  white-space: pre-wrap;\n",
              "}\n",
              "</style>\n",
              "<small>shape: (95, 7)</small><table border=\"1\" class=\"dataframe\"><thead><tr><th>date</th><th>asset_id</th><th>IC28E9776F</th><th>IC63213253</th><th>IC83A1B819</th><th>ICA17F00B9</th><th>ICF982536B</th></tr><tr><td>date</td><td>str</td><td>f32</td><td>f32</td><td>f32</td><td>f32</td><td>f32</td></tr></thead><tbody><tr><td>2026-01-05</td><td>&quot;IC28E9776F&quot;</td><td>1.0</td><td>1.0</td><td>null</td><td>null</td><td>null</td></tr><tr><td>2026-01-05</td><td>&quot;IC63213253&quot;</td><td>null</td><td>1.0</td><td>null</td><td>null</td><td>null</td></tr><tr><td>2026-01-05</td><td>&quot;IC83A1B819&quot;</td><td>null</td><td>null</td><td>1.0</td><td>1.0</td><td>1.0</td></tr><tr><td>2026-01-05</td><td>&quot;ICA17F00B9&quot;</td><td>null</td><td>null</td><td>null</td><td>1.0</td><td>1.0</td></tr><tr><td>2026-01-05</td><td>&quot;ICF982536B&quot;</td><td>null</td><td>null</td><td>null</td><td>null</td><td>1.0</td></tr><tr><td>&hellip;</td><td>&hellip;</td><td>&hellip;</td><td>&hellip;</td><td>&hellip;</td><td>&hellip;</td><td>&hellip;</td></tr><tr><td>2026-01-30</td><td>&quot;IC28E9776F&quot;</td><td>1.0</td><td>0.585247</td><td>null</td><td>null</td><td>null</td></tr><tr><td>2026-01-30</td><td>&quot;IC63213253&quot;</td><td>null</td><td>1.0</td><td>null</td><td>null</td><td>null</td></tr><tr><td>2026-01-30</td><td>&quot;IC83A1B819&quot;</td><td>null</td><td>null</td><td>1.0</td><td>0.348917</td><td>-0.033258</td></tr><tr><td>2026-01-30</td><td>&quot;ICA17F00B9&quot;</td><td>null</td><td>null</td><td>null</td><td>1.0</td><td>-0.197656</td></tr><tr><td>2026-01-30</td><td>&quot;ICF982536B&quot;</td><td>null</td><td>null</td><td>null</td><td>null</td><td>1.0</td></tr></tbody></table></div>"
            ],
            "text/plain": [
              "shape: (95, 7)\n",
              "┌────────────┬────────────┬────────────┬────────────┬────────────┬────────────┬────────────┐\n",
              "│ date       ┆ asset_id   ┆ IC28E9776F ┆ IC63213253 ┆ IC83A1B819 ┆ ICA17F00B9 ┆ ICF982536B │\n",
              "│ ---        ┆ ---        ┆ ---        ┆ ---        ┆ ---        ┆ ---        ┆ ---        │\n",
              "│ date       ┆ str        ┆ f32        ┆ f32        ┆ f32        ┆ f32        ┆ f32        │\n",
              "╞════════════╪════════════╪════════════╪════════════╪════════════╪════════════╪════════════╡\n",
              "│ 2026-01-05 ┆ IC28E9776F ┆ 1.0        ┆ 1.0        ┆ null       ┆ null       ┆ null       │\n",
              "│ 2026-01-05 ┆ IC63213253 ┆ null       ┆ 1.0        ┆ null       ┆ null       ┆ null       │\n",
              "│ 2026-01-05 ┆ IC83A1B819 ┆ null       ┆ null       ┆ 1.0        ┆ 1.0        ┆ 1.0        │\n",
              "│ 2026-01-05 ┆ ICA17F00B9 ┆ null       ┆ null       ┆ null       ┆ 1.0        ┆ 1.0        │\n",
              "│ 2026-01-05 ┆ ICF982536B ┆ null       ┆ null       ┆ null       ┆ null       ┆ 1.0        │\n",
              "│ …          ┆ …          ┆ …          ┆ …          ┆ …          ┆ …          ┆ …          │\n",
              "│ 2026-01-30 ┆ IC28E9776F ┆ 1.0        ┆ 0.585247   ┆ null       ┆ null       ┆ null       │\n",
              "│ 2026-01-30 ┆ IC63213253 ┆ null       ┆ 1.0        ┆ null       ┆ null       ┆ null       │\n",
              "│ 2026-01-30 ┆ IC83A1B819 ┆ null       ┆ null       ┆ 1.0        ┆ 0.348917   ┆ -0.033258  │\n",
              "│ 2026-01-30 ┆ ICA17F00B9 ┆ null       ┆ null       ┆ null       ┆ 1.0        ┆ -0.197656  │\n",
              "│ 2026-01-30 ┆ ICF982536B ┆ null       ┆ null       ┆ null       ┆ null       ┆ 1.0        │\n",
              "└────────────┴────────────┴────────────┴────────────┴────────────┴────────────┴────────────┘"
            ]
          },
          "execution_count": 13,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "# for small portfolios, the dataframe is small enough to pivot and display\n",
        "(\n",
        "    idio_corr_df.pivot(\"asset_id_right\", index=(\"date\", \"asset_id\"), maintain_order=True, sort_columns=True)\n",
        "    .filter(pl.col(\"date\") > pl.col(\"date\").min())\n",
        ")"
      ]
    }
  ],
  "metadata": {
    "jupytext": {
      "text_representation": {
        "extension": ".py",
        "format_name": "percent",
        "format_version": "1.3",
        "jupytext_version": "1.19.3"
      }
    },
    "kernelspec": {
      "display_name": ".venv",
      "language": "python",
      "name": "python3"
    },
    "language_info": {
      "codemirror_mode": {
        "name": "ipython",
        "version": 3
      },
      "file_extension": ".py",
      "mimetype": "text/x-python",
      "name": "python",
      "nbconvert_exporter": "python",
      "pygments_lexer": "ipython3",
      "version": "3.11.15"
    }
  },
  "nbformat": 4,
  "nbformat_minor": 5
}