{
  "cells": [
    {
      "cell_type": "markdown",
      "id": "f8bc26bc",
      "metadata": {},
      "source": [
        "# Risk Datasets\n",
        "\n",
        "In this tutorial we are going to demonstrate the usage of the Bayesline Risk Datasets API, which allows to define new datasets that are used as a foundation to estimate factor risk models. \n",
        "\n",
        "A risk dataset comprises all underlying data necessary to build a risk model:\n",
        "* factor exposures (including style exposures, industry/regional exposures, etc.)\n",
        "* asset price data \n",
        "* fundamental data (market caps)\n",
        "* security master data\n",
        "\n",
        "\n",
        "In this first iteration we are allowing to ingest custom exposures into the Bayesline ecosystem, leveraging Bayesline data for the rest. In subsequent product iterations the user will be able to bring custom data for all other items, allowing to mix and match which data is brought by the user and which data is brought by Bayesline.\n",
        "\n",
        "In this notebook we will introduce and explore:\n",
        "* *system datasets* and *user datasets* and how to list them\n",
        "* how to create a new risk dataset\n",
        "* how to create a new risk dataset with custom exposures\n",
        "* how do add exposures to an existing risk dataset"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "676c1e6d",
      "metadata": {},
      "source": [
        "## Imports & Setup\n",
        "\n",
        "For this tutorial notebook, you will need to import the following packages."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 2,
      "id": "0935c4bb",
      "metadata": {},
      "outputs": [],
      "source": [
        "import datetime as dt\n",
        "import numpy as np\n",
        "import polars as pl\n",
        "\n",
        "from bayesline.apiclient import BayeslineApiClient\n",
        "\n",
        "from bayesline.api.equity import (\n",
        "    CategoricalExposureGroupSettings,\n",
        "    CategoricalFilterSettings,\n",
        "    ContinuousExposureGroupSettings,\n",
        "    ExposureSettings,\n",
        "    FactorRiskModelSettings, \n",
        "    ModelConstructionSettings,\n",
        "    RiskDatasetHuberRegressionExposureSettings,\n",
        "    DerivedRiskDatasetSettings,\n",
        "    RiskDatasetUnitExposureSettings,\n",
        "    RiskDatasetUploadedExposureSettings,\n",
        "    UniverseSettings\n",
        ")"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "2fcc7702",
      "metadata": {},
      "source": [
        "We will also need to have a Bayesline API client configured."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "id": "f3d284f1",
      "metadata": {
        "tags": [
          "skip-execution"
        ]
      },
      "outputs": [],
      "source": [
        "bln = BayeslineApiClient.new_client(\n",
        "    endpoint=\"https://[ENDPOINT]\",\n",
        "    api_key=\"[API-KEY]\",\n",
        ")"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "fb44ed03",
      "metadata": {},
      "source": [
        "The main entrypoint for the Risk Datasets API sits on `bln.equity.riskdatasets`. All dataset functionality can be reached from here on out.\n",
        "\n",
        "See here for relevant docs:\n",
        "* [Risk Datasets API Summary](https://docs.bayesline.com/0.14.0/_autosummary/bayesline.api.equity.RiskDatasetLoaderApi.html)"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 3,
      "id": "f039242c",
      "metadata": {},
      "outputs": [],
      "source": [
        "risk_datasets = bln.equity.riskdatasets"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "a4c141f1",
      "metadata": {},
      "source": [
        "## Obtaining Available Datasets\n",
        "\n",
        "To list existing datasets we utilize the `get_dataset_names` method. When creating new datasets the names will appear in this list and are used downstream when creating risk model specifications.\n",
        "\n",
        "We distinguish *system* and *user* datasets. *System* datasets are available to all users, e.g. the *Bayesline-Global* dataset. *User* datasets are created and owned by an individual user. "
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 4,
      "id": "e64e4d46",
      "metadata": {},
      "outputs": [
        {
          "data": {
            "text/plain": [
              "{'bayesline/Bayesline-US-500-1y': 'ready',\n",
              " 'bayesline/Bayesline-US-All-1y': 'ready'}"
            ]
          },
          "execution_count": 4,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "# default is \"All\", i.e. both system and user datasets\n",
        "risk_datasets.get_dataset_names()"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 5,
      "id": "c128485c",
      "metadata": {},
      "outputs": [
        {
          "data": {
            "text/plain": [
              "{}"
            ]
          },
          "execution_count": 5,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "risk_datasets.get_dataset_names(mode=\"User\")"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 6,
      "id": "9b181aa3",
      "metadata": {},
      "outputs": [
        {
          "data": {
            "text/plain": [
              "{'bayesline/Bayesline-US-500-1y': 'ready',\n",
              " 'bayesline/Bayesline-US-All-1y': 'ready'}"
            ]
          },
          "execution_count": 6,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "risk_datasets.get_dataset_names(mode=\"System\")"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "d87f024f",
      "metadata": {},
      "source": [
        "## Creating a New Dataset\n",
        "\n",
        "When creating a new risk dataset we utilize the `create_new_dataset` method for which we need to provide a *dataset name* and a `DerivedRiskDatasetSettings` object. \n",
        "\n",
        "At the bare minimum we need to specify a *reference dataset*, which is an existing dataset that all input data will be sourced from. The custom nature then is introduced by selectively specifying which data is to be brought in by the user.\n",
        "\n",
        "Note that below minimal configuration effectivelt creates a copy of the reference dataset.\n",
        "\n",
        "See here for relevant docs:\n",
        "* [Risk Datasets Settings](https://docs.bayesline.com/0.14.0/_autosummary/bayesline.api.equity.DerivedRiskDatasetSettings.html)"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 7,
      "id": "b76a3c4c",
      "metadata": {},
      "outputs": [],
      "source": [
        "settings = DerivedRiskDatasetSettings(\n",
        "    reference_dataset=\"bayesline/Bayesline-US-All-1y\"\n",
        ")"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 8,
      "id": "0e40d4a6",
      "metadata": {},
      "outputs": [],
      "source": [
        "risk_dataset_api = risk_datasets.create_dataset(\"My-Dataset\", settings=settings)"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "6a73b038",
      "metadata": {},
      "source": [
        "Above `create_dataset` invocation merely \n",
        "\n",
        "1. Adds the given settings into the settings registry under given name.\n",
        "2. Produces the physical dataset according to the given settings\n",
        "\n",
        "Note that we could have simply saved the settings in the settings registry directly which would have skipped step 2. This is perfectly feasible but requires us to invoke the dataset creation separtely (explained below).\n",
        "\n",
        "We can verify the settings registry creation by inspecting the registry directly. What we will notice is that *system* datasets are **not** included. "
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 9,
      "id": "ec785d73",
      "metadata": {},
      "outputs": [
        {
          "data": {
            "text/plain": [
              "SettingsIdentifiers(id_to_name={1: 'My-Dataset'}, name_to_id={'My-Dataset': 1}, id_to_type={1: <class 'bayesline.api._src.equity.riskdataset_settings.DerivedRiskDatasetSettings'>}, id_to_user_email={1: 'integration@bayesline.com'}, id_to_created_on={1: datetime.datetime(2026, 6, 15, 2, 19, 7, 53364, tzinfo=TzInfo(0))}, id_to_last_updated_on={1: datetime.datetime(2026, 6, 15, 2, 19, 7, 53364, tzinfo=TzInfo(0))})"
            ]
          },
          "execution_count": 9,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "risk_datasets.settings.get_identifiers()"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "29891c85",
      "metadata": {},
      "source": [
        "One thing we can immediately do on the returned `risk_dataset_api` is to *describe* it and inspect what the available styles, industries etc. are. Note that these immediately flow through to the relevant settings menus on `bln.equity.universes`, `bln.equity.exposures` etc.\n",
        "\n",
        "See here for relevant docs:\n",
        "* [RiskDatasetProperties](https://docs.bayesline.com/0.14.0/_autosummary/bayesline.api.equity.RiskDatasetProperties.html)"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 10,
      "id": "6dc229ae",
      "metadata": {},
      "outputs": [
        {
          "data": {
            "text/plain": [
              "{'size': ['log_market_cap', 'log_total_assets'],\n",
              " 'value': ['book_to_price'],\n",
              " 'growth': ['price_to_earnings'],\n",
              " 'volatility': ['beta', 'sigma', 'sigma_eps'],\n",
              " 'momentum': ['mom12', 'mom6'],\n",
              " 'dividend': ['dividend_yield'],\n",
              " 'leverage': ['debt_to_assets', 'debt_to_equity']}"
            ]
          },
          "execution_count": 10,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "risk_dataset_props = risk_dataset_api.describe()\n",
        "\n",
        "risk_dataset_props.exposure_settings_menu.continuous_hierarchies[\"style\"]"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "c7912e50",
      "metadata": {},
      "source": [
        "## Loading and Updating an Existing Dataset\n",
        "\n",
        "Step 2 from above can at any time be invoked manually to trigger a full recreation of the dataset, using the latest versions of all referenced datasets. To do this we simply load back the dataset we previousy created using either its name or globally unique identifier. Note that the system tracks the versions of all input data such that a dataset won't be updated if it is already at the latest version."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 11,
      "id": "a5d19abb",
      "metadata": {},
      "outputs": [],
      "source": [
        "risk_dataset_api = risk_datasets.load(\"My-Dataset\")"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 12,
      "id": "05aab806",
      "metadata": {},
      "outputs": [],
      "source": [
        "update_result = risk_dataset_api.update()"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "edadbe1b",
      "metadata": {},
      "source": [
        "The `RiskDatasetUpdateResult` gives summary information about the update process."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 13,
      "id": "4b13472c",
      "metadata": {},
      "outputs": [
        {
          "data": {
            "text/plain": [
              "RiskDatasetUpdateResult()"
            ]
          },
          "execution_count": 13,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "update_result"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "18c06379",
      "metadata": {},
      "source": [
        "### Using the Custom Dataset\n",
        "\n",
        "We can now use the dataset to produce risk models."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 14,
      "id": "23fdc665",
      "metadata": {},
      "outputs": [],
      "source": [
        "riskmodel_engine = bln.equity.riskmodels.load(\n",
        "    FactorRiskModelSettings(\n",
        "        universe=UniverseSettings(),\n",
        "        exposures=ExposureSettings(\n",
        "            exposures=[\n",
        "                ContinuousExposureGroupSettings(hierarchy=\"market\"),\n",
        "                CategoricalExposureGroupSettings(hierarchy=\"trbc\"),\n",
        "                CategoricalExposureGroupSettings(hierarchy=\"continent\"),\n",
        "                ContinuousExposureGroupSettings(\n",
        "                    hierarchy=\"style\", standardize_method=\"equal_weighted\"\n",
        "                ),\n",
        "            ]\n",
        "        ),\n",
        "        modelconstruction=ModelConstructionSettings(\n",
        "            estimation_universe=None,\n",
        "            zero_sum_constraints={\"trbc\": \"mcap_weighted\", \"continent\": \"mcap_weighted\"},\n",
        "        ),\n",
        "    ).with_dataset(\"bayesline/Bayesline-US-All-1y\")\n",
        ")"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 15,
      "id": "41ab1b31",
      "metadata": {},
      "outputs": [],
      "source": [
        "risk_model_api = riskmodel_engine.get_model() "
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 16,
      "id": "548946dd",
      "metadata": {},
      "outputs": [
        {
          "data": {
            "text/html": [
              "<div><style>\n",
              ".dataframe > thead > tr,\n",
              ".dataframe > tbody > tr {\n",
              "  text-align: right;\n",
              "  white-space: pre-wrap;\n",
              "}\n",
              "</style>\n",
              "<small>shape: (5, 27)</small><table border=\"1\" class=\"dataframe\"><thead><tr><th>date</th><th>continent.Africa</th><th>continent.America</th><th>continent.Asia</th><th>continent.Europe</th><th>continent.Oceania</th><th>market.Market</th><th>style.Dividend</th><th>style.Growth</th><th>style.Leverage</th><th>style.Momentum</th><th>style.Size</th><th>style.Value</th><th>style.Volatility</th><th>trbc.Academic &amp; Educational Services</th><th>trbc.Basic Materials</th><th>trbc.Consumer Cyclicals</th><th>trbc.Consumer Non-Cyclicals</th><th>trbc.Energy</th><th>trbc.Financials</th><th>trbc.Government Activity</th><th>trbc.Healthcare</th><th>trbc.Industrials</th><th>trbc.Institutions, Associations &amp; Organizations</th><th>trbc.Real Estate</th><th>trbc.Technology</th><th>trbc.Utilities</th></tr><tr><td>date</td><td>f32</td><td>f32</td><td>f32</td><td>f32</td><td>f32</td><td>f32</td><td>f32</td><td>f32</td><td>f32</td><td>f32</td><td>f32</td><td>f32</td><td>f32</td><td>f32</td><td>f32</td><td>f32</td><td>f32</td><td>f32</td><td>f32</td><td>f32</td><td>f32</td><td>f32</td><td>f32</td><td>f32</td><td>f32</td><td>f32</td></tr></thead><tbody><tr><td>2025-03-31</td><td>0.0</td><td>0.0</td><td>0.0</td><td>0.0</td><td>0.0</td><td>0.0</td><td>0.0</td><td>0.0</td><td>0.0</td><td>0.0</td><td>0.0</td><td>0.0</td><td>0.0</td><td>0.0</td><td>0.0</td><td>0.0</td><td>0.0</td><td>0.0</td><td>0.0</td><td>0.0</td><td>0.0</td><td>0.0</td><td>0.0</td><td>0.0</td><td>0.0</td><td>0.0</td></tr><tr><td>2025-04-01</td><td>0.0</td><td>0.0</td><td>0.0</td><td>0.0</td><td>0.0</td><td>0.000784</td><td>-0.000745</td><td>0.001477</td><td>-0.000056</td><td>0.00123</td><td>0.000771</td><td>-0.001448</td><td>0.002358</td><td>0.010905</td><td>-0.000872</td><td>0.002733</td><td>0.001298</td><td>0.004916</td><td>0.001269</td><td>-0.003152</td><td>-0.022817</td><td>0.002555</td><td>0.006851</td><td>-0.001532</td><td>0.003365</td><td>0.002273</td></tr><tr><td>2025-04-02</td><td>0.0</td><td>0.0</td><td>0.0</td><td>0.0</td><td>0.0</td><td>0.009932</td><td>-0.001131</td><td>0.000139</td><td>0.000634</td><td>-0.000503</td><td>0.002288</td><td>0.000159</td><td>0.012323</td><td>0.008189</td><td>-0.000547</td><td>0.004179</td><td>-0.004415</td><td>-0.002994</td><td>0.003798</td><td>-0.019164</td><td>0.003298</td><td>0.002464</td><td>0.01463</td><td>-0.000624</td><td>-0.003934</td><td>0.000078</td></tr><tr><td>2025-04-03</td><td>0.0</td><td>0.0</td><td>0.0</td><td>0.0</td><td>0.0</td><td>-0.038712</td><td>0.000288</td><td>-0.000492</td><td>0.000053</td><td>0.003965</td><td>-0.012713</td><td>-0.005916</td><td>-0.034854</td><td>0.011935</td><td>-0.001103</td><td>-0.007737</td><td>0.028514</td><td>-0.019052</td><td>-0.010367</td><td>-0.045227</td><td>0.02465</td><td>-0.008115</td><td>-0.054794</td><td>0.001435</td><td>-0.001348</td><td>0.026905</td></tr><tr><td>2025-04-04</td><td>0.0</td><td>0.0</td><td>0.0</td><td>0.0</td><td>0.0</td><td>-0.031548</td><td>-0.002094</td><td>-0.000304</td><td>0.000547</td><td>-0.004676</td><td>-0.013628</td><td>0.000887</td><td>-0.011226</td><td>0.014085</td><td>-0.007171</td><td>0.023238</td><td>0.007243</td><td>-0.02979</td><td>-0.005344</td><td>0.105213</td><td>-0.001438</td><td>-0.001301</td><td>-0.020153</td><td>0.006923</td><td>-0.001893</td><td>0.000743</td></tr></tbody></table></div>"
            ],
            "text/plain": [
              "shape: (5, 27)\n",
              "┌───────────┬───────────┬───────────┬───────────┬───┬───────────┬───────────┬───────────┬──────────┐\n",
              "│ date      ┆ continent ┆ continent ┆ continent ┆ … ┆ trbc.Inst ┆ trbc.Real ┆ trbc.Tech ┆ trbc.Uti │\n",
              "│ ---       ┆ .Africa   ┆ .America  ┆ .Asia     ┆   ┆ itutions, ┆ Estate    ┆ nology    ┆ lities   │\n",
              "│ date      ┆ ---       ┆ ---       ┆ ---       ┆   ┆ Associati ┆ ---       ┆ ---       ┆ ---      │\n",
              "│           ┆ f32       ┆ f32       ┆ f32       ┆   ┆ on…       ┆ f32       ┆ f32       ┆ f32      │\n",
              "│           ┆           ┆           ┆           ┆   ┆ ---       ┆           ┆           ┆          │\n",
              "│           ┆           ┆           ┆           ┆   ┆ f32       ┆           ┆           ┆          │\n",
              "╞═══════════╪═══════════╪═══════════╪═══════════╪═══╪═══════════╪═══════════╪═══════════╪══════════╡\n",
              "│ 2025-03-3 ┆ 0.0       ┆ 0.0       ┆ 0.0       ┆ … ┆ 0.0       ┆ 0.0       ┆ 0.0       ┆ 0.0      │\n",
              "│ 1         ┆           ┆           ┆           ┆   ┆           ┆           ┆           ┆          │\n",
              "│ 2025-04-0 ┆ 0.0       ┆ 0.0       ┆ 0.0       ┆ … ┆ 0.006851  ┆ -0.001532 ┆ 0.003365  ┆ 0.002273 │\n",
              "│ 1         ┆           ┆           ┆           ┆   ┆           ┆           ┆           ┆          │\n",
              "│ 2025-04-0 ┆ 0.0       ┆ 0.0       ┆ 0.0       ┆ … ┆ 0.01463   ┆ -0.000624 ┆ -0.003934 ┆ 0.000078 │\n",
              "│ 2         ┆           ┆           ┆           ┆   ┆           ┆           ┆           ┆          │\n",
              "│ 2025-04-0 ┆ 0.0       ┆ 0.0       ┆ 0.0       ┆ … ┆ -0.054794 ┆ 0.001435  ┆ -0.001348 ┆ 0.026905 │\n",
              "│ 3         ┆           ┆           ┆           ┆   ┆           ┆           ┆           ┆          │\n",
              "│ 2025-04-0 ┆ 0.0       ┆ 0.0       ┆ 0.0       ┆ … ┆ -0.020153 ┆ 0.006923  ┆ -0.001893 ┆ 0.000743 │\n",
              "│ 4         ┆           ┆           ┆           ┆   ┆           ┆           ┆           ┆          │\n",
              "└───────────┴───────────┴───────────┴───────────┴───┴───────────┴───────────┴───────────┴──────────┘"
            ]
          },
          "execution_count": 16,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "risk_model_api.fret().head()"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "691b9c8e",
      "metadata": {},
      "source": [
        "### Updating System Datasets\n",
        "\n",
        "Users with *administrator* permissions can update system wide datasets, e.g. `Bayesline-Global`. In practice this means that the remote source will be checked for new data (e.g. as part of a daily data update) and any changes will be incorporated into the Bayesline ecosystem. Updating a system dataset affects all users."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 17,
      "id": "8eb72441",
      "metadata": {},
      "outputs": [],
      "source": [
        "bayesline_risk_dataset = risk_datasets.load(\"bayesline/Bayesline-US-All-1y\")"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 18,
      "id": "73c961b0",
      "metadata": {},
      "outputs": [
        {
          "data": {
            "text/plain": [
              "RiskDatasetUpdateResult()"
            ]
          },
          "execution_count": 18,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "bayesline_risk_dataset.update()"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "08b99dc1",
      "metadata": {},
      "source": [
        "## Creating a Custom Risk Dataset\n",
        "\n",
        "For the remainder of the tutorial we will:\n",
        "* upload sample exposures \n",
        "* upload a simulated time series (e.g. an oil price)\n",
        "* create a custom risk dataset using the uploaded exposures and esitmate huber regression exposures for the simulated time series.\n",
        "\n",
        "### Uploading Custom Exposures\n",
        "In below example we will be uploading a set of sample exposures for the top 100 US companies. For this we are first using the *Exposures API* to create the sample exposures and then the *Uploaders API* (see the [Data Uploaders Tutorial](https://docs.bayesline.com/0.14.0/notebooks/tutorial_uploaders.html) for a detailed walk through) to upload the exposures as a custom exposure dataset. \n",
        "\n",
        "#### Creating Sample Exposures"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 19,
      "id": "6e2f5353",
      "metadata": {},
      "outputs": [],
      "source": [
        "exposures_api = bln.equity.exposures.load(\n",
        "    ExposureSettings(\n",
        "        exposures=[\n",
        "            ContinuousExposureGroupSettings(\n",
        "                hierarchy=\"style\",\n",
        "                include=[\"log_market_cap\", \"mom12\"],\n",
        "            )\n",
        "        ]\n",
        "    ).with_dataset(\"bayesline/Bayesline-US-All-1y\")\n",
        ")"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 20,
      "id": "592a0a42",
      "metadata": {},
      "outputs": [],
      "source": [
        "universe_settings = UniverseSettings(\n",
        "    categorical_filters=[\n",
        "        CategoricalFilterSettings(hierarchy=\"continent\", include=[\"USA\"]),\n",
        "    ],\n",
        ")\n",
        "exposures_df = exposures_api.get(universe_settings, standardize_universe=None)"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 21,
      "id": "4f6de903",
      "metadata": {},
      "outputs": [
        {
          "data": {
            "text/html": [
              "<div><style>\n",
              ".dataframe > thead > tr,\n",
              ".dataframe > tbody > tr {\n",
              "  text-align: right;\n",
              "  white-space: pre-wrap;\n",
              "}\n",
              "</style>\n",
              "<small>shape: (5, 4)</small><table border=\"1\" class=\"dataframe\"><thead><tr><th>date</th><th>bayesid</th><th>style.Size</th><th>style.Momentum</th></tr><tr><td>date</td><td>str</td><td>f32</td><td>f32</td></tr></thead><tbody><tr><td>2026-03-31</td><td>&quot;ICFFE60191&quot;</td><td>-0.109375</td><td>0.811035</td></tr><tr><td>2026-03-31</td><td>&quot;ICFFE938FD&quot;</td><td>1.0625</td><td>0.026306</td></tr><tr><td>2026-03-31</td><td>&quot;ICFFE94AED&quot;</td><td>-0.236206</td><td>0.912598</td></tr><tr><td>2026-03-31</td><td>&quot;ICFFEBBB38&quot;</td><td>0.678711</td><td>0.611328</td></tr><tr><td>2026-03-31</td><td>&quot;ICFFF2F5AD&quot;</td><td>-0.438721</td><td>0.2229</td></tr></tbody></table></div>"
            ],
            "text/plain": [
              "shape: (5, 4)\n",
              "┌────────────┬────────────┬────────────┬────────────────┐\n",
              "│ date       ┆ bayesid    ┆ style.Size ┆ style.Momentum │\n",
              "│ ---        ┆ ---        ┆ ---        ┆ ---            │\n",
              "│ date       ┆ str        ┆ f32        ┆ f32            │\n",
              "╞════════════╪════════════╪════════════╪════════════════╡\n",
              "│ 2026-03-31 ┆ ICFFE60191 ┆ -0.109375  ┆ 0.811035       │\n",
              "│ 2026-03-31 ┆ ICFFE938FD ┆ 1.0625     ┆ 0.026306       │\n",
              "│ 2026-03-31 ┆ ICFFE94AED ┆ -0.236206  ┆ 0.912598       │\n",
              "│ 2026-03-31 ┆ ICFFEBBB38 ┆ 0.678711   ┆ 0.611328       │\n",
              "│ 2026-03-31 ┆ ICFFF2F5AD ┆ -0.438721  ┆ 0.2229         │\n",
              "└────────────┴────────────┴────────────┴────────────────┘"
            ]
          },
          "execution_count": 21,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "exposures_df.tail()"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 22,
      "id": "4ef1ac23",
      "metadata": {},
      "outputs": [
        {
          "data": {
            "text/html": [
              "<div><style>\n",
              ".dataframe > thead > tr,\n",
              ".dataframe > tbody > tr {\n",
              "  text-align: right;\n",
              "  white-space: pre-wrap;\n",
              "}\n",
              "</style>\n",
              "<small>shape: (5, 1)</small><table border=\"1\" class=\"dataframe\"><thead><tr><th>bayesid</th></tr><tr><td>str</td></tr></thead><tbody><tr><td>&quot;IC5FAAD6EF&quot;</td></tr><tr><td>&quot;IC87439A97&quot;</td></tr><tr><td>&quot;IC7F196659&quot;</td></tr><tr><td>&quot;ICBE3882C7&quot;</td></tr><tr><td>&quot;ICAFF7F1E7&quot;</td></tr></tbody></table></div>"
            ],
            "text/plain": [
              "shape: (5, 1)\n",
              "┌────────────┐\n",
              "│ bayesid    │\n",
              "│ ---        │\n",
              "│ str        │\n",
              "╞════════════╡\n",
              "│ IC5FAAD6EF │\n",
              "│ IC87439A97 │\n",
              "│ IC7F196659 │\n",
              "│ ICBE3882C7 │\n",
              "│ ICAFF7F1E7 │\n",
              "└────────────┘"
            ]
          },
          "execution_count": 22,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "top_100_assets = (\n",
        "    exposures_df\n",
        "    .group_by(\"bayesid\")\n",
        "    .agg(pl.col(\"style.Size\").mean())\n",
        "    .sort(\"style.Size\")\n",
        "    .tail(100)\n",
        "    .select(\"bayesid\")\n",
        ")\n",
        "\n",
        "top_100_assets.head()"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 23,
      "id": "fba8300a",
      "metadata": {},
      "outputs": [],
      "source": [
        "exposures_df = (\n",
        "    exposures_df\n",
        "    .join(top_100_assets, on=\"bayesid\", how=\"semi\")\n",
        ")"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "c74dab51",
      "metadata": {},
      "source": [
        "#### Upload the Sample Exposures"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 24,
      "id": "ae116796",
      "metadata": {},
      "outputs": [],
      "source": [
        "uploaders = bln.equity.uploaders"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 25,
      "id": "e4ce1035",
      "metadata": {},
      "outputs": [
        {
          "data": {
            "text/plain": [
              "['exchange_dates',\n",
              " 'exchange_rates',\n",
              " 'exposures',\n",
              " 'factors',\n",
              " 'hierarchies',\n",
              " 'idmap',\n",
              " 'market_cap',\n",
              " 'portfolios',\n",
              " 'price',\n",
              " 'series']"
            ]
          },
          "execution_count": 25,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "uploaders.get_data_types()"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 26,
      "id": "539ae9a6",
      "metadata": {},
      "outputs": [],
      "source": [
        "exposure_uploader = uploaders.get_data_type(\"exposures\")"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 27,
      "id": "c8ba8b59",
      "metadata": {},
      "outputs": [],
      "source": [
        "exposure_dataset = exposure_uploader.get_or_create_dataset(\"My-US-Top100-Exposures\")"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "b3d29b90",
      "metadata": {},
      "source": [
        "For the uploader we need to provide one of the accepted input formats. \n",
        "Below we choose the *Long-Format* parser and transform our `exposures_df` to fit this format."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 28,
      "id": "75231568",
      "metadata": {},
      "outputs": [
        {
          "data": {
            "text/plain": [
              "['Long-Format', 'Wide-Format', 'Windowed-Long-Format']"
            ]
          },
          "execution_count": 28,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "exposure_dataset.get_parser_names()"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 29,
      "id": "3c0be23a",
      "metadata": {},
      "outputs": [
        {
          "data": {
            "text/html": [
              "<div><style>\n",
              ".dataframe > thead > tr,\n",
              ".dataframe > tbody > tr {\n",
              "  text-align: right;\n",
              "  white-space: pre-wrap;\n",
              "}\n",
              "</style>\n",
              "<small>shape: (8, 6)</small><table border=\"1\" class=\"dataframe\"><thead><tr><th>date</th><th>asset_id</th><th>asset_id_type</th><th>factor_group</th><th>factor</th><th>exposure</th></tr><tr><td>date</td><td>str</td><td>str</td><td>str</td><td>str</td><td>f64</td></tr></thead><tbody><tr><td>2025-01-06</td><td>&quot;GOOG&quot;</td><td>&quot;cusip9&quot;</td><td>&quot;style&quot;</td><td>&quot;momentum_6&quot;</td><td>-0.3</td></tr><tr><td>2025-01-06</td><td>&quot;GOOG&quot;</td><td>&quot;cusip9&quot;</td><td>&quot;market&quot;</td><td>&quot;market&quot;</td><td>1.0</td></tr><tr><td>2025-01-06</td><td>&quot;AAPL&quot;</td><td>&quot;cusip9&quot;</td><td>&quot;style&quot;</td><td>&quot;momentum_6&quot;</td><td>0.1</td></tr><tr><td>2025-01-06</td><td>&quot;AAPL&quot;</td><td>&quot;cusip9&quot;</td><td>&quot;market&quot;</td><td>&quot;market&quot;</td><td>1.0</td></tr><tr><td>2025-01-07</td><td>&quot;GOOG&quot;</td><td>&quot;cusip9&quot;</td><td>&quot;style&quot;</td><td>&quot;momentum_6&quot;</td><td>-0.28</td></tr><tr><td>2025-01-07</td><td>&quot;GOOG&quot;</td><td>&quot;cusip9&quot;</td><td>&quot;market&quot;</td><td>&quot;market&quot;</td><td>1.0</td></tr><tr><td>2025-01-07</td><td>&quot;AAPL&quot;</td><td>&quot;cusip9&quot;</td><td>&quot;style&quot;</td><td>&quot;momentum_6&quot;</td><td>0.0</td></tr><tr><td>2025-01-07</td><td>&quot;AAPL&quot;</td><td>&quot;cusip9&quot;</td><td>&quot;market&quot;</td><td>&quot;market&quot;</td><td>1.0</td></tr></tbody></table></div>"
            ],
            "text/plain": [
              "shape: (8, 6)\n",
              "┌────────────┬──────────┬───────────────┬──────────────┬────────────┬──────────┐\n",
              "│ date       ┆ asset_id ┆ asset_id_type ┆ factor_group ┆ factor     ┆ exposure │\n",
              "│ ---        ┆ ---      ┆ ---           ┆ ---          ┆ ---        ┆ ---      │\n",
              "│ date       ┆ str      ┆ str           ┆ str          ┆ str        ┆ f64      │\n",
              "╞════════════╪══════════╪═══════════════╪══════════════╪════════════╪══════════╡\n",
              "│ 2025-01-06 ┆ GOOG     ┆ cusip9        ┆ style        ┆ momentum_6 ┆ -0.3     │\n",
              "│ 2025-01-06 ┆ GOOG     ┆ cusip9        ┆ market       ┆ market     ┆ 1.0      │\n",
              "│ 2025-01-06 ┆ AAPL     ┆ cusip9        ┆ style        ┆ momentum_6 ┆ 0.1      │\n",
              "│ 2025-01-06 ┆ AAPL     ┆ cusip9        ┆ market       ┆ market     ┆ 1.0      │\n",
              "│ 2025-01-07 ┆ GOOG     ┆ cusip9        ┆ style        ┆ momentum_6 ┆ -0.28    │\n",
              "│ 2025-01-07 ┆ GOOG     ┆ cusip9        ┆ market       ┆ market     ┆ 1.0      │\n",
              "│ 2025-01-07 ┆ AAPL     ┆ cusip9        ┆ style        ┆ momentum_6 ┆ 0.0      │\n",
              "│ 2025-01-07 ┆ AAPL     ┆ cusip9        ┆ market       ┆ market     ┆ 1.0      │\n",
              "└────────────┴──────────┴───────────────┴──────────────┴────────────┴──────────┘"
            ]
          },
          "execution_count": 29,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "exposure_dataset.get_parser(\"Long-Format\").get_examples()[0]"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 30,
      "id": "5b38c76f",
      "metadata": {
        "lines_to_next_cell": 2
      },
      "outputs": [
        {
          "data": {
            "text/html": [
              "<div><style>\n",
              ".dataframe > thead > tr,\n",
              ".dataframe > tbody > tr {\n",
              "  text-align: right;\n",
              "  white-space: pre-wrap;\n",
              "}\n",
              "</style>\n",
              "<small>shape: (5, 6)</small><table border=\"1\" class=\"dataframe\"><thead><tr><th>date</th><th>asset_id</th><th>factor</th><th>exposure</th><th>asset_id_type</th><th>factor_group</th></tr><tr><td>date</td><td>str</td><td>str</td><td>f32</td><td>str</td><td>str</td></tr></thead><tbody><tr><td>2025-12-31</td><td>&quot;ICF1D1CC60&quot;</td><td>&quot;momentum&quot;</td><td>1.267578</td><td>&quot;bayesid&quot;</td><td>&quot;style&quot;</td></tr><tr><td>2025-12-31</td><td>&quot;ICF7F6AFDC&quot;</td><td>&quot;momentum&quot;</td><td>-1.34668</td><td>&quot;bayesid&quot;</td><td>&quot;style&quot;</td></tr><tr><td>2025-12-31</td><td>&quot;ICF982536B&quot;</td><td>&quot;momentum&quot;</td><td>0.432129</td><td>&quot;bayesid&quot;</td><td>&quot;style&quot;</td></tr><tr><td>2025-12-31</td><td>&quot;ICFB370DC9&quot;</td><td>&quot;momentum&quot;</td><td>0.538086</td><td>&quot;bayesid&quot;</td><td>&quot;style&quot;</td></tr><tr><td>2025-12-31</td><td>&quot;ICFE6CF9D5&quot;</td><td>&quot;momentum&quot;</td><td>-1.261719</td><td>&quot;bayesid&quot;</td><td>&quot;style&quot;</td></tr></tbody></table></div>"
            ],
            "text/plain": [
              "shape: (5, 6)\n",
              "┌────────────┬────────────┬──────────┬───────────┬───────────────┬──────────────┐\n",
              "│ date       ┆ asset_id   ┆ factor   ┆ exposure  ┆ asset_id_type ┆ factor_group │\n",
              "│ ---        ┆ ---        ┆ ---      ┆ ---       ┆ ---           ┆ ---          │\n",
              "│ date       ┆ str        ┆ str      ┆ f32       ┆ str           ┆ str          │\n",
              "╞════════════╪════════════╪══════════╪═══════════╪═══════════════╪══════════════╡\n",
              "│ 2025-12-31 ┆ ICF1D1CC60 ┆ momentum ┆ 1.267578  ┆ bayesid       ┆ style        │\n",
              "│ 2025-12-31 ┆ ICF7F6AFDC ┆ momentum ┆ -1.34668  ┆ bayesid       ┆ style        │\n",
              "│ 2025-12-31 ┆ ICF982536B ┆ momentum ┆ 0.432129  ┆ bayesid       ┆ style        │\n",
              "│ 2025-12-31 ┆ ICFB370DC9 ┆ momentum ┆ 0.538086  ┆ bayesid       ┆ style        │\n",
              "│ 2025-12-31 ┆ ICFE6CF9D5 ┆ momentum ┆ -1.261719 ┆ bayesid       ┆ style        │\n",
              "└────────────┴────────────┴──────────┴───────────┴───────────────┴──────────────┘"
            ]
          },
          "execution_count": 30,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "upload_df =(\n",
        "    exposures_df.filter(pl.col(\"date\") <= dt.date(2025, 12, 31))\n",
        "    .rename({\"bayesid\": \"asset_id\", \"style.Size\": \"size\", \"style.Momentum\": \"momentum\"})\n",
        "    .unpivot(\n",
        "        on=[\"size\", \"momentum\"],\n",
        "        index=[\"date\", \"asset_id\"],\n",
        "        variable_name=\"factor\",\n",
        "        value_name=\"exposure\",\n",
        "    )\n",
        "    .with_columns(\n",
        "        pl.lit(\"bayesid\").alias(\"asset_id_type\"),\n",
        "        pl.lit(\"style\").alias(\"factor_group\")\n",
        "    )\n",
        ")\n",
        "\n",
        "upload_df.tail()"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 31,
      "id": "26903c9e",
      "metadata": {},
      "outputs": [
        {
          "data": {
            "text/plain": [
              "UploadCommitResult(version=1, committed_names=[])"
            ]
          },
          "execution_count": 31,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "exposure_dataset.fast_commit(upload_df, mode=\"append\")"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "69e40c76",
      "metadata": {},
      "source": [
        "To verify that our data was uploaded correctly we can obtain the data back from the exposure dataset."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 32,
      "id": "48f4b36e",
      "metadata": {},
      "outputs": [
        {
          "data": {
            "text/html": [
              "<div><style>\n",
              ".dataframe > thead > tr,\n",
              ".dataframe > tbody > tr {\n",
              "  text-align: right;\n",
              "  white-space: pre-wrap;\n",
              "}\n",
              "</style>\n",
              "<small>shape: (5, 6)</small><table border=\"1\" class=\"dataframe\"><thead><tr><th>date</th><th>asset_id</th><th>asset_id_type</th><th>factor_group</th><th>factor</th><th>exposure</th></tr><tr><td>date</td><td>str</td><td>str</td><td>str</td><td>str</td><td>f32</td></tr></thead><tbody><tr><td>2025-03-31</td><td>&quot;IC006CA2E0&quot;</td><td>&quot;bayesid&quot;</td><td>&quot;style&quot;</td><td>&quot;size&quot;</td><td>2.908203</td></tr><tr><td>2025-03-31</td><td>&quot;IC0390CC2A&quot;</td><td>&quot;bayesid&quot;</td><td>&quot;style&quot;</td><td>&quot;size&quot;</td><td>2.859375</td></tr><tr><td>2025-03-31</td><td>&quot;IC03C39235&quot;</td><td>&quot;bayesid&quot;</td><td>&quot;style&quot;</td><td>&quot;size&quot;</td><td>2.892578</td></tr><tr><td>2025-03-31</td><td>&quot;IC06870B83&quot;</td><td>&quot;bayesid&quot;</td><td>&quot;style&quot;</td><td>&quot;size&quot;</td><td>3.0</td></tr><tr><td>2025-03-31</td><td>&quot;IC069B311C&quot;</td><td>&quot;bayesid&quot;</td><td>&quot;style&quot;</td><td>&quot;size&quot;</td><td>3.0</td></tr></tbody></table></div>"
            ],
            "text/plain": [
              "shape: (5, 6)\n",
              "┌────────────┬────────────┬───────────────┬──────────────┬────────┬──────────┐\n",
              "│ date       ┆ asset_id   ┆ asset_id_type ┆ factor_group ┆ factor ┆ exposure │\n",
              "│ ---        ┆ ---        ┆ ---           ┆ ---          ┆ ---    ┆ ---      │\n",
              "│ date       ┆ str        ┆ str           ┆ str          ┆ str    ┆ f32      │\n",
              "╞════════════╪════════════╪═══════════════╪══════════════╪════════╪══════════╡\n",
              "│ 2025-03-31 ┆ IC006CA2E0 ┆ bayesid       ┆ style        ┆ size   ┆ 2.908203 │\n",
              "│ 2025-03-31 ┆ IC0390CC2A ┆ bayesid       ┆ style        ┆ size   ┆ 2.859375 │\n",
              "│ 2025-03-31 ┆ IC03C39235 ┆ bayesid       ┆ style        ┆ size   ┆ 2.892578 │\n",
              "│ 2025-03-31 ┆ IC06870B83 ┆ bayesid       ┆ style        ┆ size   ┆ 3.0      │\n",
              "│ 2025-03-31 ┆ IC069B311C ┆ bayesid       ┆ style        ┆ size   ┆ 3.0      │\n",
              "└────────────┴────────────┴───────────────┴──────────────┴────────┴──────────┘"
            ]
          },
          "execution_count": 32,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "exposure_dataset.get_data().collect().head()"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "af13cfa7",
      "metadata": {},
      "source": [
        "### Uploading Time Series Data\n",
        "\n",
        "In below example we will be uploading two hypothetical time series (e.g. the oil price or the total returns of a technology index). Those can then be used to run create asset level exposures using Bayesline's huber regression framework. \n",
        "\n",
        "#### Creating Sample Time Series\n",
        "Below we simply create two random time series by sampling a normal distribution with a positive drift."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 33,
      "id": "c0bdb9b9",
      "metadata": {},
      "outputs": [
        {
          "data": {
            "text/html": [
              "<div><style>\n",
              ".dataframe > thead > tr,\n",
              ".dataframe > tbody > tr {\n",
              "  text-align: right;\n",
              "  white-space: pre-wrap;\n",
              "}\n",
              "</style>\n",
              "<small>shape: (5, 3)</small><table border=\"1\" class=\"dataframe\"><thead><tr><th>date</th><th>returns_oil</th><th>returns_tech</th></tr><tr><td>date</td><td>f64</td><td>f64</td></tr></thead><tbody><tr><td>2025-12-27</td><td>0.003365</td><td>-0.009885</td></tr><tr><td>2025-12-28</td><td>-0.000773</td><td>0.007268</td></tr><tr><td>2025-12-29</td><td>0.021132</td><td>0.006827</td></tr><tr><td>2025-12-30</td><td>0.015934</td><td>0.00405</td></tr><tr><td>2025-12-31</td><td>0.004058</td><td>0.005765</td></tr></tbody></table></div>"
            ],
            "text/plain": [
              "shape: (5, 3)\n",
              "┌────────────┬─────────────┬──────────────┐\n",
              "│ date       ┆ returns_oil ┆ returns_tech │\n",
              "│ ---        ┆ ---         ┆ ---          │\n",
              "│ date       ┆ f64         ┆ f64          │\n",
              "╞════════════╪═════════════╪══════════════╡\n",
              "│ 2025-12-27 ┆ 0.003365    ┆ -0.009885    │\n",
              "│ 2025-12-28 ┆ -0.000773   ┆ 0.007268     │\n",
              "│ 2025-12-29 ┆ 0.021132    ┆ 0.006827     │\n",
              "│ 2025-12-30 ┆ 0.015934    ┆ 0.00405      │\n",
              "│ 2025-12-31 ┆ 0.004058    ┆ 0.005765     │\n",
              "└────────────┴─────────────┴──────────────┘"
            ]
          },
          "execution_count": 33,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "dates = upload_df[\"date\"].unique().sort()\n",
        "mu, sigma = 0.0002, 0.01\n",
        "rng = np.random.default_rng(seed=42)\n",
        "returns_oil = rng.normal(mu, sigma, size=len(dates))\n",
        "returns_tech = rng.normal(mu, sigma, size=len(dates))\n",
        "\n",
        "returns_df = pl.DataFrame({\n",
        "    \"date\": dates,\n",
        "    \"returns_oil\": returns_oil,\n",
        "    \"returns_tech\": returns_tech\n",
        "})\n",
        "\n",
        "returns_df.tail()"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "000f9c5f",
      "metadata": {},
      "source": [
        "#### Upload the Factor Time Series\n",
        "\n",
        "This is the same as above, only that we use the `factors` data type for the uploader."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 34,
      "id": "bdd7ea60",
      "metadata": {},
      "outputs": [
        {
          "data": {
            "text/plain": [
              "['exchange_dates',\n",
              " 'exchange_rates',\n",
              " 'exposures',\n",
              " 'factors',\n",
              " 'hierarchies',\n",
              " 'idmap',\n",
              " 'market_cap',\n",
              " 'portfolios',\n",
              " 'price',\n",
              " 'series']"
            ]
          },
          "execution_count": 34,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "uploaders.get_data_types()"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 35,
      "id": "7a6c08be",
      "metadata": {},
      "outputs": [],
      "source": [
        "factor_uploader = bln.equity.uploaders.get_data_type(\"factors\")"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 36,
      "id": "c487c10d",
      "metadata": {},
      "outputs": [],
      "source": [
        "factor_ts_dataset = factor_uploader.get_or_create_dataset(\"Oil-and-Tech-Returns\")"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 37,
      "id": "aa8eba71",
      "metadata": {},
      "outputs": [
        {
          "data": {
            "text/plain": [
              "UploadCommitResult(version=1, committed_names=[])"
            ]
          },
          "execution_count": 37,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "factor_ts_dataset.fast_commit(returns_df, mode=\"append\")"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 38,
      "id": "88b05b6c",
      "metadata": {},
      "outputs": [
        {
          "data": {
            "text/html": [
              "<div><style>\n",
              ".dataframe > thead > tr,\n",
              ".dataframe > tbody > tr {\n",
              "  text-align: right;\n",
              "  white-space: pre-wrap;\n",
              "}\n",
              "</style>\n",
              "<small>shape: (5, 3)</small><table border=\"1\" class=\"dataframe\"><thead><tr><th>date</th><th>factor</th><th>value</th></tr><tr><td>date</td><td>str</td><td>f32</td></tr></thead><tbody><tr><td>2025-12-29</td><td>&quot;returns_tech&quot;</td><td>0.006827</td></tr><tr><td>2025-12-30</td><td>&quot;returns_oil&quot;</td><td>0.015934</td></tr><tr><td>2025-12-30</td><td>&quot;returns_tech&quot;</td><td>0.00405</td></tr><tr><td>2025-12-31</td><td>&quot;returns_oil&quot;</td><td>0.004058</td></tr><tr><td>2025-12-31</td><td>&quot;returns_tech&quot;</td><td>0.005765</td></tr></tbody></table></div>"
            ],
            "text/plain": [
              "shape: (5, 3)\n",
              "┌────────────┬──────────────┬──────────┐\n",
              "│ date       ┆ factor       ┆ value    │\n",
              "│ ---        ┆ ---          ┆ ---      │\n",
              "│ date       ┆ str          ┆ f32      │\n",
              "╞════════════╪══════════════╪══════════╡\n",
              "│ 2025-12-29 ┆ returns_tech ┆ 0.006827 │\n",
              "│ 2025-12-30 ┆ returns_oil  ┆ 0.015934 │\n",
              "│ 2025-12-30 ┆ returns_tech ┆ 0.00405  │\n",
              "│ 2025-12-31 ┆ returns_oil  ┆ 0.004058 │\n",
              "│ 2025-12-31 ┆ returns_tech ┆ 0.005765 │\n",
              "└────────────┴──────────────┴──────────┘"
            ]
          },
          "execution_count": 38,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "factor_ts_dataset.get_data().collect().tail()"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "6aaf5361",
      "metadata": {},
      "source": [
        "### Creating a Custom Risk Dataset\n",
        "\n",
        "Recall that we named the custom exposure dataset `My-US-Top100-Exposures` and the factor time series dataset `Oil-and-Tech-Returns`. We will use these name to specify that the exposure input data for the new risk dataset should be sourced from these uploads.\n",
        "\n",
        "Also note from above that as a *factor group* we specified `style`. *Factor groups* are used to logically group exposures into *styles*, *regions*, *industries*, etc. This is particularly important if we bring more than one set of industry or region schemas (e.g. TRBC and GICS). \n",
        "Below we specify only the `style_factor_group` (meaning no other exposure groups will be brought in, even if they existed in our uploaded exposure dataset).\n",
        "\n",
        "Lastly, note that below we specify `exposures` as a list. We can reference more than one exposure upload and create a consolidated risk dataset from it. In fact we do just that in this example where we bring in two different sources of exposures.\n",
        "\n",
        "Note below nuances:\n",
        "* we may want to bring in a market factor (and potentially industry and country factors). In absence of bringing our own we can stub in *unit* exposure dummies, shown below for the market factor.\n",
        "\n",
        "Below we use default settings for both the uploaded exposures and the huber regressions. There is an extensive set of available options, see below for relevant docs:\n",
        "* [RiskDatasetUploadedExposureSettings](https://docs.bayesline.com/0.14.0/_autosummary/bayesline.api.equity.RiskDatasetUploadedExposureSettings.html)\n",
        "* [RiskDatasetHuberRegressionExposureSettings](https://docs.bayesline.com/0.14.0/_autosummary/bayesline.api.equity.RiskDatasetHuberRegressionExposureSettings.html)"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 39,
      "id": "43be33c8",
      "metadata": {},
      "outputs": [],
      "source": [
        "riskdataset_settings = DerivedRiskDatasetSettings(\n",
        "    reference_dataset=\"bayesline/Bayesline-US-All-1y\",\n",
        "    exposures=[\n",
        "        RiskDatasetUnitExposureSettings(\n",
        "            factor_group=\"market\", factor=\"market\", factor_type=\"continuous\"\n",
        "        ),\n",
        "        RiskDatasetUploadedExposureSettings(\n",
        "            exposure_source=\"My-US-Top100-Exposures\",\n",
        "            continuous_factor_groups=[\"style\"],\n",
        "            factor_groups_gaussianize=[\"style\"],\n",
        "            factor_groups_fill_miss=[\"style\"],\n",
        "        ),\n",
        "        RiskDatasetHuberRegressionExposureSettings(\n",
        "            tsfactors_source=\"Oil-and-Tech-Returns\",\n",
        "        ),\n",
        "    ],\n",
        ")"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 40,
      "id": "e7a86ff4",
      "metadata": {},
      "outputs": [],
      "source": [
        "risk_dataset_api = risk_datasets.create_dataset(\"My-Risk-Dataset\", settings=riskdataset_settings)"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 41,
      "id": "6e0c91a6",
      "metadata": {},
      "outputs": [],
      "source": [
        "risk_dataset_props = risk_dataset_api.describe()"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 42,
      "id": "dc2deaad",
      "metadata": {},
      "outputs": [
        {
          "data": {
            "text/plain": [
              "{'market': ['market'],\n",
              " 'style': ['momentum', 'size'],\n",
              " 'huber_style': ['returns_oil', 'returns_tech']}"
            ]
          },
          "execution_count": 42,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "risk_dataset_props.exposure_settings_menu.continuous_hierarchies"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "2d7db54a",
      "metadata": {},
      "source": [
        "First we might be interested in what the huber regression based exposures worked out to be. Everything is linked with the rest of the Bayesline ecosystem so we can simply pick them up through the *Exposures API*."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 43,
      "id": "8b4624a8",
      "metadata": {},
      "outputs": [],
      "source": [
        "universe_settings = UniverseSettings()\n",
        "\n",
        "exposure_settings = ExposureSettings(\n",
        "    exposures=[\n",
        "        ContinuousExposureGroupSettings(hierarchy=\"market\"),\n",
        "        ContinuousExposureGroupSettings(hierarchy=\"huber_style\"),\n",
        "    ],\n",
        ")"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 44,
      "id": "110bdc8e",
      "metadata": {},
      "outputs": [
        {
          "data": {
            "text/html": [
              "<div><style>\n",
              ".dataframe > thead > tr,\n",
              ".dataframe > tbody > tr {\n",
              "  text-align: right;\n",
              "  white-space: pre-wrap;\n",
              "}\n",
              "</style>\n",
              "<small>shape: (5, 5)</small><table border=\"1\" class=\"dataframe\"><thead><tr><th>date</th><th>bayesid</th><th>market.market</th><th>huber_style.returns_oil</th><th>huber_style.returns_tech</th></tr><tr><td>date</td><td>str</td><td>f32</td><td>f32</td><td>f32</td></tr></thead><tbody><tr><td>2025-12-27</td><td>&quot;IC83A1B819&quot;</td><td>1.0</td><td>-0.054718</td><td>0.014046</td></tr><tr><td>2025-12-28</td><td>&quot;IC83A1B819&quot;</td><td>1.0</td><td>-0.054718</td><td>0.014046</td></tr><tr><td>2025-12-29</td><td>&quot;IC83A1B819&quot;</td><td>1.0</td><td>-0.049713</td><td>0.003635</td></tr><tr><td>2025-12-30</td><td>&quot;IC83A1B819&quot;</td><td>1.0</td><td>-0.045654</td><td>0.009979</td></tr><tr><td>2025-12-31</td><td>&quot;IC83A1B819&quot;</td><td>1.0</td><td>-0.042694</td><td>0.008766</td></tr></tbody></table></div>"
            ],
            "text/plain": [
              "shape: (5, 5)\n",
              "┌────────────┬────────────┬───────────────┬─────────────────────────┬──────────────────────────┐\n",
              "│ date       ┆ bayesid    ┆ market.market ┆ huber_style.returns_oil ┆ huber_style.returns_tech │\n",
              "│ ---        ┆ ---        ┆ ---           ┆ ---                     ┆ ---                      │\n",
              "│ date       ┆ str        ┆ f32           ┆ f32                     ┆ f32                      │\n",
              "╞════════════╪════════════╪═══════════════╪═════════════════════════╪══════════════════════════╡\n",
              "│ 2025-12-27 ┆ IC83A1B819 ┆ 1.0           ┆ -0.054718               ┆ 0.014046                 │\n",
              "│ 2025-12-28 ┆ IC83A1B819 ┆ 1.0           ┆ -0.054718               ┆ 0.014046                 │\n",
              "│ 2025-12-29 ┆ IC83A1B819 ┆ 1.0           ┆ -0.049713               ┆ 0.003635                 │\n",
              "│ 2025-12-30 ┆ IC83A1B819 ┆ 1.0           ┆ -0.045654               ┆ 0.009979                 │\n",
              "│ 2025-12-31 ┆ IC83A1B819 ┆ 1.0           ┆ -0.042694               ┆ 0.008766                 │\n",
              "└────────────┴────────────┴───────────────┴─────────────────────────┴──────────────────────────┘"
            ]
          },
          "execution_count": 44,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "exposures_api = bln.equity.exposures.load(exposure_settings.with_dataset(\"My-Risk-Dataset\"))\n",
        "exposures_df_my_model = exposures_api.get(universe_settings, standardize_universe=None)\n",
        "exposures_df_my_model.filter(pl.col(\"bayesid\") == \"IC83A1B819\").tail()  # Apple, Inc."
      ]
    },
    {
      "cell_type": "markdown",
      "id": "aec40787",
      "metadata": {},
      "source": [
        "Below we can now build a factor risk model with the risk dataset we just created."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 45,
      "id": "d0a89d12",
      "metadata": {},
      "outputs": [],
      "source": [
        "riskmodel_engine = bln.equity.riskmodels.load(\n",
        "    FactorRiskModelSettings(\n",
        "        universe=universe_settings,\n",
        "        exposures=exposure_settings,\n",
        "        modelconstruction=ModelConstructionSettings(\n",
        "            estimation_universe=None,\n",
        "        ),\n",
        "    ).with_dataset(\"My-Risk-Dataset\")\n",
        ")"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 46,
      "id": "9d5f9474",
      "metadata": {},
      "outputs": [],
      "source": [
        "risk_model_api = riskmodel_engine.get_model() "
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 47,
      "id": "f88aaed6",
      "metadata": {},
      "outputs": [
        {
          "data": {
            "text/html": [
              "<div><style>\n",
              ".dataframe > thead > tr,\n",
              ".dataframe > tbody > tr {\n",
              "  text-align: right;\n",
              "  white-space: pre-wrap;\n",
              "}\n",
              "</style>\n",
              "<small>shape: (5, 4)</small><table border=\"1\" class=\"dataframe\"><thead><tr><th>date</th><th>huber_style.returns_oil</th><th>huber_style.returns_tech</th><th>market.market</th></tr><tr><td>date</td><td>f32</td><td>f32</td><td>f32</td></tr></thead><tbody><tr><td>2025-12-24</td><td>0.000647</td><td>0.000908</td><td>0.003303</td></tr><tr><td>2025-12-26</td><td>-0.000135</td><td>-0.000518</td><td>-0.001475</td></tr><tr><td>2025-12-29</td><td>-0.000565</td><td>0.000023</td><td>-0.00452</td></tr><tr><td>2025-12-30</td><td>-0.000281</td><td>0.000031</td><td>-0.002803</td></tr><tr><td>2025-12-31</td><td>-0.000233</td><td>0.000304</td><td>-0.007165</td></tr></tbody></table></div>"
            ],
            "text/plain": [
              "shape: (5, 4)\n",
              "┌────────────┬─────────────────────────┬──────────────────────────┬───────────────┐\n",
              "│ date       ┆ huber_style.returns_oil ┆ huber_style.returns_tech ┆ market.market │\n",
              "│ ---        ┆ ---                     ┆ ---                      ┆ ---           │\n",
              "│ date       ┆ f32                     ┆ f32                      ┆ f32           │\n",
              "╞════════════╪═════════════════════════╪══════════════════════════╪═══════════════╡\n",
              "│ 2025-12-24 ┆ 0.000647                ┆ 0.000908                 ┆ 0.003303      │\n",
              "│ 2025-12-26 ┆ -0.000135               ┆ -0.000518                ┆ -0.001475     │\n",
              "│ 2025-12-29 ┆ -0.000565               ┆ 0.000023                 ┆ -0.00452      │\n",
              "│ 2025-12-30 ┆ -0.000281               ┆ 0.000031                 ┆ -0.002803     │\n",
              "│ 2025-12-31 ┆ -0.000233               ┆ 0.000304                 ┆ -0.007165     │\n",
              "└────────────┴─────────────────────────┴──────────────────────────┴───────────────┘"
            ]
          },
          "execution_count": 47,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "risk_model_api.fret().tail()"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "7c6e9bd8",
      "metadata": {},
      "source": [
        "### Adding Exposures to an Existing Custom Risk Dataset\n",
        "\n",
        "As a last step in this tutorial we will add exposures for 2025 to our existing exposures upload and then update the risk dataset we already created. \n",
        "\n",
        "Recall the steps from above to obtain some sample exposures up to the end of 2024. We will follow the same steps here (note that we'll reuse the same top 100 assets from above).\n",
        "\n",
        "Also note that:\n",
        "* we won't update the factor time series to demonstrate the behavior in case of only partially available exposures.\n",
        "* the dataframe we upload also contains 2024 dates. These will be ignored when uploading in append mode (i.e. any existing date/factor combindations are ignored)."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 48,
      "id": "a4f483e0",
      "metadata": {},
      "outputs": [
        {
          "data": {
            "text/html": [
              "<div><style>\n",
              ".dataframe > thead > tr,\n",
              ".dataframe > tbody > tr {\n",
              "  text-align: right;\n",
              "  white-space: pre-wrap;\n",
              "}\n",
              "</style>\n",
              "<small>shape: (5, 4)</small><table border=\"1\" class=\"dataframe\"><thead><tr><th>date</th><th>bayesid</th><th>style.Size</th><th>style.Momentum</th></tr><tr><td>date</td><td>str</td><td>f32</td><td>f32</td></tr></thead><tbody><tr><td>2026-03-31</td><td>&quot;ICF1D1CC60&quot;</td><td>3.0</td><td>1.446289</td></tr><tr><td>2026-03-31</td><td>&quot;ICF7F6AFDC&quot;</td><td>3.0</td><td>-0.549316</td></tr><tr><td>2026-03-31</td><td>&quot;ICF982536B&quot;</td><td>3.0</td><td>-0.19104</td></tr><tr><td>2026-03-31</td><td>&quot;ICFB370DC9&quot;</td><td>3.0</td><td>-0.276367</td></tr><tr><td>2026-03-31</td><td>&quot;ICFE6CF9D5&quot;</td><td>2.841797</td><td>-1.904297</td></tr></tbody></table></div>"
            ],
            "text/plain": [
              "shape: (5, 4)\n",
              "┌────────────┬────────────┬────────────┬────────────────┐\n",
              "│ date       ┆ bayesid    ┆ style.Size ┆ style.Momentum │\n",
              "│ ---        ┆ ---        ┆ ---        ┆ ---            │\n",
              "│ date       ┆ str        ┆ f32        ┆ f32            │\n",
              "╞════════════╪════════════╪════════════╪════════════════╡\n",
              "│ 2026-03-31 ┆ ICF1D1CC60 ┆ 3.0        ┆ 1.446289       │\n",
              "│ 2026-03-31 ┆ ICF7F6AFDC ┆ 3.0        ┆ -0.549316      │\n",
              "│ 2026-03-31 ┆ ICF982536B ┆ 3.0        ┆ -0.19104       │\n",
              "│ 2026-03-31 ┆ ICFB370DC9 ┆ 3.0        ┆ -0.276367      │\n",
              "│ 2026-03-31 ┆ ICFE6CF9D5 ┆ 2.841797   ┆ -1.904297      │\n",
              "└────────────┴────────────┴────────────┴────────────────┘"
            ]
          },
          "execution_count": 48,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "exposures_df.tail()"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 49,
      "id": "70fcfa8c",
      "metadata": {},
      "outputs": [],
      "source": [
        "exposures_df = exposures_df.join(top_100_assets, on=\"bayesid\", how=\"semi\")"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 50,
      "id": "4d78ab1e",
      "metadata": {},
      "outputs": [
        {
          "data": {
            "text/html": [
              "<div><style>\n",
              ".dataframe > thead > tr,\n",
              ".dataframe > tbody > tr {\n",
              "  text-align: right;\n",
              "  white-space: pre-wrap;\n",
              "}\n",
              "</style>\n",
              "<small>shape: (36_600, 4)</small><table border=\"1\" class=\"dataframe\"><thead><tr><th>date</th><th>bayesid</th><th>style.Size</th><th>style.Momentum</th></tr><tr><td>date</td><td>str</td><td>f32</td><td>f32</td></tr></thead><tbody><tr><td>2025-03-31</td><td>&quot;IC006CA2E0&quot;</td><td>2.908203</td><td>0.926758</td></tr><tr><td>2025-03-31</td><td>&quot;IC0390CC2A&quot;</td><td>2.859375</td><td>-0.944824</td></tr><tr><td>2025-03-31</td><td>&quot;IC03C39235&quot;</td><td>2.892578</td><td>-1.038086</td></tr><tr><td>2025-03-31</td><td>&quot;IC06870B83&quot;</td><td>3.0</td><td>2.128906</td></tr><tr><td>2025-03-31</td><td>&quot;IC069B311C&quot;</td><td>3.0</td><td>1.087891</td></tr><tr><td>&hellip;</td><td>&hellip;</td><td>&hellip;</td><td>&hellip;</td></tr><tr><td>2026-03-31</td><td>&quot;ICF1D1CC60&quot;</td><td>3.0</td><td>1.446289</td></tr><tr><td>2026-03-31</td><td>&quot;ICF7F6AFDC&quot;</td><td>3.0</td><td>-0.549316</td></tr><tr><td>2026-03-31</td><td>&quot;ICF982536B&quot;</td><td>3.0</td><td>-0.19104</td></tr><tr><td>2026-03-31</td><td>&quot;ICFB370DC9&quot;</td><td>3.0</td><td>-0.276367</td></tr><tr><td>2026-03-31</td><td>&quot;ICFE6CF9D5&quot;</td><td>2.841797</td><td>-1.904297</td></tr></tbody></table></div>"
            ],
            "text/plain": [
              "shape: (36_600, 4)\n",
              "┌────────────┬────────────┬────────────┬────────────────┐\n",
              "│ date       ┆ bayesid    ┆ style.Size ┆ style.Momentum │\n",
              "│ ---        ┆ ---        ┆ ---        ┆ ---            │\n",
              "│ date       ┆ str        ┆ f32        ┆ f32            │\n",
              "╞════════════╪════════════╪════════════╪════════════════╡\n",
              "│ 2025-03-31 ┆ IC006CA2E0 ┆ 2.908203   ┆ 0.926758       │\n",
              "│ 2025-03-31 ┆ IC0390CC2A ┆ 2.859375   ┆ -0.944824      │\n",
              "│ 2025-03-31 ┆ IC03C39235 ┆ 2.892578   ┆ -1.038086      │\n",
              "│ 2025-03-31 ┆ IC06870B83 ┆ 3.0        ┆ 2.128906       │\n",
              "│ 2025-03-31 ┆ IC069B311C ┆ 3.0        ┆ 1.087891       │\n",
              "│ …          ┆ …          ┆ …          ┆ …              │\n",
              "│ 2026-03-31 ┆ ICF1D1CC60 ┆ 3.0        ┆ 1.446289       │\n",
              "│ 2026-03-31 ┆ ICF7F6AFDC ┆ 3.0        ┆ -0.549316      │\n",
              "│ 2026-03-31 ┆ ICF982536B ┆ 3.0        ┆ -0.19104       │\n",
              "│ 2026-03-31 ┆ ICFB370DC9 ┆ 3.0        ┆ -0.276367      │\n",
              "│ 2026-03-31 ┆ ICFE6CF9D5 ┆ 2.841797   ┆ -1.904297      │\n",
              "└────────────┴────────────┴────────────┴────────────────┘"
            ]
          },
          "execution_count": 50,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "exposures_df"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 51,
      "id": "48ac0d2d",
      "metadata": {
        "lines_to_next_cell": 2
      },
      "outputs": [
        {
          "data": {
            "text/html": [
              "<div><style>\n",
              ".dataframe > thead > tr,\n",
              ".dataframe > tbody > tr {\n",
              "  text-align: right;\n",
              "  white-space: pre-wrap;\n",
              "}\n",
              "</style>\n",
              "<small>shape: (5, 6)</small><table border=\"1\" class=\"dataframe\"><thead><tr><th>date</th><th>asset_id</th><th>factor</th><th>exposure</th><th>asset_id_type</th><th>factor_group</th></tr><tr><td>date</td><td>str</td><td>str</td><td>f32</td><td>str</td><td>str</td></tr></thead><tbody><tr><td>2026-03-31</td><td>&quot;ICF1D1CC60&quot;</td><td>&quot;momentum&quot;</td><td>1.446289</td><td>&quot;bayesid&quot;</td><td>&quot;style&quot;</td></tr><tr><td>2026-03-31</td><td>&quot;ICF7F6AFDC&quot;</td><td>&quot;momentum&quot;</td><td>-0.549316</td><td>&quot;bayesid&quot;</td><td>&quot;style&quot;</td></tr><tr><td>2026-03-31</td><td>&quot;ICF982536B&quot;</td><td>&quot;momentum&quot;</td><td>-0.19104</td><td>&quot;bayesid&quot;</td><td>&quot;style&quot;</td></tr><tr><td>2026-03-31</td><td>&quot;ICFB370DC9&quot;</td><td>&quot;momentum&quot;</td><td>-0.276367</td><td>&quot;bayesid&quot;</td><td>&quot;style&quot;</td></tr><tr><td>2026-03-31</td><td>&quot;ICFE6CF9D5&quot;</td><td>&quot;momentum&quot;</td><td>-1.904297</td><td>&quot;bayesid&quot;</td><td>&quot;style&quot;</td></tr></tbody></table></div>"
            ],
            "text/plain": [
              "shape: (5, 6)\n",
              "┌────────────┬────────────┬──────────┬───────────┬───────────────┬──────────────┐\n",
              "│ date       ┆ asset_id   ┆ factor   ┆ exposure  ┆ asset_id_type ┆ factor_group │\n",
              "│ ---        ┆ ---        ┆ ---      ┆ ---       ┆ ---           ┆ ---          │\n",
              "│ date       ┆ str        ┆ str      ┆ f32       ┆ str           ┆ str          │\n",
              "╞════════════╪════════════╪══════════╪═══════════╪═══════════════╪══════════════╡\n",
              "│ 2026-03-31 ┆ ICF1D1CC60 ┆ momentum ┆ 1.446289  ┆ bayesid       ┆ style        │\n",
              "│ 2026-03-31 ┆ ICF7F6AFDC ┆ momentum ┆ -0.549316 ┆ bayesid       ┆ style        │\n",
              "│ 2026-03-31 ┆ ICF982536B ┆ momentum ┆ -0.19104  ┆ bayesid       ┆ style        │\n",
              "│ 2026-03-31 ┆ ICFB370DC9 ┆ momentum ┆ -0.276367 ┆ bayesid       ┆ style        │\n",
              "│ 2026-03-31 ┆ ICFE6CF9D5 ┆ momentum ┆ -1.904297 ┆ bayesid       ┆ style        │\n",
              "└────────────┴────────────┴──────────┴───────────┴───────────────┴──────────────┘"
            ]
          },
          "execution_count": 51,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "upload_df =(\n",
        "    exposures_df\n",
        "    .rename({\"bayesid\": \"asset_id\", \"style.Size\": \"size\", \"style.Momentum\": \"momentum\"})\n",
        "    .unpivot(\n",
        "        on=[\"size\", \"momentum\"],\n",
        "        index=[\"date\", \"asset_id\"],\n",
        "        variable_name=\"factor\",\n",
        "        value_name=\"exposure\",\n",
        "    )\n",
        "    .with_columns(\n",
        "        pl.lit(\"bayesid\").alias(\"asset_id_type\"),\n",
        "        pl.lit(\"style\").alias(\"factor_group\")\n",
        "    )\n",
        ")\n",
        "\n",
        "upload_df.tail()"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "c61c6966",
      "metadata": {},
      "source": [
        "Note how below we choose the `append` mode which allows us to add data rather than overwrite previous data."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 52,
      "id": "5dd4a94e",
      "metadata": {},
      "outputs": [
        {
          "data": {
            "text/plain": [
              "UploadCommitResult(version=2, committed_names=[])"
            ]
          },
          "execution_count": 52,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "exposure_dataset.fast_commit(upload_df, mode=\"append\")"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 53,
      "id": "10b670dd",
      "metadata": {},
      "outputs": [
        {
          "data": {
            "text/plain": [
              "{2: datetime.datetime(2026, 6, 15, 2, 21, 11, 867661, tzinfo=datetime.timezone.utc),\n",
              " 1: datetime.datetime(2026, 6, 15, 2, 20, 3, 619433, tzinfo=datetime.timezone.utc),\n",
              " 0: datetime.datetime(2026, 6, 15, 2, 20, 3, 579627, tzinfo=datetime.timezone.utc)}"
            ]
          },
          "execution_count": 53,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "exposure_dataset.version_history()"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "dc5f7ad8",
      "metadata": {},
      "source": [
        "New exposures have been uploaded, as a last step we need to update our risk dataset. Note that as of now we need to manually update the risk dataset to bring in the changes. In a future release functionality will be added to automatically trigger the risk dataset update if input data changes."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 54,
      "id": "965dd687",
      "metadata": {},
      "outputs": [
        {
          "data": {
            "text/plain": [
              "RiskDatasetUpdateResult()"
            ]
          },
          "execution_count": 54,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "risk_dataset_api.update()"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "3eae28dd",
      "metadata": {},
      "source": [
        "Fitting the risk model again we'll find that the new exposures have been captured."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 55,
      "id": "2f041345",
      "metadata": {},
      "outputs": [],
      "source": [
        "riskmodel_engine = bln.equity.riskmodels.load(\n",
        "    FactorRiskModelSettings(\n",
        "        universe=universe_settings,\n",
        "        exposures=exposure_settings,\n",
        "    ).with_dataset(\"My-Risk-Dataset\")\n",
        ")"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 56,
      "id": "d4cb0f02",
      "metadata": {},
      "outputs": [],
      "source": [
        "risk_model_api = riskmodel_engine.get_model() "
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 57,
      "id": "9b1ae002",
      "metadata": {},
      "outputs": [
        {
          "data": {
            "text/html": [
              "<div><style>\n",
              ".dataframe > thead > tr,\n",
              ".dataframe > tbody > tr {\n",
              "  text-align: right;\n",
              "  white-space: pre-wrap;\n",
              "}\n",
              "</style>\n",
              "<small>shape: (5, 4)</small><table border=\"1\" class=\"dataframe\"><thead><tr><th>date</th><th>huber_style.returns_oil</th><th>huber_style.returns_tech</th><th>market.market</th></tr><tr><td>date</td><td>f32</td><td>f32</td><td>f32</td></tr></thead><tbody><tr><td>2026-03-25</td><td>0.000988</td><td>-0.000059</td><td>0.007369</td></tr><tr><td>2026-03-26</td><td>-0.00115</td><td>0.000217</td><td>-0.012504</td></tr><tr><td>2026-03-27</td><td>-0.000372</td><td>-0.001118</td><td>-0.015879</td></tr><tr><td>2026-03-30</td><td>-0.000919</td><td>0.000172</td><td>-0.005209</td></tr><tr><td>2026-03-31</td><td>0.000747</td><td>0.000495</td><td>0.02608</td></tr></tbody></table></div>"
            ],
            "text/plain": [
              "shape: (5, 4)\n",
              "┌────────────┬─────────────────────────┬──────────────────────────┬───────────────┐\n",
              "│ date       ┆ huber_style.returns_oil ┆ huber_style.returns_tech ┆ market.market │\n",
              "│ ---        ┆ ---                     ┆ ---                      ┆ ---           │\n",
              "│ date       ┆ f32                     ┆ f32                      ┆ f32           │\n",
              "╞════════════╪═════════════════════════╪══════════════════════════╪═══════════════╡\n",
              "│ 2026-03-25 ┆ 0.000988                ┆ -0.000059                ┆ 0.007369      │\n",
              "│ 2026-03-26 ┆ -0.00115                ┆ 0.000217                 ┆ -0.012504     │\n",
              "│ 2026-03-27 ┆ -0.000372               ┆ -0.001118                ┆ -0.015879     │\n",
              "│ 2026-03-30 ┆ -0.000919               ┆ 0.000172                 ┆ -0.005209     │\n",
              "│ 2026-03-31 ┆ 0.000747                ┆ 0.000495                 ┆ 0.02608       │\n",
              "└────────────┴─────────────────────────┴──────────────────────────┴───────────────┘"
            ]
          },
          "execution_count": 57,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "risk_model_api.fret().tail()"
      ]
    },
    {
      "cell_type": "markdown",
      "id": "ecbb9b17",
      "metadata": {},
      "source": [
        "## Housekeeping\n",
        "\n",
        "Below demonstrates how to delete risk datasets."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 58,
      "id": "e7d068e7",
      "metadata": {},
      "outputs": [
        {
          "data": {
            "text/plain": [
              "RawSettings(model_type='DerivedRiskDatasetSettings', name='My-Dataset', identifier=1, exists=False, raw_json={'kind': 'derived', 'is_system_wide': False, 'reference_dataset': 'bayesline/Bayesline-US-All-1y', 'exposures': [{'exposure_type': 'referenced', 'continuous_factor_groups': None, 'categorical_factor_groups': None}], 'exchange_codes': None, 'trim_assets': 'none', 'trim_assets_sources': None, 'trim_start_date': 'earliest_start', 'trim_end_date': 'latest_end'}, references=[RawSettings(model_type='', name='bayesline/Bayesline-US-All-1y', identifier=None, exists=False, raw_json={}, references=[], extra={})], extra={})"
            ]
          },
          "execution_count": 58,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "risk_datasets.delete_dataset(\"My-Dataset\")"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 59,
      "id": "5567b295",
      "metadata": {},
      "outputs": [
        {
          "data": {
            "text/plain": [
              "RawSettings(model_type='DerivedRiskDatasetSettings', name='My-Risk-Dataset', identifier=2, exists=False, raw_json={'kind': 'derived', 'is_system_wide': False, 'reference_dataset': 'bayesline/Bayesline-US-All-1y', 'exposures': [{'exposure_type': 'unit', 'factor_group': 'market', 'factor': 'market', 'factor_type': 'continuous'}, {'exposure_type': 'uploaded', 'exposure_source': 'My-US-Top100-Exposures', 'continuous_factor_groups': ['style'], 'categorical_factor_groups': [], 'factor_groups_forward_fill': [], 'factor_groups_gaussianize': ['style'], 'factor_groups_gaussianize_maintain_zeros': [], 'factor_groups_fill_miss': ['style'], 'hierarchy_sources': {}}, {'exposure_type': 'huber_regression', 'tsfactors_source': 'Oil-and-Tech-Returns', 'factor_group': 'huber_style', 'include': 'All', 'exclude': [], 'fill_miss': True, 'window': 126, 'epsilon': 1.35, 'alpha': 0.0001, 'alpha_start': 10.0, 'student_t_level': None, 'clip': [None, None], 'gaussianize': True, 'gaussianize_maintain_zeros': False, 'impute': True, 'currency': 'USD', 'calendar': {'filters': [['XNYS']]}}], 'exchange_codes': None, 'trim_assets': 'none', 'trim_assets_sources': None, 'trim_start_date': 'earliest_start', 'trim_end_date': 'latest_end'}, references=[RawSettings(model_type='', name='bayesline/Bayesline-US-All-1y', identifier=None, exists=False, raw_json={}, references=[], extra={}), RawSettings(model_type='', name='My-US-Top100-Exposures', identifier=None, exists=False, raw_json={}, references=[], extra={}), RawSettings(model_type='', name='Oil-and-Tech-Returns', identifier=None, exists=False, raw_json={}, references=[], extra={})], extra={})"
            ]
          },
          "execution_count": 59,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "risk_datasets.delete_dataset(\"My-Risk-Dataset\")"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 60,
      "id": "a3960dcd",
      "metadata": {},
      "outputs": [],
      "source": [
        "exposure_dataset.destroy()"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 61,
      "id": "4d3b1d51",
      "metadata": {},
      "outputs": [],
      "source": [
        "factor_ts_dataset.destroy()"
      ]
    }
  ],
  "metadata": {
    "jupytext": {
      "text_representation": {
        "extension": ".py",
        "format_name": "percent",
        "format_version": "1.3",
        "jupytext_version": "1.19.3"
      }
    },
    "kernelspec": {
      "display_name": ".venv",
      "language": "python",
      "name": "python3"
    },
    "language_info": {
      "codemirror_mode": {
        "name": "ipython",
        "version": 3
      },
      "file_extension": ".py",
      "mimetype": "text/x-python",
      "name": "python",
      "nbconvert_exporter": "python",
      "pygments_lexer": "ipython3",
      "version": "3.11.15"
    }
  },
  "nbformat": 4,
  "nbformat_minor": 5
}