{
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "# Risk Datasets\n",
        "\n",
        "In this tutorial we are going to demonstrate the usage of the Bayesline Risk Datasets API, which allows to define new datasets that are used as a foundation to estimate factor risk models. \n",
        "\n",
        "A risk dataset comprises all underlying data necessary to build a risk model:\n",
        "* factor exposures (including style exposures, industry/regional exposures, etc.)\n",
        "* asset price data \n",
        "* fundamental data (market caps)\n",
        "* security master data\n",
        "\n",
        "\n",
        "In this first iteration we are allowing to ingest custom exposures into the Bayesline ecosystem, leveraging Bayesline data for the rest. In subsequent product iterations the user will be able to bring custom data for all other items, allowing to mix and match which data is brought by the user and which data is brought by Bayesline.\n",
        "\n",
        "In this notebook we will introduce and explore:\n",
        "* *system datasets* and *user datasets* and how to list them\n",
        "* how to create a new risk dataset\n",
        "* how to create a new risk dataset with custom exposures\n",
        "* how do add exposures to an existing risk dataset"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Imports & Setup\n",
        "\n",
        "For this tutorial notebook, you will need to import the following packages."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 2,
      "metadata": {},
      "outputs": [],
      "source": [
        "import datetime as dt\n",
        "import numpy as np\n",
        "import polars as pl\n",
        "\n",
        "from bayesline.apiclient import BayeslineApiClient\n",
        "\n",
        "from bayesline.api.equity import (\n",
        "    CategoricalExposureGroupSettings,\n",
        "    CategoricalFilterSettings,\n",
        "    ContinuousExposureGroupSettings,\n",
        "    ExposureSettings,\n",
        "    FactorRiskModelSettings, \n",
        "    ModelConstructionSettings,\n",
        "    RiskDatasetHuberRegressionExposureSettings,\n",
        "    RiskDatasetSettings,\n",
        "    RiskDatasetUnitExposureSettings,\n",
        "    RiskDatasetUploadedExposureSettings,\n",
        "    UniverseSettings\n",
        ")"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "We will also need to have a Bayesline API client configured."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "tags": [
          "skip-execution"
        ]
      },
      "outputs": [],
      "source": [
        "bln = BayeslineApiClient.new_client(\n",
        "    endpoint=\"https://[ENDPOINT]\",\n",
        "    api_key=\"[API-KEY]\",\n",
        ")"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "The main entrypoint for the Risk Datasets API sits on `bln.equity.riskdatasets`. All dataset functionality can be reached from here on out.\n",
        "\n",
        "See here for relevant docs:\n",
        "* [Risk Datasets API Summary](https://docs.bayesline.com/0.12.1/_autosummary/bayesline.api.equity.RiskDatasetLoaderApi.html)"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 3,
      "metadata": {},
      "outputs": [],
      "source": [
        "risk_datasets = bln.equity.riskdatasets"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Obtaining Available Datasets\n",
        "\n",
        "To list existing datasets we utilize the `get_dataset_names` method. When creating new datasets the names will appear in this list and are used downstream when creating risk model specifications.\n",
        "\n",
        "We distinguish *system* and *user* datasets. *System* datasets are available to all users, e.g. the *Bayesline-Global* dataset. *User* datasets are created and owned by an individual user. "
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 4,
      "metadata": {},
      "outputs": [
        {
          "data": {
            "text/plain": [
              "{'Bayesline-US-All-1y': 'ready', 'Bayesline-US-500-1y': 'ready'}"
            ]
          },
          "execution_count": 4,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "# default is \"All\", i.e. both system and user datasets\n",
        "risk_datasets.get_dataset_names()"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 5,
      "metadata": {},
      "outputs": [
        {
          "data": {
            "text/plain": [
              "{}"
            ]
          },
          "execution_count": 5,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "risk_datasets.get_dataset_names(mode=\"User\")"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 6,
      "metadata": {},
      "outputs": [
        {
          "data": {
            "text/plain": [
              "{'Bayesline-US-All-1y': 'ready', 'Bayesline-US-500-1y': 'ready'}"
            ]
          },
          "execution_count": 6,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "risk_datasets.get_dataset_names(mode=\"System\")"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "A default dataset is a system dataset that will be used in absence of specifying a concrete dataset when creating a risk model. It is by definition the first result of `risk_datasets.get_dataset_names()`."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 7,
      "metadata": {},
      "outputs": [
        {
          "data": {
            "text/plain": [
              "'Bayesline-US-All-1y'"
            ]
          },
          "execution_count": 7,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "risk_datasets.get_default_dataset_name()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Creating a New Dataset\n",
        "\n",
        "When creating a new risk dataset we utilize the `create_new_dataset` method for which we need to provide a *dataset name* and a `RiskDatasetSettings` object. \n",
        "\n",
        "At the bare minimum we need to specify a *reference dataset*, which is an existing dataset that all input data will be sourced from. The custom nature then is introduced by selectively specifying which data is to be brought in by the user.\n",
        "\n",
        "Note that below minimal configuration effectivelt creates a copy of the reference dataset.\n",
        "\n",
        "See here for relevant docs:\n",
        "* [Risk Datasets Settings](https://docs.bayesline.com/0.12.1/_autosummary/bayesline.api.equity.RiskDatasetSettings.html)"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 8,
      "metadata": {},
      "outputs": [],
      "source": [
        "settings = RiskDatasetSettings(\n",
        "    reference_dataset=\"Bayesline-US-All-1y\"\n",
        ")"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 9,
      "metadata": {},
      "outputs": [],
      "source": [
        "risk_dataset_api = risk_datasets.create_dataset(\"My-Dataset\", settings=settings)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Above `create_dataset` invocation merely \n",
        "\n",
        "1. Adds the given settings into the settings registry under given name.\n",
        "2. Produces the physical dataset according to the given settings\n",
        "\n",
        "Note that we could have simply saved the settings in the settings registry directly which would have skipped step 2. This is perfectly feasible but requires us to invoke the dataset creation separtely (explained below).\n",
        "\n",
        "We can verify the settings registry creation by inspecting the registry directly. What we will notice is that *system* datasets are **not** included. "
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 10,
      "metadata": {},
      "outputs": [
        {
          "data": {
            "text/plain": [
              "{'My-Dataset': 1}"
            ]
          },
          "execution_count": 10,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "risk_datasets.settings.names()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "One thing we can immediately do on the returned `risk_dataset_api` is to *describe* it and inspect what the available styles, industries etc. are. Note that these immediately flow through to the relevant settings menus on `bln.equity.universes`, `bln.equity.exposures` etc.\n",
        "\n",
        "See here for relevant docs:\n",
        "* [RiskDatasetProperties](https://docs.bayesline.com/0.12.1/_autosummary/bayesline.api.equity.RiskDatasetProperties.html)"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 11,
      "metadata": {},
      "outputs": [
        {
          "data": {
            "text/plain": [
              "{'size': ['log_market_cap', 'log_total_assets'],\n",
              " 'value': ['book_to_price'],\n",
              " 'growth': ['price_to_earnings'],\n",
              " 'volatility': ['sigma', 'sigma_eps', 'beta'],\n",
              " 'momentum': ['mom6', 'mom12'],\n",
              " 'dividend': ['dividend_yield'],\n",
              " 'leverage': ['debt_to_assets', 'debt_to_equity']}"
            ]
          },
          "execution_count": 11,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "risk_dataset_props = risk_dataset_api.describe()\n",
        "\n",
        "risk_dataset_props.exposure_settings_menu.continuous_hierarchies[\"style\"]"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Loading and Updating an Existing Dataset\n",
        "\n",
        "Step 2 from above can at any time be invoked manually to trigger a full recreation of the dataset, using the latest versions of all referenced datasets. To do this we simply load back the dataset we previousy created using either its name or globally unique identifier. Note that the system tracks the versions of all input data such that a dataset won't be updated if it is already at the latest version."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 12,
      "metadata": {},
      "outputs": [],
      "source": [
        "risk_dataset_api = risk_datasets.load(\"My-Dataset\")"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 13,
      "metadata": {},
      "outputs": [],
      "source": [
        "update_result = risk_dataset_api.update()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "The `RiskDatasetUpdateResult` gives summary information about the update process."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 14,
      "metadata": {},
      "outputs": [
        {
          "data": {
            "text/plain": [
              "RiskDatasetUpdateResult()"
            ]
          },
          "execution_count": 14,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "update_result"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Using the Custom Dataset\n",
        "\n",
        "We can now use the dataset to produce risk models."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 15,
      "metadata": {},
      "outputs": [],
      "source": [
        "riskmodel_engine = bln.equity.riskmodels.load(\n",
        "    FactorRiskModelSettings(\n",
        "        universe=UniverseSettings(dataset=\"My-Dataset\"),\n",
        "        exposures=ExposureSettings(\n",
        "            exposures=[\n",
        "                ContinuousExposureGroupSettings(hierarchy=\"market\"),\n",
        "                CategoricalExposureGroupSettings(hierarchy=\"trbc\"),\n",
        "                CategoricalExposureGroupSettings(hierarchy=\"continent\"),\n",
        "                ContinuousExposureGroupSettings(\n",
        "                    hierarchy=\"style\", standardize_method=\"equal_weighted\"\n",
        "                ),\n",
        "            ]\n",
        "        ),\n",
        "        modelconstruction=ModelConstructionSettings(\n",
        "            estimation_universe=None,\n",
        "            zero_sum_constraints={\"trbc\": \"mcap_weighted\", \"continent\": \"mcap_weighted\"},\n",
        "        ),\n",
        "    )\n",
        ")"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 16,
      "metadata": {},
      "outputs": [],
      "source": [
        "risk_model_api = riskmodel_engine.get_model() "
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 17,
      "metadata": {},
      "outputs": [
        {
          "data": {
            "text/html": [
              "<div><style>\n",
              ".dataframe > thead > tr,\n",
              ".dataframe > tbody > tr {\n",
              "  text-align: right;\n",
              "  white-space: pre-wrap;\n",
              "}\n",
              "</style>\n",
              "<small>shape: (5, 27)</small><table border=\"1\" class=\"dataframe\"><thead><tr><th>date</th><th>continent.Africa</th><th>continent.America</th><th>continent.Asia</th><th>continent.Europe</th><th>continent.Oceania</th><th>market.Market</th><th>style.Dividend</th><th>style.Growth</th><th>style.Leverage</th><th>style.Momentum</th><th>style.Size</th><th>style.Value</th><th>style.Volatility</th><th>trbc.Academic &amp; Educational Services</th><th>trbc.Basic Materials</th><th>trbc.Consumer Cyclicals</th><th>trbc.Consumer Non-Cyclicals</th><th>trbc.Energy</th><th>trbc.Financials</th><th>trbc.Government Activity</th><th>trbc.Healthcare</th><th>trbc.Industrials</th><th>trbc.Institutions, Associations &amp; Organizations</th><th>trbc.Real Estate</th><th>trbc.Technology</th><th>trbc.Utilities</th></tr><tr><td>date</td><td>f32</td><td>f32</td><td>f32</td><td>f32</td><td>f32</td><td>f32</td><td>f32</td><td>f32</td><td>f32</td><td>f32</td><td>f32</td><td>f32</td><td>f32</td><td>f32</td><td>f32</td><td>f32</td><td>f32</td><td>f32</td><td>f32</td><td>f32</td><td>f32</td><td>f32</td><td>f32</td><td>f32</td><td>f32</td><td>f32</td></tr></thead><tbody><tr><td>2025-03-31</td><td>0.0</td><td>0.0</td><td>0.0</td><td>0.0</td><td>0.0</td><td>0.0</td><td>0.0</td><td>0.0</td><td>0.0</td><td>0.0</td><td>0.0</td><td>0.0</td><td>0.0</td><td>0.0</td><td>0.0</td><td>0.0</td><td>0.0</td><td>0.0</td><td>0.0</td><td>0.0</td><td>0.0</td><td>0.0</td><td>0.0</td><td>0.0</td><td>0.0</td><td>0.0</td></tr><tr><td>2025-04-01</td><td>0.0</td><td>0.0</td><td>0.0</td><td>0.0</td><td>0.0</td><td>0.001275</td><td>-0.000716</td><td>0.00137</td><td>0.000223</td><td>0.001203</td><td>0.000303</td><td>-0.001377</td><td>0.001726</td><td>0.010046</td><td>-0.000729</td><td>0.002665</td><td>0.000949</td><td>0.005436</td><td>0.00114</td><td>-0.005397</td><td>-0.022813</td><td>0.00243</td><td>0.000714</td><td>-0.001636</td><td>0.003515</td><td>0.001968</td></tr><tr><td>2025-04-02</td><td>0.0</td><td>0.0</td><td>0.0</td><td>0.0</td><td>0.0</td><td>0.012838</td><td>-0.001102</td><td>-0.000044</td><td>0.000923</td><td>-0.000746</td><td>0.00156</td><td>-0.000587</td><td>0.009693</td><td>0.007572</td><td>-0.001018</td><td>0.003938</td><td>-0.005407</td><td>-0.003476</td><td>0.004102</td><td>-0.028334</td><td>0.003053</td><td>0.002225</td><td>-0.01354</td><td>-0.001109</td><td>-0.00352</td><td>-0.000387</td></tr><tr><td>2025-04-03</td><td>0.0</td><td>0.0</td><td>0.0</td><td>0.0</td><td>0.0</td><td>-0.047399</td><td>0.000446</td><td>-0.000387</td><td>-0.000237</td><td>0.005562</td><td>-0.011777</td><td>-0.003213</td><td>-0.027989</td><td>0.011917</td><td>0.000906</td><td>-0.005801</td><td>0.031092</td><td>-0.015572</td><td>-0.014826</td><td>-0.029906</td><td>0.025471</td><td>-0.006625</td><td>0.044973</td><td>0.002773</td><td>-0.000955</td><td>0.026971</td></tr><tr><td>2025-04-04</td><td>0.0</td><td>0.0</td><td>0.0</td><td>0.0</td><td>0.0</td><td>-0.035351</td><td>-0.002098</td><td>-0.000844</td><td>0.000002</td><td>-0.003629</td><td>-0.012374</td><td>0.002067</td><td>-0.011087</td><td>0.016362</td><td>-0.005323</td><td>0.024181</td><td>0.008486</td><td>-0.028371</td><td>-0.010039</td><td>0.105782</td><td>0.000179</td><td>0.000353</td><td>0.018136</td><td>0.007696</td><td>-0.000676</td><td>0.000588</td></tr></tbody></table></div>"
            ],
            "text/plain": [
              "shape: (5, 27)\n",
              "┌───────────┬───────────┬───────────┬───────────┬───┬───────────┬───────────┬───────────┬──────────┐\n",
              "│ date      ┆ continent ┆ continent ┆ continent ┆ … ┆ trbc.Inst ┆ trbc.Real ┆ trbc.Tech ┆ trbc.Uti │\n",
              "│ ---       ┆ .Africa   ┆ .America  ┆ .Asia     ┆   ┆ itutions, ┆ Estate    ┆ nology    ┆ lities   │\n",
              "│ date      ┆ ---       ┆ ---       ┆ ---       ┆   ┆ Associati ┆ ---       ┆ ---       ┆ ---      │\n",
              "│           ┆ f32       ┆ f32       ┆ f32       ┆   ┆ on…       ┆ f32       ┆ f32       ┆ f32      │\n",
              "│           ┆           ┆           ┆           ┆   ┆ ---       ┆           ┆           ┆          │\n",
              "│           ┆           ┆           ┆           ┆   ┆ f32       ┆           ┆           ┆          │\n",
              "╞═══════════╪═══════════╪═══════════╪═══════════╪═══╪═══════════╪═══════════╪═══════════╪══════════╡\n",
              "│ 2025-03-3 ┆ 0.0       ┆ 0.0       ┆ 0.0       ┆ … ┆ 0.0       ┆ 0.0       ┆ 0.0       ┆ 0.0      │\n",
              "│ 1         ┆           ┆           ┆           ┆   ┆           ┆           ┆           ┆          │\n",
              "│ 2025-04-0 ┆ 0.0       ┆ 0.0       ┆ 0.0       ┆ … ┆ 0.000714  ┆ -0.001636 ┆ 0.003515  ┆ 0.001968 │\n",
              "│ 1         ┆           ┆           ┆           ┆   ┆           ┆           ┆           ┆          │\n",
              "│ 2025-04-0 ┆ 0.0       ┆ 0.0       ┆ 0.0       ┆ … ┆ -0.01354  ┆ -0.001109 ┆ -0.00352  ┆ -0.00038 │\n",
              "│ 2         ┆           ┆           ┆           ┆   ┆           ┆           ┆           ┆ 7        │\n",
              "│ 2025-04-0 ┆ 0.0       ┆ 0.0       ┆ 0.0       ┆ … ┆ 0.044973  ┆ 0.002773  ┆ -0.000955 ┆ 0.026971 │\n",
              "│ 3         ┆           ┆           ┆           ┆   ┆           ┆           ┆           ┆          │\n",
              "│ 2025-04-0 ┆ 0.0       ┆ 0.0       ┆ 0.0       ┆ … ┆ 0.018136  ┆ 0.007696  ┆ -0.000676 ┆ 0.000588 │\n",
              "│ 4         ┆           ┆           ┆           ┆   ┆           ┆           ┆           ┆          │\n",
              "└───────────┴───────────┴───────────┴───────────┴───┴───────────┴───────────┴───────────┴──────────┘"
            ]
          },
          "execution_count": 17,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "risk_model_api.fret().head()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Updating System Datasets\n",
        "\n",
        "Users with *administrator* permissions can update system wide datasets, e.g. `Bayesline-Global`. In practice this means that the remote source will be checked for new data (e.g. as part of a daily data update) and any changes will be incorporated into the Bayesline ecosystem. Updating a system dataset affects all users."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 18,
      "metadata": {},
      "outputs": [],
      "source": [
        "bayesline_risk_dataset = risk_datasets.load(\"Bayesline-US-All-1y\")"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 19,
      "metadata": {},
      "outputs": [
        {
          "data": {
            "text/plain": [
              "RiskDatasetUpdateResult()"
            ]
          },
          "execution_count": 19,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "bayesline_risk_dataset.update()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Creating a Custom Risk Dataset\n",
        "\n",
        "For the remainder of the tutorial we will:\n",
        "* upload sample exposures \n",
        "* upload a simulated time series (e.g. an oil price)\n",
        "* create a custom risk dataset using the uploaded exposures and esitmate huber regression exposures for the simulated time series.\n",
        "\n",
        "### Uploading Custom Exposures\n",
        "In below example we will be uploading a set of sample exposures for the top 100 US companies. For this we are first using the *Exposures API* to create the sample exposures and then the *Uploaders API* (see the [Data Uploaders Tutorial](https://docs.bayesline.com/0.12.1/notebooks/tutorial_uploaders.html) for a detailed walk through) to upload the exposures as a custom exposure dataset. \n",
        "\n",
        "#### Creating Sample Exposures"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 20,
      "metadata": {},
      "outputs": [],
      "source": [
        "exposures_api = bln.equity.exposures.load(\n",
        "    ExposureSettings(\n",
        "        exposures=[\n",
        "            ContinuousExposureGroupSettings(\n",
        "                hierarchy=\"style\",\n",
        "                include=[\"log_market_cap\", \"mom12\"],\n",
        "            )\n",
        "        ]\n",
        "    )\n",
        ")"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 21,
      "metadata": {},
      "outputs": [],
      "source": [
        "universe_settings = UniverseSettings(\n",
        "    dataset=\"Bayesline-US-All-1y\",\n",
        "    categorical_filters=[\n",
        "        CategoricalFilterSettings(hierarchy=\"continent\", include=[\"USA\"]),\n",
        "    ],\n",
        ")\n",
        "exposures_df = exposures_api.get(universe_settings, standardize_universe=None)"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 22,
      "metadata": {},
      "outputs": [
        {
          "data": {
            "text/html": [
              "<div><style>\n",
              ".dataframe > thead > tr,\n",
              ".dataframe > tbody > tr {\n",
              "  text-align: right;\n",
              "  white-space: pre-wrap;\n",
              "}\n",
              "</style>\n",
              "<small>shape: (5, 4)</small><table border=\"1\" class=\"dataframe\"><thead><tr><th>date</th><th>bayesid</th><th>style.Size</th><th>style.Momentum</th></tr><tr><td>date</td><td>str</td><td>f32</td><td>f32</td></tr></thead><tbody><tr><td>2026-03-31</td><td>&quot;ICFFE60191&quot;</td><td>-0.161011</td><td>1.235352</td></tr><tr><td>2026-03-31</td><td>&quot;ICFFE938FD&quot;</td><td>1.047852</td><td>0.00238</td></tr><tr><td>2026-03-31</td><td>&quot;ICFFE94AED&quot;</td><td>-0.292236</td><td>1.063477</td></tr><tr><td>2026-03-31</td><td>&quot;ICFFEBBB38&quot;</td><td>0.655273</td><td>0.463867</td></tr><tr><td>2026-03-31</td><td>&quot;ICFFF2F5AD&quot;</td><td>-0.720215</td><td>0.131836</td></tr></tbody></table></div>"
            ],
            "text/plain": [
              "shape: (5, 4)\n",
              "┌────────────┬────────────┬────────────┬────────────────┐\n",
              "│ date       ┆ bayesid    ┆ style.Size ┆ style.Momentum │\n",
              "│ ---        ┆ ---        ┆ ---        ┆ ---            │\n",
              "│ date       ┆ str        ┆ f32        ┆ f32            │\n",
              "╞════════════╪════════════╪════════════╪════════════════╡\n",
              "│ 2026-03-31 ┆ ICFFE60191 ┆ -0.161011  ┆ 1.235352       │\n",
              "│ 2026-03-31 ┆ ICFFE938FD ┆ 1.047852   ┆ 0.00238        │\n",
              "│ 2026-03-31 ┆ ICFFE94AED ┆ -0.292236  ┆ 1.063477       │\n",
              "│ 2026-03-31 ┆ ICFFEBBB38 ┆ 0.655273   ┆ 0.463867       │\n",
              "│ 2026-03-31 ┆ ICFFF2F5AD ┆ -0.720215  ┆ 0.131836       │\n",
              "└────────────┴────────────┴────────────┴────────────────┘"
            ]
          },
          "execution_count": 22,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "exposures_df.tail()"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 23,
      "metadata": {},
      "outputs": [
        {
          "data": {
            "text/html": [
              "<div><style>\n",
              ".dataframe > thead > tr,\n",
              ".dataframe > tbody > tr {\n",
              "  text-align: right;\n",
              "  white-space: pre-wrap;\n",
              "}\n",
              "</style>\n",
              "<small>shape: (5, 1)</small><table border=\"1\" class=\"dataframe\"><thead><tr><th>bayesid</th></tr><tr><td>str</td></tr></thead><tbody><tr><td>&quot;ICF765A5A4&quot;</td></tr><tr><td>&quot;IC87439A97&quot;</td></tr><tr><td>&quot;IC5FAAD6EF&quot;</td></tr><tr><td>&quot;IC7F196659&quot;</td></tr><tr><td>&quot;ICBE3882C7&quot;</td></tr></tbody></table></div>"
            ],
            "text/plain": [
              "shape: (5, 1)\n",
              "┌────────────┐\n",
              "│ bayesid    │\n",
              "│ ---        │\n",
              "│ str        │\n",
              "╞════════════╡\n",
              "│ ICF765A5A4 │\n",
              "│ IC87439A97 │\n",
              "│ IC5FAAD6EF │\n",
              "│ IC7F196659 │\n",
              "│ ICBE3882C7 │\n",
              "└────────────┘"
            ]
          },
          "execution_count": 23,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "top_100_assets = (\n",
        "    exposures_df\n",
        "    .group_by(\"bayesid\")\n",
        "    .agg(pl.col(\"style.Size\").mean())\n",
        "    .sort(\"style.Size\")\n",
        "    .tail(100)\n",
        "    .select(\"bayesid\")\n",
        ")\n",
        "\n",
        "top_100_assets.head()"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 24,
      "metadata": {},
      "outputs": [],
      "source": [
        "exposures_df = (\n",
        "    exposures_df\n",
        "    .join(top_100_assets, on=\"bayesid\", how=\"semi\")\n",
        ")"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "#### Upload the Sample Exposures"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 25,
      "metadata": {},
      "outputs": [],
      "source": [
        "uploaders = bln.equity.uploaders"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 26,
      "metadata": {},
      "outputs": [
        {
          "data": {
            "text/plain": [
              "['exposures', 'factors', 'hierarchies', 'portfolios']"
            ]
          },
          "execution_count": 26,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "uploaders.get_data_types()"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 27,
      "metadata": {},
      "outputs": [],
      "source": [
        "exposure_uploader = uploaders.get_data_type(\"exposures\")"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 28,
      "metadata": {},
      "outputs": [],
      "source": [
        "exposure_dataset = exposure_uploader.get_or_create_dataset(\"My-US-Top100-Exposures\")"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "For the uploader we need to provide one of the accepted input formats. \n",
        "Below we choose the *Long-Format* parser and transform our `exposures_df` to fit this format."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 29,
      "metadata": {},
      "outputs": [
        {
          "data": {
            "text/plain": [
              "['Long-Format', 'Wide-Format']"
            ]
          },
          "execution_count": 29,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "exposure_dataset.get_parser_names()"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 30,
      "metadata": {},
      "outputs": [
        {
          "data": {
            "text/html": [
              "<div><style>\n",
              ".dataframe > thead > tr,\n",
              ".dataframe > tbody > tr {\n",
              "  text-align: right;\n",
              "  white-space: pre-wrap;\n",
              "}\n",
              "</style>\n",
              "<small>shape: (8, 6)</small><table border=\"1\" class=\"dataframe\"><thead><tr><th>date</th><th>asset_id</th><th>asset_id_type</th><th>factor_group</th><th>factor</th><th>exposure</th></tr><tr><td>date</td><td>str</td><td>str</td><td>str</td><td>str</td><td>f64</td></tr></thead><tbody><tr><td>2025-01-06</td><td>&quot;GOOG&quot;</td><td>&quot;cusip9&quot;</td><td>&quot;style&quot;</td><td>&quot;momentum_6&quot;</td><td>-0.3</td></tr><tr><td>2025-01-06</td><td>&quot;GOOG&quot;</td><td>&quot;cusip9&quot;</td><td>&quot;market&quot;</td><td>&quot;market&quot;</td><td>1.0</td></tr><tr><td>2025-01-06</td><td>&quot;AAPL&quot;</td><td>&quot;cusip9&quot;</td><td>&quot;style&quot;</td><td>&quot;momentum_6&quot;</td><td>0.1</td></tr><tr><td>2025-01-06</td><td>&quot;AAPL&quot;</td><td>&quot;cusip9&quot;</td><td>&quot;market&quot;</td><td>&quot;market&quot;</td><td>1.0</td></tr><tr><td>2025-01-07</td><td>&quot;GOOG&quot;</td><td>&quot;cusip9&quot;</td><td>&quot;style&quot;</td><td>&quot;momentum_6&quot;</td><td>-0.28</td></tr><tr><td>2025-01-07</td><td>&quot;GOOG&quot;</td><td>&quot;cusip9&quot;</td><td>&quot;market&quot;</td><td>&quot;market&quot;</td><td>1.0</td></tr><tr><td>2025-01-07</td><td>&quot;AAPL&quot;</td><td>&quot;cusip9&quot;</td><td>&quot;style&quot;</td><td>&quot;momentum_6&quot;</td><td>0.0</td></tr><tr><td>2025-01-07</td><td>&quot;AAPL&quot;</td><td>&quot;cusip9&quot;</td><td>&quot;market&quot;</td><td>&quot;market&quot;</td><td>1.0</td></tr></tbody></table></div>"
            ],
            "text/plain": [
              "shape: (8, 6)\n",
              "┌────────────┬──────────┬───────────────┬──────────────┬────────────┬──────────┐\n",
              "│ date       ┆ asset_id ┆ asset_id_type ┆ factor_group ┆ factor     ┆ exposure │\n",
              "│ ---        ┆ ---      ┆ ---           ┆ ---          ┆ ---        ┆ ---      │\n",
              "│ date       ┆ str      ┆ str           ┆ str          ┆ str        ┆ f64      │\n",
              "╞════════════╪══════════╪═══════════════╪══════════════╪════════════╪══════════╡\n",
              "│ 2025-01-06 ┆ GOOG     ┆ cusip9        ┆ style        ┆ momentum_6 ┆ -0.3     │\n",
              "│ 2025-01-06 ┆ GOOG     ┆ cusip9        ┆ market       ┆ market     ┆ 1.0      │\n",
              "│ 2025-01-06 ┆ AAPL     ┆ cusip9        ┆ style        ┆ momentum_6 ┆ 0.1      │\n",
              "│ 2025-01-06 ┆ AAPL     ┆ cusip9        ┆ market       ┆ market     ┆ 1.0      │\n",
              "│ 2025-01-07 ┆ GOOG     ┆ cusip9        ┆ style        ┆ momentum_6 ┆ -0.28    │\n",
              "│ 2025-01-07 ┆ GOOG     ┆ cusip9        ┆ market       ┆ market     ┆ 1.0      │\n",
              "│ 2025-01-07 ┆ AAPL     ┆ cusip9        ┆ style        ┆ momentum_6 ┆ 0.0      │\n",
              "│ 2025-01-07 ┆ AAPL     ┆ cusip9        ┆ market       ┆ market     ┆ 1.0      │\n",
              "└────────────┴──────────┴───────────────┴──────────────┴────────────┴──────────┘"
            ]
          },
          "execution_count": 30,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "exposure_dataset.get_parser(\"Long-Format\").get_examples()[0]"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 31,
      "metadata": {},
      "outputs": [
        {
          "data": {
            "text/html": [
              "<div><style>\n",
              ".dataframe > thead > tr,\n",
              ".dataframe > tbody > tr {\n",
              "  text-align: right;\n",
              "  white-space: pre-wrap;\n",
              "}\n",
              "</style>\n",
              "<small>shape: (5, 6)</small><table border=\"1\" class=\"dataframe\"><thead><tr><th>date</th><th>asset_id</th><th>factor</th><th>exposure</th><th>asset_id_type</th><th>factor_group</th></tr><tr><td>date</td><td>str</td><td>str</td><td>f32</td><td>str</td><td>str</td></tr></thead><tbody><tr><td>2025-12-31</td><td>&quot;ICF765A5A4&quot;</td><td>&quot;momentum&quot;</td><td>0.231567</td><td>&quot;bayesid&quot;</td><td>&quot;style&quot;</td></tr><tr><td>2025-12-31</td><td>&quot;ICF7F6AFDC&quot;</td><td>&quot;momentum&quot;</td><td>-0.82959</td><td>&quot;bayesid&quot;</td><td>&quot;style&quot;</td></tr><tr><td>2025-12-31</td><td>&quot;ICF982536B&quot;</td><td>&quot;momentum&quot;</td><td>0.380859</td><td>&quot;bayesid&quot;</td><td>&quot;style&quot;</td></tr><tr><td>2025-12-31</td><td>&quot;ICFB370DC9&quot;</td><td>&quot;momentum&quot;</td><td>0.572754</td><td>&quot;bayesid&quot;</td><td>&quot;style&quot;</td></tr><tr><td>2025-12-31</td><td>&quot;ICFE6CF9D5&quot;</td><td>&quot;momentum&quot;</td><td>-1.203125</td><td>&quot;bayesid&quot;</td><td>&quot;style&quot;</td></tr></tbody></table></div>"
            ],
            "text/plain": [
              "shape: (5, 6)\n",
              "┌────────────┬────────────┬──────────┬───────────┬───────────────┬──────────────┐\n",
              "│ date       ┆ asset_id   ┆ factor   ┆ exposure  ┆ asset_id_type ┆ factor_group │\n",
              "│ ---        ┆ ---        ┆ ---      ┆ ---       ┆ ---           ┆ ---          │\n",
              "│ date       ┆ str        ┆ str      ┆ f32       ┆ str           ┆ str          │\n",
              "╞════════════╪════════════╪══════════╪═══════════╪═══════════════╪══════════════╡\n",
              "│ 2025-12-31 ┆ ICF765A5A4 ┆ momentum ┆ 0.231567  ┆ bayesid       ┆ style        │\n",
              "│ 2025-12-31 ┆ ICF7F6AFDC ┆ momentum ┆ -0.82959  ┆ bayesid       ┆ style        │\n",
              "│ 2025-12-31 ┆ ICF982536B ┆ momentum ┆ 0.380859  ┆ bayesid       ┆ style        │\n",
              "│ 2025-12-31 ┆ ICFB370DC9 ┆ momentum ┆ 0.572754  ┆ bayesid       ┆ style        │\n",
              "│ 2025-12-31 ┆ ICFE6CF9D5 ┆ momentum ┆ -1.203125 ┆ bayesid       ┆ style        │\n",
              "└────────────┴────────────┴──────────┴───────────┴───────────────┴──────────────┘"
            ]
          },
          "execution_count": 31,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "upload_df =(\n",
        "    exposures_df.filter(pl.col(\"date\") <= dt.date(2025, 12, 31))\n",
        "    .rename({\"bayesid\": \"asset_id\", \"style.Size\": \"size\", \"style.Momentum\": \"momentum\"})\n",
        "    .unpivot(\n",
        "        on=[\"size\", \"momentum\"],\n",
        "        index=[\"date\", \"asset_id\"],\n",
        "        variable_name=\"factor\",\n",
        "        value_name=\"exposure\",\n",
        "    )\n",
        "    .with_columns(\n",
        "        pl.lit(\"bayesid\").alias(\"asset_id_type\"),\n",
        "        pl.lit(\"style\").alias(\"factor_group\")\n",
        "    )\n",
        ")\n",
        "\n",
        "upload_df.tail()\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 32,
      "metadata": {},
      "outputs": [
        {
          "data": {
            "text/plain": [
              "UploadCommitResult(version=1, committed_names=[])"
            ]
          },
          "execution_count": 32,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "exposure_dataset.fast_commit(upload_df, mode=\"append\")"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "To verify that our data was uploaded correctly we can obtain the data back from the exposure dataset."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 33,
      "metadata": {},
      "outputs": [
        {
          "data": {
            "text/html": [
              "<div><style>\n",
              ".dataframe > thead > tr,\n",
              ".dataframe > tbody > tr {\n",
              "  text-align: right;\n",
              "  white-space: pre-wrap;\n",
              "}\n",
              "</style>\n",
              "<small>shape: (5, 6)</small><table border=\"1\" class=\"dataframe\"><thead><tr><th>date</th><th>asset_id</th><th>asset_id_type</th><th>factor_group</th><th>factor</th><th>exposure</th></tr><tr><td>date</td><td>str</td><td>str</td><td>str</td><td>str</td><td>f32</td></tr></thead><tbody><tr><td>2025-03-31</td><td>&quot;IC006CA2E0&quot;</td><td>&quot;bayesid&quot;</td><td>&quot;style&quot;</td><td>&quot;size&quot;</td><td>2.892578</td></tr><tr><td>2025-03-31</td><td>&quot;IC0390CC2A&quot;</td><td>&quot;bayesid&quot;</td><td>&quot;style&quot;</td><td>&quot;size&quot;</td><td>2.84375</td></tr><tr><td>2025-03-31</td><td>&quot;IC03C39235&quot;</td><td>&quot;bayesid&quot;</td><td>&quot;style&quot;</td><td>&quot;size&quot;</td><td>2.876953</td></tr><tr><td>2025-03-31</td><td>&quot;IC06870B83&quot;</td><td>&quot;bayesid&quot;</td><td>&quot;style&quot;</td><td>&quot;size&quot;</td><td>3.0</td></tr><tr><td>2025-03-31</td><td>&quot;IC069B311C&quot;</td><td>&quot;bayesid&quot;</td><td>&quot;style&quot;</td><td>&quot;size&quot;</td><td>3.0</td></tr></tbody></table></div>"
            ],
            "text/plain": [
              "shape: (5, 6)\n",
              "┌────────────┬────────────┬───────────────┬──────────────┬────────┬──────────┐\n",
              "│ date       ┆ asset_id   ┆ asset_id_type ┆ factor_group ┆ factor ┆ exposure │\n",
              "│ ---        ┆ ---        ┆ ---           ┆ ---          ┆ ---    ┆ ---      │\n",
              "│ date       ┆ str        ┆ str           ┆ str          ┆ str    ┆ f32      │\n",
              "╞════════════╪════════════╪═══════════════╪══════════════╪════════╪══════════╡\n",
              "│ 2025-03-31 ┆ IC006CA2E0 ┆ bayesid       ┆ style        ┆ size   ┆ 2.892578 │\n",
              "│ 2025-03-31 ┆ IC0390CC2A ┆ bayesid       ┆ style        ┆ size   ┆ 2.84375  │\n",
              "│ 2025-03-31 ┆ IC03C39235 ┆ bayesid       ┆ style        ┆ size   ┆ 2.876953 │\n",
              "│ 2025-03-31 ┆ IC06870B83 ┆ bayesid       ┆ style        ┆ size   ┆ 3.0      │\n",
              "│ 2025-03-31 ┆ IC069B311C ┆ bayesid       ┆ style        ┆ size   ┆ 3.0      │\n",
              "└────────────┴────────────┴───────────────┴──────────────┴────────┴──────────┘"
            ]
          },
          "execution_count": 33,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "exposure_dataset.get_data().collect().head()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Uploading Time Series Data\n",
        "\n",
        "In below example we will be uploading two hypothetical time series (e.g. the oil price or the total returns of a technology index). Those can then be used to run create asset level exposures using Bayesline's huber regression framework. \n",
        "\n",
        "#### Creating Sample Time Series\n",
        "Below we simply create two random time series by sampling a normal distribution with a positive drift."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 34,
      "metadata": {},
      "outputs": [
        {
          "data": {
            "text/html": [
              "<div><style>\n",
              ".dataframe > thead > tr,\n",
              ".dataframe > tbody > tr {\n",
              "  text-align: right;\n",
              "  white-space: pre-wrap;\n",
              "}\n",
              "</style>\n",
              "<small>shape: (5, 3)</small><table border=\"1\" class=\"dataframe\"><thead><tr><th>date</th><th>returns_oil</th><th>returns_tech</th></tr><tr><td>date</td><td>f64</td><td>f64</td></tr></thead><tbody><tr><td>2025-12-27</td><td>0.003365</td><td>-0.009885</td></tr><tr><td>2025-12-28</td><td>-0.000773</td><td>0.007268</td></tr><tr><td>2025-12-29</td><td>0.021132</td><td>0.006827</td></tr><tr><td>2025-12-30</td><td>0.015934</td><td>0.00405</td></tr><tr><td>2025-12-31</td><td>0.004058</td><td>0.005765</td></tr></tbody></table></div>"
            ],
            "text/plain": [
              "shape: (5, 3)\n",
              "┌────────────┬─────────────┬──────────────┐\n",
              "│ date       ┆ returns_oil ┆ returns_tech │\n",
              "│ ---        ┆ ---         ┆ ---          │\n",
              "│ date       ┆ f64         ┆ f64          │\n",
              "╞════════════╪═════════════╪══════════════╡\n",
              "│ 2025-12-27 ┆ 0.003365    ┆ -0.009885    │\n",
              "│ 2025-12-28 ┆ -0.000773   ┆ 0.007268     │\n",
              "│ 2025-12-29 ┆ 0.021132    ┆ 0.006827     │\n",
              "│ 2025-12-30 ┆ 0.015934    ┆ 0.00405      │\n",
              "│ 2025-12-31 ┆ 0.004058    ┆ 0.005765     │\n",
              "└────────────┴─────────────┴──────────────┘"
            ]
          },
          "execution_count": 34,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "dates = upload_df[\"date\"].unique().sort()\n",
        "mu, sigma = 0.0002, 0.01\n",
        "rng = np.random.default_rng(seed=42)\n",
        "returns_oil = rng.normal(mu, sigma, size=len(dates))\n",
        "returns_tech = rng.normal(mu, sigma, size=len(dates))\n",
        "\n",
        "returns_df = pl.DataFrame({\n",
        "    \"date\": dates,\n",
        "    \"returns_oil\": returns_oil,\n",
        "    \"returns_tech\": returns_tech\n",
        "})\n",
        "\n",
        "returns_df.tail()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "#### Upload the Factor Time Series\n",
        "\n",
        "This is the same as above, only that we use the `factors` data type for the uploader."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 35,
      "metadata": {},
      "outputs": [
        {
          "data": {
            "text/plain": [
              "['exposures', 'factors', 'hierarchies', 'portfolios']"
            ]
          },
          "execution_count": 35,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "uploaders.get_data_types()"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 36,
      "metadata": {},
      "outputs": [],
      "source": [
        "factor_uploader = bln.equity.uploaders.get_data_type(\"factors\")"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 37,
      "metadata": {},
      "outputs": [],
      "source": [
        "factor_ts_dataset = factor_uploader.get_or_create_dataset(\"Oil-and-Tech-Returns\")"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 38,
      "metadata": {},
      "outputs": [
        {
          "data": {
            "text/plain": [
              "UploadCommitResult(version=1, committed_names=[])"
            ]
          },
          "execution_count": 38,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "factor_ts_dataset.fast_commit(returns_df, mode=\"append\")"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 39,
      "metadata": {},
      "outputs": [
        {
          "data": {
            "text/html": [
              "<div><style>\n",
              ".dataframe > thead > tr,\n",
              ".dataframe > tbody > tr {\n",
              "  text-align: right;\n",
              "  white-space: pre-wrap;\n",
              "}\n",
              "</style>\n",
              "<small>shape: (5, 3)</small><table border=\"1\" class=\"dataframe\"><thead><tr><th>date</th><th>factor</th><th>value</th></tr><tr><td>date</td><td>str</td><td>f32</td></tr></thead><tbody><tr><td>2025-12-27</td><td>&quot;returns_tech&quot;</td><td>-0.009885</td></tr><tr><td>2025-12-28</td><td>&quot;returns_tech&quot;</td><td>0.007268</td></tr><tr><td>2025-12-29</td><td>&quot;returns_tech&quot;</td><td>0.006827</td></tr><tr><td>2025-12-30</td><td>&quot;returns_tech&quot;</td><td>0.00405</td></tr><tr><td>2025-12-31</td><td>&quot;returns_tech&quot;</td><td>0.005765</td></tr></tbody></table></div>"
            ],
            "text/plain": [
              "shape: (5, 3)\n",
              "┌────────────┬──────────────┬───────────┐\n",
              "│ date       ┆ factor       ┆ value     │\n",
              "│ ---        ┆ ---          ┆ ---       │\n",
              "│ date       ┆ str          ┆ f32       │\n",
              "╞════════════╪══════════════╪═══════════╡\n",
              "│ 2025-12-27 ┆ returns_tech ┆ -0.009885 │\n",
              "│ 2025-12-28 ┆ returns_tech ┆ 0.007268  │\n",
              "│ 2025-12-29 ┆ returns_tech ┆ 0.006827  │\n",
              "│ 2025-12-30 ┆ returns_tech ┆ 0.00405   │\n",
              "│ 2025-12-31 ┆ returns_tech ┆ 0.005765  │\n",
              "└────────────┴──────────────┴───────────┘"
            ]
          },
          "execution_count": 39,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "factor_ts_dataset.get_data().collect().tail()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Creating a Custom Risk Dataset\n",
        "\n",
        "Recall that we named the custom exposure dataset `My-US-Top100-Exposures` and the factor time series dataset `Oil-and-Tech-Returns`. We will use these name to specify that the exposure input data for the new risk dataset should be sourced from these uploads.\n",
        "\n",
        "Also note from above that as a *factor group* we specified `style`. *Factor groups* are used to logically group exposures into *styles*, *regions*, *industries*, etc. This is particularly important if we bring more than one set of industry or region schemas (e.g. TRBC and GICS). \n",
        "Below we specify only the `style_factor_group` (meaning no other exposure groups will be brought in, even if they existed in our uploaded exposure dataset).\n",
        "\n",
        "Lastly, note that below we specify `exposures` as a list. We can reference more than one exposure upload and create a consolidated risk dataset from it. In fact we do just that in this example where we bring in two different sources of exposures.\n",
        "\n",
        "Note below nuances:\n",
        "* we may want to bring in a market factor (and potentially industry and country factors). In absence of bringing our own we can stub in *unit* exposure dummies, shown below for the market factor.\n",
        "\n",
        "Below we use default settings for both the uploaded exposures and the huber regressions. There is an extensive set of available options, see below for relevant docs:\n",
        "* [RiskDatasetUploadedExposureSettings](https://docs.bayesline.com/0.12.1/_autosummary/bayesline.api.equity.RiskDatasetUploadedExposureSettings.html)\n",
        "* [RiskDatasetHuberRegressionExposureSettings](https://docs.bayesline.com/0.12.1/_autosummary/bayesline.api.equity.RiskDatasetHuberRegressionExposureSettings.html)"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 40,
      "metadata": {},
      "outputs": [],
      "source": [
        "riskdataset_settings = RiskDatasetSettings(\n",
        "    reference_dataset=\"Bayesline-US-All-1y\",\n",
        "    exposures=[\n",
        "        RiskDatasetUnitExposureSettings(\n",
        "            factor_group=\"market\", factor=\"market\", factor_type=\"continuous\"\n",
        "        ),\n",
        "        RiskDatasetUploadedExposureSettings(\n",
        "            exposure_source=\"My-US-Top100-Exposures\",\n",
        "            continuous_factor_groups=[\"style\"],\n",
        "            factor_groups_gaussianize=[\"style\"],\n",
        "            factor_groups_fill_miss=[\"style\"],\n",
        "        ),\n",
        "        RiskDatasetHuberRegressionExposureSettings(\n",
        "            tsfactors_source=\"Oil-and-Tech-Returns\",\n",
        "        ),\n",
        "    ],\n",
        ")"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 41,
      "metadata": {},
      "outputs": [],
      "source": [
        "risk_dataset_api = risk_datasets.create_dataset(\"My-Risk-Dataset\", settings=riskdataset_settings)"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 42,
      "metadata": {},
      "outputs": [],
      "source": [
        "risk_dataset_props = risk_dataset_api.describe()"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 43,
      "metadata": {},
      "outputs": [
        {
          "data": {
            "text/plain": [
              "{'market': ['market'],\n",
              " 'style': ['momentum', 'size'],\n",
              " 'huber_style': ['returns_oil', 'returns_tech']}"
            ]
          },
          "execution_count": 43,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "risk_dataset_props.exposure_settings_menu.continuous_hierarchies"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "First we might be interested in what the huber regression based exposures worked out to be. Everything is linked with the rest of the Bayesline ecosystem so we can simply pick them up through the *Exposures API*."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 44,
      "metadata": {},
      "outputs": [],
      "source": [
        "universe_settings = UniverseSettings(dataset=\"My-Risk-Dataset\")\n",
        "\n",
        "exposure_settings = ExposureSettings(\n",
        "    exposures=[\n",
        "        ContinuousExposureGroupSettings(hierarchy=\"market\"),\n",
        "        ContinuousExposureGroupSettings(hierarchy=\"huber_style\"),\n",
        "    ],\n",
        ")"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 45,
      "metadata": {},
      "outputs": [
        {
          "data": {
            "text/html": [
              "<div><style>\n",
              ".dataframe > thead > tr,\n",
              ".dataframe > tbody > tr {\n",
              "  text-align: right;\n",
              "  white-space: pre-wrap;\n",
              "}\n",
              "</style>\n",
              "<small>shape: (5, 5)</small><table border=\"1\" class=\"dataframe\"><thead><tr><th>date</th><th>bayesid</th><th>market.market</th><th>huber_style.returns_oil</th><th>huber_style.returns_tech</th></tr><tr><td>date</td><td>str</td><td>f32</td><td>f32</td><td>f32</td></tr></thead><tbody><tr><td>2025-12-27</td><td>&quot;IC83A1B819&quot;</td><td>0.0</td><td>0.136719</td><td>0.090698</td></tr><tr><td>2025-12-28</td><td>&quot;IC83A1B819&quot;</td><td>0.0</td><td>0.136719</td><td>0.090698</td></tr><tr><td>2025-12-29</td><td>&quot;IC83A1B819&quot;</td><td>0.0</td><td>0.137207</td><td>0.08667</td></tr><tr><td>2025-12-30</td><td>&quot;IC83A1B819&quot;</td><td>0.0</td><td>0.169189</td><td>0.086609</td></tr><tr><td>2025-12-31</td><td>&quot;IC83A1B819&quot;</td><td>0.0</td><td>0.1640625</td><td>0.08374</td></tr></tbody></table></div>"
            ],
            "text/plain": [
              "shape: (5, 5)\n",
              "┌────────────┬────────────┬───────────────┬─────────────────────────┬──────────────────────────┐\n",
              "│ date       ┆ bayesid    ┆ market.market ┆ huber_style.returns_oil ┆ huber_style.returns_tech │\n",
              "│ ---        ┆ ---        ┆ ---           ┆ ---                     ┆ ---                      │\n",
              "│ date       ┆ str        ┆ f32           ┆ f32                     ┆ f32                      │\n",
              "╞════════════╪════════════╪═══════════════╪═════════════════════════╪══════════════════════════╡\n",
              "│ 2025-12-27 ┆ IC83A1B819 ┆ 0.0           ┆ 0.136719                ┆ 0.090698                 │\n",
              "│ 2025-12-28 ┆ IC83A1B819 ┆ 0.0           ┆ 0.136719                ┆ 0.090698                 │\n",
              "│ 2025-12-29 ┆ IC83A1B819 ┆ 0.0           ┆ 0.137207                ┆ 0.08667                  │\n",
              "│ 2025-12-30 ┆ IC83A1B819 ┆ 0.0           ┆ 0.169189                ┆ 0.086609                 │\n",
              "│ 2025-12-31 ┆ IC83A1B819 ┆ 0.0           ┆ 0.1640625               ┆ 0.08374                  │\n",
              "└────────────┴────────────┴───────────────┴─────────────────────────┴──────────────────────────┘"
            ]
          },
          "execution_count": 45,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "exposures_api = bln.equity.exposures.load(exposure_settings)\n",
        "exposures_df_my_model = exposures_api.get(universe_settings, standardize_universe=None)\n",
        "exposures_df_my_model.filter(pl.col(\"bayesid\") == \"IC83A1B819\").tail()  # Apple, Inc."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Below we can now build a factor risk model with the risk dataset we just created."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 46,
      "metadata": {},
      "outputs": [],
      "source": [
        "riskmodel_engine = bln.equity.riskmodels.load(\n",
        "    FactorRiskModelSettings(\n",
        "        universe=universe_settings,\n",
        "        exposures=exposure_settings,\n",
        "        modelconstruction=ModelConstructionSettings(\n",
        "            estimation_universe=None,\n",
        "        ),\n",
        "    )\n",
        ")"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 47,
      "metadata": {},
      "outputs": [],
      "source": [
        "risk_model_api = riskmodel_engine.get_model() "
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 48,
      "metadata": {},
      "outputs": [
        {
          "data": {
            "text/html": [
              "<div><style>\n",
              ".dataframe > thead > tr,\n",
              ".dataframe > tbody > tr {\n",
              "  text-align: right;\n",
              "  white-space: pre-wrap;\n",
              "}\n",
              "</style>\n",
              "<small>shape: (5, 4)</small><table border=\"1\" class=\"dataframe\"><thead><tr><th>date</th><th>huber_style.returns_oil</th><th>huber_style.returns_tech</th><th>market.market</th></tr><tr><td>date</td><td>f32</td><td>f32</td><td>f32</td></tr></thead><tbody><tr><td>2025-12-24</td><td>-0.000119</td><td>9.7059e-9</td><td>0.00312</td></tr><tr><td>2025-12-26</td><td>0.000164</td><td>-0.000899</td><td>-0.001331</td></tr><tr><td>2025-12-29</td><td>-0.000064</td><td>0.000105</td><td>-0.004423</td></tr><tr><td>2025-12-30</td><td>0.000667</td><td>-0.000139</td><td>-0.00278</td></tr><tr><td>2025-12-31</td><td>-0.00011</td><td>-0.000572</td><td>-0.006793</td></tr></tbody></table></div>"
            ],
            "text/plain": [
              "shape: (5, 4)\n",
              "┌────────────┬─────────────────────────┬──────────────────────────┬───────────────┐\n",
              "│ date       ┆ huber_style.returns_oil ┆ huber_style.returns_tech ┆ market.market │\n",
              "│ ---        ┆ ---                     ┆ ---                      ┆ ---           │\n",
              "│ date       ┆ f32                     ┆ f32                      ┆ f32           │\n",
              "╞════════════╪═════════════════════════╪══════════════════════════╪═══════════════╡\n",
              "│ 2025-12-24 ┆ -0.000119               ┆ 9.7059e-9                ┆ 0.00312       │\n",
              "│ 2025-12-26 ┆ 0.000164                ┆ -0.000899                ┆ -0.001331     │\n",
              "│ 2025-12-29 ┆ -0.000064               ┆ 0.000105                 ┆ -0.004423     │\n",
              "│ 2025-12-30 ┆ 0.000667                ┆ -0.000139                ┆ -0.00278      │\n",
              "│ 2025-12-31 ┆ -0.00011                ┆ -0.000572                ┆ -0.006793     │\n",
              "└────────────┴─────────────────────────┴──────────────────────────┴───────────────┘"
            ]
          },
          "execution_count": 48,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "risk_model_api.fret().tail()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Adding Exposures to an Existing Custom Risk Dataset\n",
        "\n",
        "As a last step in this tutorial we will add exposures for 2025 to our existing exposures upload and then update the risk dataset we already created. \n",
        "\n",
        "Recall the steps from above to obtain some sample exposures up to the end of 2024. We will follow the same steps here (note that we'll reuse the same top 100 assets from above).\n",
        "\n",
        "Also note that:\n",
        "* we won't update the factor time series to demonstrate the behavior in case of only partially available exposures.\n",
        "* the dataframe we upload also contains 2024 dates. These will be ignored when uploading in append mode (i.e. any existing date/factor combindations are ignored)."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 49,
      "metadata": {},
      "outputs": [
        {
          "data": {
            "text/html": [
              "<div><style>\n",
              ".dataframe > thead > tr,\n",
              ".dataframe > tbody > tr {\n",
              "  text-align: right;\n",
              "  white-space: pre-wrap;\n",
              "}\n",
              "</style>\n",
              "<small>shape: (5, 4)</small><table border=\"1\" class=\"dataframe\"><thead><tr><th>date</th><th>bayesid</th><th>style.Size</th><th>style.Momentum</th></tr><tr><td>date</td><td>str</td><td>f32</td><td>f32</td></tr></thead><tbody><tr><td>2026-03-31</td><td>&quot;ICF765A5A4&quot;</td><td>2.697266</td><td>0.396729</td></tr><tr><td>2026-03-31</td><td>&quot;ICF7F6AFDC&quot;</td><td>3.0</td><td>-0.344971</td></tr><tr><td>2026-03-31</td><td>&quot;ICF982536B&quot;</td><td>3.0</td><td>-0.417236</td></tr><tr><td>2026-03-31</td><td>&quot;ICFB370DC9&quot;</td><td>3.0</td><td>-1.3125</td></tr><tr><td>2026-03-31</td><td>&quot;ICFE6CF9D5&quot;</td><td>2.832031</td><td>-2.326172</td></tr></tbody></table></div>"
            ],
            "text/plain": [
              "shape: (5, 4)\n",
              "┌────────────┬────────────┬────────────┬────────────────┐\n",
              "│ date       ┆ bayesid    ┆ style.Size ┆ style.Momentum │\n",
              "│ ---        ┆ ---        ┆ ---        ┆ ---            │\n",
              "│ date       ┆ str        ┆ f32        ┆ f32            │\n",
              "╞════════════╪════════════╪════════════╪════════════════╡\n",
              "│ 2026-03-31 ┆ ICF765A5A4 ┆ 2.697266   ┆ 0.396729       │\n",
              "│ 2026-03-31 ┆ ICF7F6AFDC ┆ 3.0        ┆ -0.344971      │\n",
              "│ 2026-03-31 ┆ ICF982536B ┆ 3.0        ┆ -0.417236      │\n",
              "│ 2026-03-31 ┆ ICFB370DC9 ┆ 3.0        ┆ -1.3125        │\n",
              "│ 2026-03-31 ┆ ICFE6CF9D5 ┆ 2.832031   ┆ -2.326172      │\n",
              "└────────────┴────────────┴────────────┴────────────────┘"
            ]
          },
          "execution_count": 49,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "exposures_df.tail()"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 50,
      "metadata": {},
      "outputs": [],
      "source": [
        "exposures_df = exposures_df.join(top_100_assets, on=\"bayesid\", how=\"semi\")"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 51,
      "metadata": {},
      "outputs": [
        {
          "data": {
            "text/html": [
              "<div><style>\n",
              ".dataframe > thead > tr,\n",
              ".dataframe > tbody > tr {\n",
              "  text-align: right;\n",
              "  white-space: pre-wrap;\n",
              "}\n",
              "</style>\n",
              "<small>shape: (36_600, 4)</small><table border=\"1\" class=\"dataframe\"><thead><tr><th>date</th><th>bayesid</th><th>style.Size</th><th>style.Momentum</th></tr><tr><td>date</td><td>str</td><td>f32</td><td>f32</td></tr></thead><tbody><tr><td>2025-03-31</td><td>&quot;IC006CA2E0&quot;</td><td>2.892578</td><td>0.955078</td></tr><tr><td>2025-03-31</td><td>&quot;IC0390CC2A&quot;</td><td>2.84375</td><td>-1.339844</td></tr><tr><td>2025-03-31</td><td>&quot;IC03C39235&quot;</td><td>2.876953</td><td>-0.96582</td></tr><tr><td>2025-03-31</td><td>&quot;IC06870B83&quot;</td><td>3.0</td><td>2.113281</td></tr><tr><td>2025-03-31</td><td>&quot;IC069B311C&quot;</td><td>3.0</td><td>1.069336</td></tr><tr><td>&hellip;</td><td>&hellip;</td><td>&hellip;</td><td>&hellip;</td></tr><tr><td>2026-03-31</td><td>&quot;ICF765A5A4&quot;</td><td>2.697266</td><td>0.396729</td></tr><tr><td>2026-03-31</td><td>&quot;ICF7F6AFDC&quot;</td><td>3.0</td><td>-0.344971</td></tr><tr><td>2026-03-31</td><td>&quot;ICF982536B&quot;</td><td>3.0</td><td>-0.417236</td></tr><tr><td>2026-03-31</td><td>&quot;ICFB370DC9&quot;</td><td>3.0</td><td>-1.3125</td></tr><tr><td>2026-03-31</td><td>&quot;ICFE6CF9D5&quot;</td><td>2.832031</td><td>-2.326172</td></tr></tbody></table></div>"
            ],
            "text/plain": [
              "shape: (36_600, 4)\n",
              "┌────────────┬────────────┬────────────┬────────────────┐\n",
              "│ date       ┆ bayesid    ┆ style.Size ┆ style.Momentum │\n",
              "│ ---        ┆ ---        ┆ ---        ┆ ---            │\n",
              "│ date       ┆ str        ┆ f32        ┆ f32            │\n",
              "╞════════════╪════════════╪════════════╪════════════════╡\n",
              "│ 2025-03-31 ┆ IC006CA2E0 ┆ 2.892578   ┆ 0.955078       │\n",
              "│ 2025-03-31 ┆ IC0390CC2A ┆ 2.84375    ┆ -1.339844      │\n",
              "│ 2025-03-31 ┆ IC03C39235 ┆ 2.876953   ┆ -0.96582       │\n",
              "│ 2025-03-31 ┆ IC06870B83 ┆ 3.0        ┆ 2.113281       │\n",
              "│ 2025-03-31 ┆ IC069B311C ┆ 3.0        ┆ 1.069336       │\n",
              "│ …          ┆ …          ┆ …          ┆ …              │\n",
              "│ 2026-03-31 ┆ ICF765A5A4 ┆ 2.697266   ┆ 0.396729       │\n",
              "│ 2026-03-31 ┆ ICF7F6AFDC ┆ 3.0        ┆ -0.344971      │\n",
              "│ 2026-03-31 ┆ ICF982536B ┆ 3.0        ┆ -0.417236      │\n",
              "│ 2026-03-31 ┆ ICFB370DC9 ┆ 3.0        ┆ -1.3125        │\n",
              "│ 2026-03-31 ┆ ICFE6CF9D5 ┆ 2.832031   ┆ -2.326172      │\n",
              "└────────────┴────────────┴────────────┴────────────────┘"
            ]
          },
          "execution_count": 51,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "exposures_df"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 52,
      "metadata": {},
      "outputs": [
        {
          "data": {
            "text/html": [
              "<div><style>\n",
              ".dataframe > thead > tr,\n",
              ".dataframe > tbody > tr {\n",
              "  text-align: right;\n",
              "  white-space: pre-wrap;\n",
              "}\n",
              "</style>\n",
              "<small>shape: (5, 6)</small><table border=\"1\" class=\"dataframe\"><thead><tr><th>date</th><th>asset_id</th><th>factor</th><th>exposure</th><th>asset_id_type</th><th>factor_group</th></tr><tr><td>date</td><td>str</td><td>str</td><td>f32</td><td>str</td><td>str</td></tr></thead><tbody><tr><td>2026-03-31</td><td>&quot;ICF765A5A4&quot;</td><td>&quot;momentum&quot;</td><td>0.396729</td><td>&quot;bayesid&quot;</td><td>&quot;style&quot;</td></tr><tr><td>2026-03-31</td><td>&quot;ICF7F6AFDC&quot;</td><td>&quot;momentum&quot;</td><td>-0.344971</td><td>&quot;bayesid&quot;</td><td>&quot;style&quot;</td></tr><tr><td>2026-03-31</td><td>&quot;ICF982536B&quot;</td><td>&quot;momentum&quot;</td><td>-0.417236</td><td>&quot;bayesid&quot;</td><td>&quot;style&quot;</td></tr><tr><td>2026-03-31</td><td>&quot;ICFB370DC9&quot;</td><td>&quot;momentum&quot;</td><td>-1.3125</td><td>&quot;bayesid&quot;</td><td>&quot;style&quot;</td></tr><tr><td>2026-03-31</td><td>&quot;ICFE6CF9D5&quot;</td><td>&quot;momentum&quot;</td><td>-2.326172</td><td>&quot;bayesid&quot;</td><td>&quot;style&quot;</td></tr></tbody></table></div>"
            ],
            "text/plain": [
              "shape: (5, 6)\n",
              "┌────────────┬────────────┬──────────┬───────────┬───────────────┬──────────────┐\n",
              "│ date       ┆ asset_id   ┆ factor   ┆ exposure  ┆ asset_id_type ┆ factor_group │\n",
              "│ ---        ┆ ---        ┆ ---      ┆ ---       ┆ ---           ┆ ---          │\n",
              "│ date       ┆ str        ┆ str      ┆ f32       ┆ str           ┆ str          │\n",
              "╞════════════╪════════════╪══════════╪═══════════╪═══════════════╪══════════════╡\n",
              "│ 2026-03-31 ┆ ICF765A5A4 ┆ momentum ┆ 0.396729  ┆ bayesid       ┆ style        │\n",
              "│ 2026-03-31 ┆ ICF7F6AFDC ┆ momentum ┆ -0.344971 ┆ bayesid       ┆ style        │\n",
              "│ 2026-03-31 ┆ ICF982536B ┆ momentum ┆ -0.417236 ┆ bayesid       ┆ style        │\n",
              "│ 2026-03-31 ┆ ICFB370DC9 ┆ momentum ┆ -1.3125   ┆ bayesid       ┆ style        │\n",
              "│ 2026-03-31 ┆ ICFE6CF9D5 ┆ momentum ┆ -2.326172 ┆ bayesid       ┆ style        │\n",
              "└────────────┴────────────┴──────────┴───────────┴───────────────┴──────────────┘"
            ]
          },
          "execution_count": 52,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "upload_df =(\n",
        "    exposures_df\n",
        "    .rename({\"bayesid\": \"asset_id\", \"style.Size\": \"size\", \"style.Momentum\": \"momentum\"})\n",
        "    .unpivot(\n",
        "        on=[\"size\", \"momentum\"],\n",
        "        index=[\"date\", \"asset_id\"],\n",
        "        variable_name=\"factor\",\n",
        "        value_name=\"exposure\",\n",
        "    )\n",
        "    .with_columns(\n",
        "        pl.lit(\"bayesid\").alias(\"asset_id_type\"),\n",
        "        pl.lit(\"style\").alias(\"factor_group\")\n",
        "    )\n",
        ")\n",
        "\n",
        "upload_df.tail()\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Note how below we choose the `append` mode which allows us to add data rather than overwrite previous data."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 53,
      "metadata": {},
      "outputs": [
        {
          "data": {
            "text/plain": [
              "UploadCommitResult(version=2, committed_names=[])"
            ]
          },
          "execution_count": 53,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "exposure_dataset.fast_commit(upload_df, mode=\"append\")"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 54,
      "metadata": {},
      "outputs": [
        {
          "data": {
            "text/plain": [
              "{2: datetime.datetime(2026, 4, 29, 20, 17, 29, 292000, tzinfo=datetime.timezone.utc),\n",
              " 1: datetime.datetime(2026, 4, 29, 20, 16, 50, 126000, tzinfo=datetime.timezone.utc),\n",
              " 0: datetime.datetime(2026, 4, 29, 20, 16, 50, 70000, tzinfo=datetime.timezone.utc)}"
            ]
          },
          "execution_count": 54,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "exposure_dataset.version_history()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "New exposures have been uploaded, as a last step we need to update our risk dataset. Note that as of now we need to manually update the risk dataset to bring in the changes. In a future release functionality will be added to automatically trigger the risk dataset update if input data changes."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 55,
      "metadata": {},
      "outputs": [
        {
          "data": {
            "text/plain": [
              "RiskDatasetUpdateResult()"
            ]
          },
          "execution_count": 55,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "risk_dataset_api.update()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Fitting the risk model again we'll find that the new exposures have been captured."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 56,
      "metadata": {},
      "outputs": [],
      "source": [
        "riskmodel_engine = bln.equity.riskmodels.load(\n",
        "    FactorRiskModelSettings(\n",
        "        universe=universe_settings,\n",
        "        exposures=exposure_settings,\n",
        "    )\n",
        ")"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 57,
      "metadata": {},
      "outputs": [],
      "source": [
        "risk_model_api = riskmodel_engine.get_model() "
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 58,
      "metadata": {},
      "outputs": [
        {
          "data": {
            "text/html": [
              "<div><style>\n",
              ".dataframe > thead > tr,\n",
              ".dataframe > tbody > tr {\n",
              "  text-align: right;\n",
              "  white-space: pre-wrap;\n",
              "}\n",
              "</style>\n",
              "<small>shape: (5, 4)</small><table border=\"1\" class=\"dataframe\"><thead><tr><th>date</th><th>huber_style.returns_oil</th><th>huber_style.returns_tech</th><th>market.market</th></tr><tr><td>date</td><td>f32</td><td>f32</td><td>f32</td></tr></thead><tbody><tr><td>2026-03-25</td><td>0.00027</td><td>0.000519</td><td>0.009012</td></tr><tr><td>2026-03-26</td><td>-0.001483</td><td>-0.001564</td><td>-0.012237</td></tr><tr><td>2026-03-27</td><td>0.001317</td><td>-0.001228</td><td>-0.015247</td></tr><tr><td>2026-03-30</td><td>-0.000715</td><td>-0.000302</td><td>-0.005544</td></tr><tr><td>2026-03-31</td><td>0.002486</td><td>0.002097</td><td>0.025684</td></tr></tbody></table></div>"
            ],
            "text/plain": [
              "shape: (5, 4)\n",
              "┌────────────┬─────────────────────────┬──────────────────────────┬───────────────┐\n",
              "│ date       ┆ huber_style.returns_oil ┆ huber_style.returns_tech ┆ market.market │\n",
              "│ ---        ┆ ---                     ┆ ---                      ┆ ---           │\n",
              "│ date       ┆ f32                     ┆ f32                      ┆ f32           │\n",
              "╞════════════╪═════════════════════════╪══════════════════════════╪═══════════════╡\n",
              "│ 2026-03-25 ┆ 0.00027                 ┆ 0.000519                 ┆ 0.009012      │\n",
              "│ 2026-03-26 ┆ -0.001483               ┆ -0.001564                ┆ -0.012237     │\n",
              "│ 2026-03-27 ┆ 0.001317                ┆ -0.001228                ┆ -0.015247     │\n",
              "│ 2026-03-30 ┆ -0.000715               ┆ -0.000302                ┆ -0.005544     │\n",
              "│ 2026-03-31 ┆ 0.002486                ┆ 0.002097                 ┆ 0.025684      │\n",
              "└────────────┴─────────────────────────┴──────────────────────────┴───────────────┘"
            ]
          },
          "execution_count": 58,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "risk_model_api.fret().tail()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Housekeeping\n",
        "\n",
        "Below demonstrates how to delete risk datasets."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 59,
      "metadata": {},
      "outputs": [
        {
          "data": {
            "text/plain": [
              "RawSettings(model_type='RiskDatasetSettings', name='My-Dataset', identifier=1, exists=True, raw_json={'is_system_wide': False, 'reference_dataset': 'Bayesline-US-All-1y', 'exposures': [{'exposure_type': 'referenced', 'continuous_factor_groups': None, 'categorical_factor_groups': None}], 'exchange_codes': None, 'trim_assets': 'none', 'trim_start_date': 'earliest_start', 'trim_end_date': 'latest_end'}, references=[], extra={})"
            ]
          },
          "execution_count": 59,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "risk_datasets.delete_dataset(\"My-Dataset\")"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 60,
      "metadata": {},
      "outputs": [
        {
          "data": {
            "text/plain": [
              "RawSettings(model_type='RiskDatasetSettings', name='My-Risk-Dataset', identifier=2, exists=True, raw_json={'is_system_wide': False, 'reference_dataset': 'Bayesline-US-All-1y', 'exposures': [{'exposure_type': 'unit', 'factor_group': 'market', 'factor': 'market', 'factor_type': 'continuous'}, {'exposure_type': 'uploaded', 'exposure_source': 'My-US-Top100-Exposures', 'continuous_factor_groups': ['style'], 'categorical_factor_groups': [], 'factor_groups_gaussianize': ['style'], 'factor_groups_gaussianize_maintain_zeros': [], 'factor_groups_fill_miss': ['style'], 'hierarchy_sources': {}}, {'exposure_type': 'huber_regression', 'tsfactors_source': 'Oil-and-Tech-Returns', 'factor_group': 'huber_style', 'include': 'All', 'exclude': [], 'fill_miss': True, 'window': 126, 'epsilon': 1.35, 'alpha': 0.0001, 'alpha_start': 10.0, 'student_t_level': None, 'clip': [None, None], 'gaussianize': True, 'gaussianize_maintain_zeros': False, 'impute': True, 'currency': 'USD', 'calendar': {'dataset': None, 'filters': [['XNYS']]}}], 'exchange_codes': None, 'trim_assets': 'none', 'trim_start_date': 'earliest_start', 'trim_end_date': 'latest_end'}, references=[], extra={})"
            ]
          },
          "execution_count": 60,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "risk_datasets.delete_dataset(\"My-Risk-Dataset\")"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 61,
      "metadata": {},
      "outputs": [],
      "source": [
        "exposure_dataset.destroy()"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 62,
      "metadata": {},
      "outputs": [],
      "source": [
        "factor_ts_dataset.destroy()"
      ]
    }
  ],
  "metadata": {
    "kernelspec": {
      "display_name": ".venv",
      "language": "python",
      "name": "python3"
    },
    "language_info": {
      "codemirror_mode": {
        "name": "ipython",
        "version": 3
      },
      "file_extension": ".py",
      "mimetype": "text/x-python",
      "name": "python",
      "nbconvert_exporter": "python",
      "pygments_lexer": "ipython3",
      "version": "3.11.15"
    }
  },
  "nbformat": 4,
  "nbformat_minor": 2
}