Portfolios#
Prerequisites
Uploaders API Tutorial
In this tutorial we are going to demonstrate the usage of the Bayesline Portfolios API. The Portfolios API provides a unified access mechanism to bring portfolio holdings data into the system such that it can be used downstream (e.g. for risk analytics).
Specifically, we will introduce and explore:
Portfolio Sources
Listing existing portfolio sources
Uploading new portfolio data (adding to a source, creating a new source)
Reading portfolio data
Forward filling holdings data with drift correction
Funds of funds structures
Portfolio Schemas (advanced)
Imports & Setup#
For this tutorial notebook, you will need to import the following packages.
import datetime as dt
import polars as pl
from bayesline.apiclient import BayeslineApiClient
from bayesline.api.equity import (
PortfolioSettings,
PortfolioOrganizerSettings,
)
We will also need to have a Bayesline API client configured.
bln = BayeslineApiClient.new_client(
endpoint="https://[ENDPOINT]",
api_key="[API-KEY]",
)
The main entrypoint for the Portfolios API sits on bln.equity.portfolios
. All portfolios functionality can be reached from here on out.
See here for relevant docs:
portfolios_loader = bln.equity.portfolios
Portfolio Sources#
A portfolio source is an isolated dataset that contains holdings information for different portfolios. This could be a system source (e.g. from a database) or user uploaded data.
For each source it is guaranteed that the data is free of duplications and otherwise consistent.
Listing Available Portfolio Sources#
Below demonstrates how to obtain the list of available sources using the settings menu.
portfolios_loader.settings.available_settings()
PortfolioSettingsMenu(sources=[], schemas=[])
Uploading a Portfolio Source#
A new portfolio source can be added using the portfolios uploader, which can be obtained through the uploader
property (note that this is a shortcut for using bln.equity.uploaders.get_data_type("portfolios")
, which yields the same uploader).
Given above showed no existing portfolio sources we should find that the uploader has no datasets.
uploader = portfolios_loader.uploader
uploader.get_datasets()
[]
Next let’s create a new portfolios dataset and upload some sample data. For a more detailed walk through on the uploader infrastructure (including parsers, example inputs, versioning, etc.) see the Bayesline Uploaders Tutorial.
demo_portfolio_dataset = uploader.get_or_create_dataset("demo-portfolios")
df = pl.DataFrame({
"portfolio_id": [
"Test-Portfolio", "Test-Portfolio", "Test-Portfolio",
"Test-Portfolio-2", "Test-Portfolio-2", "Test-Portfolio-2",
],
"asset_id": [
"02079K107", "02079K107", "2592345",
"85371710", "85371710", "85371710"
],
"asset_id_type": [
"cusip9", "cusip9", "sedol7",
"cusip8", "cusip8", "cusip8"
],
"date": [
dt.date(2025, 1, 1), dt.date(2025, 1, 31), dt.date(2025, 1, 15),
dt.date(2025, 1, 1), dt.date(2025, 1, 13), dt.date(2025, 2, 15),
],
"value": [
100, 110, 200,
50, 55, 54
]
})
df
portfolio_id | asset_id | asset_id_type | date | value |
---|---|---|---|---|
str | str | str | date | i64 |
"Test-Portfolio" | "02079K107" | "cusip9" | 2025-01-01 | 100 |
"Test-Portfolio" | "02079K107" | "cusip9" | 2025-01-31 | 110 |
"Test-Portfolio" | "2592345" | "sedol7" | 2025-01-15 | 200 |
"Test-Portfolio-2" | "85371710" | "cusip8" | 2025-01-01 | 50 |
"Test-Portfolio-2" | "85371710" | "cusip8" | 2025-01-13 | 55 |
"Test-Portfolio-2" | "85371710" | "cusip8" | 2025-02-15 | 54 |
demo_portfolio_dataset.fast_commit(df, mode="append")
UploadCommitResult(version=1, committed_names=[])
Reading Portfolio Data#
Having uploaded a portfolio source we can now use the portfolio loader to obtain the portfolio.
portfolios_loader.settings.available_settings()
PortfolioSettingsMenu(sources=['demo-portfolios'], schemas=[])
portfolios_api = portfolios_loader.load(PortfolioSettings.from_source("demo-portfolios"))
Why use a separate Portfolios API to obtain this data, as opposed to just using the uploader infrastructure to read the data back? The Portfolios API adds plenty of functionality that is specific to the domain of portfolios, e.g. coverage statistics, forward filling and drifting, etc.
Below demonstrates this functionality.
portfolios_api.get_portfolio_names()
['Test-Portfolio', 'Test-Portfolio-2']
portfolios_api.get_dates()
{'Test-Portfolio': [datetime.date(2025, 1, 1),
datetime.date(2025, 1, 15),
datetime.date(2025, 1, 31)],
'Test-Portfolio-2': [datetime.date(2025, 1, 1),
datetime.date(2025, 1, 13),
datetime.date(2025, 2, 15)]}
portfolios_api.get_coverage()
portfolio_group | portfolio_id | date | asset_id_type | input | bayesid |
---|---|---|---|---|---|
str | str | date | str | u32 | u32 |
"demo-portfolios" | "Test-Portfolio-2" | 2025-01-01 | "cusip8" | 1 | 0 |
"demo-portfolios" | "Test-Portfolio" | 2025-01-15 | "sedol7" | 1 | 1 |
"demo-portfolios" | "Test-Portfolio-2" | 2025-02-15 | "cusip8" | 1 | 0 |
"demo-portfolios" | "Test-Portfolio" | 2025-01-01 | "cusip9" | 1 | 1 |
"demo-portfolios" | "Test-Portfolio" | 2025-01-31 | "cusip9" | 1 | 1 |
"demo-portfolios" | "Test-Portfolio-2" | 2025-01-13 | "cusip8" | 1 | 0 |
portfolios_api.get_portfolio(names=["Test-Portfolio"])
date | portfolio_group | portfolio_id | input_asset_id | input_asset_id_type | value |
---|---|---|---|---|---|
date | str | str | str | str | f32 |
2025-01-01 | "demo-portfolios" | "Test-Portfolio" | "02079K107" | "cusip9" | 100.0 |
2025-01-15 | "demo-portfolios" | "Test-Portfolio" | "2592345" | "sedol7" | 200.0 |
2025-01-31 | "demo-portfolios" | "Test-Portfolio" | "02079K107" | "cusip9" | 110.0 |
Note how above the ID space is the input ID space, i.e. the IDs that were uploaded. These IDs can be mapped to one of the supported target ID spaces. Note that if IDs cannot be mapped to output values will be None
.
portfolios_api.get_id_types()
{'Test-Portfolio': ['bayesid'], 'Test-Portfolio-2': ['bayesid']}
portfolios_api.get_portfolio(names=["Test-Portfolio"], id_type="bayesid")
date | portfolio_group | portfolio_id | input_asset_id | input_asset_id_type | asset_id | asset_id_type | value |
---|---|---|---|---|---|---|---|
date | str | str | str | str | str | str | f32 |
2025-01-01 | "demo-portfolios" | "Test-Portfolio" | "02079K107" | "cusip9" | "GOOG" | "bayesid" | 100.0 |
2025-01-15 | "demo-portfolios" | "Test-Portfolio" | "2592345" | "sedol7" | "MSFT" | "bayesid" | 200.0 |
2025-01-31 | "demo-portfolios" | "Test-Portfolio" | "02079K107" | "cusip9" | "GOOG" | "bayesid" | 110.0 |
Forward Filling and Drift Correction#
The demo portfolio data was uploaded with a monthly frequency. Obtaining them back will yield the data unaltered. We can pass settings to automatically forward fill and drift correct the data. We do this at top level by passing the relevant settings to the PortfolioSettings
object.
portfolios_api = portfolios_loader.load(
PortfolioSettings.from_source("demo-portfolios", ffill="ffill-with-drift")
)
portfolios_api.get_portfolio(names=["Test-Portfolio"])
date | portfolio_group | portfolio_id | input_asset_id | input_asset_id_type | value |
---|---|---|---|---|---|
date | str | str | str | str | f32 |
2025-01-01 | "demo-portfolios" | "Test-Portfolio" | "02079K107" | "cusip9" | 100.0 |
2025-01-02 | "demo-portfolios" | "Test-Portfolio" | "02079K107" | "cusip9" | 100.068657 |
2025-01-03 | "demo-portfolios" | "Test-Portfolio" | "02079K107" | "cusip9" | 101.315361 |
2025-01-04 | "demo-portfolios" | "Test-Portfolio" | "02079K107" | "cusip9" | 101.315361 |
2025-01-05 | "demo-portfolios" | "Test-Portfolio" | "02079K107" | "cusip9" | 101.315361 |
… | … | … | … | … | … |
2025-07-04 | "demo-portfolios" | "Test-Portfolio" | "02079K107" | "cusip9" | 97.028252 |
2025-07-05 | "demo-portfolios" | "Test-Portfolio" | "02079K107" | "cusip9" | 97.028252 |
2025-07-06 | "demo-portfolios" | "Test-Portfolio" | "02079K107" | "cusip9" | 97.028252 |
2025-07-07 | "demo-portfolios" | "Test-Portfolio" | "02079K107" | "cusip9" | 95.547371 |
2025-07-08 | "demo-portfolios" | "Test-Portfolio" | "02079K107" | "cusip9" | 94.234055 |
Funds of Funds#
We can represent fund of funds structures easily by using the portfolio-id as the respective asset ids and portfolio_id
as the asset id type.
Below we’re uploading a fund of funds portfolios to our existing dataset by appending it to the existing data.
fof_portfolio_df = pl.DataFrame({
"portfolio_id": [
"My-FOF", "My-FOF", "My-FOF", "My-FOF"
],
"asset_id": [
"Test-Portfolio", "Test-Portfolio", "Test-Portfolio-2", "Test-Portfolio-2"
],
"asset_id_type": [
"portfolio_id", "portfolio_id", "portfolio_id", "portfolio_id"
],
"date": [
dt.date(2025, 1, 1), dt.date(2025, 1, 31), dt.date(2025, 1, 1), dt.date(2025, 1, 31)
],
"value": [
.5, .55, .5, .45
]
})
fof_portfolio_df
portfolio_id | asset_id | asset_id_type | date | value |
---|---|---|---|---|
str | str | str | date | f64 |
"My-FOF" | "Test-Portfolio" | "portfolio_id" | 2025-01-01 | 0.5 |
"My-FOF" | "Test-Portfolio" | "portfolio_id" | 2025-01-31 | 0.55 |
"My-FOF" | "Test-Portfolio-2" | "portfolio_id" | 2025-01-01 | 0.5 |
"My-FOF" | "Test-Portfolio-2" | "portfolio_id" | 2025-01-31 | 0.45 |
demo_portfolio_dataset.fast_commit(fof_portfolio_df, mode="append")
UploadCommitResult(version=2, committed_names=[])
Below we’re obtaining the fund of funds portfolio without any alterations.
portfolios_api = portfolios_loader.load(
PortfolioSettings.from_source("demo-portfolios")
)
portfolios_api.get_portfolio(names=["My-FOF"])
date | portfolio_group | portfolio_id | input_asset_id | input_asset_id_type | value |
---|---|---|---|---|---|
date | str | str | str | str | f32 |
2025-01-31 | "demo-portfolios" | "My-FOF" | "Test-Portfolio" | "portfolio_id" | 0.55 |
2025-01-31 | "demo-portfolios" | "My-FOF" | "Test-Portfolio-2" | "portfolio_id" | 0.45 |
2025-01-01 | "demo-portfolios" | "My-FOF" | "Test-Portfolio" | "portfolio_id" | 0.5 |
2025-01-01 | "demo-portfolios" | "My-FOF" | "Test-Portfolio-2" | "portfolio_id" | 0.5 |
We can also configure to unpack the fund of funds structure and forward fill with drift. This is an essential piece of functioanlity to provide fund of fund level analytics.
portfolios_api = portfolios_loader.load(
PortfolioSettings.from_source(
"demo-portfolios", ffill="ffill-with-drift", unpack="unpack"
)
)
portfolios_api.get_portfolio(names=["My-FOF"])
date | portfolio_group | portfolio_id | input_asset_id | input_asset_id_type | value |
---|---|---|---|---|---|
date | str | str | str | str | f32 |
2025-01-01 | "demo-portfolios" | "My-FOF" | "02079K107" | "cusip9" | 50.0 |
2025-01-02 | "demo-portfolios" | "My-FOF" | "02079K107" | "cusip9" | 50.034328 |
2025-01-03 | "demo-portfolios" | "My-FOF" | "02079K107" | "cusip9" | 50.657681 |
2025-01-04 | "demo-portfolios" | "My-FOF" | "02079K107" | "cusip9" | 50.657681 |
2025-01-05 | "demo-portfolios" | "My-FOF" | "02079K107" | "cusip9" | 50.657681 |
… | … | … | … | … | … |
2025-07-04 | "demo-portfolios" | "My-FOF" | "02079K107" | "cusip9" | 53.36554 |
2025-07-05 | "demo-portfolios" | "My-FOF" | "02079K107" | "cusip9" | 53.36554 |
2025-07-06 | "demo-portfolios" | "My-FOF" | "02079K107" | "cusip9" | 53.36554 |
2025-07-07 | "demo-portfolios" | "My-FOF" | "02079K107" | "cusip9" | 52.551056 |
2025-07-08 | "demo-portfolios" | "My-FOF" | "02079K107" | "cusip9" | 51.828732 |
Portfolio Schemas#
Portfolio Schemas are an advanced topic that sit on top of the Portfolio Sources we previously explored.
Using Portfolio Schemas we can cherry-pick which portfolios should be sourced from what underlying portfolio source. We do this by providing a mapping from portfolio id -> portfolio source
. This flexibility allows for a powerful means to arbitrarily override portfolio holdings while keeping everything else the same, e.g. to perform what-if analyses.
Below we are creating a new what-if
portfolio dataset where we override holdings for our Test-Portfolio-2
. We then create a Portfolio Schema where we source Test-Portfolio-2
from our new dataset, keeping the rest the same.
what_if_dataset = uploader.create_dataset("what-if")
what_if_portfolio_df = pl.DataFrame({
"portfolio_id": ["Test-Portfolio-2", "Test-Portfolio-2"],
"asset_id": ["67066G10", "67066G10"],
"asset_id_type": ["cusip8", "cusip8"],
"date": [dt.date(2025, 1, 1), dt.date(2025, 1, 31)],
"value": [100, 125]
})
what_if_portfolio_df
portfolio_id | asset_id | asset_id_type | date | value |
---|---|---|---|---|
str | str | str | date | i64 |
"Test-Portfolio-2" | "67066G10" | "cusip8" | 2025-01-01 | 100 |
"Test-Portfolio-2" | "67066G10" | "cusip8" | 2025-01-31 | 125 |
what_if_dataset.fast_commit(what_if_portfolio_df, mode="append")
UploadCommitResult(version=1, committed_names=[])
portfolios_loader.settings.available_settings()
PortfolioSettingsMenu(sources=['demo-portfolios', 'what-if'], schemas=[])
schema = PortfolioOrganizerSettings(
enabled_portfolios={
"Test-Portfolio": "demo-portfolios",
"Test-Portfolio-2": "what-if"
}
)
portfolios_api = portfolios_loader.load(
PortfolioSettings(portfolio_schema=schema, ffill="ffill-with-drift")
)
portfolios_api.get_portfolio_names()
['Test-Portfolio', 'Test-Portfolio-2']
Note below how Test-Portfolio-2
has the updated values.
portfolios_api.get_portfolio(names=['Test-Portfolio', "Test-Portfolio-2"])
date | portfolio_group | portfolio_id | input_asset_id | input_asset_id_type | value |
---|---|---|---|---|---|
date | str | str | str | str | f32 |
2025-01-01 | "what-if" | "Test-Portfolio" | "02079K107" | "cusip9" | 100.0 |
2025-01-01 | "what-if" | "Test-Portfolio-2" | "67066G10" | "cusip8" | 100.0 |
2025-01-02 | "what-if" | "Test-Portfolio" | "02079K107" | "cusip9" | 100.068657 |
2025-01-02 | "what-if" | "Test-Portfolio-2" | "67066G10" | "cusip8" | 102.99353 |
2025-01-03 | "what-if" | "Test-Portfolio" | "02079K107" | "cusip9" | 101.315361 |
… | … | … | … | … | … |
2025-07-06 | "what-if" | "Test-Portfolio-2" | "67066G10" | "cusip8" | 165.908478 |
2025-07-07 | "what-if" | "Test-Portfolio" | "02079K107" | "cusip9" | 95.547371 |
2025-07-07 | "what-if" | "Test-Portfolio-2" | "67066G10" | "cusip8" | 164.763123 |
2025-07-08 | "what-if" | "Test-Portfolio" | "02079K107" | "cusip9" | 94.234055 |
2025-07-08 | "what-if" | "Test-Portfolio-2" | "67066G10" | "cusip8" | 166.595688 |
Saving the Schema#
We can also save a portfolio schema for later use.
Below we demonstrate how to save the schema we previosly created and then use downstream.
portfolios_loader.organizer_settings.save("my-schema", schema)
0
portfolios_api = portfolios_loader.load(
PortfolioSettings(portfolio_schema="my-schema")
)
portfolios_api.get_portfolio_names()
['Test-Portfolio', 'Test-Portfolio-2']
Housekeeping#
Lastly, we clean up by deleting our portfolio datasets.
portfolios_loader.organizer_settings.delete("my-schema")
uploader.get_dataset("demo-portfolios").destroy()
uploader.get_dataset("what-if").destroy()