Portfolio Hierarchies#

Prerequisites

  • Portfolios API Tutorial

  • Uploaders API Tutorial

In this tutorial we are going to demonstrate the usage of the Bayesline Portfolio Hierarchies API. The Portfolio Hierarchies API builds on the Portfolios API (see tutorial for an in depth walk through). It allows to combine portfolios with benchmarks and other groupings and by default applies forward filling and drift correction. This forms the foundation for portfolio analytics, where we can simply link a Portfolio Hierarchy and obtain analytics (e.g. a return attribution report) for different portfolios with optional benchmarks and other groupings.

Specifically, we will introduce and explore:

  • Portfolio Hierarchies

  • Creating a new Portfolio Hierarchy

    • Only portfolios

    • With benchmarks

    • Additional Groupings

  • Obtaining data

  • Housekeeping

Imports & Setup#

For this tutorial notebook, you will need to import the following packages.

import datetime as dt

import polars as pl

from bayesline.apiclient import BayeslineApiClient

from bayesline.api.equity import (
    PortfolioHierarchySettings,
)

We will also need to have a Bayesline API client configured.

bln = BayeslineApiClient.new_client(
    endpoint="https://[ENDPOINT]",
    api_key="[API-KEY]",
)

The main entrypoint for the Portfolios API sits on bln.equity.portfoliohierarchies. All portfolio hierarchies functionality can be reached from here on out.

See here for relevant docs:

ph_loader = bln.equity.portfoliohierarchies

Portfolio Hierarchies#

In its most plain form a PortfolioHierarchy solely consists of a list of portfolio ids and a source or schema that describes where to obtain the data from.

Below we create some demo portfolio data and upload it to the system such that it can be used for this demo. See the Uploaders API tutorial for an in depth walk through on how to bring data into the Bayesline ecosystem.

In this toy example we create four portfolios with 2 holdings prints, 2025-01-01 and 2025-01-31:

  1. AGTHX with two holdings

  2. FCNTX with one holding

  3. VADGX with three holdings

  4. SPX with four holdings

uploader = bln.equity.uploaders.get_data_type("portfolios").get_or_create_dataset("portfolio-hierarchies-demo")
portfolio_df = pl.DataFrame({
    "portfolio_id": [
        "AGTHX", "AGTHX", "AGTHX", "AGTHX",
        "FCNTX", "FCNTX",
        "VADGX", "VADGX", "VADGX", "VADGX", "VADGX", "VADGX",
        "SPX", "SPX", "SPX", "SPX", "SPX", "SPX", "SPX", "SPX",
    ],
    "asset_id": [
        # AGTHX
        "02079K107", "02079K107", 
        "2592345", "2592345",

        # FCNTX
        "67066G10", "67066G10",

        # VADGX
        "02079K107", "02079K107", 
        "2592345", "2592345", 
        "67066G10", "67066G10",

        # SPX
        "02079K107", "02079K107", 
        "2592345", "2592345",
        "67066G10", "67066G10",
        "85371710", "85371710",

    ],  
    "asset_id_type": [
        # AGTHX
        "cusip9", "cusip9", "sedol7", "sedol7",

        # FCNTX
        "cusip8", "cusip8",

        # VADGX
        "cusip9", "cusip9", "sedol7", "sedol7", "cusip8", "cusip8",

        # SPX
        "cusip9", "cusip9", "sedol7", "sedol7", "cusip8", "cusip8", "cusip8", "cusip8",
        
    ],
    "date": [
        # AGTHX
        dt.date(2025, 1, 1), dt.date(2025, 1, 31), 
        dt.date(2025, 1, 1), dt.date(2025, 1, 31),

        # FCNTX
        dt.date(2025, 1, 1), dt.date(2025, 1, 31),

        # VADGX
        dt.date(2025, 1, 1), dt.date(2025, 1, 31), 
        dt.date(2025, 1, 1), dt.date(2025, 1, 31), 
        dt.date(2025, 1, 1), dt.date(2025, 1, 31), 
        
        # SPX
        dt.date(2025, 1, 1), dt.date(2025, 1, 31), 
        dt.date(2025, 1, 1), dt.date(2025, 1, 31), 
        dt.date(2025, 1, 1), dt.date(2025, 1, 31), 
        dt.date(2025, 1, 1), dt.date(2025, 1, 31), 
    ],
    "value": [
        # AGTHX
        0.5, 0.55, 
        0.5, 0.45,

        # FCNTX
        1.0, 1.0,

        # VADGX
        0.3, 0.5, 
        0.4, 0.2,
        0.3, 0.25,

        # SPX
        0.3, 0.3, 
        0.4, 0.2,
        0.3, 0.25,
        0.3, 0.25,
    ],
})

portfolio_df
shape: (20, 5)
portfolio_idasset_idasset_id_typedatevalue
strstrstrdatef64
"AGTHX""02079K107""cusip9"2025-01-010.5
"AGTHX""02079K107""cusip9"2025-01-310.55
"AGTHX""2592345""sedol7"2025-01-010.5
"AGTHX""2592345""sedol7"2025-01-310.45
"FCNTX""67066G10""cusip8"2025-01-011.0
"SPX""2592345""sedol7"2025-01-310.2
"SPX""67066G10""cusip8"2025-01-010.3
"SPX""67066G10""cusip8"2025-01-310.25
"SPX""85371710""cusip8"2025-01-010.3
"SPX""85371710""cusip8"2025-01-310.25
uploader.fast_commit(portfolio_df, mode="append")
UploadCommitResult(version=1, committed_names=[])

Hierarchy with only Portfolios#

Next up we create a portfolio hierarchy with only two of the portfolios we created. This is the first use case of the hierarchy, to subset a larger source of portfolios to only those that are relevant for a specific analysis.

Note that the portfolio_ids must be unique for a hierarchy to be valid.

ph_settings = PortfolioHierarchySettings.from_source(
    source="portfolio-hierarchies-demo",
    portfolio_ids=["AGTHX", "SPX"],
)

ph_api = ph_loader.load(ph_settings)

Below we obtain holdings data for our setup and we find that:

  1. Only our two specified portfolios are present

  2. The values have been forward filled and drift corrected

  3. The benchmark is unset (as specified)

ph_api.get(start_date=dt.date(2025, 1, 1), end_date=dt.date(2025, 1, 31))
shape: (155, 6)
dateportfolio_idinput_asset_idinput_asset_id_typevaluevalue_bench
datestrstrstrf32f32
2025-01-01"AGTHX""02079K107""cusip9"0.5null
2025-01-01"AGTHX""2592345""sedol7"0.5null
2025-01-01"SPX""02079K107""cusip9"0.3null
2025-01-01"SPX""2592345""sedol7"0.4null
2025-01-01"SPX""67066G10""cusip8"0.3null
2025-01-31"AGTHX""02079K107""cusip9"0.55null
2025-01-31"AGTHX""2592345""sedol7"0.45null
2025-01-31"SPX""02079K107""cusip9"0.3null
2025-01-31"SPX""2592345""sedol7"0.2null
2025-01-31"SPX""67066G10""cusip8"0.25null

Adding Benchmarks#

We can add a benchmark to our hierarchy which will populate the benchmark column. Note that we can selectively specify a benchmark, for instance we could benchmark AGTHX against SPX but leave SPX itself unbenchmarked.

ph_settings = PortfolioHierarchySettings.from_source(
    source="portfolio-hierarchies-demo",
    portfolio_ids=["AGTHX", "SPX"],
    benchmark_ids=["SPX", None],
)
ph_api = ph_loader.load(ph_settings)

Note that the null observations are cases where either portfolio or benchmark as a holding but the other does not (in its respective id space).

ph_api.get(start_date=dt.date(2025, 1, 1), end_date=dt.date(2025, 1, 31))
shape: (186, 6)
dateportfolio_idinput_asset_idinput_asset_id_typevaluevalue_bench
datestrstrstrf32f32
2025-01-01"AGTHX""02079K107""cusip9"0.50.3
2025-01-01"AGTHX""2592345""sedol7"0.50.4
2025-01-01"AGTHX""67066G10""cusip8"null0.3
2025-01-01"SPX""02079K107""cusip9"0.3null
2025-01-01"SPX""2592345""sedol7"0.4null
2025-01-31"AGTHX""2592345""sedol7"0.450.2
2025-01-31"AGTHX""67066G10""cusip8"null0.25
2025-01-31"SPX""02079K107""cusip9"0.3null
2025-01-31"SPX""2592345""sedol7"0.2null
2025-01-31"SPX""67066G10""cusip8"0.25null

We can pass one of the supported output id types to map ids on the fly.

ph_api.get_id_types()
{'AGTHX': ['bayesid'], 'SPX': ['bayesid']}
ph_api.get(start_date=dt.date(2025, 1, 1), end_date=dt.date(2025, 1, 31), id_type="bayesid")
shape: (186, 8)
dateportfolio_idinput_asset_idinput_asset_id_typeasset_idasset_id_typevaluevalue_bench
datestrstrstrstrstrf32f32
2025-01-01"AGTHX""02079K107""cusip9""GOOG""bayesid"0.50.3
2025-01-01"AGTHX""2592345""sedol7""MSFT""bayesid"0.50.4
2025-01-01"AGTHX""67066G10""cusip8""NVDA""bayesid"null0.3
2025-01-01"SPX""02079K107""cusip9""GOOG""bayesid"0.3null
2025-01-01"SPX""2592345""sedol7""MSFT""bayesid"0.4null
2025-01-31"AGTHX""2592345""sedol7""MSFT""bayesid"0.450.2
2025-01-31"AGTHX""67066G10""cusip8""NVDA""bayesid"null0.25
2025-01-31"SPX""02079K107""cusip9""GOOG""bayesid"0.3null
2025-01-31"SPX""2592345""sedol7""MSFT""bayesid"0.2null
2025-01-31"SPX""67066G10""cusip8""NVDA""bayesid"0.25null

Adding Additional Groupings#

We can add additional groupings that downstream can be used to provide additional aggregations for these groupings. For instance we might assign portfolios to managers or a specific investment style.

Below we’re assigning hypothetical managers Alex and Joanna. Note that these groupings have no effect on the output of the ph_loader.get method but they are picked up downstream by the Reports API.

ph_settings = PortfolioHierarchySettings.from_source(
    source="portfolio-hierarchies-demo",
    portfolio_ids=["AGTHX", "FCNTX", "VADGX"],
    benchmark_ids=["SPX", "SPX", "VADGX"],
    groupings={"Manager": ["Alice", "Bob", "Charlie"]},
)


ph_settings.to_polars()
shape: (3, 3)
Managerportfolio_idbenchmark_id
strstrstr
"Alice""AGTHX""SPX"
"Bob""FCNTX""SPX"
"Charlie""VADGX""VADGX"

Creating from Polars#

As a shorthand it might be more convenient to create a hierarchy from a data frame instead of manually specifying the Pydantic object.

ph_settings_df = ph_settings.to_polars()

ph_settings_df
shape: (3, 3)
Managerportfolio_idbenchmark_id
strstrstr
"Alice""AGTHX""SPX"
"Bob""FCNTX""SPX"
"Charlie""VADGX""VADGX"
ph_settings = PortfolioHierarchySettings.from_polars(
    ph_settings_df, 
    portfolio_source="portfolio-hierarchies-demo"
)
ph_api = ph_loader.load(ph_settings)

ph_api.get(start_date=dt.date(2025, 1, 1), end_date=dt.date(2025, 1, 31))
shape: (279, 6)
dateportfolio_idinput_asset_idinput_asset_id_typevaluevalue_bench
datestrstrstrf32f32
2025-01-01"AGTHX""02079K107""cusip9"0.50.3
2025-01-01"AGTHX""2592345""sedol7"0.50.4
2025-01-01"AGTHX""67066G10""cusip8"null0.3
2025-01-01"FCNTX""02079K107""cusip9"null0.3
2025-01-01"FCNTX""2592345""sedol7"null0.4
2025-01-31"FCNTX""2592345""sedol7"null0.2
2025-01-31"FCNTX""67066G10""cusip8"1.00.25
2025-01-31"VADGX""02079K107""cusip9"0.50.5
2025-01-31"VADGX""2592345""sedol7"0.20.2
2025-01-31"VADGX""67066G10""cusip8"0.250.25