Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] .ultimate_ does not match the final period of .full_triangle_ #532

Open
tobycook97 opened this issue Jun 25, 2024 · 2 comments
Open
Assignees
Labels

Comments

@tobycook97
Copy link

tobycook97 commented Jun 25, 2024

Describe the bug
Doing .ultimate_ on a chainladder model, I'd expect the figures to match the final column of .latest_diagonal_ . However it does not. This seems to occur when there are missing values in some of the triangle.

To Reproduce

import chainladder as cl

tri = cl.load_sample('clrd')

dev = cl.Development().fit(tri)

dev = dev.transform(tri)

model = cl.Chainladder().fit(dev) 

ult = model.ultimate_
full_tri = model.full_triangle_

print(ult[(ult.index['GRNAME'] == 'Adriatic Ins Co') & (ult.index['LOB'] == 'othliab') ]['IncurLoss']) # as an example 

print(full_tri[(full_tri.index['GRNAME'] == 'Adriatic Ins Co') & (full_tri.index['LOB'] == 'othliab') ]['IncurLoss'])

Expected behavior
Latest diagonals of both full_triangle and ultimate match

assert full_tri.latest_diagonal == ult.latest_diagonal

Desktop (please complete the following information):

  • Numpy == 1.26.4
  • Pandas == 1.4.2
  • Chainladder == 0.8.23
@jbogaardt
Copy link
Collaborator

Hi @tobycook97, thanks for the report. It's very easy to follow and we will resolve at next release. I think we need to investigate why the full_triangle property is bugging out on this one, possibly related to the very sparsely populated LDFs.

@tobycook97
Copy link
Author

tobycook97 commented Jul 2, 2024

Thanks @jbogaardt, appreciate it.

In case it's useful to anyone, I found the problem arises where the first origin periods of the triangle are null. My workaround is to remove these (and to remove the corresponding development periods).

def remove_na_from_triangle(tri: cl.Triangle, column: str = 'IncurLoss') -> cl.Triangle:
        """
            Removes the origin periods that are all na, and not preceded by any non null origin periods
        """
        origins_dataframe = tri[column].sum(axis=3).to_frame().fillna(method='ffill') # sum across dev periods and make a 
        # df so we can get the null values (i'm sure there is a better way)
        # Uses ffill to ensure we are only removing the first null origins (not the ones in the mid years) 

        non_na_origins = origins_dataframe[origins_dataframe.iloc[:,0].notna()].index.to_period(freq='Y')
        
        devs_dataframe = tri[column].sum(axis=2).to_frame() # sum across devs and make a df  
        na_devs = devs_dataframe.columns[devs_dataframe.iloc[0].isna()]
        
        tri = tri[tri.origin.isin(non_na_origins)] # filter out
        tri = tri[~tri.development.isin(na_devs)]
        
        return tri 

The drawback of this is you will need to iterate over the index and remove the NA values individually if you want to fit each triangle individually. (However this is unlikely as you will probably want to make dev patterns at a higher level than the one shown in the example above)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants