[API] Redesign towards pytorch-forecasting 2.0 #1736

fkiraly · 2024-12-20T21:14:58Z

Discussion thread for API re-design for pytorch.forecasting next 1.X and towards 2.0. Comments appreciated from everyone!

Link to enhancemeng proposal: sktime/enhancement-proposals#39

Context and goals

High-level directions:

the DSIPTS project (https://github.com/DSIP-FBK/DSIPTS) be used as an experimental branch for pytorch-forecasting 2.0. We will need to homogenize interfaces, consolidate design ideas, and ensure downwards compatibility.
further inspiration can be taken from the thuml project, also see [ENH] neural network libraries in thuml time-series-library sktime#7243.
@agobbifbk and a team at FBK with substantial time on this will drive this together with @fkiraly and developers at sktime.

High-level features for 2.0 with MoSCoW analysis:

M: unified model API which is easily extensible and composable, similar to sktime and DSIPTS, but as closely to the pytorch level as possible. The API need not cover forecasters in general, only torch based forecasters.
- M: unified monitoring and logging API, also see [API] redesign of logging and monitoring for 2.0 #1700
- M: extension templates need to be created
- S: skbase can be used to curate the forecasters as records, with tags, etc
- S: model persistence
- C: third party extension patterns, so new models can "live" in other repositories or packages, for instance thuml
M: reworked and unified data input API
- M: support static variables and categoricals
- S: support for multiple data input locations and formats - pandas, polars, distributed solutions etc
M: MLops and benchmarking features as in DSIPTS
S: support for pre-training, model hubs, foundation models, but this could be post-2.0

Meeting notes

Summary of discussion on Dec 20, 2024 and prior

FYI @agobbifbk, @thawn, @sktime/core-developers.

High-level directions:

the DSIPTS project (https://github.com/DSIP-FBK/DSIPTS) be used as an experimental branch for pytorch-forecasting 2.0. We will need to homogenize interfaces, consolidate design ideas, and ensure downwards compatibility.
further inspiration can be taken from the thuml project, also see [ENH] neural network libraries in thuml time-series-library sktime#7243.
@agobbifbk and a team at FBK with substantial time on this will drive this together with @fkiraly and developers at sktime.

High-level features for 2.0 with MoSCoW analysis:

M: unified model API which is easily extensible and composable, similar to sktime and DSIPTS, but as closely to the pytorch level as possible. The API need not cover forecasters in general, only torch based forecasters.
- M: unified monitoring and logging API, also see [API] redesign of logging and monitoring for 2.0 #1700
- M: extension templates need to be created
- S: skbase can be used to curate the forecasters as records, with tags, etc
- S: model persistence
- C: third party extension patterns, so new models can "live" in other repositories or packages, for instance thuml
M: reworked and unified data input API
- M: support static variables and categoricals
- S: support for multiple data input locations and formats - pandas, polars, distributed solutions etc
M: MLops and benchmarking features as in DSIPTS
S: support for pre-training, model hubs, foundation models, but this could be post-2.0

Todos:
0. update documentation on dsipts to signpost the above. README etc.

highest priority - consolidated API design for model and data layer.
- Depending on distance to current ptf and dsipts, use one or the other location for improvements (separate 2.0 -> dsipts, 1.X -> ptf as current).
- ptf = stable and downwards compatible; dsipts = "playground"
- first step for that: side-by-side comparisons of code, defined core workflows
planning sessions & sprints from Jan 2025

Roadmap planning Jan 15, 2025

👋 Attendees

[name=fill this in]
[name=Andrea Gobbi]
[name=Aryan Saini]
[name=Franz Kiraly]
[name=Benedikt Heidrich]
[name=Sandeep Kumar]
[name=Felix Hirwa Nshuti]

Prioritization

👍👍👍👍👍👍👍👍👍👍👍👍👍
✔️✔️✔️
💬 💬 💬

data layer - dataset, dataloader 👍👍👍👍👍👍👍 💬 ✔️✔️✔️

dataset and dataloader API consolidation

model layer - base classes, configs, unified API 👍👍👍👍 ✔️

A more refined base classes maybe and proper documentation of them
tests: input-output shapes of the bathches
operationalization of the models (inference must be a clear process/output)
think about how model config are stored. e.g. versioning so that we know how to load model weights.
easy interface for adding new architectures

foundation models, model hubs 👍 👍 👍

To think how to handle pre-trained models in ptf and how to interface them?
integration with model hubs (hf/kaggle/...)
[IDEA] looking to Chronos-Bolt and integrate into the repo.

documentation👍✔️

adding tutorials and examples for the new users
proper documentation of base classes

benchmarking 👍 👍💬

easy way to use benchmark datasets
reproducibility, scalability, concept of experiment (same dataset, different models/parameters to be compared )

mlops and scaling (distributed, cluster etc)👍 👍

operationalization of the models (inference)
hooks for: slurm cluster, multiprocess, OPTUNA
think about how model config are stored. e.g. versioning so that we know how to load model weights.

more learning tasks supported

[IDEA] continuous learning / active learning
[IDEA] easy to convert DL architectures from regression to classification

Tech meeting Jan 20, 2025

Attendees:

Andrea
Felix
Franz
Pranav B
Sandeep
Till

Agenda

discussing agenda
number of classes, dataset, dataloader, "bottleneck" idea
__getitem__ output convention
__init__ input convention(s)
handling large data sets, pandas vs polars

References

Umbrella issue design
#1736

Notes

number of classes, dataset, dataloader, "bottleneck" idea

AG: should be making it as modular as possible

avoid "god class" anti-pattern (like current ptf TimeSeriesDataSet that does everything)
so, more than one?
thinks there are two different ways to implement dataloader, with respect of where we compute slices when re-sampling
- option 1: pre-compute in __init__ - more memory intensive, clear distinction between train and inference; naive implementation needs to load everything in memory
- option 2: during sample generation, e.g., compute at __getitem__ time (dataset or dataloader) - feels this might be compute intensive, if we are recomputing and not caching etc
- need a decision?
thinks output of __getitem__ should be as general as possible
- but should keep the number of dictionary keys as small as possible, not overdesign or proliferate
- most architectures so far are more or less the same
- important input to this discussion: is input to the foundation models similar to classical models? Or do we need different dataloaders, even different datasets perhaps?

FHN:

prefers lazy computation at __getitem__ time, option 2.
- need to ensure ptf can be used by people with large data!
agrees with most points of AG otherwise

PB:

concerned about difference in datasets, dataloaders for foundation models
- this question needs to be resolved before adopting an input/output API

S:

need to discuss support for data files on the hard drive
- list of datasets seen:
in sktime we are using timesfm library
- timesfm is by google
- maybe we can use or interface it
we could have two kind of output, point based vs interval or range prediction
- in inferce mode
- we should ensure we can cover both

T:

mostly agree with AG on important points
additional idea on in-memory vs lazy loading
- two data classes, one in-memory, one on-disk, with same __getitem__ protocol
- can use on-disk for trianing, in-memory for inference
- ideally one class inherits from the other (or have a common ancestor)

FK:

idea of "bottleneck" or "least common denominator" did not come up, surprised (came up before)
think we need at least one class, likely a DataSet for "raw time series" (collection of, with all metadata)
- this can be used as a component of another dataset or data loader
- conceptual model would be "minimal information that is required to specify abstract data model", of collection of time series
  - alternatively, minimal for all deep learning forecasters ever seen somewhere on GitHub (which could be less minimal)
Benedikt (not here today) also suggested this idea, and that DataSet-s could depend on each other
current best guess for a structure:
- "minimal" layer DataSet-s, these inherit from a common base and handle pandas as well as hard drive data
- "specialized" DataSet-s, these could add re-sampling on top, normalization etc
- "specialized" DataLoader-s, these are specific to data sets and classes of neural networks
alternative structure
- DataSet-s only have minimal representation of "time series"
  - common base class and children for pandas/in-memory vs hard drive
- everything else is done by "specialized" DataLoader-s that adapt data sets to neural networks
T: one of the "final layer" classes - or middle layer classes - could be adapter to 1.X API of pytorch-forecasting (current), ensuring downwards compatibility.
FK: big question for me is how many "layers" to have, e.g., two dataset layers and one data loader layer, or single dataset layer and one data loader layer (where data loaders do more).
- T: this will depend on the model, and whether one wants 1.X compatibility, here one would need two data sets
- FK: true for 1.X compatibility, but what about long term - do we need the second dataset?
- T: thinking - 1.X API will not understand dataset directly, so will need to write adapter code.
  - FK: does not follow that we need two layers of dataset - because we could have dataloader that adapts the "lower layer" dataset directly to the loader consumed by the current ptf 1.X API
    - so there could be a dataloader that consumes the "minimal" dataset, and produces the current dataloader interface
T: had assumed we will use standard pytorch dataloader - if that is the case, we will need two datasets for downwards compatibility.
- FK: should we or not rewrite data loaders?
- T: as long as we adhere to torch API, it is fine to write custom loaders; there is no a-priori reason to not write our own
  - we also have the "option 1 vs option 2" problem (precompute vs lazy) that AG mentioned in this case

FHN: if we keep using vanilla torch dataloader, we need two data set layers
* is this a contradiciton to the dataset to be "minimal"?
* FK: thinks not a contradiction, since there are two layers of datasets
* lower layer is "minimal" as discussed
* 2nd layer is specialized and specific to neural network(s)

T: would make 2nd layer optional, not all models will need it
- FK: agree, have the feeling that foundation models do not need it (or need another, different, possibly simpler layer)

FK: feels there is convergence but with two open questions:

"precompute or lazy" for resampling ("option 1 and 2")
custom data loaders, or not (with implications on lower layers)
- if custom, then only one dataset layer needed
- if vanilla, then two dataset layers needed

(__getitem__ format to be handled in next agenda point)

AG: there is one more complication - "stacked models", which are composites that use other models and their outputs to generate improved outputs
- possible in dsipts but not well-architected (yet)
- impacts dataset/dataloader because this may require holding data in memory - option 1 vs 2 discussion
- this is more of a general point, perhaps about models - implications on dataset/dataloaders are not clear, but perhaps relevant
FK - we could have both options with a flag or two classes, this is really about internals of the class and does not impact
- number of classes
- interfaces
- it is an orthogonal question
- T: prefers having different classes if we have "option 1 classes" vs "option 2 classes"
  - in-memory class should be optional adapter, does ont change getitem output
T: commenting about "stacked models"
- sounds very useful if model output can easily act as input!
  - FK: like in sktime, this makes composition easy (and possible at all!)
- specifically, it should be possible to plug inference outputs into dataloader input locations!

strong opinions on using vanilla dataloader vs two dataset layers, vs custom dataloader and one dataset

T: community standard seems to be vanilla dataloader, would have slight preference due to that
FHN: good to have custom dataloader, but need to sort out in-memory vs lazy issue, and because there is less dataset layers.
- prefers vanilla dataloader, because familiar with how toe use by default
AG: weak preference for vanilla dataloader, same in dsipts (status quo)
- FHN: if we need caching during training, can do via custom dataloader
S: thinks vanilla is much better, already fairly optimized by community
- extra features can use customization, e.g., dynamic batches
FK: Benedikt prefers (moderately stronlgy) the "two dataset layer" design, I infer this from GitHub converstaion on the linked issue

`getitem` output convention

FHN: unsure
T: "as simple as possible"
- suggest to use dict and arrays (tensors etc) inside
  - should not be much more complicated
S: do we have a clear picture of what should be there?
- FK: we have decided on two dataset layers, so need to answer this for two layers
  - lower "minimal" layer is probably key
  - intermediate layer implied by NN architecture
  - "minimal" layer ADT is? already unclear
    - single time series?
    - chunk of time series?
    - out of a collection of?
    - metadata?
T: would prefer pure tensors
- just a single tensor!
- FK: what would this encode?
  - batch/time/channels
  - that's it?
- AG: thinks this is too restrictive for a set of models
  - length between input and output can be different
  - context of 20 in past, 10 in future
  - FK: this is an argument for the middle layer, not the "minimal layer"
  - AG: question, what does you mean with layers
    - "minimal layer" means autoregressive?
    - FK: "minimal layer" was thinking the most simple possible dataset that is a dataest for all DL models
      - was thinking more conretely, corresponds to abstract data type that actual datasets have, e.g., collection of time series of potentially different length
        
        possibly with metadata
- main reasoning, eventually it has to be tensors! That is the minimal encoding
  - FK: what about categoricals?
    - T: categoricals would be channels
    - T: which is which is metadata!
      - FK: would imply metadata tracking "already encoded", and "not yet encoded"

Tech meeting Jan 24, 2025

Attendees:

Andrea
Aryan
Franz
Jigyasu
Pranav B
Satvik
Sandeep
whatdoes12

Notes

discussion of input
- need adaptation for distirbuted inputs or hard drive
- FK: can be dealt with by varying inputs for same output

Recap

need to define dataset/dataloader layers
- last time, strong opinion for dataloader = vanilla, two dataset layers
- two "sides" need to be discussed - "input API", __init__, and "output API", __getitem__
FK: suggest to focus on on output first
- input can vary - so we can adapt pandas and zarr input etc

`getitem` designs based on last times

AG design suggestion

* two DataSet (__getitem__):
    * simple: x_num: [length x channels]
    *

Current DSIPTS str

class MyDataset(Dataset):

    def __init__(self, data:dict,t:np.array,groups:np.array,idx_target:Union[np.array,None],idx_target_future:Union[np.array,None])->torch.utils.data.Dataset:
        """
            Extension of Dataset class. While training the returned item is a batch containing the standard keys

        Args:
            data (dict): a dictionary. Each key is a np.array containing the data. The keys are:
                y : the target variable(s)
                x_num_past: the numerical past variables
                x_num_future: the numerical future variables
                x_cat_past: the categorical past variables
                x_cat_future: the categorical future variables
                idx_target: index of target features in the past array
            t (np.array): the time array related to the target variables
            idx_target (Union[np.array,None]): you can specify the index in the past data that represent the input features (for differntial analysis or detrending strategies)
            idx_target_future (Union[np.array,None]): you can specify the index in the future data that represent the input features (for differntial analysis or detrending strategies)

        Returns:
            torch.utils.data.Dataset: a torch Dataset to be used in a Dataloader
        """
        
    def __getitem__(self, idxs):
        
        sample = {}
        for k in self.data:
            sample[k] = self.data[k][idxs]
        if self.idx_target is not None:
            sample['idx_target'] = self.idx_target
        if self.idx_target_future is not None:
            sample['idx_target_future'] = self.idx_target_future
        return sample

    """
    Sampling via ``__getitem__`` returns a dictionary,
    which always has following str-keyed entries:

    y : (n_timepoints_future, n_targets)
    x_num_past : (n_timepoints_past, n_targets + n_past_covariates_numerical)
    x_num_future : (n_timepoints_future, n_future_numerical)
    x_cat_past : (n_timepoints_past, n_past_covariates_categorical)
    x_cat_future: (n_timepoints_future, n_future_covariates_categorical)
    idx_target : list containing the column indexes of x_num_past corresponding to y

    dsipts neural networks currently do not use t, so it is not passed!
    """

    """ The input to `__init__` expects a dictionary:
    
    y : (n_samples, n_timepoints_future, n_targets)
    x_num_past : (n_samples, n_timepoints_past, n_targets + n_past_covariates_numerical)
    x_num_future : (n_samples, n_timepoints_future, n_future_numerical)
    x_cat_past : (n_samples, n_timepoints_past, n_past_covariates_categorical)
    x_cat_future: (n_samples, n_timepoints_future, n_future_covariates_categorical)
    t : 
    idx_target : list containing the column indexes of x_num_past corresponding to y

FK comments:
this looks like the top layer. It is closer to the "raw" or "bottleneck" layer, but it already has the data resampled.

The "sample" index is the first index in the input to __init__.

FK opinion: the resampling should be part of a pipeline to prepare a data loader.

So we have different artefacts

A raw data - y and x have not been split, resampled, etc
B resampled data - input to dsipts DataSet. Obtained from raw data via resampling/normalization utility
C DataLoader using the output of __getitem__

observation:

pytorch-forecasting covers A-C in single DataSet

DSIPTS covers B-C in single DataSet, and A-B in utilities

FK: last time, agreed we should have two layers DataSet
but none of current solutions has the "bottleneck" layer

ptf should take DataSet instead of DataFrame
- this would just convert df into the "bottleneck" format
DSIPTS has A-B outside torch idiomatic structures
- FK not sure how to address this best
- option: putting preproc into DataSet as current
  - perhaps not a good idea as it adds features to input
- option: one more DataSet between point A and B?
  - if we start at "bottleneck", we enable harddrive, zarr, etc, because that will have a conversion layer
alternatively, we could have a custom class handle conversions up to the dataloader format, or the input required for the dataset closest to the model

FK design suggestion

class TimeSeries(Dataset):
    """PyTorch Dataset for storing raw time series from a pandas DataFrame.

    This dataset follows the base raw time series dataset API in pytorch-forecasting.
    A single sample corresponds to the i-th time series instance in the dataset.

    Sampling via ``__getitem__`` returns a dictionary,
    which always has following str-keyed entries:

    * t: tensor of shape (n_timepoints)
      Time index for each time point in the past or present. Aligned with ``y``,
      and ``x`` not ending in ``f``.
    * y: tensor of shape (n_timepoints, n_targets)
      Target values for each time point. Rows are time points, aligned with ``t``.
      Columns are targets, aligned with ``col_t``.
    * x: tensor of shape (n_timepoints, n_features)
      Features for each time point. Rows are time points, aligned with ``t``.
    * group: tensor of shape (n_groups)
      Group ids for time series instance.
    * st: tensor of shape (n_static_features)
      Static features.

    * y_cols: list of str of length (n_targets)
      Names of columns of ``y``, in same order as columns in ``y``.
    * x_cols: list of str of length (n_features)
      Names of columns of ``x``, in same order as columns in ``x``.
    * st_cols: list of str of length (n_static_features)
      Names of entries of ``st``, in same order as entries in ``st``.
    * y_types: list of str of length (n_targets)
      Types of columns of ``y``, in same order as columns in ``y``.
      Types can be "c" for categorical, "n" for numerical.
    * x_types: list of str of length (n_features)
      Types of columns of ``x``, in same order as columns in ``x``.
      Types can be "c" for categorical, "n" for numerical.
    * st_types: list of str of length (n_static_features)
      Types of entries of ``st``, in same order as entries in ``st``.
    * x_k: list of int of length (n_features)
      Whether the feature is known in the future, encoded by 0 or 1,
      in same order as columns in ``x``.
      0 means the feature is not known in the future, 1 means it is known.

    Optionally, the following str-keyed entries can be included:
    * t_f: tensor of shape (n_timepoints_future)
      Time index for each time point in the future.
      Aligned with ``x_f``.
    * x_f: tensor of shape (n_timepoints_future, n_features)
      Known features for each time point in the future.
      Rows are time points, aligned with ``t_f``.
    * weight: tensor of shape (n_timepoints), only if weight is not None
    * weight_f: tensor of shape (n_timepoints_future), only if weight is not None

    Parameters
    ----------
    data : pd.DataFrame
        data frame with sequence data.
        Column names must all be str, and contain str as referred to below.
    data_future : pd.DataFrame, optional, default=None
        data rame with future data.
        Column names must all be str, and contain str as referred to below.
        May contain only columns that are in time, group, weight, known, or static.
    time : str, optional, default = first col not in group_ids, weight, target, static.
        integer typed column denoting the time index within ``data``.
        This columns is used to determine the sequence of samples.
        If there are no missings observations,
        the time index should increase by ``+1`` for each subsequent sample.
        The first time_idx for each series does not necessarily
        have to be ``0`` but any value is allowed.
    target : str or List[str], optional, default = last column (at iloc -1)
        column(s) in ``data`` denoting the forecasting target.
        Can be categorical or numerical dtype.
    group : List[str], optional, default = None
        list of column names identifying a time series instance within ``data``.
        This means that the ``group`` together uniquely identify an instance,
        and ``group`` together with ``time`` uniquely identify a single observation
        within a time series instance.
        If ``None``, the dataset is assumed to be a single time series.
    weight : str, optional, default=None
        column name for weights.
        If ``None``, it is assumed that there is no weight column.
    num : list of str, optional, default = all columns with dtype in "fi"
        list of numerical variables in ``data``,
        list may also contain list of str, which are then grouped together.
    cat : list of str, optional, default = all columns with dtype in "Obc"
        list of categorical variables in ``data``,
        list may also contain list of str, which are then grouped together
        (e.g. useful for product categories).
    known : list of str, optional, default = all variables
        list of variables that change over time and are known in the future,
        list may also contain list of str, which are then grouped together
        (e.g. useful for special days or promotion categories).
    unknown : list of str, optional, default = no variables
        list of variables that are not known in the future,
        list may also contain list of str, which are then grouped together
        (e.g. useful for weather categories).
    static : list of str, optional, default = all variables not in known, unknown
        list of variables that do not change over time,
        list may also contain list of str, which are then grouped together.
    """

The text was updated successfully, but these errors were encountered:

fkiraly · 2025-01-01T00:20:30Z

Having reviewed multiple code bases - pytorch-forecasting, DSIPTS, neuralforecast, thuml, I have come to understand that the DataLoader and DataSet conventions are key, in particular the input convention for forward. Interestingly, all the above-mentioned packages have different conventions here, and none seems satisfactory. What is probably most promising is a "merge" of pytorch-forecasting and DSIPTS.

The model layer will mostly follow the data layer, given that torch has an idiosyncratic forward interface.

My suggestions for high-level requirements on data loaders:

easy to use in pure torch, detour via pandas can be avoided (this is currently possible but not easy)
support for future-known and unknown, endo- and exogenous, group ID and static variables
if possible, downwards compatibility to pytorch-forecasting

Observations of the current API:

neither package spells out the forward API properly, or has checking utilities for the containers.
pytorch-forecasting seems to do a resampling for a decoder/encoder structure already in the data loader - this may not be necessary for all models
DSIPTS is closer to the abstract data type, but lacks support for static variables

The explicit specifications can be reconstructed from usage and docstrings, for convenience listed below:

`pytorch-forecasting`

From the docstring of TimeSeriesDataset.to_dataloader:

DataLoader: dataloader that returns Tuple.
    First entry is ``x``, a dictionary of tensors with the entries (and shapes in brackets)

    * encoder_cat (batch_size x n_encoder_time_steps x n_features): long tensor of encoded
        categoricals for encoder
    * encoder_cont (batch_size x n_encoder_time_steps x n_features): float tensor of scaled continuous
        variables for encoder
    * encoder_target (batch_size x n_encoder_time_steps or list thereof with each entry for a different
        target):
        float tensor with unscaled continous target or encoded categorical target,
        list of tensors for multiple targets
    * encoder_lengths (batch_size): long tensor with lengths of the encoder time series. No entry will
        be greater than n_encoder_time_steps
    * decoder_cat (batch_size x n_decoder_time_steps x n_features): long tensor of encoded
        categoricals for decoder
    * decoder_cont (batch_size x n_decoder_time_steps x n_features): float tensor of scaled continuous
        variables for decoder
    * decoder_target (batch_size x n_decoder_time_steps or list thereof with each entry for a different
        target):
        float tensor with unscaled continous target or encoded categorical target for decoder
        - this corresponds to first entry of ``y``, list of tensors for multiple targets
    * decoder_lengths (batch_size): long tensor with lengths of the decoder time series. No entry will
        be greater than n_decoder_time_steps
    * group_ids (batch_size x number_of_ids): encoded group ids that identify a time series in the dataset
    * target_scale (batch_size x scale_size or list thereof with each entry for a different target):
        parameters used to normalize the target.
        Typically these are mean and standard deviation. Is list of tensors for multiple targets.


    Second entry is ``y``, a tuple of the form (``target``, `weight`)

    * target (batch_size x n_decoder_time_steps or list thereof with each entry for a different target):
        unscaled (continuous) or encoded (categories) targets, list of tensors for multiple targets
    * weight (None or batch_size x n_decoder_time_steps): weight

There is a custom collate_fn, it is (oddly?) stored in the TimeSeriesDataSet._collate_fm as a static method, which is then passed to the data loader.

DSIPTS

Specifies a simpler structure that is closer ot the abstract data type of the time series data - and therefore imo better.

The data loader needs to return batches as follows, from the docstring of Base.forward:

            batch (dict): the batch structure. The keys are:
                y : the target variable(s). This is always present
                x_num_past: the numerical past variables. This is always present
                x_num_future: the numerical future variables
                x_cat_past: the categorical past variables
                x_cat_future: the categorical future variables
                idx_target: index of target features in the past array

This is missing group ID or static variables, but imo is closer to the end state where we want to go.

Ensuring downwards compatibility

Downwards compatibility can be ensured by:

providing converter functions between the two types of batches. This can be achieved with aadditional decoder/encoder layers, or a DataLoader depending on another DataLoader.
neural networks being tagged with input assumptions on forward. This is probably a good idea in general as well.

Also, currently none of the libraries seems to have stringent tests for the API - we should probably introduce these. scikit-base can be used to draw these up quickly.

fkiraly · 2025-01-01T00:26:42Z

question for @jdb78 - why did you design the forward API with encoder/decoder specific variables? Personally, I consider this a modelling choice, since not every deep learning forecaster is encoder/decoder based.

Side note: one possible design is to have data loaders that are specific to neural networks, facing a more general API

geetu040 · 2025-01-01T08:22:53Z

question for @jdb78 - why did you design the forward API with encoder/decoder specific variables? Personally, I consider this a modelling choice, since not every deep learning forecaster is encoder/decoder based.

forward method is in the API design of torch.nn.Module, which is the base class for every layer and model in pytorch. So I would say its not encoder/decoder based or a modelling choice in just this specific context.

fkiraly · 2025-01-01T10:12:38Z

@geetu040, what I mean is the format of the x in forward, not the choice of forward itself (which indeed is fixed by torch). This can be an arbitrarily nested structure of dict and tuple, with leaf entries being tensors. The convention on the exact structure of x is up to the user, and this is where a core part of the API definition is "hidden" - all listed packages differ in their choices for the type of x that needs to be passed.

So, for pytorch-forecasting, the choice of having decoder/encoder related fields is indeed a choice.

Sohaib-Ahmed21 · 2025-01-02T10:39:00Z

@fkiraly what are your reviews on model initialization in pytorch_forecasting from from_dataset class method in model as other packages initialize model from the init method.

jdb78 · 2025-01-02T12:50:59Z

The idea is that basically all models can be represented as encoder/decoder. In some cases they are the same.

fkiraly · 2025-01-02T14:56:22Z

The idea is that basically all models can be represented as encoder/decoder. In some cases they are the same.

Is that really true though for all models out there? And do we need this as forward args at the top level - as opposed to inside a layer?

See for instance Amazon Chronos:
https://github.com/amazon-science/chronos-forecasting/blob/main/src/chronos/chronos.py

or Google TimesFM:
https://github.com/google-research/timesfm/blob/master/src/timesfm/pytorch_patched_decoder.py

What I think we need for 2.0 is an API design that can cover all torch-based forecasting models.

fkiraly · 2025-01-02T15:10:23Z

@fkiraly what are your reviews on model initialization in pytorch_forecasting from from_dataset class method in model as other packages initialize model from the init method.

I think there is no serious problem with that as ultimately it calls __init__ which in turn calls the hooks. I have three main feelings here:

positive: I think it is a smart idea, since many parameters will be the same for multiple models, given one dataset
negative: it complicates the interface, since we are passing information about the model and the dataset in many places.
question: I would really like @jdb78's thoughts on why/how you decided to put which parameters where - in TimeSeriesDataSet args, the model __init__, or the forward args, e.g., allowed_encoder_known_variable_names

Sohaib-Ahmed21 · 2025-01-02T15:38:14Z

I think there is no serious problem with that as ultimately it calls __init__ which in turn calls the hooks. I have three main feelings here:

positive: I think it is a smart idea, since many parameters will be the same for multiple models, given one dataset

negative: it complicates the interface, since we are passing information about the model and the dataset in many places.

Yup, the interface complication is the main concerning thing considering use-cases like those involving single model. But yes, the positive and negative sides need cost-benefit analysis to decide final.

fkiraly · 2025-01-03T21:29:15Z

Some further thoughts about the design:

I think there should be a DataSet that provides the time series raw, without any transformation or resampling.
- optimally, this will be decoupled from pandas, using pandas only as one possible source.
- this should be more similar to DSIPTS
I think we should the idiomatic follow DataSet vs DataLoader separation clearly - the usual separation is DataSet = sample-level operations, loading; DataLoader = shuffling, batching, etc.
- the current pytorch-forecasting does not follow this separation! The data loader just copies mostly what the DataSet does, introducing high coupling between the layers that should be separated.

I would hence suggest, on the pytorch-forecasting side, a refactor that introduces a clear layer separation, but leaves the current interfaces intact until 2.0:

introduction of a DataSet subclass C similar to DSIPTS, close to the data. This can be subclassed for non-memory data sources
- optionally, there can be subclasses that take sktime data types as lazy arguments. This would greatly facilitate interfacing.
introduction of a SlidingDataLoader that unifies the current logic in TimeSeriesDataSet.__getitem__, TimeSeriesDataSet._construct_index and the dataloader returned by TimeSeriesDataSet.to_dataloader. This DataLoader would take C as argument, and the parameters used in the above, and return the same batches as the current TimeSeriesDataSet.
- for downwards compatibility, it can also take a TimeSeriesDataSet - this is polymorphism, just to ensure downwards compatibility.
on the model side, we design the API that each model comes with its own loader - or loaders. There is a default loader for each model - current pytorch-forecasting moels all point to the loader implied by TimeSeriesDataSet.
- optionally, we could introduce a composite class closer to sktime, which consists of a loader and a model - one for each model.

fkiraly · 2025-01-03T21:39:22Z

@geetu040, @benHeid, @jdb78, I would appreciate your thoughts and opinions!

benHeid · 2025-01-05T13:31:49Z

I would like to the following topics to the discussion of redesigning the API:

Currently, the Trainer from lightning is used. I.e., we cannot modify the Trainer if necessary. Thus, it might be worth to add a Trainer that just wraps the Trainer from lightning to provide as the flexibility. E.g., also transformers from huggingface or gluonts has their own Trainer implementations.
There exist different architectures with different inheritances. Do we want to touch this or leave it as it is. E.g. BaseModelWithCovariates, BaseModel, AutoRegressiveBaseModelWithCovariates.
Should, the datasets be capable of applying preprocessing. Like transformations? Or should we introduce such datasets? They might apply sktime transformer pipelines. Or apply bootstrapping etc. (Probably for each of these ideas a separate dataset should be implemented).

Specific replies to thoughts from above.

introduction of a SlidingDataLoader that unifies the current logic in TimeSeriesDataSet.__getitem__, TimeSeriesDataSet._construct_index and the dataloader returned by TimeSeriesDataSet.to_dataloader.

Should the DataLoader do the sliding? For me this is more about how the samples are created. Thus, this should be part of the dataset (At least if I got this correctly: https://pytorch.org/docs/stable/data.html#torch.utils.data.Dataset)

on the model side, we design the API that each model comes with its own loader - or loaders. There is a default loader for each model - current pytorch-forecasting moels all point to the loader implied by TimeSeriesDataSet.

Why should be the loader model specific. The loader's task is to provide an iterable by combining a Dataset and a Sampler(https://pytorch.org/docs/stable/data.html#module-torch.utils.data). The structure of each sample is determined by the Dataset. Thus, I would argue that if something of the data related code should be model specific, it should be the Dataset. Furthermore, I the bullet point above, you proposed to introduce also SlidingDataloader. This would mean, we need to implement a DataLoader, SlidingDataLoader, and potential further DataLoader for each model separately.

Unfortunately, I haven't attended any of the planning sessions, thus the following question might be already answered. Why are we currently aiming for having one DataLoader/Dataset per model? E.g., if we have two models that only support endogenous features, I would suppose that their datasets would be identical. Thus, I do not see the need of having multiple ones. Furthermore, models that support exogenous features does not always require exogenous features, they might also be applicable to only endogenous features. Would this then require multiple datasets/dataloader per model?

fkiraly · 2025-01-05T14:42:50Z

@benHeid, excellent points!

Replies to your raised design angles:

Currently, the Trainer from lightning is used. I.e., we cannot modify the Trainer if necessary. Thus, it might be worth to add a Trainer that just wraps the Trainer from lightning to provide as the flexibility. E.g., also transformers from huggingface or gluonts has their own Trainer implementations.

Good idea to introduce a separation here. Question, why do you think we cannot modify the trainer in the current state of pytorch-forecasting? It is extraneous to the model. Or do you think the dependency on lightning is too strong to allow other methods of training? Would we have to rearchitect the models one level lower, removing the lightning inheritance, to change this?

More generally, what are architectures you can think of that allow to treat all the abovementioned trainers under one interface?

There exist different architectures with different inheritances. Do we want to touch this or leave it as it is. E.g. BaseModelWithCovariates, BaseModel, AutoRegressiveBaseModelWithCovariates.

I think this is more complex than needs be and needs a redesign, but it is not affecting the base API as such, as far as I can see - therefore I would leave it to later, after the base API rework. My approach would be to replace inheritance with tags and/or polymorphism in a lower number of base classes. The model specific ones perhaps need to stay, but the type specific ones could be merged.

However, I think it makes sense only after we have reworked the base API, as it will imply how the base classes will look like.

Should, the datasets be capable of applying preprocessing. Like transformations? Or should we introduce such datasets? They might apply sktime transformer pipelines. Or apply bootstrapping etc. (Probably for each of these ideas a separate dataset should be implemented).

Good question. The idiomatic architecture for torch extensions has DataSet deal with instance level transformations (in our case "series-to-series"), and DataLoader with batch level ones, that would include augmentation and/or more general panel level transformations.

As mentioned, the current architecture violates this idiom, since the DataSet does decoder/encoder level window subsampling which is a panel level transformation.

fkiraly · 2025-01-05T14:47:35Z

Specific replies to thoughts from above.

introduction of a SlidingDataLoader that unifies the current logic in TimeSeriesDataSet.__getitem__, TimeSeriesDataSet._construct_index and the dataloader returned by TimeSeriesDataSet.to_dataloader.

Should the DataLoader do the sliding? For me this is more about how the samples are created. Thus, this should be part of the dataset (At least if I got this correctly: https://pytorch.org/docs/stable/data.html#torch.utils.data.Dataset)

I would disagree with the opinion that slicing should be part of the DataSet.

I interpret the architectural intention (which is not very clear in the torch docs) that DataSet should be as close to the raw data as possible, possibly including instance (__getitem__ level) pre-processing, while DataLoader should collect all concerns related to batching, slicing, sampling, and batch level transformations such as data augmentation.

If we accept this architectural intention as correct, this implies:

the current architecture does not align with it, as DataSet carries out a significant part of batching, slicing, sampling, batch level transformations
the optimal end state has these concerns transferred to a DataLoader subclass, and removed from DataSet subclasses. The DataLoader can be composite.
one possible version of this end state has DataSet closer to the DSIPTS design, which I consider for that reason superior (though also not perfect=, when it comes to the DataSet / DataLoader separation.

on the model side, we design the API that each model comes with its own loader - or loaders. There is a default loader for each model - current pytorch-forecasting moels all point to the loader implied by TimeSeriesDataSet.

Why should be the loader model specific. The loader's task is to provide an iterable by combining a Dataset and a Sampler(https://pytorch.org/docs/stable/data.html#module-torch.utils.data). The structure of each sample is determined by the Dataset.

My reasoning is, that some but not all models require a loader that has encoder/decoder specifics. Examples are most current models, counterexamples are some models in DSIPTS and the above linked foundation models.

If we accept that some models require encoder/decoder batches, while some others do not, and we think that this should be done in a DataLoader (and not in, say, extra layers), then this necessarily implies that loaders will have to be model specific.

Thus, I would argue that if something of the data related code should be model specific, it should be the Dataset.

I think this is a corollary of how you envision the separation of DataSet and DataLoader, which is a point where we differ. If you start from my proposal of the separation, you get the same consequence but for the DataLoader.

Why are we currently aiming for having one DataLoader/Dataset per model?

To clarify, that is not the aim in my design. In my design, there would probably be a small number of DataLoader-s, it will be an n:1 relationship of models:loaders, while all will use the same DataSet interface and possibly an additional middle unified DataLoader layer.

We should of course not need to define one DataSet per model, as in sktime or sklearn it should be easy to loop over models, so only one DataSet overall.

The key problem is actually the state you are pointing out as problematic - currently, we have to define one DataSet per package (or even mode), for using the various foundation models etc. In sktime, this now happens under the hood in sktime classes, but the unification should happen one layer deeper, in torch or lightning.

Unfortunately, I haven't attended any of the planning sessions, thus the following question might be already answered.

No problem, we will have a big one with new mentees, @agobbifbk plus team, in the new year (the FBK team will return next week from holiday). @thawn, you are very welcome to attend and participate in planning too.

We will align on a date for this in the pytorch-forcasting channel on the sktime discord.

benHeid · 2025-01-05T20:59:27Z

Good idea to introduce a separation here. Question, why do you think we cannot modify the trainer in the current state of pytorch-forecasting? It is extraneous to the model. Or do you think the dependency on lightning is too strong to allow other methods of training? Would we have to rearchitect the models one level lower, removing the lightning inheritance, to change this?

Technical it is possible. However, the user has to change the Trainer in that case. So I think introducing a Trainer with a major version release is more intuitive. And than the user does not have to change it.

More generally, what are architectures you can think of that allow to treat all the abovementioned trainers under one interface?

I think all architectures can be trained with that trainer. However, there might be features that we would like to implement that are time series specific. E.g., Truncated back propagation through time, which was removed from lightning trainer: #1581

benHeid · 2025-01-05T21:50:58Z

With regards to the discussion, where the slicing is located. I am still not convinced that a custom DataLoader should be a solution, and I still think that tasks like slicing are intended to be part of datasets and not of Dataloaders. The reasons why I think this are:

I haven't found any good example of custom Dataloader on the PyTorch site. But plenty of tutorials and documentation showing different datasets.
Regarding slicing, I would say that from the level where they are applied, they are comparable to operations such as creating subset datasets or concat datasets. For those operations, there are extra dataset implementations available in PyTorch. Thus, I could imagine that a possible solution would be to have preprocessing datasets or augmentation datasets that are afterwards passed to a slicing dataset that will slice the transformed and augmented time series. However, might be complicated for users that need to chain various Datasets.

fkiraly · 2025-01-08T20:05:32Z

Technical it is possible. However, the user has to change the Trainer in that case. So I think introducing a Trainer with a major version release is more intuitive. And than the user does not have to change it.

I do not completely grasp the context - could you perhaps add two code snippets, current and speculative, for modifying trainer?

With regards to the discussion, where the slicing is located. I am still not convinced that a custom DataLoader should be a solution, and I still think that tasks like slicing are intended to be part of datasets and not of Dataloaders.

Hm, I see how one could make an argument either way.

Then, would the implied design be (a) raw time series dataset without slicing e.g., as a DataSet, and a WindowSlicedTimeSeriesDataSet(base_dataset, params)?`

To avoid chaining, we can always have an all-in-one delegator.

benHeid · 2025-01-09T19:32:17Z

I do not completely grasp the context - could you perhaps add two code snippets, current and speculative, for modifying trainer?

Currently, her are importing directly from lightning: from lightning.pytorch import Trainer. If we change at some point to an own implementation, the user need to adapt their imports. I think this adaption, is better to happen with a major release. So that with 2.0 the user have to do from pytorch_forecasting.trainer import Trainer. This trainer can at the beginning be an empty wrapper around the lightning Trainer:

from lightning.pytorch import Trainer as PL_trainer

class Trainer(PL_trainer):
   ...

Then, would the implied design be (a) raw time series dataset without slicing e.g., as a DataSet, and a WindowSlicedTimeSeriesDataSet(base_dataset, params)?`

Yes

This PR carries out a clean-up refactor of `TimeSeriesDataSet`. No changes are made to the logic. This is in preparation for major work items impacting the logic, e.g., removal of the `pandas` coupling (see #1685), or a 2.0 rework (see #1736). In general, a clean state would make these easier. Work carried out: * clear, and complete docstrings, in numpydoc format * separating logic, e.g., for parameter checking, data formatting, default handling * reducing cognitive complexity and max indentations, addressing "code smells" * linting

thawn · 2025-01-13T09:00:08Z

@fkiraly I am back from vacation. I hope you had a relaxing time during the Holidays. Before I get started with figuring out how I can help with the integration ofTime-Series-Library: Have you tried contacting the original developers of that library? I think this is important for two reasons:

They know their code much better than me
I would feel bad using their code without asking them first (even though the license technically allows this)
Even though my pull request to time-series-library may look like a lot of changes, it is a relatively simple refactor that took me inside of two hours (the major part being running tests if TimesNet still works, because that is what I am using downstream)

fkiraly · 2025-01-13T09:13:57Z

Have you tried contacting the original developers of that library?

I think somewhere I pinged them, but you are right - we should also let them know and ask very explicitly, with the current state of discussions. I see multiple options that make technical sense:

developing a common framework layer for models, like sktime, and people can have their own packages or estimators, see here: https://www.sktime.net/en/latest/estimator_overview.html (estimators have "author", "maintainer" ownership tags, or can even be in other packages entirely, like prophetverse or skchange)
merge into a common code base, e.g., pytorch-forecasting which already has a well-maintained package and release layer, and an active community

Both options will require agreeing on a common framework level API - I still think DSIPTS is closest to what we want, of course comments are much appreciated.

FYI @wuhaixu2016, @ZDandsomSP, @GuoKaku, @DigitalLifeYZQiu, @Musongwhk (please feel free to add further pings if I forgot someone)

fkiraly · 2025-01-13T09:22:05Z

I would feel bad using their code without asking them first (even though the license technically allows this)

Agree, at least credit needs to be given clearly to the authors of time-series-library!

A complication is that there is much copy-pasting going on in the neural network area historically, some (but not all) code in time-series-library is even copied from elsewhere afaik.

I think we should also come up with a way to fairly assign credit while backtracking the entire copy-merge-modify tree, although that might be a bit of work. A complication is that the people actively maintaining the code may be different from those who wrote parts of it (but no longer maintain it), and all deserve credit!
A solution could look like the tagging system in sktime which separates maintainer-type owners from authorship credits, some example pull requests on this topic are here:

sktime/sktime#6850
sktime/sktime#7518

The first PR includes handling of cure-lab algorithms, which is historically at a central position of the copy-modify-tree for deep learning forecasters, but was never turned into a package. In that latter respect, it is similar to time-series-library.

thawn · 2025-01-13T10:40:42Z

Regarding the API of Time-Series-Library

from my (limited) grasp of the code, this consists of the following:

data loaders are quite specific to the data they use for their benchmarks
an extensive config object based on argparse.ArgumentParser.parse_args()
a model wrapper which they call experiment

My feeling is that adapting this to any more general API will require major work. Using TimesNet in my code required me to write my own dataset and dataloaders as well as model wrapper.

Edit: this should not be taken as criticism of Time-Series-Library. I am very grateful to the authors for their code and their papers. Their work helped me a lot for my project. It just shows that the library was written for a specific purpose (benchmarking) and not as a general purpose API like what is planned here.

agobbifbk · 2025-01-13T11:07:20Z

Regarding the API of Time-Series-Library

from my (limited) grasp of the code, this consists of the following:

data loaders are quite specific to the data they use for their benchmarks

an extensive config object based on argparse.ArgumentParser.parse_args()

a model wrapper which they call experiment

My feeling is that adapting this to any more general API will require major work. Using TimesNet in my code required me to write my own dataset and dataloaders as well as model wrapper.

Edit: this should not be taken as criticism of Time-Series-Library. I am very grateful to the authors for their code and their papers. Their work helped me a lot for my project. It just shows that the library was written for a specific purpose (benchmarking) and not as a general purpose API like what is planned here.

A lot of models in DSIPTS are an adaptation of online repositories with the same logic of Time-Series-Library. What I've done so far is to align and standardize these API calls in the DSIPTS data preparation. I also found hard to understand the input parameters sometimes, this is why I leverage on Hydra.

fkiraly · 2025-01-16T00:00:25Z

For discussion on Fri, here is a speculative design for a raw data container.
#1755

fkiraly · 2025-01-17T19:20:26Z

outcome from prioritization meeting on Jan 15:

data layer - dataset, dataloader 👍👍👍👍👍👍👍 💬 ✔️✔️✔️

dataset and dataloader API consolidation

model layer - base classes, configs, unified API 👍👍👍👍 ✔️

A more refined base classes maybe and proper documentation of them
tests: input-output shapes of the bathches
operationalization of the models (inference must be a clear process/output)
think about how model config are stored. e.g. versioning so that we know how to load model weights.
easy interface for adding new architectures

foundation models, model hubs 👍 👍 👍

To think how to handle pre-trained models in ptf and how to interface them?
integration with model hubs (hf/kaggle/...)
[IDEA] looking to Chronos-Bolt and integrate into the repo.

documentation👍✔️

adding tutorials and examples for the new users
proper documentation of base classes

benchmarking 👍 👍💬

easy way to use benchmark datasets
reproducibility, scalability, concept of experiment (same dataset, different models/parameters to be compared )

mlops and scaling (distributed, cluster etc)👍 👍

operationalization of the models (inference)
hooks for: slurm cluster, multiprocess, OPTUNA
think about how model config are stored. e.g. versioning so that we know how to load model weights.

more learning tasks supported

[IDEA] continuous learning / active learning
[IDEA] easy to convert DL architectures from regression to classification

agobbifbk · 2025-01-21T08:14:35Z

I came up with this https://zarr.readthedocs.io/en/stable/. It is widely used when you have a large dataset that can not fit on RAM/VRAM. The idea is to create the zarr before creating the dataset. It generates chunks of data on a give dimension(s), in hour case temporal dimension, the datasets open the zarr and in the get_item function you retrieve the window you ask for.
Google has created a bucket with ALL ERA5 data (all the European hourly meteorological variables!).
While creating the zarr we can:

update the scaler(s) on the training set
updating the label encode(s)
keep track of the valid samples (not nan)

PROs:

our models can run on a enormous dataset
it should not be difficult to fit it in the API we are thinking
linux can cash automatically the most used files so maybe we gain some speed
most used technology for this kind of application (similar to xarrray, nc, and other compressing/storaging options for BIG data)

CONs:

need to create a zarr file on the disk (redundancy, space)
slower respect having all the numpy-tensor dataset in RAM

thawn · 2025-01-21T09:25:45Z

I came up with this https://zarr.readthedocs.io/en/stable/.

+1 from me. We are also using zarr as file backend in our time series project.

thawn · 2025-01-21T09:26:17Z

As default data format in our minimal dataset class, I realized that xarray may be a great choice: Xarray natively supports column names (i.e. metadata) as well as n-dimentional columns, which makes it an ideal replacement for both numpy and pandas. Furthermore, it has the advantage of dask support (which handles memory issues very well). By using Dask.array, xarray enables native chunking (ideal for non-overlapping time windows) as well as overlapping chunks (for sliding time windows).

fkiraly · 2025-01-22T18:47:35Z

@agobbifbk, @thawn, I think there are multiple solutions that have a similar feature set - dask and polars as well.

As long as we have consistent __getitem__ return format, nothing stops us from adding dataset classes that take either of these as input.

fkiraly · 2025-01-22T22:00:45Z

FYI, a simplified version of the previous design sketch here: #1757 - for discussion

agobbifbk · 2025-01-24T14:51:55Z

Desiderata from the DataSet component:

possibility to cut data based on list of ranges (for identifying he training dataset). ranges can be temporal values or percentage
scalers (minmax) trained on the train dataset --> maybe is better to train them outside the DataSet so there will be only the transform function here and the dataset can work both for train and infererece!
label encoder (trained on the train dataset) --> if there are strings in columns we need to convert them into 0 N-1 integers
groups: suppose to have M identities sharing the same data (think about M meteorological station collecting temperatures) in this case the time is not an index but the tuple (time-group) is. Maybe sometimes you want to normalize by groups
since interpolating is not always a good choice, and AFAIU the slicing is performed there, we need to ensure that the samples is valid --> precompute all the possible good starting points?

fkiraly · 2025-01-31T12:56:16Z

Completed a design proposal draft here: sktime/enhancement-proposals#39

Input appreciated!

fkiraly · 2025-02-05T22:22:35Z

moved the various notes to the top.

Are there any missing notes?

fkiraly added enhancement New feature or request API design API design & software architecture labels Dec 20, 2024

fkiraly pinned this issue Dec 20, 2024

fkiraly mentioned this issue Jan 3, 2025

[ENH] clean-up refactor of TimeSeriesDataSet #1746

Merged

This was referenced Jan 4, 2025

[ENH] enable large data use cases - decouple data input from pandas, allow polars, dask, and/or spark #1685

Open

[ENH] Improving docs, examples, and tutorials of pytorch-forecasting #1745

Open

This was referenced Jan 5, 2025

[API] redesign of logging and monitoring for 2.0 #1700

Open

[MNT] Is PyTorch-forecasting still under maintenance? #1575

Closed

[ENH] using TFT without past target values #1585

Open

fkiraly mentioned this issue Jan 13, 2025

[ENH] neural network libraries in thuml time-series-library sktime/sktime#7243

Open

fkiraly mentioned this issue Jan 16, 2025

[ENH] speculative design for raw data container DataSet #1755

Draft

PranavBhatP mentioned this issue Jan 22, 2025

[ENH] Interfacing chronos-bolt in sktime sktime/sktime#7680

Open

fkiraly mentioned this issue Jan 22, 2025

[ENH] speculative design for raw data container DataSet v2 #1757

Draft

fkiraly mentioned this issue Jan 31, 2025

pytorch-forecasting and dsipts v2 API design sktime/enhancement-proposals#39

Open

[API] Redesign towards pytorch-forecasting 2.0 #1736

[API] Redesign towards pytorch-forecasting 2.0 #1736

Comments

fkiraly commented Dec 20, 2024 • edited Loading

Context and goals

Meeting notes

👋 Attendees

Prioritization

Agenda

References

Notes

number of classes, dataset, dataloader, "bottleneck" idea

__getitem__ output convention

Notes

__getitem__ designs based on last times

AG design suggestion

Current DSIPTS str

FK design suggestion

fkiraly commented Jan 1, 2025 • edited Loading

pytorch-forecasting

DSIPTS

Ensuring downwards compatibility

fkiraly commented Jan 1, 2025

geetu040 commented Jan 1, 2025

fkiraly commented Jan 1, 2025 • edited Loading

Sohaib-Ahmed21 commented Jan 2, 2025

jdb78 commented Jan 2, 2025

fkiraly commented Jan 2, 2025

fkiraly commented Jan 2, 2025

Sohaib-Ahmed21 commented Jan 2, 2025 • edited Loading

fkiraly commented Jan 3, 2025 • edited Loading

fkiraly commented Jan 3, 2025

benHeid commented Jan 5, 2025

Specific replies to thoughts from above.

fkiraly commented Jan 5, 2025 • edited Loading

fkiraly commented Jan 5, 2025 • edited Loading

Specific replies to thoughts from above.

benHeid commented Jan 5, 2025

benHeid commented Jan 5, 2025

fkiraly commented Jan 8, 2025

benHeid commented Jan 9, 2025

thawn commented Jan 13, 2025

fkiraly commented Jan 13, 2025 • edited Loading

fkiraly commented Jan 13, 2025 • edited Loading

thawn commented Jan 13, 2025 • edited Loading

Regarding the API of Time-Series-Library

agobbifbk commented Jan 13, 2025

Regarding the API of Time-Series-Library

fkiraly commented Jan 16, 2025 • edited Loading

fkiraly commented Jan 17, 2025

agobbifbk commented Jan 21, 2025

thawn commented Jan 21, 2025

thawn commented Jan 21, 2025

fkiraly commented Jan 22, 2025

fkiraly commented Jan 22, 2025

agobbifbk commented Jan 24, 2025

fkiraly commented Jan 31, 2025

fkiraly commented Feb 5, 2025

fkiraly commented Dec 20, 2024 •

edited

Loading

`getitem` output convention

`getitem` designs based on last times

fkiraly commented Jan 1, 2025 •

edited

Loading

`pytorch-forecasting`

fkiraly commented Jan 1, 2025 •

edited

Loading

Sohaib-Ahmed21 commented Jan 2, 2025 •

edited

Loading

fkiraly commented Jan 3, 2025 •

edited

Loading

fkiraly commented Jan 5, 2025 •

edited

Loading

fkiraly commented Jan 5, 2025 •

edited

Loading

fkiraly commented Jan 13, 2025 •

edited

Loading

fkiraly commented Jan 13, 2025 •

edited

Loading

thawn commented Jan 13, 2025 •

edited

Loading

fkiraly commented Jan 16, 2025 •

edited

Loading