Skip to content

Commit

Permalink
remove underperforming variant
Browse files Browse the repository at this point in the history
  • Loading branch information
lucidrains committed Jan 6, 2024
1 parent 0c41146 commit 544ec67
Show file tree
Hide file tree
Showing 3 changed files with 3 additions and 291 deletions.
31 changes: 2 additions & 29 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,8 @@ The official implementation has been released <a href="https://github.com/thuml/

- <a href="https://stability.ai/">StabilityAI</a> and <a href="https://huggingface.co/">🤗 Huggingface</a> for the generous sponsorship, as well as my other sponsors, for affording me the independence to open source current artificial intelligence techniques.

- <a href="https://github.com/gdevos010">Greg DeVos</a> for sharing <a href="https://github.com/lucidrains/iTransformer/issues/20">experiments</a> he ran on `iTransformer` and some of the improvised variants

## Install

```bash
Expand Down Expand Up @@ -112,35 +114,6 @@ preds = model(time_series)
# -> (12: (2, 12, 137), 24: (2, 24, 137), 36: (2, 36, 137), 48: (2, 48, 137))
```

### iTransformer with Normalization Statistics Conditioning

Reversible instance normalization, but all statistics across variates are concatted and projected into a conditioning vector for FiLM conditioning after each layernorm in the transformer.

```python
import torch
from iTransformer import iTransformerNormConditioned

# using solar energy settings

model = iTransformerNormConditioned(
num_variates = 137,
lookback_len = 96, # or the lookback length in the paper
dim = 256, # model dimensions
depth = 6, # depth
heads = 8, # attention heads
dim_head = 64, # head dimension
pred_length = (12, 24, 36, 48), # can be one prediction, or many
num_tokens_per_variate = 1, # experimental setting that projects each variate to more than one token. the idea is that the network can learn to divide up into time tokens for more granular attention across time. thanks to flash attention, you should be able to accommodate long sequence lengths just fine
)

time_series = torch.randn(2, 96, 137) # (batch, lookback len, variates)

preds = model(time_series)

# preds -> Dict[int, Tensor[batch, pred_length, variate]]
# -> (12: (2, 12, 137), 24: (2, 24, 137), 36: (2, 36, 137), 48: (2, 48, 137))
```

## Todo

- [x] beef up the transformer with latest findings
Expand Down
261 changes: 0 additions & 261 deletions iTransformer/iTransformerNormConditioned.py

This file was deleted.

2 changes: 1 addition & 1 deletion setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
setup(
name = 'iTransformer',
packages = find_packages(exclude=[]),
version = '0.5.2',
version = '0.5.3',
license='MIT',
description = 'iTransformer - Inverted Transformer Are Effective for Time Series Forecasting',
author = 'Phil Wang',
Expand Down

0 comments on commit 544ec67

Please sign in to comment.