Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Process no longer completes #98

Open
pkudva-leavened opened this issue May 14, 2024 · 13 comments
Open

Process no longer completes #98

pkudva-leavened opened this issue May 14, 2024 · 13 comments

Comments

@pkudva-leavened
Copy link

I installed this project a few weeks ago and was able to run it and get results using the default VI and HMC methods for some basic data. Now it doesn't work anymore. There are a couple of deprecation warnings, but I don't remember if they were present since the beginning.

Expected Behavior

Process takes less than 5 minutes to return the summary and plot.

Current Behavior

Process does not complete, ps status code S+ after about 30 seconds of CPU time and stays that way
Summary and report are not printed in terminal even after 30 minutes

$ python3 tfcausalimpact.py
2024-05-14 11:26:59.928855: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/causalimpact/data.py:263: FutureWarning: DataFrame.applymap has been deprecated. Use DataFrame.map instead.
  if not data.applymap(np.isreal).values.all():
WARNING:tensorflow:From /Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/tensorflow/python/autograph/impl/api.py:459: StructuralTimeSeries.joint_log_prob (from tensorflow_probability.python.sts.structural_time_series) is deprecated and will be removed after 2022-03-01.
Instructions for updating:
Please use `StructuralTimeSeries.joint_distribution(observed_time_series).log_prob`

Context (Environment)

python version 3.10.11
First encountered May 7 on version 0.0.14. Version 0.0.15 did not fix the issue

@WillianFuks
Copy link
Owner

Hi @pkudva-leavened ,

Just to confirm, are you getting this from version 0.0.14 as well?

Also, are you using the default model from the package? How big is the dataset in terms of data points? Which OS are you running?

@pkudva-leavened
Copy link
Author

pkudva-leavened commented May 15, 2024

Hi @WillianFuks

Yes, 0.0.14 gives the same warnings and does not give any response after 30 minutes.

Yes, I am using the default Variational Inference method via the function ci = CausalImpact(data, pre_period, post_period)
My dataset has 290 rows with 1 test column and 1 control column, but I just tested with the comparison_data.csv in the fixtures and I am still not getting a result.
I am on macOS 13.2, with an Intel i9 and 16GB RAM.

UPDATE:
I attempted a fresh installation of python 3.10.11 and this package on a Windows machine that I don't normally program on, and it works just fine. So I'm assuming there is something on my Mac that is preventing this from running properly. I will try to figure out what it is and update this when I do. Thank you for responding.

@pkudva-leavened
Copy link
Author

Hi @WillianFuks,

I'm still not completely sure what was the problem, but I got it working.
I had to completely uninstall python 3.10 and all of the sym links, and reinstall it and the package. There are a couple extra logs that came from that, but as long as it's working, I'm happy.

Here are the outputs before it eventually outputted the report and plot, for your reference.

/localprojects/causal-inference/tfci.py:1: DeprecationWarning: 
Pyarrow will become a required dependency of pandas in the next major release of pandas (pandas 3.0),
(to allow more performant data types, such as the Arrow string type, and better interoperability with other libraries)
but was not found to be installed on your system.
If this would cause problems for you,
please provide us feedback at https://github.com/pandas-dev/pandas/issues/54466
        
  import pandas as pd
2024-05-15 12:53:16.442005: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/causalimpact/data.py:263: FutureWarning: DataFrame.applymap has been deprecated. Use DataFrame.map instead.
  if not data.applymap(np.isreal).values.all():
WARNING:tensorflow:From /Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/tensorflow/python/autograph/impl/api.py:459: StructuralTimeSeries.joint_log_prob (from tensorflow_probability.python.sts.structural_time_series) is deprecated and will be removed after 2022-03-01.
Instructions for updating:
Please use `StructuralTimeSeries.joint_distribution(observed_time_series).log_prob`
WARNING:tensorflow:From /Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/causalimpact/model.py:414: calling one_step_predictive (from tensorflow_probability.python.sts.forecast) with timesteps_are_event_shape=True is deprecated and will be removed after 2021-12-31.
Instructions for updating:
`Predictive distributions returned by`tfp.sts.one_step_predictive` will soon compute per-timestep probabilities (treating timesteps as part of the batch shape) instead of a single probability for an entire series (the current approach, in which timesteps are treated as event shape). Please update your code to pass `timesteps_are_event_shape=False` (this will soon be the default) and to explicitly sum over the per-timestep log probabilities if this is required.

@braydentang1
Copy link

Running into the same issues as @pkudva-leavened.

The default example doesn't even terminate after 5+ minutes. I get the exact same warning:

StructuralTimeSeries.joint_log_prob (from tensorflow_probability.python.sts.structural_time_series) is deprecated and will be removed after 2022-03-01.
Instructions for updating:
Please use `StructuralTimeSeries.joint_distribution(observed_time_series).log_prob`

from tensorflow.

import pandas as pd
from causalimpact import CausalImpact
data = pd.read_csv('https://raw.githubusercontent.com/WillianFuks/tfcausalimpact/master/tests/fixtures/arma_data.csv')[['y', 'X']]
data.iloc[70:, 0] += 5
pre_period = [0, 69]
post_period = [70, 99]
ci = CausalImpact(data, pre_period, post_period)
print(ci.summary())
ci.plot()

On MacOS 14.4.1, Apple M3 Pro, 36 GB ram. Python 3.11.0. Package version 0.0.15.

@WillianFuks
Copy link
Owner

Hi @braydentang1 ,

Could you please confirm if version 0.0.14 is also having the same issues? We're not certain if this is happening due the latest version updates.

@braydentang1
Copy link

Version 0.0.14 is having the same issues @WillianFuks

@dtran-im
Copy link

dtran-im commented Jun 3, 2024

Experiencing the same issue with both 0.0.14 and 0.0.15. @Wopple figured out that downgrading pyarrow to 10.0.1 fixes the problem for us, not a good long-term solution for us but a short-term solution anyhow.

@Wopple
Copy link

Wopple commented Jun 3, 2024

I have been having this same issue where it hangs indefinitely. This was brought in through a transitive dependency elsewhere in my project:

[[package]]
name = "pyarrow"
version = "16.1.0"

I managed to "fix" it with:

pyarrow = "=10.0.1"

You can likely reproduce with:

pyarrow = "=16.1.0"

and then running the sample code. I have not investigated much further than that since this "works for me." But maybe it helps someone here track down the underlying cause.

Edit: Works with 15.0.2, breaks with 16.0.0.

@braydentang1
Copy link

braydentang1 commented Jun 3, 2024

@dtran-im @Wopple Switching to pyarrow 10.0.1 solved it. Thanks!

@WillianFuks
Copy link
Owner

I've been trying to replicate this issue with no success so far. Please @Wopple could you send us which pandas version you have running when having the issue?

The weird thing is that tfci uses pandas<=2.2 which doesn't use pyarrow, so the lib is not a dependency. Even when installing it separately here the code runs fine so maybe it's some other package related to pandas and pyarrow that is causing the issue.

@Wopple
Copy link

Wopple commented Jun 11, 2024

@WillianFuks https://github.com/Wopple/tfcausalimpact-bug

This bug was observed on both apple silicon and x86.

@WillianFuks
Copy link
Owner

Hi @Wopple ,

Thanks for this repo. I installed it locally and still could run with no problems. It seems to be something related to environment but so far I'm out of ideas on how to debug this one.

Not sure if downgrading pyarrow was a coincidence, the lib is not a dependency of this project so it shouldn't change outcomes.

My suspicion was on mac binary codes but we also have it tested on our CI pipeline which goes just fine.

Just out of curiosity, if possible, does the code run for you inside a docker container?

@Wopple
Copy link

Wopple commented Jun 16, 2024

I just checked and yes, the code runs in a container. It might be a mac-only issue if linux environments are not having a problem.

Edit: It is 100% reproducible on my and other's macs though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants