-
Notifications
You must be signed in to change notification settings - Fork 302
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TypeError while ctgan.fit() #326
Comments
I was facing the same problem. There may be a problem with your column names, they should be strings. |
Hi @AT9991 and @aarishmaqsood, would either of you be able to share some CSV data that we can use to replicate this? BTW instead of using the CTGAN library directly, I would highly recommend using the SDV library. You can access the CTGAN Synthesizer via SDV. Doing so will allow you to make use of additional features -- such as better data pre-processing, customizations such as constraints, and conditional sampling. I actually wonder whether you would still encounter this bug in SDV, since there is a lot more data validation and checking we do there. Here is a tutorial that uses CTGAN via the SDV library. |
@npatki Thank you for your response. I have fixed my problem. In the future I will use your suggested solution. |
Great to hear @aarishmaqsood. Could you describe what fixed your problem? In case other others have the same issue, I can refer them here. Thanks. |
@npatki Here is the Colab link, where I have replicated the error and provided the solution as well. This problem occurs in version 1.5.0. Below are the code snippets that illustrate both the problem and the solution. Reproducing the Error
Solution
|
Hi @aarishmaqsood, very much appreciate the detailed code and notebook. Note that I have replicated this issue on the latest SDV (1.12.0) also. Here are a few things I discovered:
Since we now have the above two issues filed in our main SDV library, I will mark this one as a duplicate. In the meantime, for anyone else running into the issue, I suggest using @aarishmaqsood 's simple workaround that converts the column names from integers to strings. Thanks all for helping uncover this. For any related discussion, please feel free to comment on either of the SDV issues linked above. |
Environment Details
Google Colab
Error Description
TypeError Traceback (most recent call last)
in <cell line: 1>()
----> 1 ctgan.fit(trial)
6 frames
/usr/local/lib/python3.10/dist-packages/rdt/transformers/base.py in _set_seed(self, data)
365 hash_value = self.columns[0]
366 for value in data.head(5):
--> 367 hash_value += str(value)
368
369 hash_value = int(hashlib.sha256(hash_value.encode('utf-8')).hexdigest(), 16)
TypeError: unsupported operand type(s) for +=: 'int' and 'str'
Steps to reproduce
!pip install ctgan
from ctgan import CTGAN
data = pd.read_csv(...)
ctgan = CTGAN(epochs=100)
ctgan.fit(data)
The text was updated successfully, but these errors were encountered: