-
Notifications
You must be signed in to change notification settings - Fork 291
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How can i change mini-batch size when i using tfio.experimental.streaming.KafkaBatchIODataset api? #1458
Comments
@TTian-vivo as of now it is restricted to 1024. Let me change it to take the value of |
ok, thanks, I think this value (mini batch size) need to be adjustable; |
@TTian-vivo I have raised a PR (#1460) to address this. This enables you to pass:
to the configuration options and your max size for |
ok, thanks. And which configuration can config the smallest size for mini batch? I think increase the batch size can speed up model training and easier model convergence. Don't know if my idea is correct? |
@TTian-vivo you can set the import tensorflow_io as tfio
# Prepare the dataset
dataset = tfio.experimental.streaming.KafkaBatchIODataset(
topics=["mini-batch-test"],
group_id="cgminibatchtrain",
servers=None,
stream_timeout=5000,
configuration=[
"session.timeout.ms=7000",
"max.poll.interval.ms=8000",
"auto.offset.reset=earliest",
"batch.num.messages=2048"
],
)
# Prepare the model
NUM_COLUMNS = 1
model = tf.keras.Sequential(
[
tf.keras.layers.Input(shape=(NUM_COLUMNS,)),
tf.keras.layers.Dense(4, activation="relu"),
tf.keras.layers.Dropout(0.1),
tf.keras.layers.Dense(1, activation="sigmoid"),
]
)
model.compile(
optimizer="adam",
loss=tf.keras.losses.BinaryCrossentropy(from_logits=True),
metrics=["accuracy"],
)
# Train the model
TRAINING_BATCH_SIZE = 64
for mini_d in dataset:
mini_d = mini_d.map(
lambda m, k: (
tf.strings.to_number(m, out_type=tf.float32),
tf.strings.to_number(k, out_type=tf.float32),
)
).batch(TRAINING_BATCH_SIZE)
# Fits the model as long as the data keeps on streaming
model.fit(mini_d, epochs=5) In the training loop, every |
Sorry you didn't understand what I mean. |
@TTian-vivo the PR has not been merged yet. Will try to get it merged soon. |
Ok,look forward to this. |
@TTian-vivo the PR has been merged, you can install the |
what is the differenct between tensorflow-io-nightly and tensorflow-io package? I tried tensorflow-io package, the problem is solved. |
@TTian-vivo the So in your case, when this issue was created, the stable release was Hope this information helps. Closing the issue as of now. Thanks! |
ok! Thanks! It worked! Close this issue now! |
How can i change mini-batch size when i using tfio.experimental.streaming.KafkaBatchIODataset api? I increase the configuration params tenfold (batch.size、batch.num.messages、message.max.bytes) simultaneously,but the mini batch size is still 1024.
ps:tensorflow version 2.5.0; tensorflow-io version 0.16.0
The text was updated successfully, but these errors were encountered: