Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error: Trying to access flag --preserve_unused_tokens before flags were parsed #1133

Open
rv-ltran opened this issue Aug 7, 2020 · 21 comments

Comments

@rv-ltran
Copy link

rv-ltran commented Aug 7, 2020

I have been using the following code fine until this morning. I got an error for using bert.run_classifier.convert_examples_to_features(test_InputExamples, label_list, MAX_SEQ_LENGTH, tokenizer)

Please let me know how to fix it

import pandas as pd
import bert
from bert import run_classifier
from bert import optimization
from bert import tokenization
from tensorflow.contrib import predictor
import pkg_resources
pkg_resources.get_distribution("bert-tensorflow").version


input_words = "Hello"

DATA_COLUMN = "message"
LABEL_COLUMN = "category_label"


test = pd.DataFrame({DATA_COLUMN: [input_words], LABEL_COLUMN : [0]})

BERT_MODEL_HUB = "https://tfhub.dev/google/bert_uncased_L-12_H-768_A-12/1"

def create_tokenizer_from_hub_module():
  """Get the vocab file and casing info from the Hub module."""
  with tf.Graph().as_default():
    bert_module = hub.Module(BERT_MODEL_HUB)
    tokenization_info = bert_module(signature="tokenization_info", as_dict=True)
    with tf.Session() as sess:
      vocab_file, do_lower_case = sess.run([tokenization_info["vocab_file"],
                                            tokenization_info["do_lower_case"]])

  return bert.tokenization.FullTokenizer(
      vocab_file=vocab_file, do_lower_case=do_lower_case)

tokenizer = create_tokenizer_from_hub_module()

test_InputExamples = test.apply(lambda x: bert.run_classifier.InputExample(guid=None, 
                                                               text_a = x[DATA_COLUMN], 
                                                               text_b = None, 
                                                               label = x[LABEL_COLUMN]), axis = 1)

# We'll set sequences to be at most 128 tokens long.
MAX_SEQ_LENGTH = 128
label_list = [6,1,2,4,3,5,0]
# Convert our test features to InputFeatures that BERT understands.
test_features = bert.run_classifier.convert_examples_to_features(test_InputExamples, label_list, MAX_SEQ_LENGTH, tokenizer)


Error:

INFO:tensorflow:Writing example 0 of 1
INFO:tensorflow:Writing example 0 of 1
UnparsedFlagAccessError: Trying to access flag --preserve_unused_tokens before flags were parsed.
---------------------------------------------------------------------------
UnparsedFlagAccessError                   Traceback (most recent call last)
<command-35675914> in <module>
     16 label_list = [6,1,2,4,3,5,0]
     17 # Convert our test features to InputFeatures that BERT understands.
---> 18 test_features = bert.run_classifier.convert_examples_to_features(test_InputExamples, label_list, MAX_SEQ_LENGTH, tokenizer)
     19 
     20 input_ids_list = [x.input_ids for x in test_features]

/databricks/python/lib/python3.7/site-packages/bert/run_classifier.py in convert_examples_to_features(examples, label_list, max_seq_length, tokenizer)
    778 
    779     feature = convert_single_example(ex_index, example, label_list,
--> 780                                      max_seq_length, tokenizer)
    781 
    782     features.append(feature)

/databricks/python/lib/python3.7/site-packages/bert/run_classifier.py in convert_single_example(ex_index, example, label_list, max_seq_length, tokenizer)
    394     label_map[label] = i
    395 
--> 396   tokens_a = tokenizer.tokenize(example.text_a)
    397   tokens_b = None
@liuyibox
Copy link

liuyibox commented Aug 7, 2020

I got same error, and it just happens today.

@AMZzee
Copy link

AMZzee commented Aug 8, 2020

Hey, this is an error caused due to a recent version update in bert.
Change pip install bert-tensorflow to pip install bert-tensorflow==1.0.1
This will solve the error by installing the previous version.
You can use the previous version till the developers fix this issue.

@rv-ltran
Copy link
Author

Do you know when will it be fixed by the developer?

@deeptimittal97
Copy link

I am getting same error, any progress on this?

@GauravSahani1417
Copy link

Hey, this is an error caused due to a recent version update in bert.
Change pip install bert-tensorflow to pip install bert-tensorflow==1.0.1
This will solve the error by installing the previous version.
You can use the previous version till the developers fix this issue.

Degraded the bert-tensorflow version, resolved!

@jasser94
Copy link

I got the same error today and downgrading bert-tf to 1.0.1 didn't fix it any other ideas?
tf version 1.15.2
python version 3.6.10
anaconda environment in ubuntu OS 18.04

@dpinol
Copy link

dpinol commented Sep 23, 2020

For me, downgrading bert-tensorflow from 1.0.4 to 1.0.1 solved the issue.
I'm retraining a model from colab

@unre4l
Copy link

unre4l commented Sep 29, 2020

Any news here? Downgrading can't be the final solution 🤔

@ekeleshian
Copy link

For me, downgrading bert-tensorflow to 1.0.1 and downgrading tensorflow to 2.0.0 worked but w/ one work around.
context: after downgrading the libraries and running my script, error was thrown: with tf.gfile.GFile(vocab_file, "r") as reader: AttributeError: module 'tensorflow' has no attribute 'gfile'
So to fix this, I edited the code in site-packages/bert/tokenization.py where gfile was being called and replaced it with with tf.io.gfile.Gfile(vocab_file, "r") as reader
it's not the cleanest, but it worked for me.

@DerekGrant
Copy link

Hey, this is an error caused due to a recent version update in bert.
Change pip install bert-tensorflow to pip install bert-tensorflow==1.0.1
This will solve the error by installing the previous version.
You can use the previous version till the developers fix this issue.

Thanks, it helps a lot!

@ZetiMente
Copy link

Is Google really using Bert in their search engine but ignoring updating it to TensorFlow 2.4? I have the error AttributeError: module 'tensorflow_estimator.python.estimator.api._v1.estimator' has no attribute 'TPUEstimator' and a search from Google led me here.

@yuvrajeyes
Copy link

same issue

liuwh0107 added a commit to liuwh0107/NCTU_AI_final_project that referenced this issue Jun 18, 2021
# train - Embedding
# 分割 dataset - train + validation
# BERT
# 不明原因train完 1 epoch會卡住 

# main ref: https://www.kaggle.com/gunesevitan/nlp-with-disaster-tweets-eda-cleaning-and-bert#3.-Target-and-N-grams
# ref: google-research/bert#1133
@ahmedhassen7
Copy link

I got the same error today and downgrading bert-tf to 1.0.1 didn't fix it any other ideas?
tf version 1.15.2
python version 3.6.10
anaconda environment in ubuntu OS 18.04

Actually you should restart the kernel

@kawthar-eltarr
Copy link

I'm having the exact same issue. When I try downgrading to 1.0.1 version I have this error :

with tf.gfile.GFile(vocab_file, "r") as reader: AttributeError: module 'tensorflow' has no attribute 'gfile'

Does anyone know any solution for this please ?

@paramdutta
Copy link

paramdutta commented Apr 25, 2022

Hello Friends,
I had this issue with tf 2.8 and bert 1.0.4 as well. I just stick these lines before the offending call:
import sys
sys.argv=['preserve_unused_tokens=False'] #Or true, if you like
flags.FLAGS(sys.argv)

Cheers!

@AnandVamsi1993
Copy link

Hello Friends, I had this issue with tf 2.8 and bert 1.0.4 as well. I just stick these lines before the offending call: import sys sys.argv=['preserve_unused_tokens=False'] #Or true, if you like flags.FLAGS(sys.argv)

Cheers!

what is the 'flags' variable?

@ahmadharimukti
Copy link

Trying to access flag --preserve_unused_tokens before flags were parsed.
im still stuck at this

downgraden not really impact to me

@sa5r
Copy link

sa5r commented May 25, 2022

I set the flag manually. Not sure this is right but made my code work.

import sys
from absl import flags
sys.argv=['preserve_unused_tokens=False']
flags.FLAGS(sys.argv)

@honzikv
Copy link

honzikv commented Jul 22, 2022

I set the flag manually. Not sure this is right but made my code work.

import sys
from absl import flags
sys.argv=['preserve_unused_tokens=False']
flags.FLAGS(sys.argv)

This fixes the issue

@Rask133
Copy link

Rask133 commented Feb 13, 2023

I set the flag manually. Not sure this is right but made my code work.

import sys
from absl import flags
sys.argv=['preserve_unused_tokens=False']
flags.FLAGS(sys.argv)

Hey thanks,it worked for me

@xlsdust
Copy link

xlsdust commented Apr 12, 2023

Hello Friends, I had this issue with tf 2.8 and bert 1.0.4 as well. I just stick these lines before the offending call: import sys sys.argv=['preserve_unused_tokens=False'] #Or true, if you like flags.FLAGS(sys.argv)

Cheers!

thankyou,it fixes the issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests