You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
type a "w" in the string to tokenize on the Tokens tab
click the Parse button
What is the expected output? What do you see instead?
There should be a single token "w". Instead, you get a single token "ww".
What version of the product are you using? On what operating system?
I used the latest code from git repository.
Please provide any additional information below.
I tracked this down to the methods in PKURLState.m
A: - (BOOL)parseWWWFromReader:(PKReader *)r
B: - (PKToken *)nextTokenFromReader:(PKReader *)r startingWith:(PKUniChar)cin tokenizer:(PKTokenizer *)t
- B calls A to attempt to find a URL starting with www.
- A does a read to look for more 'w' characters.
- control returns to B and matched is NO so B does an unread to undo what A potentially read.
- parsing continues
The problem is when the 'w' is at the end of the input string. Step 2 reads nothing so step 3 duplicates the w with the unread. If you type in "ww" and press parse, you get "wwww" out. This is also true regardless of what is before this token. "i am going to go talk to my w" will come out with tokens "i", "am", "going", "to", "talk", "to", "my", "ww".
Solution
I am thinking that - (BOOL)parseWWWFromReader:(PKReader *)r should do it's own cleanup (unread if necessary) before returning instead of the cleanup happening outside the method.
Work-around
You can avoid this bug by setting the tokenizer's URLState.allowsWWWPrefix to NO.
The text was updated successfully, but these errors were encountered:
What steps will reproduce the problem?
What is the expected output? What do you see instead?
There should be a single token "w". Instead, you get a single token "ww".
What version of the product are you using? On what operating system?
I used the latest code from git repository.
Please provide any additional information below.
I tracked this down to the methods in PKURLState.m
A:
- (BOOL)parseWWWFromReader:(PKReader *)r
B:
- (PKToken *)nextTokenFromReader:(PKReader *)r startingWith:(PKUniChar)cin tokenizer:(PKTokenizer *)t
The problem is when the 'w' is at the end of the input string. Step 2 reads nothing so step 3 duplicates the w with the unread. If you type in "ww" and press parse, you get "wwww" out. This is also true regardless of what is before this token. "i am going to go talk to my w" will come out with tokens "i", "am", "going", "to", "talk", "to", "my", "ww".
Solution
I am thinking that
- (BOOL)parseWWWFromReader:(PKReader *)r
should do it's own cleanup (unread if necessary) before returning instead of the cleanup happening outside the method.Work-around
You can avoid this bug by setting the tokenizer's URLState.allowsWWWPrefix to NO.
The text was updated successfully, but these errors were encountered: