Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BERT implementation #43

Merged
merged 6 commits into from
Dec 9, 2023
Merged

BERT implementation #43

merged 6 commits into from
Dec 9, 2023

Conversation

jbarrow
Copy link
Contributor

@jbarrow jbarrow commented Dec 8, 2023

Added and documented the BERT implementation. It currently uses the HF tokenizer, because that handles a lot of nice things (like padding, masking, and token_type_ids).

It's tested against the HuggingFace implementation at: batch sizes >= 1, different models (bert-base-uncased, bert-base-cased, bert-large-uncased, bert-large-cased), etc.

@awni awni self-requested a review December 8, 2023 15:24
Copy link
Member

@awni awni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really nicely done!! I love it. I left some mostly minor comments.

Please address and then we can merge it into the examples.

Also could you add a requirements.txt for dependenies?

bert/README.md Outdated Show resolved Hide resolved
bert/model.py Outdated Show resolved Hide resolved
bert/model.py Show resolved Hide resolved
bert/model.py Outdated Show resolved Hide resolved
bert/model.py Outdated Show resolved Hide resolved
bert/model.py Show resolved Hide resolved
bert/model.py Outdated Show resolved Hide resolved
bert/model.py Show resolved Hide resolved
bert/convert.py Outdated Show resolved Hide resolved
@jbarrow
Copy link
Contributor Author

jbarrow commented Dec 9, 2023

Outside of anything that requires a change to MLX core, I think I've made the changes.

I'll put off removing the MultiHeadAttention class until I've made that change, it's been pulled into MLX, and there's been a released version. 😄

@jbarrow
Copy link
Contributor Author

jbarrow commented Dec 9, 2023

Submitted and tested a PR for the bias change in mlx. I can submit a new PR here once it's merged/released.

Copy link
Member

@awni awni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, just two more comments, then we can merge this and follow up after the change to core multiheadedattention.

bert/README.md Outdated Show resolved Hide resolved
bert/model.py Outdated Show resolved Hide resolved
Copy link
Member

@awni awni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Brilliant, thank you for putting this together!

@awni awni merged commit 46c6bbe into ml-explore:main Dec 9, 2023
Blaizzy pushed a commit to Blaizzy/mlx-examples that referenced this pull request Mar 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants