Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore(main): release 0.2.0 #476

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

github-actions[bot]
Copy link
Contributor

@github-actions github-actions bot commented Aug 27, 2024

🤖 I have created a release beep boop

0.2.0 (2024-12-01)

Features

  • add rotary_dim argument to rope APIs for partial apply rope (#599) (eb9bc71)
  • add a use_softmax field in variant class (#533) (d81af97)
  • add an option non_blocking to plan function (#622) (560af6f)
  • add gemma_rmsnorm and gemma_fused_add_rmsnorm (#477) (1a6b17e)
  • add group size 3 to GQA decode dispatch (#558) (6227562)
  • allow the cascade kernels to be executed using varying sequence lenghts (#627) (92ac440)
  • CUDAGraph compatibility of multi-level cascade inference APIs (#586) (2332e8a)
  • fix the maximal grid dimension in prefill planning with CUDA graphs (#639) (86ca89a)
  • improve the precision of the FusedAddRMSNormKernel function (#587) (c7dc921)
  • JIT compilation (#507) (3613a5b)
  • modify group-gemm stage number (#497) (52dab1d)
  • non-contiguous query with paged kv cache (#553) (89f2c4a)
  • pass a dynamic token count to the cascade kernels (#635) (5fe9f7d)
  • simplify prefill JIT compilation (#605) (fe4f898)
  • support cached cos/sin in rope APIs (#585) (83e541d)
  • support huggingface transformer style rope interface (#568) (4f40420)
  • support sm90 cutlass group gemm (#509) (794bdda)
  • torch custom_op fix for rope (#569) (3e104bc)
  • torch custom_op support: norm (#552) (f6e0010)
  • torch.compile and custom_op support (#554) (9bf916f)
  • warmup for jit kernel tests (#629) (8f5f349)

Bug Fixes

Performance Improvements

  • accelerate JIT compilation speed (#618) (eaf73fd)
  • fix prefill kernel performance degradation (step 1) (#602) (595cf60)
  • fix the performance issue of append_paged_kv_cache (#588) (e15f7c9)
  • improve parallelism in RoPE with pos_ids (#609) (ff05155)
  • improve plan performance by using non-blocking memcpy (#547) (41ebe6d)
  • reduce the read and write of shared memory in the FusedAddRMSNormKernel (#592) (2043ca2)
  • remove unnecessary contiguous operation in block sparse attention (#561) (7a7ad46)
  • speedup jit compilation of prefill attention kernels (#632) (a059586)
  • use cuda-core implemention for io-bound block-sparse attention (#560) (3fbf028)

This PR was generated with Release Please. See documentation.

@github-actions github-actions bot force-pushed the release-please--branches--main branch 6 times, most recently from 952ff56 to 2f4844c Compare September 1, 2024 15:33
@github-actions github-actions bot force-pushed the release-please--branches--main branch 3 times, most recently from 46381dc to 236fac3 Compare September 11, 2024 06:01
@github-actions github-actions bot force-pushed the release-please--branches--main branch 2 times, most recently from 7bea067 to b22fe36 Compare September 19, 2024 10:48
@github-actions github-actions bot force-pushed the release-please--branches--main branch 2 times, most recently from 7c4096a to 7b5eabc Compare September 26, 2024 18:27
@github-actions github-actions bot force-pushed the release-please--branches--main branch 5 times, most recently from 7ca768f to 51f1fa2 Compare October 11, 2024 01:27
@github-actions github-actions bot force-pushed the release-please--branches--main branch 6 times, most recently from 72cf12c to 2c807c2 Compare October 20, 2024 08:27
@github-actions github-actions bot force-pushed the release-please--branches--main branch 5 times, most recently from 39eaf65 to 6848941 Compare October 25, 2024 02:51
@github-actions github-actions bot force-pushed the release-please--branches--main branch 8 times, most recently from 3e93234 to f524db3 Compare November 6, 2024 23:03
@github-actions github-actions bot force-pushed the release-please--branches--main branch 7 times, most recently from a6e6ada to 6c73b6f Compare November 14, 2024 07:46
@github-actions github-actions bot force-pushed the release-please--branches--main branch 5 times, most recently from 04a3374 to 36efd86 Compare November 21, 2024 01:10
@github-actions github-actions bot force-pushed the release-please--branches--main branch 8 times, most recently from baf5381 to 752e912 Compare November 26, 2024 08:38
@github-actions github-actions bot force-pushed the release-please--branches--main branch from 752e912 to 4ac2de5 Compare December 1, 2024 04:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

0 participants