·
AI & ML interests
I like to fine-tune the small models of the Doge series.
Organizations
-
-
-
-
-
-
-
-
-
-
-
view article Trainable Dynamic Mask Sparse Attention: Bridging Efficiency and Effectiveness in Long-Context Language Models
upvoted a paper 7 months ago upvoted a paper about 1 year ago