About 50 results
Open links in new tab
  1. Commits · SqueezeAILab/TinyAgent · GitHub

    Commit History Commits on Sep 4, 2024 Add pointer to ArXiv (#3) sidjha1 authored Copy full SHA for cc45c0e

  2. GitHub - SqueezeAILab/SqueezedAttention: [ACL 2025] Squeezed …

    Squeezed Attention is a method to accelerate attention for long input prompts where a large portion of the input prompt is fixed across successive user queries. Many LLM applications require processing …

  3. GitHub - SqueezeAILab/MultipoleAttention: [NeurIPS 2025] Multipole ...

    About [NeurIPS 2025] Multipole Attention for Efficient Long Context Reasoning

  4. SqueezeLLM: Dense-and-Sparse Quantization - GitHub

    SqueezeLLM is a post-training quantization framework that incorporates a new method called Dense-and-Sparse Quantization to enable efficient LLM serving. TLDR: Deploying LLMs is difficult due to …

  5. SqueezeAILab/plan-and-act - GitHub

    We introduce Plan-and-Act, a framework that enables accurate and reliable long-horizon task solving with explicit planning. We additionally introduce a synthetic data generation method for training the …

  6. ETS/rebase.py at main · SqueezeAILab/ETS · GitHub

    ETS: Efficient Tree Search for Inference-Time Scaling - ETS/rebase.py at main · SqueezeAILab/ETS

  7. GitHub - SqueezeAILab/QuantSpec

    Feb 22, 2025 · Contribute to SqueezeAILab/QuantSpec development by creating an account on GitHub.

  8. LLM2LLM/GSM8K/config.yaml at main - GitHub

    [ACL 2024] LLM2LLM: Boosting LLMs with Novel Iterative Data Enhancement - SqueezeAILab/LLM2LLM

  9. Commits · SqueezeAILab/TinyAgent · GitHub

    Commit History Commits on Sep 4, 2024 Add pointer to ArXiv (#3) sidjha1 authored

  10. SqueezedAttention/offline_clustering.py at main - GitHub

    SqueezedAttention / offline_clustering.py Cannot retrieve latest commit at this time.