huggingface - Serbipunk/notes GitHub Wiki

image

看起来可以做线上demo projectGAN

好像也可以做训练,但数据太难传了吧

https://golden.com/wiki/Hugging_Face-39P6RJJ

safetensors

They’re using .safetensors + hf_hub_download() for two practical reasons: safe/portable weight files and easy, cached distribution.

  • Why .safetensors for model weights

    • Security: unlike PyTorch’s common .pt/.pth (often produced via torch.save / pickle), .safetensors is a simple tensor container format and doesn’t execute code during load. That reduces “arbitrary code execution” risk when loading third‑party checkpoints.
    • Faster + memory-friendly: it supports efficient reads and can be faster to load; many workflows also benefit from predictable, zero-copy-ish behavior depending on the stack.
    • Ecosystem standard: Hugging Face tooling and a lot of modern model releases standardize on .safetensors, so it’s a convenient default.
  • Why load via hf_hub_download()

    • Works with both local and Hugging Face Hub: in init.py, from_pretrained() checks whether "{path}.json" and "{path}.safetensors" exist locally; if not, it treats path as a Hub identifier and downloads.
    • Caching + resumable downloads: hf_hub_download() stores files in the local HF cache, avoids re-downloading, and handles partial downloads better than ad-hoc requests.
    • Versioning & reproducibility: the Hub supports revisions/commits/tags; hf_hub_download() can pin a revision (not shown here, but supported), making runs more reproducible.
    • Distribution convenience: avoids shipping large binaries inside the repo and plays nicely with model hosting (often without needing you to manage Git LFS yourself).

In this project specifically, from_pretrained() expects two side-by-side artifacts: a config "{path}.json" (to know which class + args to instantiate) and weights "{path}.safetensors" (the actual tensors), then builds the model and calls load_state_dict().

If you want, I can point out exactly how the path string is parsed into repo_id + model_name and what shape it expects (it’s a little strict).