Skip to content

Common LLM setup for the KUACC cluster in Koc University, Istanbul.

Notifications You must be signed in to change notification settings

KUIS-AI/kuacc-llm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

39 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

kuacc-llm

Common LLM setup for the KUACC cluster in Koc University, Istanbul.

tldr: $ source /datasets/NLP/setenv.sh to set up your environment and access models/datasets without downloading them.

This currently sets the TORCH_HOME and HF_HOME and directs the following commands to use the read-only cache under /datasets/NLP:

import transformers, datasets, torchvision
transformers.AutoModelForCausalLM.from_pretrained("gpt2")
datasets.load_dataset("tiny_shakespeare")
torchvision.get_model("resnet50", weights="DEFAULT")

To build the python environment to use these models and datasets use:

$ conda env create -f environment.yml
# OR
$ conda create --name llm --file spec-file.txt

To generate text:

from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("gpt2")
tokenizer = AutoTokenizer.from_pretrained("gpt2")
inputs = tokenizer("Hello, I am", return_tensors="pt")
tokens = model.generate(**inputs)
tokenizer.decode(tokens[0])

Alternative method to generate text:

from transformers import pipeline
generator = pipeline("text-generation", model="gpt2")
generator("Hello, I'm a language model,")

To investigate weights:

from transformers import AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained("gpt2")
for n,p in model.named_parameters():
  print((n,p.shape))

('transformer.wte.weight', torch.Size([50257, 768]))
('transformer.wpe.weight', torch.Size([1024, 768]))
...

To use a model with lower precision (32, 16, 8, 4 bit): For setup and tips see:

from transformers import AutoModelForCausalLM
m = "facebook/opt-350m"  # gpt2 is not supported with 4/8 bit
AutoModelForCausalLM.from_pretrained(m)  # fp32, defaults to cpu
AutoModelForCausalLM.from_pretrained(m, device_map="auto")     # fp32, gpu if available
AutoModelForCausalLM.from_pretrained(m, device_map="auto", dtype=torch.float16)   # fp16
AutoModelForCausalLM.from_pretrained(m, device_map="auto", dtype=torch.bfloat16)  # bf16, better with overflows
AutoModelForCausalLM.from_pretrained(m, device_map="auto", load_in_8bit=True)
AutoModelForCausalLM.from_pretrained(m, device_map="auto", load_in_4bit=True)

To use a llm+adapter:

from peft import PeftModel
from transformers import LlamaForCausalLM, LlamaTokenizer
from huggingface_hub import snapshot_download

snapshot_download(repo_id="esokullu/llama-13B-ft-sokullu-lora")
base_model_name_or_path = "huggyllama/llama-13b"
lora_model_name_or_path = "/datasets/NLP/huggingface/hub/models--esokullu--llama-13B-ft-sokullu-lora/snapshots/542c2f91183ac5bc5ed13d5130161b11b7bcc9b8" # "sokullu/sokullu-lora-13b"
tokenizer = LlamaTokenizer.from_pretrained(base_model_name_or_path)
model = LlamaForCausalLM.from_pretrained(base_model_name_or_path, load_in_8bit=True, device_map="auto")
model = PeftModel.from_pretrained(model, lora_model_name_or_path)

To investigate activations:

TODO

Downloaded resources:

huggingface/transformers

  • Aeala/GPT4-x-AlpacaDente2-30b (61G) (from open_llm_leaderboard)
  • ai-forever/mGPT (3.3G)
  • aisquared/chopt-* (125m (240M), 350m (635M), 1_3b (2.5G), 2_7b (5.0G))
  • aisquared/chopt-research-* (125m (240M), 350m (635M), 1_3b (2.5G), 2_7b (5.0G))
  • aisquared/dlite-v2-* (124m (253M), 355m (704M), 774m (1.5G), 1_5b (3.0G)) (from open-llms, open_llm_leaderboard, lightweight gpt-2 based, finetuned)
  • ausboss/llama-30b-supercot (61G) (needs 128G, out-of-memory error with 64G, high on open_llm_leaderboard)
  • bigcode/starcoder (60G)
  • bigscience/bloom-560m (1.1G), bigscience/bloomz-3b (5.7G)
  • CarperAI/stable-vicuna-13b-delta (25G) (from open_llm_leaderboard)
  • cerebras/Cerebras-GPT-* (111M (467M), 256M (1.1G), 590M (2.3G), 1.3B (5.1G), 2.7B (11G), 6.7B (26G), 13B (49G)) (from open-llms, open_llm_leaderboard)
  • chainyo/alpaca-lora-7b (13G) (from open_llm_leaderboard)
  • chavinlo/gpt4-x-alpaca (49G) (from open_llm_leaderboard)
  • databricks/dolly-* (v1-6b (12G), v2-3b (5.4G), v2-7b (13G), v2-12b (23G)) (from open-llms, open_llm_leaderboard)
  • decapoda-research/llama-* (7b-hf (13G), 13b-hf (37G), 30b-hf (77G)) AutoConfig:ok, AutoTokenizer:wrong-name-error, AutoModel:ok
  • digitous/Alpacino30b (61G) (from open_llm_leaderboard)
  • eachadea/vicuna-* (7b-1.1 (13G), 13b (25G)) (from open_llm_leaderboard)
  • ehartford/Wizard-Vicuna-* (7B-Uncensored (26G), 13B-Uncensored (49G)) (see https://t.co/9vrPyktaIz)
  • ehartford/WizardLM-* (7B-Uncensored (13G), 13B-Uncensored (25G), 30B-Uncensored (61G)) (see https://t.co/9vrPyktaIz)
  • EleutherAI/gpt-* (j-6b (23G), neo-125m (505M), neox-20b (39G)) (from open-llms, open_llm_leaderboard)
  • EleutherAI/pythia-* (70m (160M), 160m (360M), 410m (873M), 1b (2.0G), 1.4b (2.8G), 2.8b (5.4G), 6.9b (13G), 12b (23G)) (from open-llms)
  • facebook/llama-* (7B (13G), 13B (25G)) (the originals, not an hf repo, to load use e.g. AutoModelForCausalLM.from_pretrained("/datasets/NLP/huggingface/hub/models--facebook--llama-7B"))
  • facebook/opt-* (125m (242M), 350m (636M), 1.3b (2.5G), 13b (25G)) (from open_llm_leaderboard)
  • facebook/xglm-* (564M (1.1G), 1.7B (3.3G), 2.9B (5.6G), 4.5B (9.6G), 7.5B (15G))
  • garage-bAInd/Platypus-30B (61G)
  • garage-bAInd/Platypus2-* (7B (13G), 13B (25G), 70B (129G), 70B-instruct (129G), Camel--13B (25G), Camel--70B (129G), Stable--13B (25G), GPlatty-30B (61G), SuperPlatty-30B (61G))
  • google/flan-t5-* (small (298M), base (949M), large (3.0G), xl (11G), xxl (43G))
  • google/flan-ul2 (37G), google/ul2 (37G)
  • h2oai/h2ogpt-oig-oasst1-512-6.9b (13G) (from open-llms)
  • hakurei/instruct-12b (45G)
  • HuggingFaceH4/starchat-alpha (30G) (from open_llm_leaderboard)
  • huggyllama/llama-* (7b (13G), 13b (25G), 30b (61G), 65b (123G))
  • KoboldAI/OPT-13B-Nerybus-Mix (25G) (from open_llm_leaderboard)
  • lamini/instruct-tuned-3b (5.7G)
  • lmsys/fastchat-t5-3b-v1.0 (6.3G)
  • lmsys/vicuna-* (7b-delta-v1.1 (13G), 13b-delta-v1.1 (25G), 7b (13G), 13b (25G), 33b-v1.3 (61G)) (for 7b/13b use e.g. AutoModelForCausalLM.from_pretrained("/datasets/NLP/huggingface/hub/models--lmsys--vicuna-7b"))
  • meta-llama/Llama-2-* (7b-hf (13G), 7b-chat-hf (13G), 13b-hf (25G), 13b-chat-hf (25G), 70b-hf (129G), 70b-chat-hf (129G))
  • MetaIX/GPT4-X-Alpasta-30b (61G) (from open_llm_leaderboard)
  • mosaicml/mpt-* (1b-redpajama-200b (5.0G), 1b-redpajama-200b-dolly (5.0G), 7b (13G), 7b-chat (13G), 7b-instruct (13G), 7b-storywriter (13G)) (from open-llms, requires einops, trust_remote_code=True, see hf page for details)
  • nomic-ai/gpt4all-* (13b-snoozy (49G), j (23G)) (gururise refers to it but don't know how to download -lora, seems worse than tloen/alpaca-lora-7b) (from open_llm_leaderboard)
  • openaccess-ai-collective/manticore-13b (25G)
  • openai-gpt (461M), gpt2 (528M), gpt2-medium (1.5G), gpt2-large (3.1G), gpt2-xl (6.1G), distilgpt2 (341M) (from open_llm_leaderboard)
  • OpenAssistant/oasst-sft-4-pythia-12b-epoch-3.5 (23G) (from open-llms, open_llm_leaderboard)
  • openlm-research/open_llama_7b_preview_300bt (13G) (from open-llms, open_llm_leaderboard, use AutoModelForCausalLM.from_pretrained("openlm-research/open_llama_7b_preview_300bt", subfolder="open_llama_7b_preview_300bt_transformers_weights"))
  • Pirr/pythia-13b-deduped-green_devil (23G) (from open_llm_leaderboard)
  • pythainlp/wangchanglm-7.5B-sft-en-sharded (32G) (from open_llm_leaderboard)
  • Salesforce/codegen-16B-multi (31G) (from open_llm_leaderboard)
  • stabilityai/stablelm-* (base-alpha-3b (14G), tuned-alpha-3b (14G), tuned-alpha-7b (30G)) (from open-llms, open_llm_leaderboard)
  • TheBloke/dromedary-65b-lora-HF (123G) (from open_llm_leaderboard)
  • TheBloke/vicuna-13B-1.1-HF (25G) (from open_llm_leaderboard)
  • TheBloke/wizardLM-7B-HF (13G)
  • tiiuae/falcon-* (rw-1b (2.5G), rw-7b (15G), 7b (14G), 7b-instruct (14G), 40b (79G), 40b-instruct (79G))
  • togethercomputer/GPT-* (JT-6B-v0 (12G), JT-Moderation-6B (12G), NeoXT-Chat-Base-20B (39G))
  • togethercomputer/Pythia-Chat-Base-7B (13G)
  • togethercomputer/RedPajama-INCITE-* (Base-3B-v1 (5.4G), Base-7B-v0.1 (13G), Chat-3B-v1 (5.4G), Chat-7B-v0.1 (13G), Instruct-3B-v1 (5.4G), Instruct-7B-v0.1 (13G)) (from open-llms, open_llm_leaderboard)
  • vicgalle/gpt2-alpaca-gpt4 (492M) (from open_llm_leaderboard)
  • wordcab/llama-natural-instructions-13b (37G) (from open_llm_leaderboard)

huggingface/datasets

  • ai2_arc (2.3M) (from hf_leaderboard)
  • allenai/prosocial-dialog (92M) (from emirhan)
  • amazon_reviews_multi (368M) (from emirhan)
  • big_patent (40G) (from emirhan)
  • billsum (261M) (from emirhan)
  • bookcorpus (4.6G)
  • cais/mmlu (8.6G) (from hf_leaderboard)
  • ccdv/cnn_dailymail (1.3G) (from emirhan)
  • checkai/instruction-poems (27M) (from emirhan)
  • databricks/databricks-dolly-15k (12M) (from emirhan)
  • dctanner/oa_recipes (7.4M) (from emirhan)
  • donfu/oa-stackexchange (6.2G) (from emirhan)
  • ehartford/oa_leet10k (46M) (from emirhan)
  • EleutherAI/truthful_qa_mc (216K) (from hf_leaderboard)
  • emozilla/soda_synthetic_dialogue (1.8G) (from emirhan)
  • enwik8 (99M)
  • garage-bAInd/open-platypus (462M)
  • glue (232M)
  • hellaswag (63M) (from hf_leaderboard)
  • imdb (128M)
  • MBZUAI/LaMini-instruction (1.1G) (2M chatGPT outputs for different prompts, from emirhan)
  • mikegarts/oa_tell_a_joke_10000 (5.9G) (from emirhan)
  • mosaicml/dolly_hhrlhf (46M)
  • multi_news (668M) (from emirhan)
  • nomic-ai/gpt4all-j-prompt-generations (1.7G) (from emirhan)
  • OllieStanley/humaneval-mbpp-codegen-qa (244K) (from emirhan)
  • OllieStanley/humaneval-mbpp-testgen-qa (320K) (from emirhan)
  • OllieStanley/oa_camel (227M) (from emirhan)
  • openwebtext (38G)
  • piqa (5.2M)
  • ptb_text_only (5.8M)
  • samsum (11M) (from emirhan)
  • sciq (7.4M)
  • squad (87M)
  • super_glue (285M)
  • tatsu-lab/alpaca (45M) (the original)
  • tiiuae/falcon-refinedweb (2.6T)
  • tiny_shakespeare (1.2M)
  • totuta/youtube_subs_howto100M (1.2G) (from emirhan)
  • victor123/evol_instruct_70k (126M) (from emirhan)
  • wikitext (1.1G)
  • xsum (510M) (from emirhan)
  • yahma/alpaca-cleaned (39M) (https://github.com/gururise/AlpacaDataCleaned as of 2023-04-10)

torchvision.models

  • all 121 models listed in torchvision.models.list_models() (25G):

['alexnet', 'convnext_base', 'convnext_large', 'convnext_small', 'convnext_tiny', 'deeplabv3_mobilenet_v3_large', 'deeplabv3_resnet101', 'deeplabv3_resnet50', 'densenet121', 'densenet161', 'densenet169', 'densenet201', 'efficientnet_b0', 'efficientnet_b1', 'efficientnet_b2', 'efficientnet_b3', 'efficientnet_b4', 'efficientnet_b5', 'efficientnet_b6', 'efficientnet_b7', 'efficientnet_v2_l', 'efficientnet_v2_m', 'efficientnet_v2_s', 'fasterrcnn_mobilenet_v3_large_320_fpn', 'fasterrcnn_mobilenet_v3_large_fpn', 'fasterrcnn_resnet50_fpn', 'fasterrcnn_resnet50_fpn_v2', 'fcn_resnet101', 'fcn_resnet50', 'fcos_resnet50_fpn', 'googlenet', 'inception_v3', 'keypointrcnn_resnet50_fpn', 'lraspp_mobilenet_v3_large', 'maskrcnn_resnet50_fpn', 'maskrcnn_resnet50_fpn_v2', 'maxvit_t', 'mc3_18', 'mnasnet0_5', 'mnasnet0_75', 'mnasnet1_0', 'mnasnet1_3', 'mobilenet_v2', 'mobilenet_v3_large', 'mobilenet_v3_small', 'mvit_v1_b', 'mvit_v2_s', 'quantized_googlenet', 'quantized_inception_v3', 'quantized_mobilenet_v2', 'quantized_mobilenet_v3_large', 'quantized_resnet18', 'quantized_resnet50', 'quantized_resnext101_32x8d', 'quantized_resnext101_64x4d', 'quantized_shufflenet_v2_x0_5', 'quantized_shufflenet_v2_x1_0', 'quantized_shufflenet_v2_x1_5', 'quantized_shufflenet_v2_x2_0', 'r2plus1d_18', 'r3d_18', 'raft_large', 'raft_small', 'regnet_x_16gf', 'regnet_x_1_6gf', 'regnet_x_32gf', 'regnet_x_3_2gf', 'regnet_x_400mf', 'regnet_x_800mf', 'regnet_x_8gf', 'regnet_y_128gf', 'regnet_y_16gf', 'regnet_y_1_6gf', 'regnet_y_32gf', 'regnet_y_3_2gf', 'regnet_y_400mf', 'regnet_y_800mf', 'regnet_y_8gf', 'resnet101', 'resnet152', 'resnet18', 'resnet34', 'resnet50', 'resnext101_32x8d', 'resnext101_64x4d', 'resnext50_32x4d', 'retinanet_resnet50_fpn', 'retinanet_resnet50_fpn_v2', 's3d', 'shufflenet_v2_x0_5', 'shufflenet_v2_x1_0', 'shufflenet_v2_x1_5', 'shufflenet_v2_x2_0', 'squeezenet1_0', 'squeezenet1_1', 'ssd300_vgg16', 'ssdlite320_mobilenet_v3_large', 'swin3d_b', 'swin3d_s', 'swin3d_t', 'swin_b', 'swin_s', 'swin_t', 'swin_v2_b', 'swin_v2_s', 'swin_v2_t', 'vgg11', 'vgg11_bn', 'vgg13', 'vgg13_bn', 'vgg16', 'vgg16_bn', 'vgg19', 'vgg19_bn', 'vit_b_16', 'vit_b_32', 'vit_h_14', 'vit_l_16', 'vit_l_32', 'wide_resnet101_2', 'wide_resnet50_2']

downloading

  • https://huggingface.co/datasets/tiiuae/falcon-refinedweb
  • https://huggingface.co/conceptofmind/Flan-Open-Llama-7b
  • nomic-ai/gpt4all-lora -- error
  • nomic-ai/gpt4all-j-lora -- error
  • ehartford/alpaca1337-13b-4bit -- error: OSError: ehartford/alpaca1337-13b-4bit does not appear to have a file named config.json. Checkout 'https://huggingface.co/ehartford/alpaca1337-13b-4bit/main' for available files.
  • ehartford/alpaca1337-7b-4bit -- error: OSError: ehartford/alpaca1337-7b-4bit does not appear to have a file named config.json. Checkout 'https://huggingface.co/ehartford/alpaca1337-7b-4bit/main' for available files. n
  • laion/OIG () (from emirhan) -- error: Generating train split: 14113288 examples [36:34, 4918.58 examples/s]Failed to read file '/datasets/NLP/huggingface/datasets/downloads/extracted/13d1404eac66ab41c857612e073018ab83a1dcd1293cc32464dead7b4ce933ba' with error <class 'pyarrow.lib.ArrowInvalid'>: JSON parse error: Missing a comma or '}' after an object member. in row 10
  • EleutherAI/pile () -- error: json.decoder.JSONDecodeError: Unterminated string starting at: line 1 column 10 (char 9)
  • gsm8k (4.6M) (from emirhan) -- downloaded 'main', 'socratic', but gives error with load_dataset: FileNotFoundError: Unable to resolve any data file that matches '['train[-._ 0-9/]**', ...
  • Hello-SimpleAI/HC3 (from emirhan) -- AttributeError: 'NoneType' object has no attribute 'name'

List of companies/users with sota models/datasets we should follow:

Model Lists and Evaluation

Some model sizes

model layer embd nhead vocab nctx params
openai-gpt 12 768 12 40478 512 116_534_784
gpt2 12 768 12 50257 1024 124_439_808
gpt2-medium 24 1024 16 50257 1024 354_823_168
gpt2-large 36 1280 20 50257 1024 774_030_080
gpt2-xl 48 1600 25 50257 1024 1_557_611_200
pythia-70m 6 512 8 50304 2048 70_426_624
pythia-160m 12 768 12 50304 2048 162_322_944
pythia-410m 24 1024 16 50304 2048 405_334_016
pythia-1b 16 2048 8 50304 2048 1_011_781_632
pythia-1.4b 24 2048 16 50304 2048 1_414_647_808
pythia-2.8b 32 2560 32 50304 2048 2_775_208_960
pythia-6.9b 32 4096 32 50432 2048 6_857_302_016
pythia-12b 36 5120 40 50688 2048 11_846_072_320
chopt-125m 12 768 12 50268 2048 125_236_224
chopt-350m 24 1024 16 50268 2048 331_194_368
chopt-1_3b 24 2048 32 50268 2048 1_315_749_888
chopt-2_7b 32 2560 32 50268 2048 2_651_586_560
dlite-v2-124m 12 768 12 50260 1024 124_442_112
dlite-v2-355m 24 1024 16 50260 1024 354_826_240
dlite-v2-774m 36 1280 20 50260 1024 774_033_920
dlite-v2-1_5b 48 1600 25 50260 1024 1_557_616_000

About

Common LLM setup for the KUACC cluster in Koc University, Istanbul.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages