High PPL when groupsize != -1 for OPT model after replace linear layer with quantlinear. #275

hyx1999 · 2023-07-06T13:19:06Z

I tried to test GPTQ's PPL metrics on the opt model via opt.py. The PPL metrics of the opt model are normal with the use of fake
quantization. However, when I try to place the opt_pack before the opt_eval and set the groupsize to a value other than -1 (e.g. 128), the PPL metric of the quantized model will be much larger than that of the fake quantized model. And when groupsize is set to -1 everything is fine.

wbits=4, groupsize=128, without opt_pack
wikitext2
Evaluating ...
0
1
2
3
4
5
6
7
8
9
10
11
28.715469360351562

wbits=4, groupsize=128, with opt_pack
wikitext2
Evaluating ...
0
1
2
3
4
5
6
7
8
9
10
11
778.898193359375

    # opt_pack before opt_eval 
    if not args.load and args.wbits < 16 and not args.nearest:
        model = opt_pack(model, quantizers, args.wbits, args.groupsize)
    
    print("model:", "\n", model)

    if args.eval:
        datasets = ['wikitext2']
        if args.new_eval:
            datasets = ['wikitext2']
        for dataset in datasets:
            dataloader, testloader = get_loaders(dataset, seed=args.seed, model=args.model, seqlen=model.seqlen, cache_dir=args.cache_dir)
            print(dataset)
            opt_eval(model, testloader, DEV)

hyx1999 · 2023-07-06T13:34:45Z

I tried to test GPTQ's PPL metrics on the opt model via opt.py. The PPL metrics of the opt model are normal with the use of fake quantization. However, when I try to place the opt_pack before the opt_eval and set the groupsize to a value other than -1 (e.g. 128), the PPL metric of the quantized model will be much larger than that of the fake quantized model. And when groupsize is set to -1 everything is fine.

wbits=4, groupsize=128, without opt_pack wikitext2 Evaluating ... 0 1 2 3 4 5 6 7 8 9 10 11 28.715469360351562

wbits=4, groupsize=128, with opt_pack wikitext2 Evaluating ... 0 1 2 3 4 5 6 7 8 9 10 11 778.898193359375
    # opt_pack before opt_eval 
    if not args.load and args.wbits < 16 and not args.nearest:
        model = opt_pack(model, quantizers, args.wbits, args.groupsize)
    
    print("model:", "\n", model)

    if args.eval:
        datasets = ['wikitext2']
        if args.new_eval:
            datasets = ['wikitext2']
        for dataset in datasets:
            dataloader, testloader = get_loaders(dataset, seed=args.seed, model=args.model, seqlen=model.seqlen, cache_dir=args.cache_dir)
            print(dataset)
            opt_eval(model, testloader, DEV)

I completed the above test using Facebook/opt 125m

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

High PPL when groupsize != -1 for OPT model after replace linear layer with quantlinear. #275

High PPL when groupsize != -1 for OPT model after replace linear layer with quantlinear. #275

hyx1999 commented Jul 6, 2023

hyx1999 commented Jul 6, 2023

High PPL when groupsize != -1 for OPT model after replace linear layer with quantlinear. #275

High PPL when groupsize != -1 for OPT model after replace linear layer with quantlinear. #275

Comments

hyx1999 commented Jul 6, 2023

hyx1999 commented Jul 6, 2023