Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

High PPL when groupsize != -1 for OPT model after replace linear layer with quantlinear. #275

Open
hyx1999 opened this issue Jul 6, 2023 · 1 comment

Comments

@hyx1999
Copy link

hyx1999 commented Jul 6, 2023

I tried to test GPTQ's PPL metrics on the opt model via opt.py. The PPL metrics of the opt model are normal with the use of fake
quantization. However, when I try to place the opt_pack before the opt_eval and set the groupsize to a value other than -1 (e.g. 128), the PPL metric of the quantized model will be much larger than that of the fake quantized model. And when groupsize is set to -1 everything is fine.

wbits=4, groupsize=128, without opt_pack
wikitext2
Evaluating ...
0
1
2
3
4
5
6
7
8
9
10
11
28.715469360351562

wbits=4, groupsize=128, with opt_pack
wikitext2
Evaluating ...
0
1
2
3
4
5
6
7
8
9
10
11
778.898193359375

    # opt_pack before opt_eval 
    if not args.load and args.wbits < 16 and not args.nearest:
        model = opt_pack(model, quantizers, args.wbits, args.groupsize)
    
    print("model:", "\n", model)

    if args.eval:
        datasets = ['wikitext2']
        if args.new_eval:
            datasets = ['wikitext2']
        for dataset in datasets:
            dataloader, testloader = get_loaders(dataset, seed=args.seed, model=args.model, seqlen=model.seqlen, cache_dir=args.cache_dir)
            print(dataset)
            opt_eval(model, testloader, DEV)
@hyx1999
Copy link
Author

hyx1999 commented Jul 6, 2023

I tried to test GPTQ's PPL metrics on the opt model via opt.py. The PPL metrics of the opt model are normal with the use of fake quantization. However, when I try to place the opt_pack before the opt_eval and set the groupsize to a value other than -1 (e.g. 128), the PPL metric of the quantized model will be much larger than that of the fake quantized model. And when groupsize is set to -1 everything is fine.

wbits=4, groupsize=128, without opt_pack wikitext2 Evaluating ... 0 1 2 3 4 5 6 7 8 9 10 11 28.715469360351562

wbits=4, groupsize=128, with opt_pack wikitext2 Evaluating ... 0 1 2 3 4 5 6 7 8 9 10 11 778.898193359375

    # opt_pack before opt_eval 
    if not args.load and args.wbits < 16 and not args.nearest:
        model = opt_pack(model, quantizers, args.wbits, args.groupsize)
    
    print("model:", "\n", model)

    if args.eval:
        datasets = ['wikitext2']
        if args.new_eval:
            datasets = ['wikitext2']
        for dataset in datasets:
            dataloader, testloader = get_loaders(dataset, seed=args.seed, model=args.model, seqlen=model.seqlen, cache_dir=args.cache_dir)
            print(dataset)
            opt_eval(model, testloader, DEV)

I completed the above test using Facebook/opt 125m

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant