Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ttnn.mean op - Tensor Mismatch #869

Open
chandrasekaranpradeep opened this issue Oct 8, 2024 · 3 comments
Open

ttnn.mean op - Tensor Mismatch #869

chandrasekaranpradeep opened this issue Oct 8, 2024 · 3 comments
Assignees

Comments

@chandrasekaranpradeep
Copy link

Summary:
The ttnn.mean op fails with tensor mismatch (PCC : 0.7203957195745748)
Details:
The ttnn.mean throws tensor mismatch and the pcc is dropped to 0.72 when the input tensor of (1, 12, 3200) and dim = -1 is passed to reduce_mean(i.e ttnn.mean) op in forge. The tensor mismatch is observed while comparing PyTorch and Forge(i.e ttnn) output

For more context, here is the exact error message:

  >       assert compare_with_golden_pcc(golden=fw_out, calculated=co_out[0], pcc=0.99)
E       assert False
E        +  where False = compare_with_golden_pcc(golden=tensor([[[0.4979],\n         [0.4969],\n         [0.5080],\n         [0.5029],\n         [0.5012],\n         [0.5046],\n         [0.4993],\n         [0.5034],\n         [0.5109],\n         [0.4984],\n         [0.4972],\n         [0.4963]]]), calculated=tensor([[[0.4648],\n         [0.4707],\n         [0.4844],\n         [0.4727],\n         [0.4570],\n         [0.4766],\n         [0.4727],\n         [0.4785],\n         [0.4883],\n         [0.4707],\n         [0.4766],\n         [0.4648]]]), pcc=0.99)

Repro:
TTIR:

 module @ReduceMean attributes {tt.system_desc = #tt.system_desc<[{arch = <wormhole_b0>, grid = 8x8, l1_size = 1499136, num_dram_channels = 12, dram_channel_size = 1073741824, noc_l1_address_align_bytes = 16, pcie_address_align_bytes = 32, noc_dram_address_align_bytes = 32, l1_unreserved_base = 1024, erisc_l1_unreserved_base = 1024, dram_unreserved_base = 1024, dram_unreserved_end = 1073741824, physical_cores = {worker = [ 0x0,  0x1,  0x2,  0x3,  0x4,  0x5,  0x6,  0x7,  1x0,  1x1,  1x2,  1x3,  1x4,  1x5,  1x6,  1x7,  2x0,  2x1,  2x2,  2x3,  2x4,  2x5,  2x6,  2x7,  3x0,  3x1,  3x2,  3x3,  3x4,  3x5,  3x6,  3x7,  4x0,  4x1,  4x2,  4x3,  4x4,  4x5,  4x6,  4x7,  5x0,  5x1,  5x2,  5x3,  5x4,  5x5,  5x6,  5x7,  6x0,  6x1,  6x2,  6x3,  6x4,  6x5,  6x6,  6x7,  7x0,  7x1,  7x2,  7x3,  7x4,  7x5,  7x6,  7x7] dram = [ 8x0,  9x0,  10x0,  8x1,  9x1,  10x1,  8x2,  9x2,  10x2,  8x3,  9x3,  10x3]}, supported_data_types = [<f32>, <f16>, <bf16>, <bfp_f8>, <bfp_bf8>, <bfp_f4>, <bfp_bf4>, <bfp_f2>, <bfp_bf2>, <u32>, <u16>, <u8>], supported_tile_sizes = [ 4x16,  16x16,  32x16,  4x32,  16x32,  32x32]}], [0], [3 : i32], [ 0x0x0x0]>} {
  func.func @forward(%arg0: tensor<1x12x3200xf32> {ttir.name = "a"}) -> (tensor<1x12x1xf32> {ttir.name = "ReduceMean.output_reduce_avg_0"}) {
    %0 = tensor.empty() : tensor<1x12x1xf32>
    %1 = "ttir.mean"(%arg0, %0) <{dim_arg = [-1 : i32], keep_dim = true, operand_constraints = [#tt.operand_constraint<dram|l1|scalar|tile|none|interleaved|single_bank|height_sharded|width_sharded|block_sharded|any_layout|any_device|any_device_tile|l1_block_sharded>, #tt.operand_constraint<dram|l1|scalar|tile|none|interleaved|single_bank|height_sharded|width_sharded|block_sharded|any_layout|any_device|any_device_tile|l1_block_sharded>]}> : (tensor<1x12x3200xf32>, tensor<1x12x1xf32>) -> tensor<1x12x1xf32>
    return %1 : tensor<1x12x1xf32>
  }
}

TTNN test cases:

import torch
import ttnn
from tests.ttnn.utils_for_testing import assert_with_pcc
from models.utility_functions import torch_random
def test_mean_pcc_issue(device):
    torch.manual_seed(0)

    input_shape = (1, 12, 3200)
    reduce_dim = -1
    
    torch_input_tensor = torch.rand(input_shape, dtype=torch.float32)
    torch_output_tensor = torch.mean(torch_input_tensor, dim=reduce_dim, keepdim=True, dtype=torch.float32)

    input_tensor = ttnn.from_torch(torch_input_tensor, dtype=ttnn.float32, layout=ttnn.TILE_LAYOUT, device=device)
    
    output_tensor = ttnn.mean(input_tensor, dim=reduce_dim)
    output_tensor = ttnn.to_torch(output_tensor)
    
    assert_with_pcc(torch_output_tensor, output_tensor)

Forge test cases:

git checkout pchandrasekaran/rms_norm_and_mean

# Before running the test, comment the xfail for data mismatch in this test
pytest forge/test/mlir/test_ops.py::test_reduce_mean[-1-input_shape2] -vss
@chandrasekaranpradeep
Copy link
Author

Created a issue in TT-Metal for the ttnn.mean op tensor mismatch - tenstorrent/tt-metal#13621

@nvukobratTT
Copy link
Contributor

nvukobratTT commented Oct 9, 2024

@sdjordjevicTT we also confirmed that there is an issue on ttnn side, here is the blocker issues at hand:

Note: This one is also marked as P0 as it exists on Llama 3B model we're referencing

@sdjordjevicTT
Copy link
Contributor

Great! Let's see with TTNN folks what it is about.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants