Skip to content

Commit

Permalink
dwc_eqos - segmentation offload and vlan fixes (#39)
Browse files Browse the repository at this point in the history
- Add a README.md.
- Add a REG file for enabling a diagnostic ETW autologger.
- Implement TCP segmentation offload (LSOv2).
- Implement UDP fragmentation offload. (USO).
- Instead of making a half-baked effort to adapt to disabled hardware
  features (i.e. lack of Checksum or Segment offload support), check for
  the feature and don't load if it's missing. At present, I have nothing
  to test this on and it just makes the code messy. We can add
  adaptability back in the future if anybody actually wants it and has
  hardware to test it.
- Fix problems with error logging in Tx path.
- Fix problems with VLAN tag insertion.
- Fix problems with checksum offload configuration. NetAdapterCx only
  recognizes the "granular" options.
  • Loading branch information
idigdoug authored Jan 17, 2024
1 parent 2736054 commit 87e3e62
Show file tree
Hide file tree
Showing 14 changed files with 510 additions and 179 deletions.
12 changes: 12 additions & 0 deletions drivers/net/dwc_eqos/AutoLogger.reg
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
Windows Registry Editor Version 5.00

[HKEY_LOCAL_MACHINE\SYSTEM\ControlSet001\Control\WMI\Autologger\dwc_eqos]
"Guid"="{3fde989c-5470-4452-8f3c-91b0584f5a75}"
"BufferSize"=dword:00000010
"LogFileMode"=dword:08080400
"Start"=dword:00000001

[HKEY_LOCAL_MACHINE\SYSTEM\ControlSet001\Control\WMI\Autologger\dwc_eqos\{5d8331d3-70b3-5620-5664-db28f48a4b79}]
"Enabled"=dword:00000001
"EnableLevel"=dword:00000004
"EnableFlags"=dword:00000003
48 changes: 48 additions & 0 deletions drivers/net/dwc_eqos/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
# Synopsys DesignWare Ethernet Quality of Service (GMAC) Driver

This is a driver for the Synopsys DesignWare Ethernet Quality of Service (EQoS)
controller found in the RK35xx SoCs, supporting 1Gbps ethernet connections.

## Compatibility

EQoS is a configurable IP block that can be customized and added to a SoC. This
driver has been tested only on the RK3588(s) and assumes the presence of
optional features that may be missing on other SoCs. With minor fixes, it would
probably work on other EQoS-based SoCs. The driver specifically checks for the
following:

- `GMAC_MAC_Version.RKVER` must be 0x51 or 0x52 (other values untested).
- `GMAC_MAC_HW_Feature0.SAVLANINS` must be enabled (require VLAN insertion support).
- `GMAC_MAC_HW_Feature0.RXCOESEL` and `TXCOESEL` must be enabled (require checksum offload support).
- `GMAC_MAC_HW_Feature1.TSOEN` must be enabled (require TCP/UDP segmentation offload support).

There may be other requirements that are assumed but not checked.

## ACPI Configuration

This driver uses ACPI properties to configure the EQoS controller's DMA behavior:

- `_DSD\snps,pblx8` (default = 1): Controls the value of `GMAC_DMA_CHx_Control.PBLx8`, i.e. controls whether PBL values are treated as 1-beat units (0) or 8-beat units (1, default).
- `_DSD\snps,pbl` (default = 8): Default value for `txpbl` and `rxpbl`.
- `_DSD\snps,txpbl` (default = `pbl`): Controls the value of `GMAC_DMA_CHx_Tx_Control.TxPBL`, i.e. transmit programmable burst length.
- `_DSD\snps,rxpbl` (default = `pbl`): Controls the value of `GMAC_DMA_CHx_Rx_Control.RxPBL`, i.e. receive programmable burst length.
- `_DSD\snps,fixed-burst` (default = 0): Controls the value of `GMAC_DMA_SysBus_Mode.FB`.
- `_DSD\snps,mixed-burst` (default = 1): Controls the value of `GMAC_DMA_SysBus_Mode.Bit14`.
- `_DSD\snps,axi-config` (default = none): Controls the `$(AXIC)` method name to use for the remaining properties. If not present, the driver will use default values for the remaining properties. Should generally be set to string `"AXIC"`.
- `$(AXIC)\snps,wr_osr_lmt` (default = 4): Controls the value of `GMAC_DMA_SysBus_Mode.WR_OSR_LMT`, i.e. AXI maximum write outstanding request limit.
- `$(AXIC)\snps,rd_osr_lmt` (default = 8): Controls the value of `GMAC_DMA_SysBus_Mode.RD_OSR_LMT`, i.e. AXI maximum read outstanding request limit.
- `$(AXIC)\snps,blen` (default = `{ 16, 8, 4 }`): Controls the values of `GMAC_DMA_SysBus_Mode.BLENx` (x = 4, 8, 16, 32, 64, 128, 256), i.e. AXI burst length. Should be a list of 7 integers, e.g. `Package () { 0, 0, 0, 0, 16, 8, 4 }`.

## Areas for improvement:

- Run against network test suites and fix any issues.
- Memory optimizations? Current implementation uses system-managed buffers.
System-managed buffer size is tied to MTU. When jumbo frames are enabled,
this is wasteful since most packets are still 1522 bytes or less. If we
used driver-managed buffers and updated the Rx queue to handle multi-buffer
packets, we could use 1536-byte or 2048-byte buffers for the Rx queue, saving
about 2MB per device when JumboPacket = 9014.
- Configure speed, duplex via Ndi\params?
- Power control, wake-on-LAN, ARP offload?
- Multi-queue, RSS support?
- Make it more generic (test with other EQoS-based SoCs)?
16 changes: 12 additions & 4 deletions drivers/net/dwc_eqos/descriptors.h
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,14 @@ enum TxChecksumInsertion : UINT16
TxChecksumInsertionEnabledIncludingPseudo = 3,
};

enum TxVlanTagControl : UINT16
{
TxVlanTagControlNone = 0,
TxVlanTagControlRemove = 1,
TxVlanTagControlInsert = 2,
TxVlanTagControlReplace = 3,
};

struct TxDescriptorRead
{
// TDES0, TDES1
Expand All @@ -31,7 +39,7 @@ struct TxDescriptorRead
// TDES2

UINT16 Buf1Length : 14; // B1L
UINT16 VlanTagControl : 2; // VTIR
TxVlanTagControl VlanTagControl : 2; // VTIR

UINT16 Buf2Length : 14; // B2L
UINT16 TransmitTimestampEnable : 1; // TTSE
Expand Down Expand Up @@ -68,7 +76,7 @@ struct TxDescriptorReadTso
// TDES2

UINT16 Buf1Length : 14; // B1L (10-bit header length if FD = 1)
UINT16 VlanTagControl : 2; // VTIR
TxVlanTagControl VlanTagControl : 2; // VTIR

UINT16 Buf2Length : 14; // B2L
UINT16 TsoMemoryWriteDisable : 1; // TMWD
Expand All @@ -77,7 +85,7 @@ struct TxDescriptorReadTso
// TDES3

UINT32 TcpPayloadLength : 18; // TPL
UINT32 TcpSegmentationEnable : 1; // TSE = 0
UINT32 TcpSegmentationEnable : 1; // TSE = 1
UINT32 TcpHeaderLength : 4; // TCP/UDP header length (must be 2 for UDP)
UINT32 SourceAddressInsertionControl : 3; // SAIC
UINT32 Reserved26 : 2; // CPC, ignored when TSE = 1
Expand Down Expand Up @@ -159,7 +167,7 @@ struct TxDescriptorContext

UINT8 VlanTagValid : 1; // VLTV
UINT8 InnerVlanTagValid : 1; // IVLTV
UINT8 InnverVlanTagControl : 2; // IVTIR
UINT8 InnerVlanTagControl : 2; // IVTIR
UINT8 Reserved20 : 3;
UINT8 DescriptorError : 1; // DE

Expand Down
91 changes: 72 additions & 19 deletions drivers/net/dwc_eqos/device.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -574,6 +574,25 @@ AdapterOffloadSetRxChecksum(
TraceLoggingBoolean(Udp));
}

static EVT_NET_ADAPTER_OFFLOAD_SET_GSO AdapterOffloadSetGso;
static void
AdapterOffloadSetGso(
_In_ NETADAPTER adapter,
_In_ NETOFFLOAD offload)
{
// PASSIVE_LEVEL, nonpaged (resume path)
UNREFERENCED_PARAMETER(adapter);
auto const LsoIPv4 = NetOffloadIsLsoIPv4Enabled(offload);
auto const LsoIPv6 = NetOffloadIsLsoIPv6Enabled(offload);
auto const UsoIPv4 = NetOffloadIsUsoIPv4Enabled(offload);
auto const UsoIPv6 = NetOffloadIsUsoIPv6Enabled(offload);
TraceEntryExit(AdapterOffloadSetGso, LEVEL_INFO,
TraceLoggingBoolean(LsoIPv4),
TraceLoggingBoolean(LsoIPv6),
TraceLoggingBoolean(UsoIPv4),
TraceLoggingBoolean(UsoIPv6));
}

static EVT_WDF_DEVICE_D0_ENTRY DeviceD0Entry;
static NTSTATUS
DeviceD0Entry(
Expand Down Expand Up @@ -644,7 +663,7 @@ DeviceD0Entry(
//macConfig.PadOrCrcStripEnable = true; // Why doesn't this work?
//macConfig.CrcStripEnableForType = true; // Why doesn't this work?
macConfig.GiantPacketSizeLimitControlEnable = context->config.jumboFrame > JumboPacketMin;
macConfig.ChecksumOffloadEnable = context->config.txCoeSel || context->config.rxCoeSel;
macConfig.ChecksumOffloadEnable = true;
Write32(&context->regs->Mac_Configuration, macConfig);

MacExtConfiguration_t macExtConfig = {};
Expand Down Expand Up @@ -960,13 +979,35 @@ DevicePrepareHardware(
status = STATUS_DEVICE_CONFIGURATION_ERROR;
goto Done;
}

if (!context->feature0.TxChecksumOffload)
{
// Could adapt at runtime if needed, but assume it's present for now.
TraceWrite("DevicePrepareHardware-TxChecksumOffload-required", LEVEL_ERROR);
status = STATUS_DEVICE_CONFIGURATION_ERROR;
goto Done;
}

if (!context->feature0.RxChecksumOffload)
{
// Could adapt at runtime if needed, but assume it's present for now.
TraceWrite("DevicePrepareHardware-RxChecksumOffload-required", LEVEL_ERROR);
status = STATUS_DEVICE_CONFIGURATION_ERROR;
goto Done;
}

if (!context->feature1.TsoEn)
{
// Could adapt at runtime if needed, but assume it's present for now.
TraceWrite("DevicePrepareHardware-TsoEn-required", LEVEL_ERROR);
status = STATUS_DEVICE_CONFIGURATION_ERROR;
goto Done;
}
}

// Device Config

{
context->config.txCoeSel = context->feature0.TxChecksumOffload;
context->config.rxCoeSel = context->feature0.RxChecksumOffload;
context->config.pblX8 = true;
context->config.pbl = 8;
context->config.txPbl = context->config.pbl;
Expand Down Expand Up @@ -1222,7 +1263,7 @@ DevicePrepareHardware(

NET_ADAPTER_TX_CAPABILITIES txCaps;
NET_ADAPTER_TX_CAPABILITIES_INIT_FOR_DMA(&txCaps, &dmaCaps, QueuesSupported);
txCaps.MaximumNumberOfFragments = QueueDescriptorMinCount - 2; // = 1 hole in the ring + 1 context descriptor.
txCaps.MaximumNumberOfFragments = TxMaximumNumberOfFragments;

// TODO: Driver-managed buffering + multi-descriptor receive would
// reduce memory overhead of Jumbo Packets.
Expand All @@ -1246,21 +1287,18 @@ DevicePrepareHardware(
NetPacketFilterFlagPromiscuous;
NetAdapterSetReceiveFilterCapabilities(context->adapter, &rxFilterCaps);

if (context->config.txCoeSel)
{
NET_ADAPTER_OFFLOAD_TX_CHECKSUM_CAPABILITIES txChecksumCaps;
NET_ADAPTER_OFFLOAD_TX_CHECKSUM_CAPABILITIES_INIT(&txChecksumCaps, {}, AdapterOffloadSetTxChecksum);
txChecksumCaps.Layer3Flags =
NetAdapterOffloadLayer3FlagIPv4NoOptions |
NetAdapterOffloadLayer3FlagIPv4WithOptions |
NetAdapterOffloadLayer3FlagIPv6NoExtensions |
NetAdapterOffloadLayer3FlagIPv6WithExtensions;
txChecksumCaps.Layer4Flags =
NetAdapterOffloadLayer4FlagTcpNoOptions |
NetAdapterOffloadLayer4FlagTcpWithOptions |
NetAdapterOffloadLayer4FlagUdp;
NetAdapterOffloadSetTxChecksumCapabilities(context->adapter, &txChecksumCaps);
}
NET_ADAPTER_OFFLOAD_TX_CHECKSUM_CAPABILITIES txChecksumCaps;
NET_ADAPTER_OFFLOAD_TX_CHECKSUM_CAPABILITIES_INIT(&txChecksumCaps,
NetAdapterOffloadLayer3FlagIPv4NoOptions |
NetAdapterOffloadLayer3FlagIPv4WithOptions |
NetAdapterOffloadLayer3FlagIPv6NoExtensions |
NetAdapterOffloadLayer3FlagIPv6WithExtensions,
AdapterOffloadSetTxChecksum);
txChecksumCaps.Layer4Flags =
NetAdapterOffloadLayer4FlagTcpNoOptions |
NetAdapterOffloadLayer4FlagTcpWithOptions |
NetAdapterOffloadLayer4FlagUdp;
NetAdapterOffloadSetTxChecksumCapabilities(context->adapter, &txChecksumCaps);

NET_ADAPTER_OFFLOAD_RX_CHECKSUM_CAPABILITIES rxChecksumCaps;
NET_ADAPTER_OFFLOAD_RX_CHECKSUM_CAPABILITIES_INIT(&rxChecksumCaps,
Expand All @@ -1272,6 +1310,21 @@ DevicePrepareHardware(
NetAdapterOffloadIeee8021PriorityTaggingFlag |
NetAdapterOffloadIeee8021VlanTaggingFlag);
NetAdapterOffloadSetIeee8021qTagCapabilities(context->adapter, &ieee8021qCaps);

NET_ADAPTER_OFFLOAD_GSO_CAPABILITIES gsoCaps;
NET_ADAPTER_OFFLOAD_GSO_CAPABILITIES_INIT(&gsoCaps,
NetAdapterOffloadLayer3FlagIPv4NoOptions |
NetAdapterOffloadLayer3FlagIPv4WithOptions |
NetAdapterOffloadLayer3FlagIPv6NoExtensions |
NetAdapterOffloadLayer3FlagIPv6WithExtensions,
NetAdapterOffloadLayer4FlagUdp |
NetAdapterOffloadLayer4FlagTcpNoOptions |
NetAdapterOffloadLayer4FlagTcpWithOptions,
TxMaximumOffloadSize,
2, // MinimumSegmentCount
AdapterOffloadSetGso);
gsoCaps.Layer4HeaderOffsetLimit = TxLayer4HeaderOffsetLimit;
NetAdapterOffloadSetGsoCapabilities(context->adapter, &gsoCaps);
}

// Initialize adapter.
Expand Down
4 changes: 1 addition & 3 deletions drivers/net/dwc_eqos/device.h
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,6 @@ struct DeviceContext; // TODO: if we do multi-queue, make a DeviceQueueContext s
// Information about the device provided to the queues.
struct DeviceConfig
{
bool txCoeSel; // MAC_HW_Feature0\TXCOESEL (hardware support for tx checksum offload).
bool rxCoeSel; // MAC_HW_Feature0\RXCOESEL (hardware support for rx checksum offload).
bool pblX8; // _DSD\snps,pblx8 (default = 1).
UINT8 pbl; // _DSD\snps,pbl (default = 8; effect depends on pblX8).
UINT8 txPbl; // _DSD\snps,txpbl (default = pbl; effect depends on pblX8).
Expand All @@ -21,7 +19,7 @@ struct DeviceConfig
UINT8 blen : 7; // AXIC\snps,blen bitmask of 7 booleans 4..256 (default = 4, 8, 16).
bool txFlowControl; // Adapter configuration (Ndi\params\*FlowControl).
bool rxFlowControl; // Adapter configuration (Ndi\params\*FlowControl).
UINT16 jumboFrame; // Adapter configuration (Ndi\params\*JumboFrame). 1514..4088
UINT16 jumboFrame; // Adapter configuration (Ndi\params\*JumboFrame). 1514..9014.

UINT16 RxBufferSize() const
{
Expand Down
9 changes: 0 additions & 9 deletions drivers/net/dwc_eqos/driver.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -2,15 +2,6 @@
#include "device.h"
#include "trace.h"

/*
Possible areas for improvement:
- Tx segmentation offload.
- Run against network test suites and fix any issues.
- Power control, wake-on-LAN, ARP offload.
- Configure speed, duplex in Ndi\params.
- Multi-queue support?
*/

TRACELOGGING_DEFINE_PROVIDER(
TraceProvider,
"dwc_eqos",
Expand Down
Loading

0 comments on commit 87e3e62

Please sign in to comment.