When writing message length is more than 1024B(mtu), it failed in softroce mode - rdma

When I am writing message length is more than 1024B(mtu), it failed in softroce mode, pls help check why.
Using the standard tool ib_write_lat to test:
when ib_write_lat -s 1024 -n 5
When ib_write_lat -s 1025 -n 5, it fails.
My softroce version in in Red Hat Enterprise Linux Server release 7.4 (Maipo)
Is it a bug in softroce?

No it isn't a bug. I had similar problems.
What did you configure at your interface configuration?
I expect that you have a MTU of 1500 Bytes configured (or leaved the default value), this will result in RoCE using 1024. If you configure your interface MTU to 4200 you can use the ib_write_lat command with up to 4096 bytes.
InfiniBand protocol Maximum Transmission Unit (MTU) defines several fix size MTU: 256, 512, 1024, 2048 or 4096 bytes.
RoCE based application that uses RDMA that runs over Ethernet should take into account that the RoCE MTU is smaller than the Ethernet MTU. (normally 1500 is the default).
https://community.mellanox.com/docs/DOC-1447

Related

Successfully initialized wpa_supplicant but does’t work

I'm going mad with a stupid issue I can't solve.
During the testing of my Yocto project I always used connmactl in order to connect my board to the internet.
Now I am going to release the product but before releasing I am working on an “internet connection manager”
I guess I can’t use connmanctl anymore since it consist in an interactive command (isn’t it?) so I’m going to use directly wpa_supplicant.
In my script I edit wpa_supplicant.conf as follow:
root#localhost:~# cat /etc/wpa_supplicant/wpa_supplicant.conf
ctrl_interface=/var/run/wpa_supplicant
ctrl_interface_group=0
update_config=1
bgscan=""
network={
ssid="Obi_Lan_Kenobi"
psk="TheForceIsStrongWithThisOne"
}
After that I try to start wpa_supplicant with this command:
wpa_supplicant -B -i mlan0 -c /etc/wpa_supplicant/wpa_supplicant.conf wext
as result of this command I get:
Successfully initialized wpa_supplicant
But if I try to ping google.com (or any other website) i seet that the network doesn’t work, In particular I get this message: ping: sendto: Network is unreachable
Everything is working under connmanctl, but not under wpa_supplicant.
The strange thing is that running iw command everything seems to be configurated in the right way:
root#localhost:~# iw dev mlan0 link
Connected to 56:0c:ff:37:1a:69 (on mlan0)
SSID: Obi_Lan_Kenobi
freq: 2412
RX: 32154 bytes (310 packets)
TX: 19436 bytes (128 packets)
signal: -38 dBm
rx bitrate: 1.0 MBit/s
tx bitrate: 72.2 MBit/s MCS 7 short GI
bss flags: short-preamble short-slot-time
dtim period: 2
beacon int: 100
I honestly can’t understand why.
Does anybody have a suggestion about that?

how can I make large number of connections without error at client side

I have written a program in golang to make request about 2000qps to different remote ip with local port randomly selected by linux, and close request immediately after connection established, but still encounter bind: address already in use error periodically
what I have done:
net.ipv4.ip_local_port_range is 15000-65535
net.ipv4.tcp_tw_recycle=1 net.ipv4.tcp_tw_reuse=1 net.ipv4.tcp_fin_timeout=30
above is sockstat:
sockets: used 1200 TCP: inuse 2302 orphan 1603 tw 40940 alloc 2325 mem 201
I don't figure it out why this error still there with kernel selecting available local port,will kernel return a port in use ?
This is a good answer from 2012:
https://serverfault.com/questions/342741/what-are-the-ramifications-of-setting-tcp-tw-recycle-reuse-to-1#434669
As of 2018, tcp_tw_recycle exists only in the sysctl binary, is otherwise gone from the kernel:
https://github.com/torvalds/linux/search?utf8=%E2%9C%93&q=tcp_tw_recycle&type=
tcp_tw_reuse is still in use as described in the above answer:
https://github.com/torvalds/linux/blob/master/net/ipv4/tcp_ipv4.c#L128
However, while a TCP_TIMEWAIT_LEN is in use:
https://github.com/torvalds/linux/search?utf8=%E2%9C%93&q=TCP_TIMEWAIT_LEN&type=
the value is hardcoded:
https://github.com/torvalds/linux/blob/master/include/net/tcp.h#L120
and tcp_fin_timeout refers to a different state:
https://github.com/torvalds/linux/blob/master/Documentation/networking/ip-sysctl.txt#L294
One can relatively safely change the local port range to 1025-65535.
For kicks, if there were a situation where this client was talking to servers and network under my control, I would build a new kernel with a not-to-spec TCP_TIMEWAIT_LEN, and perhaps also fiddle with tcp_max_tw_buckets:
https://github.com/torvalds/linux/blob/master/Documentation/networking/ip-sysctl.txt#L379
But doing so in other circumstances- if this client is behind a NAT and talking to common public servers- will likely be disruptive.

Can't use CAIDA Dataset with tcpreplay to simulate network traffic in network

I have a server and I need to simulate real network traffic. I've been asked to do this using a CAIDA Dataset. I have downloaded the public Dataset which can be found here:
CAIDA Public Dataset
I also need to rewrite the source ip address in the .pcap file to be the one of the server. I tried doing it the way it's described at the end of this page: tcprewrite wiki
I run:
tcprewrite --infile=oc48-mfn.dirA.20020814-160000.UTC.anon.pcap --outfile=oc48-mfn.dirA.20020814-160000.UTC.anon_rewrite.pcap --dstipmap=0.0.0.0/0:10.101.30.60 --enet-dmac=00:0c:29:00:b1:bd
And I get:
Warning: oc48-mfn.dirA.20020814-160000.UTC.anon.pcap was captured using a snaplen of 48 bytes. This may mean you have truncated packets.
Fatal Error: From ./plugins/dlt_hdlc/hdlc.c:dlt_hdlc_encode() line 255:
Non-HDLC packet requires --hdlc-address
So after some tries like this I finally run these to get an error free tcprewrite:
tcpprep --auto=bridge --pcap=oc48-mfn.dirA.20020814-160000.UTC.anon.pcap --cachefile=cache1.cache
Which gives:
Warning: oc48-mfn.dirA.20020814-160000.UTC.anon.pcap was captured using a snaplen of 48 bytes. This may mean you have truncated packets.
Warning: oc48-mfn.dirA.20020814-160000.UTC.anon.pcap was captured using a snaplen of 48 bytes. This may mean you have truncated packets.
And then I run:
tcprewrite --infile=oc48-mfn.dirA.20020814-160000.UTC.anon.pcap --outfile=oc48-mfn.dirA.20020814-160000.UTC.anon_rewrite.pcap --dstipmap=0.0.0.0/0:10.101.30.60 --enet-dmac=00:0c:29:00:b1:bd --cachefile=cache1.cache --hdlc-control=0 --hdlc-address=0xBF
And I get:
Warning: oc48-mfn.dirA.20020814-160000.UTC.anon.pcap was captured using a snaplen of 48 bytes. This may mean you have truncated packets.
So it seems like a success, except the warning that shows up in every command.
I open the new .pcap file with tcpdump to check that the destination IP addresses have changed to the one of the server and it has been done.
So then I run tcpreplay:
tcpreplay -i ens160 --loop 5 --unique-ip oc48-mfn.dirA.20020814-160000.UTC.anon.pcap
And I run tcpdump on the server to see the traffic from the .pcap file, but the traffic looks like this:
13:30:50.194780 05:8c:55:6f:40:00 (oui Unknown) > 0f:00:08:00:45:00 (oui
Unknown), ethertype Unknown (0x3406), length 60:
0x0000: ed11 f484 7785 f477 0d79 0050 0487 007c ....w..w.y.P...|
0x0010: e7d5 d203 c32b 5010 27f7 aa51 0000 4854 .....+P.'..Q..HT
0x0020: 5450 0000 0000 0000 0000 0000 0000 TP............
I have tried the smallFlow.pcap from the sample captures of tcpreplay: Sample Captures
and it worked just fine.
So any suggestions on how to properly use the CAIDA .pcap files?
Your stated goal is "I need to simulate real network traffic", but you're using pcaps where "the payload has been removed from all packets" (per the CAIDA web page you linked to).
These two statements are in conflict with each other. All your packets are literally no larger then 48bytes which is merely enough for the TCP/IP header (and then even so, may not be sufficient in all cases). This is what the warning is telling you. You can't put the data back.
You'll need to find a new source of pcaps.

Accessing real frame buffer of PCI card

I am trying to access the framebuffer on my systems VGA controller card.
lscpi -vn gives:
00:02.0 0300: 8086:2a02 (rev 0c) (prog-if 00 [VGA controller])
Subsystem: 1028:022f
Flags: bus master, fast devsel, latency 0, IRQ 45
Memory at fea00000 (64-bit, non-prefetchable) [size=1M]
Memory at e0000000 (64-bit, prefetchable) [size=256M]
I/O ports at eff8 [size=8]
Expansion ROM at <unassigned> [disabled]
Capabilities: [90] MSI: Enable+ Count=1/1 Maskable- 64bit-
Capabilities: [d0] Power Management version 3
Kernel driver in use: i915
Now, I access the device and I get:
fb_base = pci_resource_start( devp, 0 ); **output: FEA00000**
fb_size = pci_resource_len( devp, 0 ); **output: 1MB**
So the range of framebuffer is FEA00000 - FEB00000
But from the lspci -vn output This region is non prefetchable.
Does that mean I am not pointing to the frame buffer at all.
Is my framebuffer at address E0000000:
The driver currently using the resource is the Intel i915
So maybe when I request region or IRQ it can clash if not shared by that driver.
If I remove the i915 rmmod it to insmod my driver, will my screen go blank.
Please help.
Thanks.

Ping 100% packet loss under Openwrt,driver relevant issue

Is there anyone who's familiar with Atheros solution and OpenWrt system ?
My testbed runs well under Atheros-SDK image while it was found that my ethernet interface(eth0) arose "ping 100% packet loss" when running on OpenWrt image. I even continue to use the registers' setting value,e.g.,ETH_CONF,XMII_CONF,but it doesn't work yet.
Any suggestions will be appreciated.Thanks!!
my ethernet setting at arch/mips/ath79/mach-db120.c is:
ath79_register_mdio(0, ~(BIT(5)));
ath79_eth0_data.phy_if_mode = PHY_INTERFACE_MODE_RGMII;
ath79_eth0_data.phy_mask = BIT(5);default is BIT(0)
ath79_eth0_data.mii_bus_dev = &ath79_mdio0_device.dev;
ath79_eth0_pll_data.pll_1000 = 0x06000000;
ath79_eth0_data.duplex = DUPLEX_FULL;
ath79_register_eth(0);
If I modified the th79_eth0_pll_data.pll_1000 to 0x46000000
(set the 1805002c GIGE_QUAD bit),then it can ping but still has 3%-5% or even more ping loss.strange! I realy want to know is there any issue with ag71xx relevant code ?
sectional bootlog is:
Starting kernel ...
......
......
[ 0.650000] libphy: ag71xx_mdio: probed
[ 0.650000] eth0: Atheros AG71xx at 0xb9000000, irq 4, mode:RGMII
[ 1.470000] ag71xx ag71xx.0 eth0: connected to PHY at ag71xx-mdio.0:05 [uid=004dd072, driver=Generic PHY]
root#OpenWrt:/# ping 192.168.1.99
PING 192.168.1.99 (192.168.1.99): 56 data bytes
C
--- 192.168.1.99 ping statistics ---
3 packets transmitted, 0 packets received, 100% packet loss
root#OpenWrt:/#
You need to specify which OpenWRT version you are using to get some answer !