How to pass user space data to dmaengine client usage call? - linux-device-driver

[EDITED]
I have a board on arm64 with fpga (SoC).
The task is simple:
make possible to transfer data from/to "User Space" area (app) to/from "Kernel Space" phys mem (device mem = fpga regs), with and without dma support usage (streaming type). That dma is in the board (ZynqMP / GDMA).
I will have several devices - on fpga and outside, which should use this communication, but now I'm working only with fpga-ddr4 mem area.
Now I see the next logic flow:
some initialization (dma parameters and so on);
ioremap() a fpga device area;
make a buffer (by kzalloc() or another) - this buffer I should give to the US by mmap fops;
make a scatterlist from the buffer (pseudo-code below);
use the scatterlist with dmaengine to transfer data;
// scatterlist init pseudo-code
struct scatterlist sgl[2];
struct scatterlist *sge;
int i, buf_n, err_code;
__u8 *buffer; // allocated earlier
sg_init_table(sgl, ARRAY_SIZE(sgl));
for_each_sg(sgl, sge, ARRAY_SIZE(sgl), i) {
struct page *pg = virt_to_page(buffer + i * PAGE_SIZE);
dma_addr_t dma_handle = dma_map_page(&pdev->dev, pg, 0, PAGE_SIZE, direction /* DMA_TO_DEVICE */);
if ((err_code = dma_mapping_error(&pdev->dev, dma_handle))) {
dev_err(&pdev->dev, "dma page mapping failed! (code: %i)\n", err_code);
break;
}
sg_set_page(sge, pg, PAGE_SIZE, 0);
}
dma_map_sg(&pdev->dev, sgl, ARRAY_SIZE(sgl), direction) // with appropriate check
Now I misunderstanding next - how or where the destination controled? I mean, I had allocated the buffer in RAM, make scatterlist from it and give this list by argument of dmaengine funtions for transferring. But I dont set/use ioremapped device mem area to save this buffer data! Is this dma works only with appropriate RAM memory area and I should copy buffer to the device area? Or, should I use the ioremapped area as my buffer?
Is it right flow? Explain me my mistakes pleaes?

Related

NetworkingDriverKit - How can I access packet data?

I've been creating a virtual ethernet interface. I've opened asynchronous communication with a controlling application and every time there are new packets, the controlling app is notified and then asks for the packet data. The packet data is stored in a simple struct, with uint8_t[1600] for the bytes, and uint32_t for the length. The dext is able to populate this struct with dummy data every time a packet is available, with the dummy data visible on the controlling application. However, I'm struggling to fill it with the real packet data.
The IOUserNetworkPacket provides metadata about a packet. It contains a packets timestamp, size, etc, but it doesn't seem to contain the packet's data. There are the GetDataOffset() and GetMemorySegmentOffset() methods which seem to return byte offsets for where the packet data is located in their memory buffer. My instinct tells me to add this offset to the pointer of wherever the packet data is stored. The problem is I have no idea where the packets are actually stored.
I know they are managed by the IOUserNetworkPacketBufferPool, but I don't think that's where their memory is. There is the CopyMemoryDescriptor() method which gives an IOMemoryDescriptor of its contents. I tried using the descriptor to create an IOMemoryMap, using it to call GetAddress(). The pointers to all the mentioned objects lead to junk data.
I must be approaching this entirely wrong. If anyone knows how to access the packet data, or has any ideas, I would appreciate any help. Thanks.
Code snippet within IOUserClient::ExternalMethod:
case GetPacket:
{
IOUserNetworkPacket *packet =
ivars->m_provider->getPacket();
GetPacket_Output output;
output.packet_size = packet->getDataLength();
IOUserNetworkPacketBufferPool *pool;
packet->GetPacketBufferPool(&pool);
IOMemoryDescriptor *memory = nullptr;
pool->CopyMemoryDescriptor(&memory);
IOMemoryMap *map = nullptr;
memory->CreateMapping(0, 0, 0, 0, 0, &map);
uint64_t address = map->GetAddress()
+ packet->getMemorySegmentOffset();
memcpy(output.packet_data,
(void*)address, packet->getDataLength());
in_arguments->structureOutput = OSData::withBytes(
&output, sizeof(GetPacket_Output));
// free stuff
} break;
The problem was caused by an IOUserNetworkPacketBufferPool bug. My bufferSize was set to 1600 except this value was ignored and replaced with 2048. The IOUserNetworkPackets acted as though the bufferSize was 1600 and so they gave an invalid offset.
Creating the buffer pool and mapping it:
kern_return_t
IMPL(FooDriver, Start)
{
// ...
IOUserNetworkPacketBufferPool::Create(this, "FooBuffer",
32, 32, 2048, &ivars->packet_buffer));
packet_buffer->CopyMemoryDescriptor(ivars->packet_buffer_md);
ivars->packet_md->Map(0, 0, 0, IOVMPageSize,
&ivars->packet_buffer_addr, &ivars->packet_buffer_length));
// ...
}
Getting the packet data:
void FooDriver::getPacketData(
IOUserNetworkPacket *packet,
uint8_t *packet_data,
uint32_t *packet_size
) {
uint8_t packet_head;
uint64_t packet_offset;
packet->GetHeadroom(&packet_head);
packet->GetMemorySegmentOffset(&packet_offset);
uint8_t *buffer = (uint8_t*)(ivars->packet_buffer_addr
+ packet_offset + packet_head);
*packet_size = packet->getDataLength();
memcpy(packet_data, buffer, *packet_size);
}

How to read address of reserved memory in device tree

I'm writing a device driver in Linux for a small device. The device has some particular memory constraints that forces me to carve out a piece of memory, and in my driver I need to know the address (and the size) of the reserved memory
/ {
reserved-memory {
my_reserve: my_reserve#a0000000 {
compatible = "shared-dma-pool";
reg = <0 0xa0000000 0 0x20000>;
no-map;
};
};
my_device {
compatible = "my_device";
memory-region = <&my_reserve>;
};
};
How do I read the physical address of the reserved memory (i.e. how do I read the value 0xa000'0000) in my device driver? There seem to be a API for reserved memory, but nothing that I can see that returns a struct reserved_mem *
If struct device *hwdev points to your hardware struct device (for example, if hwdev points to the dev member of a struct platform_device), then this snippet illustrates how to access the device tree node of the reserved memory region and convert that to a struct resource.
struct device_node *memnp;
struct resource mem_res;
int rc;
/* Get pointer to memory region device node from "memory-region" phandle. */
memnp = of_parse_phandle(hwdev->of_node, "memory-region", 0);
if (!memnp) {
dev_err(hwdev, "no memory-region node\n");
rc = -ENXIO;
goto err1;
}
/* Convert memory region to a struct resource */
rc = of_address_to_resource(memnp, 0, &mem_res);
/* finished with memnp */
of_node_put(memnp);
if (rc) {
dev_err(hwdev, "failed to translate memory-region to a resource\n");
goto err1;
}
The start address ends up in mem_res.start and the length is given by resource_size(&mem_res);.

Detecting CAN bus errors under socketCAN linux driver

Our products are using a well known CANopen stack, which uses socketCAN, on an embedded Beaglebone Black based system running under Ubuntu 14.04 LTS. But for some reason, even though the stack we're using will detect when the CAN bus goes into a PASSIVE state or even a BUS OFF state, it never indicates when the CAN bus recovers from errors and goes out of a PASSIVE or warning state, and enters a non error state.
If I were to query the socketCAN driver directly (via ioctl calls), would I be able to detect when the CAN bus goes in and out of a warning state (which is less than 127 errors), in and out of a PASSIVE state (greater than 127 errors) or goes BUS OFF (greater than 255 errors)?
I'd like to know if I'd be wasting my time doing this or is there a better way to detect, accurately and in real-time, all conditions of a CAN bus?
I have only a partial solution to that problem.
As you are using socketCAN, the interface is seen as a standard network interface, on which we can query the status.
Based on How to check Ethernet in Linux? (replace "eth0" by "can0"), you can check the link status.
This is not real-time, but can be executed in a periodic thread to check the bus state.
So while this is an old question, I just happened to stumble upon it (while searching for something only mildly related).
SocketCAN provides all the means for detecting error frames OOB.
Assuming your code looks similar to this:
int readFromCan(int socketFd, unsigned char* data, unsigned int* rxId) {
int32_t bytesRead = -1;
struct can_frame canFrame = {0};
bytesRead = (int32_t)read(socketFd, &canFrame, sizeof(can_frame));
if (bytesRead >= 0) {
bytesRead = canFrame.can_dlc;
if (data) {
memcpy(data, canFrame.data, readBytes);
}
if (rxId) {
*rxId = canFrame.can_id; // This will come in handy
}
}
return bytesRead;
}
void doStuffWithMessage() {
int32_t mySocketFd = fooGetSocketFd();
int32_t receiveId = 0;
unsigned char myData[8] = {0};
int32_t dataLength = 0;
if ((dataLength = readFromCan(mySocketFd, myData, &receiveId) == -1) {
// Handle error
return;
}
if (receiveId & CAN_ERR_MASK != 0) {
// Handle error frame
return;
}
// Do stuff with your data
}

Writing PCI driver for DMA transfer on Qemu

I am writing a PCI device on Qemu and driver(LKM) in the guest OS. While Qemu provides an example PCI device, edu(edu.txt and edu.c) with it's distribution, I am having trouble writing the kernel module to do DMA transfer. A basic driver has been covered here but it does not support DMA.
I am following the implementation of the link and this. I tried to transmit buffer to the PCI device from the IRQ handler. The device can read the data (pci_dma_read) but the I am not getting the correct data that I am supposed to receive. Here is the code segment that is doing DMA transfer:
static int write_to_HyPerf(void *dev, void* addr, uint32_t size)
{
/* ----------------------------------------------------------------------------
* PCI address 'addr':
* addr -> DMA source address
* 0x40000 -> DMA destination address
* 100 -> DMA transfer count
* 1 -> DMA command register
* while (DMA command register & 1)
*--------------------------------------------------------------------------------
*/
iowrite32((u32 *)dma_handle_to_device, mmio + IO_DMA_SRC);
iowrite32(DMA_START, mmio + IO_DMA_DST);
iowrite32((u32 *)size, mmio + IO_DMA_XCNT);
iowrite32(DMA_CMD | DMA_IRQ, mmio + IO_DMA_CMD);
}
I also have setup coherent mapping using dma_alloc_coherent.
vaddr_to_device = dma_alloc_coherent(&(dev->dev), DMA_SIZE, &dma_handle_to_device, GFP_ATOMIC);
The complete code is available here. What am I doing wrong?
Could be that you having problem in your driver.
In this case you can use this:
https://github.com/cirosantilli/linux-kernel-module-cheat/blob/master/kernel_modules/qemu_edu.c
and you can use dd command as in this script to write and read from your device:
https://github.com/cirosantilli/linux-kernel-module-cheat/blob/master/rootfs_overlay/lkmc/qemu_edu.sh
Then all you need is to write the wanted dma values to the correct address
like in the edu.c code:
case 0x80:
dma_rw(edu, false, &val, &edu->dma.src, false);
break;
case 0x88:
dma_rw(edu, false, &val, &edu->dma.dst, false);
break;
case 0x90:
dma_rw(edu, false, &val, &edu->dma.cnt, false);
break;
case 0x98:
dma_rw(edu, false, &val, &edu->dma.cmd, false);
break;

Access to internal Xilinx FPGA block RAM

I'm writing a device driver for Xilinx Virtex-6 X8 PCI Express Gen 2 Evaluation/Development Kit SX315T FPGA. My OS is openSUSE 11.3 64 bit.
In the documentation for this device (Virtex-6 FPGA Integrated Block form PCI Express User Guide UG517 (v5.0) April 19, 2010, page 219) says:
The PIO design is a simple target-only application that interfaces with the Endpoint for
PCIe core’s Transaction (TRN) interface and is provided as a starting point for customers to build their own designs. The following features are included:
• Four transaction-specific 2 KB target regions using the internal Xilinx FPGA block
RAMs, providing a total target space of 8192 bytes
• Supports single DWORD payload Read and Write PCI Express transactions to
32-/64-bit address memory spaces and I/O space with support for completion TLPs
• Utilizes the core’s trn_rbar_hit_n[6:0] signals to differentiate between TLP destination
Base Address Registers
• Provides separate implementations optimized for 32-bit, 64-bit, and 128-bit TRN
interfaces
In the device is available BAR0 and BAR2 length 128 bytes.
I'm trying to access internal Xilinx FPGA block RAM, for that I am mapping BAR0 in virtual space kernel.
struct pcie_dev {
struct pci_dev* dev;
struct cdev chr_dev;
atomic_t dev_available;
u32 IOBaseAddress;
u32 IOLastAddress;
void* __iomem bar;
void *virt_addr;
u32 length;
unsigned long sirqNum;
void *private_data; };
struct pcie_dev cur_pcie_dev;
cur_pcie_dev.IOBaseAddress = pci_resource_start(dev, 0);
cur_pcie_dev.IOLastAddress = pci_resource_end(dev, 0);
cur_pcie_dev.length=pci_resource_len(dev,0);
cur_pcie_dev.bar=pci_iomap(dev, 0,cur_pcie_dev.length);
IOBaseAddress is 0xfbbfe000
IOLastAddress is 0xfbbfe07f
length=128;
Using IOCTL I try, write/read data.
case IOCTL_INFO_DEVICE:
{
u32 *rcslave_mem = (u32 *)pCur_dev->bar;
u32 result = 0;
u32 value = 0;
int i;
for (i = 0; i <2048 ; i++) {
printk(KERN_DEBUG "Writing 0x%08x to 0x%p.\n",
(u32)value, (void *)rcslave_mem + i);
iowrite32(value, rcslave_mem + i);
value++;
}
/* read-back loop */
value = 0;
for (i = 0; i < 2048; i++) {
result = ioread32(rcslave_mem + i);
printk(KERN_DEBUG "Wrote 0x%08x to 0x%p, but read back 0x%08x.\n",
(u32)value, (void *)rcslave_mem + i, (u32)result);
value++;
}
But it turns out to write and read only 32 values​​. As I understand it, the recording takes place in BAR0 (4 byte * 32 values ​​= 128 bytes), but not in internal Xilinx memory.I tried to go the other way.
cur_pcie_dev.IOBaseAddress = pci_resource_start(dev, 0);
cur_pcie_dev.IOLastAddress = pci_resource_end(dev, 0);
cur_pcie_dev.length=pci_resource_len(dev,0);
flags = pci_resource_flags(dev,0);
if (flags & IORESOURCE_MEM) {
if (request_mem_region(cur_pcie_dev.IOBaseAddress,cur_pcie_dev.length, DEVICE_NAME)== NULL) {
return -EBUSY;}
cur_pcie_dev.virt_addr=ioremap_nocache(cur_pcie_dev.IOBaseAddress,cur_pcie_dev.length);
if (cur_pcie_dev.virt_addr == NULL) {
printk(KERN_ERR "ERROR: BAR%u remapping FAILED\n",0);
return -ENOMEM;
}
printk(KERN_INFO " Allocated I/O memory range %#lx-%#lx\n", cur_pcie_dev.IOBaseAddress,(cur_pcie_dev.IOBaseAddress+cur_pcie_dev.length-1));
} else {
printk(KERN_ERR "ERROR: Invalid PCI region flags\n");
return -EIO;
}
Then
address = ((unsigned int)pCur_dev->virt_addr+pd.Address);
iowrite32(pd.Value,(unsigned int*) address);
address = ((unsigned int)pCur_dev->virt_addr+pd.Address);
pd.Value = ioread32((unsigned int *)address);
I use a summing virtual address and the address, which specifies the user. But the result is read / write operations is also not true. Tell me what I'm doing wrong.
P.S.Sorry for my bad English
What is the reason you are trying to access internal block RAM of your board? I think a normal behavior of a device driver (which your device here is a PCI Express interface), would suffice if you are using Programmed I/O (PIO) on your FPGA. When you write to your device driver, the data would be transferred to block RAM by downloaded IP core on FPGA side (and also in reverse).
Take a look at Linux Driver in xapp1022 (Memory Endpoint Test) package from Xilinx.
P.S.: I know it's an old question and you may found your answer way sooner :)