What is the difference between pci_enable_device_mem and pci_enable_device? - pci

What is the difference between pci_enable_device_mem and pci_enable_device?
In ixgbe pf driver uses pci_enable_device_mem and vf driver uses pci_enable_device.

Well, both functions internally call pci_enable_device_flags(). The difference is that pci_enable_device_mem() variant initializes only Memory-mapped BARs, whereas pci_enable_device() will initialize both Memory-mapped and IO BARs.
If your PCI device does not have IO spaces (most probably this is indeed the case) you can easily use pci_enable_device_mem().
This is the code from drivers/pci/pci.c:
int pci_enable_device_mem(struct pci_dev *dev)
{
return pci_enable_device_flags(dev, IORESOURCE_MEM);
}
int pci_enable_device(struct pci_dev *dev)
{
return pci_enable_device_flags(dev, IORESOURCE_MEM | IORESOURCE_IO);
}

pci_enable_device_mem — Initialize a device for use with Memory space
https://www.kernel.org/doc/htmldocs/kernel-api/API-pci-enable-device-mem.html
pci_enable_device — Initialize device before it's used by a driver.
https://www.kernel.org/doc/htmldocs/kernel-api/API-pci-enable-device.html
The first one is the initialize the device so it can be used with memory space. The second one will initialize a device before it is used by a driver.

Related

Microchip dsPIC33 C30 function pointer size?

The C30 user manual manual states that pointers near and far are 16bits wide.
How then does this address the full code memory space which is 24bits wide?
I am so confused as I have an assembler function (called from C) returning the program counter (from the stack) where a trap error occurred. I am pretty sure it sets w1 and w0 before returning.
In C, the return value is defined as a function pointer:
void (*errLoc)(void);
and the call is:
errLoc = getErrLoc();
When I now look at errLoc, it is a 16 bit value and I just do not think that is right. Or is it? Can function pointers (or any pointers) not access the full code address space?
All this has to do with a TRAP Adress error I am trying to figure out for the past 48 hours.
I see you are trying to use the dsPIC33Fxxxx/PIC24Hxxxx fault interrupt trap example code.
The problem is that pointer size for dsPIC33 (via the MPLAB X C30 compiler) is 16bit wide, but the program counter is 24bits. Fortunately the getErrLoc() assembly function does return the correct size.
However the example C source code function signature provided is void (*getErrLoc(void))(void) which is incorrect as it will be treating the return values as if it was a 16bit pointer. You want to change the return type of the function signature to be large enough to store the 24bits program counter value below instead. Thus if you choose unsigned long integer as the return type of getErrLoc(), then it will be large enough to store the 24bit program counter into a 32bit unsigned long integer location.
unsigned long getErrLoc(void); // Get Address Error Loc
unsigned long errLoc __attribute__((persistent));
(FYI: Using __attribute__((persistent)) to record trap location on next reboot)

Scodec: Using vectorOfN with a vlong field

I am playing around with the Bitcoin blockchain to learn Scala and some useful libraries.
Currently I am trying to decode and encode Blocks with SCodec and my problem is that the vectorOfN function takes its size as an Int. How can I use a long field for the size while still preserving the full value range.
In other words is there a vectorOfLongN function?
This is my code which would compile fine if I were using vintL instead of vlongL:
object Block {
implicit val codec: Codec[Block] = {
("header" | Codec[BlockHeader]) ::
(("numTx" | vlongL) >>:~
{ numTx => ("transactions" | vectorOfN(provide(numTx), Codec[Transaction]) ).hlist })
}.as[Block]
}
You may assume that appropriate Codecs for the Blockheader and the Transactions are implemented. Actually, vlong is used as a simplification for this question, because Bitcoin uses its own codec for variable sized ints.
I'm not a scodec specialist but my common sense suggests that this is not possible because Scala's Vector being a subtype of GenSeqLike is limited to have length of type Int and apply that accepts Int index as its argument. And AFAIU this limitation comes from the underlying JVM platform where you can't have an array of size more than Integer.MAX_VALUE i.e. around 2^31 (see also "Criticism of Java" wiki). And although Vector theoretically could have work this limitation around, it was not done. So it makes no sense for vectorOfN to support Long size as well.
In other words, if you want something like this, you probably should start from creating your own Vector-like class that does support Long indices working around JVM limitations.
You may want to take a look at scodec-stream, which comes in handy when all of your data is not available immediately or does not fit into memory.
Basically, you would use your usual codecs.X and turn it into a StreamDecoder via scodec.stream.decode.many(normal_codec). This way you can work with the data through scodec without the need to load it into memory entirely.
A StreamDecoder then offers methods like decodeInputStream along scodec's usual decode.
(I used it a while ago in a slightly different context – parsing data sent by a client to a server – but it looks like it would apply here as well).

Does Swift's UnsafeMutablePointer<Float>.allocate(...) actually allocate memory?

I'm trying to understand Swift's unsafe pointer API for the purpose of manipulating audio samples.
The non-mutable pointer variants (UnsafePointer, UnsafeRawPointer, UnsafeBufferPointer) make sense to me, they are all used to reference previously allocated regions of memory on a read-only basis. There is no type method "allocate" for these variants
The mutable variants (UnsafeMutablePointer, UnsafeMutableRawPointer), however, are documented as actually allocating the underlying memory. Example from the documentation for UnsafeMutablePointer (here):
static func allocate(capacity: Int)
Allocates uninitialized memory for the specified number of instances of type Pointee
However, there is no mention that the UnsafeMutablePointer.allocate(size) can fail so it cannot be actually allocating memory. Conversely, if it does allocate actual memory, how can you tell if it failed?
Any insights would be appreciated.
I decided to test this. I ran this program in CodeRunner:
import Foundation
sleep(10)
While the sleep function was executing, CodeRunner reported that this was taking 5.6 MB of RAM on my machine, making our baseline.
I then tried this program:
import Foundation
for _ in 0..<1000000 {
let ptr = UnsafeMutablePointer<Float>.allocate(capacity: 1)
}
sleep(10)
Now, CodeRunner reports 5.8 MB of RAM usage. A little more than before, but certainly not the extra 4 MB that this should have taken up.
Finally, I assigned something to the pointer:
import Foundation
for _ in 0..<1000000 {
let ptr = UnsafeMutablePointer<Float>.allocate(capacity: 1)
ptr.pointee = 0
}
sleep(10)
Suddenly, the program is taking up 21.5 MB of RAM, finally giving us our expected RAM usage increase, although by a larger amount than what I was expecting.
Making a profile in CodeRunner to compile with the optimizations turned on did not seem to make a difference in the behavior I was seeing.
So, surprisingly enough, it does appear that the call to UnsafeMutablePointer.allocate actually does not immediately allocate memory.
Operating systems can cheat a lot when it comes to memory allocations. If you request a block of memory of size N and don't actually put anything in it, the operating system can very well go "sure you can have a block of memory, here you go" and not really do anything with it. It's really more a promise that the memory will be available when used by the program.
Even with a very simple C program like the one below, the macOS's Activity Monitor will report 945 kB first, then 961 kB after calling malloc (which allocates the memory), and finally 257.1 MB after filling the allocated memory with zeroes.
From the point of view of the program, all 256 MB needed for the array of integers is available immediately after calling malloc, but that's actually a lie.
#include <stdlib.h>
#include <stdio.h>
int main(int argc, char const *argv[])
{
int count = 64*1024*1024;
printf("Step 1: No memory allocated yet. Check memory usage for baseline, then press enter to continue (1/3)");
getchar();
/* Allocate big block of memory */
int *p = malloc(count*sizeof(int));
if (p == NULL) return 1; // failed to allocate
printf("Step 2: Memory allocated. Check memory usage, then press any key to continue (2/3)");
getchar();
/* Fill with zeroes */
for (int i=0; i < count; i++) {
p[i] = 0;
}
printf("Step 3: Memory filled with zeroes. Check memory usage, then press any key to continue (3/3)");
getchar();
return 0;
}

async_work_group_copy from __constant

Code like this:
__constant char a[1] = "x";
...
__local char b[1];
async_work_group_copy(b, a, 1, 0);
throws a compile error:
no instance of overloaded function "async_work_group_copy" matches the argument list
So it seems that this function cannot be used to copy from __constant address space. Am I right? If yes, what's the preferred method to make a copy of __constant data to __local memory for faster access? Now I use a simple for loop, where each workitem copies several elements.
async_work_group_copy() is defined to copy between local and global memory only (see here: http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/).
As far as I know, there is no method to perform bulk copy from constant to local memory. Maybe the reason is that constant memory is actually cached on all GPUs that I know of, which essentially means that it works at the same speed as local memory.
The vloadn() family of functions can load whole vectors for all types of memory, including constant, so that may partially match what you need. However, it is not bulk copy.

How does the auto-free()ing work when I use functions like mktemp()?

Greetings,
I'm using mktemp() (iPhone SDK) and this function returns a char * to the new file name where all "X" are replaced by random letters.
What confuses me is the fact that the returned string is automatically free()d. How (and when) does that happen? I doubt it has something to do with the Cocoa event loop. Is it automatically freed by the kernel?
Thanks in advance!
mktemp just modifies the buffer you pass in, and returns the same poiinter you pass in, there's no extra buffer to be free'd.
That's at least how the OSX manpage describes it(I couldn't find documentation for IPhone) , and the posix manpage (although the example in the posix manpage looks to be wrong, as it pass in a pointer to a string literal - possibly an old remnant, the function is also marked as legacy - use mkstemp instead. The OSX manpage specifically mention that as being an error).
So, this is what will happen:
char template[] = "/tmp/fooXXXXXX";
char *ptr;
if((ptr = mktemp(template)) == NULL) {
assert(ptr == template); //will be true,
// mktemp just return the same pointer you pass in
}
If it's like the cygwin function of the same name, then it's returning a pointer to an internal static character buffer that will be overwritten by the next call to mktemp(). On cygwin, the mktemp man page specifically mentions _mktemp_r() and similar functions that are guaranteed reentrant and use a caller-provided buffer.