Determine filename at an LBA (FAT) - uefi

My goal is to translate logged block-level accesses (LBAs) to file names. I am logging at UEFI level, so programs and boot loaders mostly read from ESP (EFI System Partition) which has a structure similar to FAT.
I know that fsutil volume querycluster is able to do this for NTFS, is there a solution for FAT?
It is important to operate with a mounted volume and not an image.

A straightforward way to map LBAs (sectors) to filenames (inodes) is to apply TSK utilities one by one: mmls, fls and istat.
mmls to identify ESP partition offset
fls to retrieve tuples [filename, inode] of all files in the ESP
istat to retrieve inode -> sectors mapping
I created a script that produces an index from inodes (filenames) to EFI partition sectors, everybody is welcome to use it as a reference.

Related

TextIO.Read().From() vs TextIO.ReadFiles() over withHintMatchesManyFiles()

In my usecase getting set of matching filepattern from Kafka,
PCollection<String> filepatterns = p.apply(KafkaIO.read()...);
Here each pattern could match upto 300+ files.
Q1. How can I use TextIO.Read() to match data from PCollection, as withHintMatchesManyFiles() available only for TextIO.Read() not for TextIO.ReadFiles().
Q2. If path via FileIO.Match->FileIO.ReadMatch()->TextIO.ReadFiles() is used, withHintMatchesManyFiles() isn't available in this path, how it will impact the read performance?
Q3. Is there any other optimized path for above usecase?
Yes, you can't have withHintMatchesManyFiles() with TextIO.ReadFiles() out of the box. Actually, TextIO.Read().withHintMatchesManyFiles() is implemented via FileIO transforms + TextIO.ReadFiles() (see details). In this way, FileIO.readMatches() should distribute the files reading over the workers.
So, I think you can use the same approach while reading file names from Kafka topic.
How can I use TextIO.Read() to match data from PCollection, as withHintMatchesManyFiles() available only for TextIO.Read() not for TextIO.ReadFiles().
My very limited understanding of Apache Beam in general and PTransforms in particular is that TextIO.read() creates a root PTransform that can only be used at the very beginning of the pipeline. In other words, TextIO.Read cannot be used after a PTransform of any kind.

How to access pci express configuration space via MMIO?

I am new to PCI express, I want to read/write into PCI Express configuration space via MMIO addresses. I know how port mapped IO read/write into PCI express config space via 0xCFC and 0xCF8 port addresses(on x86). I also wrote a sample linux kernel module to read pci config space via port mapped io which worked fine. I want to do the same via MMIO/MMCFG access.
I also did a search around but could not find a convincing answer. I am looking for the details and also some code sample to understand it better.
Any help is appreciated.
Hardware
The base address of the MMIO area for the configuration space of each PCIe devices in a PCI segment group is given in the ACPI table MCFG.
The MCFG table lists, for each PCI segment group, the first and last (inclusive) bus number of the PCI segment group and the base address of the extended configuration space.
The MCFG table is setup by the BIOS/UEFI based upon the value of the PCIEXBAR (for my processor is at offset 60h) in the Host Bridge/DRAM registers device located at 00:00.0.
This is the usual address, the device is integrated in the processor socket since the Nehalem architecture and never changed address.
You can google your processor generation datasheet to get the correct device address and register offset.
Also note that not all of the 256 MiB area may be mapped, my processor allows 256/128/64 MiB mapping with 128 MiB being the one selected by the BIOS/UEFI.
Linux
I don't know how to correctly handle this in Linux, there are the pci_{read|write}_config_XXX function function that seems to use the PCIe extended config space.
So accessing the config space should be very easy.
Alternatively, the pci_mcfg_lookup will give the physical address of extended configuration space for a PCI segment group and a bus range (you should be able to make it work by defining a resource structure with only the start and end fields set to the bus number).
In case you wanted a lower level approach.
Finally, you could get the address of the MCFG table and (re)parse it yourself - I don't know how to get such an address in Linux, exactly.
There is a acpi_tb_find_table where you can pass the signature of the table and null oem and table ids to get a table index.
At line 114 of the same file there is a piece of code that access a table by index, you can use it as documentation.
You probably have to import one or more symbols from the ACPI module.

Two questions about OpenStack Swift ring function

I'm new in Swift and I'm trying to learn its functionality. I have two question for you regarding the ring and the consistent hashing algorithm.
When we want to store an object, we take its path (for example ".../v1/account_name/container_name/object_name.ext"), feed the MD5 hash function with this path, then we obtain an hash value. From this hash value we take the first n bits, where n is the part-power, and use those bits to obtain the partition number. Now, if we access to the ring using the partition number, we can discover in which node that partition is and store the object in this way.
First question: what if that partition is full?
Suppose now that swift stores the object in the correct node, the second question is: how swift decides where storing the replicas?
Thank you all!
how swift decides where storing the replicas?
When you create a ring informing all nodes and all disks you have for your cluster, it automatically defines where each copy should be and also which handoff nodes to use in case of a failure. So, when you ask the ring where to find/store an object with the hash ABC123DEF... it will answer you something like:
Look at here:
SERVER1/DISK2/PATH/TO/FILE
SERVER2/DISK4/PATH/TO/FILE
SERVER4/DISK1/PATH/TO/FILE
And if you don't find, look at here.
Handoff: SERVER2/DISK2/PATH/TO/FILE
Handoff: SERVER8/DISK7/PATH/TO/FILE
Handoff: SERVER3/DISK1/PATH/TO/FILE

Read the size of the member of the pds

How to read the size of the member of a pds file. A pds file has many members. If we browse a pds we can see the member names, their creation date, time user, size, etc.
So how to get these attributes separately in variables.
As Bruce mentioned LMMFIND will have the ISPF statistics for a PDS member. Make sure to use the STATS YES on the service. Also, you may need to check for extended stats (variable ZLEXT ). A site may chose to use extended stats if the member size exceeds 64K. Please refer to the ISPF services Guide. Below is the link for LMMFIND in the 2.1 manual.
https://www.ibm.com/support/knowledgecenter/SSLTBW_2.1.0/com.ibm.zos.v2r1.f54sg00/lmmfind.htm%23lmmfind
If running under TSO/ISPF (you can do this in batch as well), you can use the LMMFIND service with the stats(yes) option. You will have to do
lminit
lmopen
before you do the LMFIND. Also as zarchasmpgmr you will need to do LMCLOSE and LMFREE
If you want to display a member list, have a look at LMMDISP
Be aware that the statistics you write about are maintained by ISPF. There are many ways to create a member outside of ISPF, and those members will not have those statistics. The ISPF service LMMSTATS will set those stats (the 3.5 R panel is a front end to that service). The order of calls is
LMINIT
LMMSTATS
LMFREE
For some reason, LMMSTATS doesn't require the LMOPEN and LMCLOSE services.
It's been ages, but if the PDS does not have ISPF stats (or the STATS are dubious (eg they can be changed outside of ISPF)). Then you could look to processing the directory.
However, by default a PDS directory doesn't contain that much info, so unless ISPF stats exist you couldn't get much info. What a PDS directory does have is a user-definable area and it's that area hat ISPF utilises. This area is preceeded by a length descriptor (see link for more info).
Another reason why ISPF stats may not exists or be accurate, is that, I think (if I recall correctly), that you can remove the stats, which can free some directory space (potentially get around E37 Abend or circumvent it happening).
I can't recall having tried with Rexx (did write an Assembler PDS unload utility), but you might be able to open and read the directory using EXECIO on the BASE name of the PDS (ie not including (member)).
The directory is blocked at 256 bytes. More info in regards to the structure can be found here PDS Directory. I don't believe that this would work properly for PDSE's though.

How a class that wraps and provides access to a single file should be designed?

MyClass is all about providing access to a single file. It must CheckHeader(), ReadSomeData(), UpdateHeader(WithInfo), etc.
But since the file that this class represents is very complex, it requires special design considerations.
That file contains a potentially huge folder-like tree structure with various node types and is block/cell based to handle fragmentation better. Size is usually smaller than 20 MB. It is not of my design.
How would you design such a class?
Read a ~20MB stream into memory?
Put a copy on temp dir and keep its path as property?
Keep a copy of big things on memory and expose them as read-only properties?
GetThings() from the file with exception-throwing code?
This class(es) will be used only by me at first, but if it ends good enough I might open-source it.
(This is a question on design, but platform is .NET and class is about offline registry access for XP)
It depends what you need to do with this data. If you only need to process it linearly one time, then it might be faster to just take the performance hit of a large file in memory.
If however you need to do various things with the file beyond a single, linear parsing, I would parse the data into a lightweight database such as SQLite and then operate on that. This way all of your file's structure is preserved and all subsequent operations on the file will be faster.
Registry access is quite complex. You are basically reading a large binary tree. The class design should rely heavily on the stored data structures. Only then you can choose an appropriate class design. To stay flexible you should model the primitives such as REG_SZ, REG_EXPAND_SZ, DWORD, SubKey, .... Don Syme has in his book Expert F# a nice section about binary parsing with binary combinators. The basic idea is that your objects know by themself how to deserialize from a binary representation. When you have a stream of bytes which is structured like this
<Header>
<Node1/>
<Node2>
<Directory1>
</Node2>
</Header>
you start with a BinaryReader to read the binary objects byte by byte. Since you know that the first thing must be the header you can pass it to the Header object
public class Header
{
static Header Deserialize(BinaryReader reader)
{
Header header = new Header();
int magic = reader.ReadByte();
if( magic == 0xf4 ) // we have a node entry
header.Insert(Node.Read( reader );
else if( magic == 0xf3 ) // directory entry
header.Insert(DirectoryEntry.Read(reader))
else
throw NotSupportedException("Invalid data");
return header;
}
}
To stay performant you can e.g. delay parsing the data up to a later time when specific properties of this or that instance are actually accessed.
Since the registry in Windows can get quite big it is not possible to read it completely into memory at once. You will need to chunk it. One solution that Windows applies is that the whole file is allocated in paged pool memory which can span several gigabytes but only the actually accessed parts are swapped out from disk into memory. That allows Windows to deal with a very large registry file in an efficient manner. You will need something similar for your reader as well. Lazy parsing is one aspect and the ability to jump around in the file without the need to read the data in between is cruical to stay performant.
More infos about paged pool and the registry can be found there:
http://blogs.technet.com/b/markrussinovich/archive/2009/03/26/3211216.aspx
Your Api design will depend on how you read the data to stay efficient (e.g. use a memory mapped file and read from different mapped regions). With .NET 4 a Memory Mapped file implementation has arrived that is quite good now but wrappers around the OS APIs exist as well.
Yours,
Alois Kraus
To support delayed loading from a memory mapped file it would make sense not to read the byte array into the object and parse it later but go one step furhter and store only the offset and length of the memory chunk from the memory mapped file. Later when the object is actually accessed you can read and deserialize the data. This way you can traverse the whole file and build a tree of objects which contain only the offsets and the reference to the memory mapped file. That should save huge amounts of memory.