Suitable NXP card for MUSCLE applet - applet

I'm checking JCOP product range for a suitable version for MUSCLE applet.
CAP file size of this applet is 14KB.
which versions can be used for this applet?
http://www.nxp.com/documents/line_card/75016728.pdf
what parameter I should check for it? EEPROM or ROM?

This is slightly off topic, and it's probably not such a good idea to refer to a specific product as products are removed and added in time.
There is the .cap file size but note that the installed applet will take a bit less. You should however consider that .cap file size does not take memory required for installation and personalization in account.
You obviously need the asymmetric co-processor but you may not need contactless, so check for that.
Unless you want to pay for creating a ROM mask you should be looking at EEPROM or Flash (where available).
I'll quote Wikipedia:
However, the one-time masking cost is high and there is a long turn-around time from design to product phase. Design errors are costly: if an error in the data or code is found, the mask ROM is useless and must be replaced in order to change the code or data.
Start thinking ROM after you've sold a few hundreds of thousands :)

Related

Can Someone Give me a high-level overview of the VSWS Algorithm used in Operating Systems?

I am trying to find videos/resources that can give me a simple, clear, concise description of the VSWS algorithm but I cannot seem to find any. Any help would be appreciated!
Can Someone Give me a high-level overview of the VSWS Algorithm...
The basic idea of the Variable-Interval Sampled Working Set algorithm is:
each virtual page has a "was used" flag
while the program is running, if/when the program uses a virtual page (including when the page's data had to be fetched from elsewhere/disk before it could be used) the CPU or OS sets the page's "was used" flag.
after a variable amount of time, the OS checks all the "was used" flags and decides that if a page wasn't used then its not part of the working set (and may evict them to free up physical memory); then clears all the "was used" flags (ready for the next variable amount of time).
... used in Operating Systems?
I wouldn't assume it's actually used in modern operating systems.
Most operating systems use something loosely based on "least recently used"; where a similar "variable sampling" approach is used to build up an estimate of "time when page was used last" (and not merely a single "was used" flag), which is then used to estimate "probability of future use"; which might then be combined with "cost of eviction" and "priority of program" to come up with a combined score; where the pages with the worst score are deemed "best to evict to free up physical memory".
Note 1: If a page was modified and needs to be written to swap space (and then possibly loaded back from swap space later) then it has a higher "cost of eviction"; and if a page hasn't been modified since it was fetched from a file or swap space last then it has a lower "cost of eviction". To improve performance (reduce the cost of eviction, not forgetting that estimates are crude and often poorly predict future use) it'd make sense to prefer the eviction of "cheaper to evict" pages.
Note 2: When there's multiple tasks running; it's good to give some tasks preferential treatment. For an extreme example, imagine if the OS is under "low memory" conditions and constantly thrashing (transferring data to/from) disks; and an admin/user is trying to terminate a buggy program that is causing all the disk trashing but can't because the tool/s they need to use to fix the problem are unresponsive (because those tools were not given preferential treatment and have to be fetched from the "already being thrashed" disk).
Note 3: In some cases (e.g. a task called sleep() and it's trivial to determine that it will wake up soon) it's possible to use other information to get a better estimate of "probability of future use" than a simple "least recently used" algorithm could provide.
Note 4: Typically when an OS needs to free up some physical memory there's other things (e.g. file data caches) that could also be considered (and could also participate in that "calculate a score and evict whatever has the worst score" system).
Note 5: Modern systems also pre-fetch data (e.g. from files, etc) before the data is actually requested. It's entirely possibly for pre-fetched "not requested by any program, not used at all yet" data to be more important than "explicitly requested and previously used" data.

Why is Page Size specified as part of Instruction Set Architecture?

I am trying to understand why is Page Size specified as part of an ISA.
More specifically, I am looking for details where any of the hardware modules (MMU, TLB) (apart from the Operating System) use the Page Size information to provide a certain functionality.
Please let me know the reasons Page Size has to be part of the ISA instead of just being decided by the OS.
Thanks.
The TLB hardware has to know the page size to figure out whether a translation applies to an address or not. e.g. given a translation, does an address 2500 bytes above it use that translation or not?
Or to put it another way, the TLB has to know which address bits are part of the page offset (within a page), and which bits need translating from virtual to physical.
Also, on architectures with HW page walk, the whole page table format is part of the ISA, and the typical design uses the virtual page number as an index to find the right entry (e.g. x86-64's 4-level page tables). Not a linear or binary search through the page table to find an entry that contains the virtual address being searched for. Normally this same design is used for page tables walked by software, AFAIK.
It is possible to build a TLB where each entry has a mask to control how many address bits it matches. i.e. where a single TLB can have entries for pages of multiples sizes. This only works if pages have power-of-2 sizes and are naturally aligned (i.e. the start address of a page is always some multiple of its size, so zeroing the low bits of an address inside a page gives you the page-start address).
You could potentially use this with an extent-based page-table format, where you have one entry for each contiguous mapping instead of one entry for each page.
Page-walks would probably be more costly, having to check more entries for more mappings, but the same number of TLB entries could cover more address space.
In many cases OSes want to be able to easily unmap or even page out unused pages, and this conflicts with using huge pages that cover a mix of hot and cold data or especially code. (But normal fixed-size hugepages have this problem, too, so x86-64's 2M and 1G hugepages aren't always a win vs. standard 4k pages.)
Page size isn't a part of the ISA (what a compiler would normally emit) for x86_64. The instruction set architecture for x86_64 is formally known as Intel® 64 Architecture, and it is briefly described in section 2.2.10 (volume 1) of the Intel® 64 and IA-32 Architectures Software Developer’s Manual. It describes what an application program can see and do. There is something similar for ARMv8.
Instead, page size is left to the OS, and it isn't a part of the ISA. This is because page sizes can vary amongst implementations and can vary according to mode settings (4K/2M/4M/1G). x86_64 implementations present something like an ISA to the OS which Intel refers to as the system programming level (what an OS would use). That's described in Chapter 13 of volume 2 of Intel's Software Developer's Manual.
That level describes page sizes and modes. But a 'correct' application program should run with different page sizes on different systems in different page size modes.

How can I limit the number of blocks written in a Write_10 command?

I have a product that is basically a USB flash drive based on an NXP LPC18xx microcontroller. I'm using a library provided from the manufacturer (LPCOpen) that handles the USB MSC and the SD card media (which is where I store data).
Here is the problem: Internally the LPC18xx has a 64kB (limited by hardware) buffer used to cache reads/writes which means it can only cache up to 128 blocks(512B) of memory. The SCSI Write-10 command has a total-blocks field that can be up to 256 blocks (128kB). When originally testing the product on Windows 7 it never writes more than 128 blocks at a time but when tested on Linux it sometimes writes more than 128 blocks, which causes the microcontroller to crash.
Is there a way to tell the host OS not to request more than 128 blocks? I see references[1] to a Read-Block-Limit command(05h) but it doesn't seem to be widely supported. Also, what sense key would I return on the Write-10 command to tell Linux the write is too large? I also see references to a block limit VPD page in some device spec sheets but cannot find a lot of documentation about how it is implemented.
[1]https://en.wikipedia.org/wiki/SCSI_command
Let me offer a disclaimer up front that this is what you SHOULD do, but none of this may work. A cursory search of the Linux SCSI driver didn't show me what I wanted to see. So, I'm not at all sure that "doing the right thing" will get you the results you want.
Going by the book, you've got to do two things: implement the Block Limits VPD and handle too-large transfer sizes in WRITE AND READ.
First, implement the Block Limits VPD page, which you can find in late revisions of SBC-3 floating around on the Internet (like this one: http://www.13thmonkey.org/documentation/SCSI/sbc3r25.pdf). It's probably worth going to the t10.org site, registering, and then downloading the last revision (http://www.t10.org/cgi-bin/ac.pl?t=f&f=sbc3r36.pdf).
The Block Limits VPD page has a maximum transfer length field that specifies the maximum number of blocks that can be transferred by all the READ and WRITE commands, and basically anything else that reads or writes data. Of course the downside of implementing this page is that you have to make sure that all the other fields you return are correct!
Second, when handling READ and WRITE, if the command's transfer length exceeds your maximum, respond with an ILLEGAL REQUEST key, and set the additional sense code to INVALID FIELD IN CDB. This behavior is indicated by a table in the section that describes the Block Limits VPD, but only in late revisions of SBC-3 (I'm looking at 35h).
You might just start with returning INVALID FIELD IN CDB, since it's the easiest course of action. See if that's enough?

About Operating System, about page-table entries status bits

In the movie The Social Network, when Mark Zuckberg was in class, the teacher asked this question:
Suppose we're given a computer, with a 16-bit virtual address, and a page size of 256-bytes,the system uses one-level page tables that start at address hex 400, may be you want DMA (Direct Memory Access) on your 16-bit system. Who knows? The first pages are reserved for hardware flags, etc. Assume page-table entries have eight status bits. The eight status bits would then be ...
Mark Zuckberg answered:
One valid bit, one modified bit, one reference bit and five permission bits.
How did he get this?
http://chomaloma.blogspot.com.au/2011/02/social-network-inaccuracies-regarding.html
That does explain it a little
Intel nomenclature in parentheses. The 'valid' (present), 'modified' (dirty) and 'reference' (accessed) bits are the minimum set of bits you need for a demand paging manager and MMU.
The 'valid' (present) bit is used by the MMU to know whether the page is mapped to a valid physical address.
The 'modified' (dirty) bit is used by the demand paging manager to determine if the page being evicted needs to be written to backing media. As accessing backing media can be considered an expensive operation, you really want to keep this to a minimum--especially when writing to it as that is generally slower than reading from it.
The 'reference' (accessed) bit is useful to the demand paging manager to figure out how to age the pages it controls. You don't want to evict the most frequently used pages as that would require saving and/or loading them repeatedly from backing store (which has already been stated as SLOW).
The remaining five bits are gravy. They are free to use as permission and/or option bits. For example, can the page be accessed by supervisor and/or user threads? Is the page available for write, or is it read-only? What is the caching strategy to be used on the page?
Hope this helps.
Sparky
How did he get the answer?---That is just movie BS.
If you take the number of bits in the address and subtract the number of bits used to represent a page, you get the number of bits available for the processor to use as system status bits.
With that information, he could identify the number of system status bits
The usage of those bits is another story. The allocation of system status bits is system dependent. Maybe they exist, but I don't know of any 16-bit virtual addressing system. So he's not referring to any specific type of system.
A reference bit is not used by all systems (e.g. VMS). That's not even mandatory.
Hollywood magic.

Can 'mov' instructions, which do not require any offset/displacement added to it, be executed without any assistance of ALU?

I've recently started exploring the field of computer architecture. While studying the instruction set architecture, I came across 'mov' instruction which copies data from one location to another. I understand that some type of mov' instructions are conditional while some need to have offset or displacement added to it to find a particular address, and hence they need ALU assistance. For e.g. Base-plus-index, Register relative, Base relative-plus-index, Scaled index etc.
I was wondering, if it is possible to bypass ALU for those mov' instructions (for e.g. register to register data transfer) who do not require any ALU assistance.
Yes. Obviously, an instruction that doesn't require any arithmetic to be performed doesn't require the assistance of the ALU.
Obviously, though, it still requires the "intervention of microprocessor"; the registers, program counter, instruction fetch/decode/execute pipeline are all part of the CPU.