Is there a POSIX-way to extend fd_set? - select

Recent POSIX tells me that an fd_set is a structure capable of holding up to FD_SETSIZE fd's.
On Linux and glibc I find that FD_SETSIZE is 1024, the default (soft) RLIMIT_NOFILE is also 1024, and sysconf(_SC_OPEN_MAX) is also 1024.
I can increase the RLIMIT_NOFILE, but I cannot find anything in POSIX which tells me how to create an "extended fd_set" to match the new maximum number of fd's.
It looks like older POSIX used to specify that the fd_set contained an array fds_bits, and it would then be safe to create an "extended fd_set" as some multiple of sizeof(fds_bits[0]) (suitably aligned).
I find that various BSDs allow me to set FD_SETSIZE, provided that's done early enough. That doesn't seem to work for glibc and Linux :-(
It seems that NFDBIT is perhaps what I need -- which is a BSD thing, which glibc will give me under __USE_MISC. There is also an fd_mask type, where the bit vector is fd_mask xxx[xx].
What I cannot find is a POSIX way to do this. Have I missed something ?
I know that for thousands of fd's I should probably be using epoll or kqueue... but I feel there should be a standard way of extending fd_set beyond FD_SETSIZE.

Related

Programming in QuickBasic with repl.it?

I'm trying to get a "retro-computing" class open and would like to give people the opportunity to finish projects at home (without carrying a 3kb monstrosity out of 1980 with them) I've heard that repl.it has every programming language, does it have QuickBasic and how do I use it online? Thanks for the help in advance!
You can do it (hint: search for QBasic; it shares syntax with QuickBASIC), but you should be aware that it has some limitations as it's running on an incomplete JavaScript implementation. For completeness, I'll reproduce the info from the original blog post:
What works
Only text mode is supported. The most common commands (enough to run
nibbles) are implemented. These include:
Subs and functions
Arrays
User types
Shared variables
Loops
Input from screen
What doesn't work
Graphics modes are not supported
No statements are allowed on the same line as IF/THEN
Line numbers are not supported
Only the built-in functions used by NIBBLES.BAS are implemented
All subroutines and functions must be declared using DECLARE
This is far from being done. In the comments, AC0KG points out that
P=1-1 doesn't work.
In short, it would need another 50 or 100 hours of work and there is
no reason to do this.
One caveat that I haven't been able to determine is a statement like INPUT or LINE INPUT... They just don't seem to work for me on repl.it, and I don't know where else one might find qb.js hosted.
My recommendation: FreeBASIC
I would recommend FreeBASIC instead, if possible. It's essentially a modern reimplementation coded in C++ (last I knew) with additional functionality.
Old DOS stuff like the DEF SEG statement and VARSEG function are no longer applicable since it is a modern BASIC implementation operating on a 32-bit flat address space rather than 16-bit segmented memory. I'm not sure what the difference between the old SADD function and the new StrPtr function is, if there is any, but the idea is the same: return the address of the bytes that make up a string.
You could also disable some stuff and maintain QB compatibility using #lang "qb" as the first line of a program as there will be noticeable differences when using the default "fb" dialect, or you could embrace the new features and avoid the "qb" dialect, focusing primarily on the programming concepts instead; the choice is yours. Regardless of the dialect you choose, the basic stuff should work just fine:
DECLARE SUB collatz ()
DIM SHARED n AS INTEGER
INPUT "Enter a value for n: ", n
PRINT n
DO WHILE n <> 4
collatz
PRINT n
LOOP
PRINT 2
PRINT 1
SUB collatz
IF n MOD 2 = 1 THEN
n = 3 * n + 1
ELSE
n = n \ 2
END IF
END SUB
A word about QB64
One might argue that there is a much more compatible transpiler known as QB64 (except for some things like DEF FN...), but I cannot recommend it if you want a tool for students to use. It's a large download for Windows users, and its syntax checking can be a bit poor at times, to the point that you might see the QB code compile only to see a cryptic message like "C++ compilation failed! See internals\temp\compile.txt for details". Simply put, it's usable and highly compatible, but it needs some work, like the qb.js script that repl.it uses.
An alternative: DOSBox and autorun
You could also find a way to run an actual copy of QB 4.5 in something like DOSBox and simply modify the autorun information in the default DOSBox.conf (or whatever it's called) to automatically launch QB. Then just repackage it with the modified DOSBox.conf in a nice installer for easy distribution (NSIS, Inno Setup, etc.) This will provide the most retro experience beyond something like a FreeDOS virtual machine as you'll be dealing with the 16-bit segmented memory, VGA, etc.—all emulated of course.

NetLogo: primitives or extension primitives to determine operating system?

I was curious as to whether or not anybody was aware of a built-in Netlogo or Netlogo extension primitive that allows you to determine the operating system that the user is currently running? I wish to alter directory separators according to the user's operating system and being able to determine this information would be incredibly useful.
If there isn't any such thing, I'll get to building it!
No built-in primitive exists.
It would probably be possible to hack something kludgy together in pure NetLogo by using file-exists? to test for the presence or absence of certain OS-specific files, for example /etc/passwd.txt on Unix-like systems (including Mac OS X). As for the best files to use for this, I don't know, but it wouldn't surprise me if there was already an SO answer on this (since it isn't a NetLogo-specific question).
I thought maybe https://github.com/NetLogo/Shell-Extension/ had it. But I see now that it although it has primitives for reading and setting environment variables, it doesn't have similar primitives for Java system properties, which is what you need here (System.getProperty("os.name")). It would make a nice addition to the extension, I think.
re: "alter directory separators according to the user's operating system" specifically:
If you need to deal with paths that are coming from the operating system, then yeah, you need to be prepared to deal with platform-specific separators.
If you're only sending paths to the operating system, you may not need to worry about it. I haven't used Windows in a long time, but iirc it might just work to use forward slash.
If you're doing pathname manipulation, you'll probably want to check out Charles Staelin's pathdir extension, https://github.com/cstaelin/Pathdir-Extension. It includes a pathdir:get-separator primitive, as well as lots of other useful-looking stuff.

What's the difference between open and sysopen in Perl?

It seems both do the same thing, huh?
Can someone show me an example where they do different job?
sysopen is a thin wrapper around the open(2) kernel system call (the arguments correspond directly), whereas open is a higher-level wrapper which enables you to do redirections, piping, etc.
Unless you are working with a specific device that requires some special flags to be passed at open(2) time, for ordinary files on disk you should be fine with open.
To quote perlopentut:
If you want the convenience of the shell, then Perl's open is
definitely the way to go. On the other hand, if you want finer
precision than C's simplistic fopen(3S) provides you should look to
Perl's sysopen, which is a direct hook into the open(2) system call.
That does mean it's a bit more involved, but that's the price of
precision.
Since Perl is written in C, both methods likely end up making the open(2) system call. The difference is that open() in Perl has some niceties built in that make opening, piping and redirection very easy. At the same time, though, open() takes away some flexibility. It has none of the Fcntl functionality available in sysopen(), nor does it have the masking functionality.
Most situations just need open().

Which scripting languages support long (64 bit) integers well?

Perl has long been my choice scripting language but I've run into a horrible problem. By default there is no support for long (64 bit) integers. Most of the time an integer is just a string and they work for seeking in huge files but there are plenty of places they don't work, such as binary &, printf, pack, unpack, <<, >>.
Now these do work in newer versions of Perl but only if it is built with 64-bit integer support, which does not help if I want to make portable code to run on Perls built without this option. And you don't always get control over the Perl on a system your code runs on.
My question is do Python, PHP, and Ruby suffer from such a problem, or do they also depend on version and build options?
The size of high speed hardware integers (assuming the language has them) will always be dependent on whatever size integers are available to the compiler that compiled the language interpreter (usually C).
If you need cross-platform / cross-version big integer support, the Perl pragma use bigint; will do the trick. If you need more control, bigint is a wrapper around the module Math::BigInt.
In the scope where use bigint; is loaded, all of the integers in that scope will be transparently upgraded to Math::BigInt numbers. Lastly, when using any sort of big number library, be sure to not use tricks like 9**9**9 to get infinity, because you might be waiting a while :)
In Python, you never get overflows. Instead, python switches the implementation of numbers it is using automatically. The basic implementation uses the native ints on the platform, but long integers use an infinite length number implementation. As a result, you never have to worry about your numbers becoming too large, python just handles it naturally.
Tcl 8.5's long integer support is pretty good from a user perspective. Internally, it represents integers as whatever type is necessary to hold them (up to and including bigints) and things that consume integers will take any of them (though might impose their own limits; you don't really want to use a number that will only fit in a bigint as a Unix file mode...)
The only time you really need to think about it at all is when you're going to/from some fixed-width binary format. That's reasonably obvious though (it's fixed width after all).
Excuse me sir, bigint and Math::BigInt are part of core modules. Just friggin' use one of them, it will work on any platform.

Is there some kind of tool to look at the encoding of Intel x86 instructions?

Forgive me if this might be a dumb question but, I'm in an assembly class that was mostly taught using an emulated CPU that was supposed to teach the concepts of assembly code. We haven't even written an Intel program, so I'm trying to adjust. In our emulated CPU, we were able to generate a symbol table file that gave the bytes equivalent for instructions:
http://imgur.com/tw5S8.png
Would I be able to do such a thing with Intel x86 instructions?
Try IDA. It has an option to show binary values of opcodes.
EDIT: Well.. it's a disassembler. Try opening a binary file, and set the number of opcode bytes to show (in Options/General/) to something that is not zero.
If you are looking for an IDE that shows you in real time the opcodes for the instruction you've used, then I don't think you'll find one, because of lack of "market". Can you explain why you need it? Do you want to know just their length, or want to learn them? There is simple pattern for lengths, so by dissasembling many binaries you'll catch it. If it's the opcodes you want.. well, there are lots of them, almost no rules, and practically no use to do it.
I see.. then you have to generate the list file . Your assembler should have an option for that. (for NASM it's -l listfile). Just put any instruction(s) in your .asm file, and generate listing for it. It should contain the binary encoding for each instruction.
First, get Intel Instruction Set Refference, or, better, this link: http://siyobik.info/index.php?module=x86 . There you'll find that most opcodes have several encodings. In your particular case, the bit 1 of the opcode specifies direction, and since both operands are registers, you can toggle the direction and swap the register codes, and the result will be the same. Usually you have this freedom on most register to register arithmetic operations. To check this, try decompiling with IDA this source file:
db 02h, E0h
db 00h, C4h
There is a demo program shipped with fasm.dll which has an editor and hex-viewer: