Doxygen documentation of Fortran77 code (no line breaks, parameters not documented) - doxygen

I want to write a comment before a Fortran77 routine (main or another subroutine), where the Doxygen (version 1.9.0) comment lines should appear in the Doxygen HTML documentation as in the source code (here four lines, with line breaks).
The documentation of the parameter TOTAL should also be displayed.
Can you reproduce that? What is the exact way to use Doxygen for this example?
Example (test.f):
!> header of doxygen documentation
!! first line of doxygen documentation
!! third line of doxygen documentation
!! #param AVRAGE information about the average
PROGRAM CH0502
C> THIS PROGRAM READS IN THREE NUMBERS AND SUMS
C> AND AVERAGES THEM.
C
IMPLICIT LOGICAL (A-Z)
REAL NUMBR1,NUMBR2,NUMBR3,AVRAGE,TOTAL
C
INTEGER N
N = 3
TOTAL = 0.0
PRINT *,'TYPE IN THREE NUMBERS'
PRINT *,'SEPARATED BY SPACES OR COMMAS'
READ *,NUMBR1,NUMBR2,NUMBR3
TOTAL= NUMBR1+NUMBR2+NUMBR3 !> #param TOTAL is the total
AVRAGE=TOTAL/N
PRINT *,'TOTAL OF NUMBERS IS',
PRINT *,'AVERAGE OF THE NUMBERS IS',AVRAGE
END
doxygen -x Doxyfile gives as output:
# Difference with default Doxyfile 1.9.0 (71777ff3973331bd9453870593a762e184ba9f78)
PROJECT_NAME = "Fortran project"
OUTPUT_DIRECTORY = C:\Users\me\Desktop\doxygen_fortran_documentation
OPTIMIZE_FOR_FORTRAN = YES
EXTRACT_ALL = YES
INPUT = C:\Users\me\Desktop\doxygen_fortran
RECURSIVE = YES
GENERATE_TREEVIEW = YES
GENERATE_LATEX = NO
CLASS_DIAGRAMS = NO
The documentation result looks different as expected.

As a comment cannot include images or layouted text I have to revert to an answer (well it partly is).
When I understand it correctly you would like to have something like:
is this correct?
This can be accomplished in a number of ways.
by means of adding \n at the end of each line
by means of adding <br> at the end of each line
So the source code will look like:
!> header of doxygen documentation\n
!! first line of doxygen documentation\n
!! third line of doxygen documentation\n
!! #param AVRAGE information about the average
PROGRAM CH0502

Related

How to read a number from text file via Matlab

I have 1000 text files and want to read a number from each file.
format of text file as:
af;laskjdf;lkasjda123241234123
$sakdfja;lskfj12352135qadsfasfa
falskdfjqwr1351
##alskgja;lksjgklajs23523,
asdfa#####1217653asl123654fjaksj
asdkjf23s#q23asjfklj
asko3
I need to read the number ("1217653") behind "#####" in each txt file.
The number will follow the "#####" closely in all text file.
"#####" and the close following number just appear one time in each file.
clc
clear
MyFolderInfo = dir('yourpath/folder');
fidin = fopen(file_name,'r','n','utf-8');
while ~feof(fidin)
tline=fgetl(fidin);
disp(tline)
end
fclose(fidin);
It is not finish yet. I am stuck with the problem that it can not read after the space line.
This is another approach using the function regex. This will easily provide a more advanced way of reading files and does not require reading the full file in one go. The difference from the already given example is basically that I read the file line-by-line, but since the example use this approach I believe it is worth answering. This will return all occurences of "#####NUMBER"
function test()
h = fopen('myfile.txt');
str = fgetl(h);
k = 1;
while (isempty(str) | str ~= -1 ) % Empty line returns empty string and EOF returns -1
res{k} = regexp(str,'#####\d+','match');
k = k+1;
str = fgetl(h);
end
for k=1:length(res)
disp(res{k});
end
EDIT
Using the expression '#####(\d+)' and the argument 'tokens' instead of 'match' Will actually return the digits after the "#####" as a string. The intent with this post was also, apart from showing another way to read the file, to show how to use regexp with a simple example. Both alternatives can be used with suitable conversion.
Assuming the following:
All files are ASCII files.
The number you are looking to extract is directly following #####.
The number you are looking for is a natural number.
##### followed by a number only occurs once per file.
You can use this code snippet inside a for loop to extract each number:
regx='#####(\d+)';
str=fileread(fileName);
num=str2double(regexp(str,regx,'tokens','once'));
Example of for loop
This code will iterate through ALL files in yourpath/folder and save the numbers into num.
regx='#####(\d+)'; % Create regex
folderDir='yourpath/folder';
files=cellstr(ls(folderDir)); % Find all files in folderDir
files=files(3:end); % remove . and ..
num=zeros(1,length(files)); % Pre allocate
for i=1:length(files) % Iterate through files
str=fileread(fullfile(folderDir,files{i})); % Extract str from file
num(i)=str2double(regexp(str,regx,'tokens','once')); % extract number using regex
end
If you want to extract more ''advanced'' numbers e.g. Integers or Real numbers, or handle several occurrences of #####NUMBER in a file you will need to update your question with a better representation of your text files.

Comment out some part of a line in matlab function

As the question suggests I want to comment out some part of a line in MATLAB.
Also I want to comment out some part of a line not till the end of line.
Reason for this is, I have to try two different versions of a line and I don't want to replicate the line twice. I know it is easy to comment/uncomment if I replicate the line , But I want it this way.
Within one line is not possible (afaik), but you can split up your term into multiple lines:
x=1+2+3 ... optional comments for each line
... * factorA ... can be inserted here
* factorB ...
+4;
Here * factorA is commented out and * factorB is used, resulting in the term x=1+2+3*factorB+4.
The documentation contains a similar example, commenting out one part of an array:
header = ['Last Name, ', ...
'First Name, ', ...
... 'Middle Initial, ', ...
'Title']
Nope, this is not possible. From help '%':
% Percent. The percent symbol is used to begin comments.
Logically, it serves as an end-of-line character. Any
following text on the line is ignored or printed by the
HELP system.
So just copy-paste the line, or write a tiny function so that it's easier to switch between versions.

Using a .fasta file to compute relative content of sequences

So me being the 'noob' that I am, being introduced to programming via Perl just recently, I'm still getting used to all of this. I have a .fasta file which I have to use, although I'm unsure if I'm able to open it, or if I have to work with it 'blindly', so to speak.
Anyway, the file that I have contains DNA sequences for three genes, written in this .fasta format.
Apparently it's something like this:
>label
sequence
>label
sequence
>label
sequence
My goal is to write a script to open and read the file, which I have gotten the hang of now, but I have to read each sequence, compute relative amounts of 'G' and 'C' within each sequence, and then I'm to write it to a TAB-delimited file the names of the genes, and their respective 'G' and 'C' content.
Would anyone be able to provide some guidance? I'm unsure what a TAB-delimited file is, and I'm still trying to figure out how to open a .fasta file to actually see the content. So far I've worked with .txt files which I can easily open, but not .fasta.
I apologise for sounding completely bewildered. I'd appreciate your patience. I'm not like you pros out there!!
I get that it's confusing, but you really should try to limit your question to one concrete problem, see https://stackoverflow.com/faq#questions
I have no idea what a ".fasta" file or 'G' and 'C' is.. but it probably doesn't matter.
Generally:
Open input file
Read and parse data. If it's in some strange format that you can't parse, go hunting on http://metacpan.org for a module to read it. If you're lucky someone has already done the hard part for you.
Compute whatever you're trying to compute
Print to screen (standard out) or another file.
A "TAB-delimite" file is a file with columns (think Excel) where each column is separated by the tab ("\t") character. As quick google or stackoverflow search would tell you..
Here is an approach using 'awk' utility which can be used from the command line. The following program is executed by specifying its path and using awk -f <path> <sequence file>
#NR>1 means only look at lines above 1 because you said the sequence starts on line 2
NR>1{
#this for-loop goes through all bases in the line and then performs operations below:
for (i=1;i<=length;i++)
#for each position encountered, the variable "total" is increased by 1 for total bases
total++
}
{
for (i=1;i<=length;i++)
#if the "substring" i.e. position in a line == c or g upper or lower (some bases are
#lowercase in some fasta files), it will carry out the following instructions:
if(substr($0,i,1)=="c" || substr($0,i,1)=="C")
#this increments the c count by one for every c or C encountered, the next if statement does
#the same thing for g and G:
c++; else
if(substr($0,i,1)=="g" || substr($0,i,1)=="G")
g++
}
END{
#this "END-block" prints the gene name and C, G content in percentage, separated by tabs
print "Gene name\tG content:\t"(100*g/total)"%\tC content:\t"(100*c/total)"%"
}

output format of cvs diff

I modified line 494 of a certain file, and use cvs diff -u4 to see what I have modified, cvs outputs something like :
## -490,9 +490,9 ##
if (!(hPtr->hStatus & (HOST_STAT_UNAVAIL | HOST_STAT_UNLICENSED |
HOST_STAT_UNREACH))){
printf(" %s:\n",
_i18n_msg_get(ls_catd,NL_SETN,1612, "CURRENT LOAD USED FOR SCHEDULING")); /* catgets 1612 */
- prtLoad(hPtr, lsInfo);
+ prtLoad(hPtr, lsInfo,bhostParams);
if (lsbSharedResConfigured_) {
/* there are share resources */
retVal = makeShareFields(hPtr->host, lsInfo, &nameTable,
I didn't understand what the first line "## -490,9 +490,9 ##" mean, I did modify line 494, but why CVS writes 490 instead? Could anyone tell me what does "## -490,9 +490,9 ##" mean?
The "u" gives you a unified diff and the "4" give you 4 lines of context on either side. From the WP entry I just linked:
The format of the range information line is as follows:
## -l,s +l,s ##
The hunk range information contains two hunk ranges. The range for the
hunk of the original file is preceded by a minus symbol, and the range
for the new file is preceded by a plus symbol. Each hunk range is of
the format l,s where l is the starting line number and s is the number
of lines the change hunk applies to for each respective file.
So basically the number isn't the line that was changed. It's the start of the range being displayed in that hunk. Using your example, the hunk starts at line 490 and 9 lines were in the range. The reason the range covers 9 lines is because of the one line you changed and the four lines of context on either side.
Note that your example seems to have some newlines stripped. I would recommend you fix it so it is clear for other people.

perlre length limit

From man perlre:
The "*" quantifier is equivalent to "{0,}", the "+" quantifier to "{1,}", and the "?" quantifier to "{0,1}". n and m are limited to integral values less than a preset limit defined when perl is built. This is usually 32766 on the most common platforms. The actual limit can be seen in the error message generated by code such as this:
$_ **= $_ , / {$_} / for 2 .. 42;
Ay that's ugly - Isn't there some constant I can get instead?
Edit: As daxim pointed out (and perlretut hints towards) it might be that 32767 is a magical hardcoded number. A little searching in the Perl code goes a long way, but I'm not sure how to get to the next step and actually find out where the default reg_infty or REG_INFTY is actually set:
~/dev/perl-5.12.2
$ grep -ri 'reg_infty.*=' *
regexec.c: if (max != REG_INFTY && ST.count == max)
t/re/pat.t: $::reg_infty = $Config {reg_infty} // 32767;
t/re/pat.t: $::reg_infty_m = $::reg_infty - 1;
t/re/pat.t: $::reg_infty_p = $::reg_infty + 1;
t/re/pat.t: $::reg_infty_m = $::reg_infty_m; # Surpress warning.
Edit 2: DVK is of course right: It's defined at compile time, and can probably be overridden only with REG_INFTY.
Summary: there are 3 ways I can think of to find the limit: empirical, "matching Perl tests" and "theoretical".
Empirical:
eval {$_ **= $_ , / {$_} / for 2 .. 129};
# To be truly portable, the above should ideally loop forever till $# is true.
$# =~ /bigger than (-?\d+) /;
print "LIMIT: $1\n"'
This seems obvious enough that it doesn't require explanation.
Matches Perl tests:
Perl has a series of tests for regex, some of which (in pat.t) deal with testing this max value. So, you can approximate that the max value computed in those tests is "good enough" and follow the test's logic:
use Config;
$reg_infty = $Config {reg_infty} // 2 ** 15 - 1; # 32767
print "Test-based reg_infinity limit: $reg_infty\n";
The explanation of where in the tests this is based off of is in below details.
Theoretical: This is attempting to replicate the EXACT logic used by C code to generate this value.
This is harder that it sounds, because it's affected by 2 things: Perl build configuration and a bunch of C #define statements with branching logic. I was able to delve fairly deeply into that logic, but was stalled on two problems: the #ifdefs reference a bunch of tokens that are NOT actually defined anywhere in Perl code that I can find - and I don't know how to find out from within Perl what those defines values were, and the ultimate default value (assuming I'm right and those #ifdefs always end up with the default) of #define PERL_USHORT_MAX ((unsigned short)~(unsigned)0) (The actual limit is gotten by removing 1 bit off that resulting all-ones number - details below).
I'm also not sure how to access the amount of bytes in short from Perl for whichever implementation was used to build perl executable.
So, even if the answer to both those questions can be found (which I'm not sure of), the resulting logic would most certainly be "uglier" and more complex than the straightforward "empirical eval-based" one I offered as the first option.
Below I will provide the details of where various bits and pieces of logic related to to this limit live in Perl code, as well as my attempts to arrive at "Theoretically correct" solution matching C logic.
OK, here is some investigation part way, you can complete it yourself as I have ti run or I will complete later:
From regcomp.c: vFAIL2("Quantifier in {,} bigger than %d", REG_INFTY - 1);
So, the limit is obviously taken from REG_INFTY define. Which is declared in:
rehcomp.h:
/* XXX fix this description.
Impose a limit of REG_INFTY on various pattern matching operations
to limit stack growth and to avoid "infinite" recursions.
*/
/* The default size for REG_INFTY is I16_MAX, which is the same as
SHORT_MAX (see perl.h). Unfortunately I16 isn't necessarily 16 bits
(see handy.h). On the Cray C90, sizeof(short)==4 and hence I16_MAX is
((1<<31)-1), while on the Cray T90, sizeof(short)==8 and I16_MAX is
((1<<63)-1). To limit stack growth to reasonable sizes, supply a
smaller default.
--Andy Dougherty 11 June 1998
*/
#if SHORTSIZE > 2
# ifndef REG_INFTY
# define REG_INFTY ((1<<15)-1)
# endif
#endif
#ifndef REG_INFTY
# define REG_INFTY I16_MAX
#endif
Please note that SHORTSIZE is overridable via Config - I will leave details of that out but the logic will need to include $Config{shortsize} :)
From handy.h (this doesn't seem to be part of Perl source at first glance so it looks like an iffy step):
#if defined(UINT8_MAX) && defined(INT16_MAX) && defined(INT32_MAX)
#define I16_MAX INT16_MAX
#else
#define I16_MAX PERL_SHORT_MAX
I could not find ANY place which defined INT16_MAX at all :(
Someone help please!!!
PERL_SHORT_MAX is defined in perl.h:
#ifdef SHORT_MAX
# define PERL_SHORT_MAX ((short)SHORT_MAX)
#else
# ifdef MAXSHORT /* Often used in <values.h> */
# define PERL_SHORT_MAX ((short)MAXSHORT)
# else
# ifdef SHRT_MAX
# define PERL_SHORT_MAX ((short)SHRT_MAX)
# else
# define PERL_SHORT_MAX ((short) (PERL_USHORT_MAX >> 1))
# endif
# endif
#endif
I wasn't able to find any place which defined SHORT_MAX, MAXSHORT or SHRT_MAX so far. So the default of ((short) (PERL_USHORT_MAX >> 1)) it is assumed to be for now :)
PERL_USHORT_MAX is defined very similarly in perl.h, and again I couldn't find a trace of definition of USHORT_MAX/MAXUSHORT/USHRT_MAX.
Which seems to imply that it's set by default to: #define PERL_USHORT_MAX ((unsigned short)~(unsigned)0). How to extract that value from Perl side, I have no clue - it's basically a number you get by bitwise negating a short 0, so if unsigned short is 16 bytes, then PERL_USHORT_MAX will be 16 ones, and PERL_SHORT_MAX will be 15 ones, e.g. 2^15-1, e.g. 32767.
Also, from t/re/pat.t (regex tests): $::reg_infty = $Config {reg_infty} // 32767; (to illustrate where the non-default compiled in value is stored).
So, to get your constant, you do:
use Config;
my $shortsize = $Config{shortsize} // 2;
$c_reg_infty = (defined $Config {reg_infty}) ? $Config {reg_infty}
: ($shortsize > 2) ? 2**16-1
: get_PERL_SHORT_MAX();
# Where get_PERL_SHORT_MAX() depends on logic for PERL_SHORT_MAX in perl.h
# which I'm not sure how to extract into Perl with any precision
# due to a bunch of never-seen "#define"s and unknown size of "short".
# You can probably do fairly well by simply returning 2**8-1 if shortsize==1
# and 2^^16-1 otherwise.
say "REAL reg_infinity based on C headers: $c_reg_infty";