threads error in running stanford LLDA tmt scala models - scala

when I run stanford LLDA tmt scala models, I encounter a few problems. One of them is a thread error when I try to do inference with LLDA tmt model. The code I am running is exactly the one provided by Shreyas Karnik in the sourceforge link,
#Skarab: Here are the links to the code which I used to bit.ly/ocK2T9 (learning) and bit.ly/qIWb6C (inference) please let me know if you still encounter any errors.
The error message is,
Command started: Fri Jun 21 21:34:48 CDT 2013
java -Dscalanlp.distributed.hub=socket://crick7.mayo.edu:41080/hub -Dscalanlp.distributed.id=/tmt/0 -Xmx100000m edu.stanford.nlp.tmt.TMTMain "/data4/bsi/nlp/s110067.sharp/bioask/tmtModels/example-7-llda-infer.scala"
Loading model ...
TSVFile("test.csv") ~> IDColumn(1) ~> Column(2) ~> TokenizeWith(SimpleEnglishTokenizer.V1() ~> CaseFolder() ~> WordsAndNumbersOnlyFilter() ~> MinimumLengthFilter(3))
Generating output ...
[Concurrent] 128 permits
[Concurrent] 128 permits
Exception in thread "Thread-3" java.lang.IndexOutOfBoundsException: 1
at scala.collection.LinearSeqOptimized$class.apply(LinearSeqOptimized.scala:51)
at scala.collection.immutable.List.apply(List.scala:45)
at scalanlp.stage.Column.map(ColumnSelectors.scala:51)
at scalanlp.stage.Column.map(ColumnSelectors.scala:46)
at scalanlp.stage.generic.Mapper$$anonfun$apply$1$$anonfun$apply$2.apply(Mapper.scala:36)
at scalanlp.stage.Item.map(Item.scala:32)
at scalanlp.stage.generic.Mapper$$anonfun$apply$1.apply(Mapper.scala:36)
at scalanlp.stage.generic.Mapper$$anonfun$apply$1.apply(Mapper.scala:36)
at scala.collection.Iterator$$anon$19.next(Iterator.scala:335)
at scala.collection.Iterator$$anon$19.next(Iterator.scala:335)
at edu.stanford.nlp.tmt.data.concurrent.Concurrent$$anonfun$map$2.apply(Concurrent.scala:96)
at edu.stanford.nlp.tmt.data.concurrent.Concurrent$$anonfun$map$2.apply(Concurrent.scala:88)
at edu.stanford.nlp.tmt.data.concurrent.Concurrent$$anon$4.run(Concurrent.scala:45)
Would you give help? highly appreciate it!

This is a known issue of tmt-0.4.0 for many users. tmt-0.2.1 works just fine.
But the support for tmt-0.4.0 has been discontinued and for tmt-0.2.1 even before that.
You can check out http://mallet.cs.umass.edu/, which also has easy to start GUIs available such as http://www.themacroscope.org/?page_id=391.

Check any .csv files for DOS/Unix incompatibility.
My experience may be relevant:
I was also getting the error java.lang.IndexOutOfBoundsException, when running tmt-0.4.0 from a command line on Windows. My workflow had a python program splitting my data set into training and test .csv files. But they were written in DOS mode. See for example,
http://www.cs.toronto.edu/~krueger/csc209h/tut/line-endings.html
A confirmation of this is that when opening them in Excel, there was an extra line, and when opening them in Emacs, there's a telltale ^M.
I ran dos2unix on the .csv files, and then the TMT Scala programs worked.

Related

Relative Path (pathlib) name working on MAC OS but on Windows gives me a error

Currently I am working a project that has have been using the pathlib library so I can work on my Windows desktop when I need too and on my MacBook Pro. Essentially be able to work between both operating systems. I have not have any issues at all until right now. Here is the set up:
I have a pipeline set up to automatically save a .joblib and a whole lot of .png files that will go to a directory called
output_dir = Path('../Trained_Models/Differential_gene_analysis/A Kidney Cancer Transcriptome Molecular Signature Identifies Tumors with Tumor Thrombus/Models train on TCGA data and test on Rodriguez data/Oct-XX-20XX')
For example, if I want to save a .joblib file under the name RandomForest_TumorThrombus_104.joblib,I would use the command
joblib.dump(model ,output_dir / 'RandomForest_TumorThrombus_104.joblib')
On my MacBook Pro, I have no issues when this is ran, but on Windows it gives me the following error
FileNotFoundError: [Errno 2] No such file or directory: '..\\Trained_Models\\Differential_gene_analysis\\A Kidney Cancer Transcriptome Molecular Signature Identifies Tumors with Tumor Thrombus\\Models train on TCGA data and test on Rodriguez data\\Oct-17-2022\\RandomForest_TumorThrombus_104.joblib'
I have tried to use the .resolve() method to get the absolute path but still gives me the same error. I have tried to experiment to try to see what is goin on such as using os.path.exists(). When using the os.path.exists() method I get True for the follwoing command:
os.path.exists(output_dir)
So it does indeed recognize that the directory exists. The next thing I tried was to rename the file to something like dddddd.joblib and that worked. But I find that only a few names for the file would allow me to save the files. During debug I found that the most recent Traceback occurs here:
with open(filename, 'wb') as f:```
I was wondering if anyone here had any idea what was going on here and how I can fix this issue? Please and Thank you.
The solution was to enable long paths on Windows.

How to get Resharpers InspectCode to recognize Plugins?

I am trying to run ReSharpers command line tool InspectCode.exe. It's running fine doing it's job with the expected output.
However after my earlier attempt to get plugins to work, this time with the new version it is supposed to be supported. There is a switch in the command line interface that allows to specify the extension you want to use.
/extensions (/x) – allows using ReSharper extensions that affect code analysis. To use an extension, specify its ID, which you can find by opening the extension package page in the ReSharper Gallery, and then the Package Statistics page. Multiple values are separated with the semicolon.
But I cannot get it to work properly. I cannot even provoke any reaction to the /x switch at all. No matter how or what I pass, I get no feedback from the executable and the output is identical. I don't even get an error message when passing obvious garbage.
I tried the following commandlines for the exact same result:
inspectcode.exe /o="rcli.xml" /swea /x="ReSharper.StyleCop" "my.sln"
inspectcode.exe /o="rcli.xml" /swea /x=ReSharper.StyleCop "my.sln"
inspectcode.exe /o="rcli.xml" /swea "my.sln"
inspectcode.exe /o="rcli.xml" /swea /x=ABCDEFG "my.sln"
Result
JetBrains Inspect Code 9.1.1
Running in 64-bit mode, .NET runtime 4.0.30319.18444 under Microsoft Windows NT
6.1.7601 Service Pack 1
Enabled solution-wide analysis according to Inspect Code command line Setting.
Analyzing files
[files]
Inspection report was written to rcli.xml
What am I doing wrong? How to get extensions to work?
I already tried the R# forums, but it took them more then 24h to approve my post and so far I'm not sure someone else even read it.
Unfortunately, the support for extensions was dropped in 9.0 due to the refactorings in the "ReSharper platform". I hope that JetBrains will bring it back soon.
See RSRP-436208.
This is a late answer that might help future readers (like myself). Currently inspectcode.exe will automatically look for and use any NuGet packages that are in the same folder as the executable (source).
Example for CleanCode extension:
if you have a R# instance on some machine and install the extension, it will be placed in C:\Users\{user}\AppData\Local\JetBrains\plugins\MO.CleanCode.5.6.15
copy MO.CleanCode.5.6.15.nupkg and paste it next to inspectcode.exe
when running inspectcode with verbosity = VERBOSE, the extension should appear in the Zones list:
$cmd = "..\JetBrains.ReSharper.CommandLineTools.2019.3.4\inspectcode.exe"
$outputFile = "..\Output\$($outputName).xml"
& $cmd -o="$outputFile" $sln --verbosity=VERBOSE
Zones: (52pcs)[CodeInspectionPageImplZone, DaemonEngineZone,
DaemonZone, IAmd64CpuArchitectureHostZone, IAspMvcZone,
IBatchToolEnvironmentZone, IClrImplementationHost Zone,
IClrPsiLanguageZone, ICodeEditingOptionsPageImplZone,
IConsoleEnvironmentZone, ICppProductZone, ICpuArchitectureHostZone,
IDocumentModelZone, IEnvironmentZone, IHostSolutionZone,
IInspectCodeConsoleEnvironmentZone, IInspectCodeEnvironmentZone,
IInspectCodeZone, ILanguageAspZone, ILanguageBuildScriptsZone,
ILanguageCppZone, I LanguageCSharpZone, ILanguageCssZone,
ILanguageHtmlZone, ILanguageIlZone, ILanguageJavaScriptZone,
ILanguageMsBuildZone, ILanguageNAntZone, ILanguageProtobufZone, ILa
nguageRazorZone, ILanguageRegExpZone, ILanguageResxZone,
ILanguageVBZone, ILanguageXamlZone, INetFrameworkHostZone, INuGetZone,
IOperatingSystemHostZone, IProjectMode lZone,
IPsiAssemblyFileLoaderImplZone, IPsiLanguageZone,
IPublicVisibilityZone, IRdFrameworkZone, IRiderModelZone,
ISinceClr2HostZone, ISinceClr4HostZone, ITextContro lsZone,
IToolsOptionsPageImplZone, IWebPsiLanguageZone, IWindowsNtHostZone,
PsiFeaturesImplZone, ReplaceableByIntelliJPlatformZone, SweaZone]
Packages: (23pcs)[JetBrains.ExternalAnnotations,
JetBrains.Platform.Core.Ide, JetBrains.Platform.Core.IisExpress,
JetBrains.Platform.Core.MsBuild, JetBrains.Platform. Core.Shell,
JetBrains.Platform.Core.Text, JetBrains.Platform.Interop.CommandLine,
JetBrains.Platform.Interop.dotMemoryUnit.Framework,
JetBrains.Platform.Interop.dotMe moryUnit.Interop.Console,
JetBrains.Platform.Interop.dotMemoryUnit.Interop.Ide,
JetBrains.Platform.RdProtocol, JetBrains.Psi.Features.Core,
JetBrains.Psi.Features.Cpp .Src.Core, JetBrains.Psi.Features.src,
JetBrains.Psi.Features.Tasks, JetBrains.Psi.Features.UnitTesting,
JetBrains.Psi.Features.Web.Core, JetBrains.ReSharperAutomatio
nTools.src.CleanupCode,
JetBrains.ReSharperAutomationTools.src.CommandLineCore,
JetBrains.ReSharperAutomationTools.src.CommandLineProducts,
JetBrains.ReSharperAutomat ionTools.src.DuplicatesFinder,
JetBrains.ReSharperAutomationTools.src.InspectCode, MO.CleanCode]

Why can't I debug my puppet code?

I'm learning Puppet and the biggest frustration I have with the entire paradigm is the try/run/fix development process I'm using to build functional Puppet code. My background is in Java and I'm naturally use to debugging my code to find errors instead of just running the program to see where it bombs making development much faster but I can't seem to find a way to do this using Puppet and Eclipse. I know writing a debugger for Puppet would require some creativity given its nature but I think this is something the community could really benefit from.
I've written debuggers and know the Eclipse SDK but unfortunately it does not map cleanly to the Puppet architecture which is a bit awkward in the sense its runtime stack and execution flow does not happen in natural order as well as the fact the runtime requires a target machine to apply changes on.
I'm curious if the community has done any development work on trying to create some kind of debugger where code can be stepped. To write this I think it would make sense to extend Eclipse with a new Puppet debug configuration type where you specify a target sandbox host to test your code as well as a puppet project in your workspace you want to debug (leveraging existing Gepetto tooling). Then when you start a new Puppet debugging session Eclipse could connect to the remote host, execute puppet apply with some additional debug arguments and somehow provide feedback from the runtime about what line of code is currently being executed.
This still might be awkward but would allow puppet developers to quickly see things like oh duh.. I can't create this directory because the parent path does not exist, wait... why is this if statement not going here like I planned, oh I see here that Puppet is not very clear on single or double quotes or now I see why this fails because this class was not executed first etc. etc.
Instead all we get is a big ugly output on the agent console that yes can give us insight on errors but does not cleanly map exceptions to our code that in my view shows an underlying pain and weakness of Puppet, can you at least give me a stack trace and line number so I know where to look? Nope sorry.
Don't get me wrong, I love how Puppet can make me look very productive throughout the work week when all I'm doing is running Puppet apply on new machines which my manager has not yet figured out but I think for Puppet to really be useful this lack of debugging support is something that needs to be addressed.
Does anyone else feel this pain? - Duncan Krebs
It would be impossible to "step through" puppet code, unless you want to debug against the ruby codebase itself. It's not just that the order of "execution" is unclear, its that the manifest themselves are never executed at a single time. They are actually evaluated in multiple phases throughout execution.
There are ways to simplify finding problems though. The biggest one is writing unit tests using rspec-puppet. It lets you essentially test the compilation phase of puppet, helping you catch errors like circular dependencies, incorrect conditional logic, etc.
There is a new tool called the puppet-debugger which allows you to set breakpoints in your puppet code in order to step through it. So this is no longer "impossible" as it has been available for about 8 months.
You will first need to install the puppet-debugger gem
https://github.com/nwops/puppet-debugger
Then install the debug module, include it in your fixtures or just ensure it is in your module path.
https://forge.puppet.com/nwops/debug
Then just set a breakpoint in your code using debug::break() function.
Ruby Version: 2.0.0
Puppet Version: 4.9.4
Puppet Debugger Version: 0.6.0
Created by: NWOps
Type "exit", "functions", "vars", "krt", "whereami", "facts", "resources", "classes",
"play", "classification", "types", "datatypes", "reset", or "help" for more information.
>> $vars = ['one', 'two', 'three']
=> [
[0] "one",
[1] "two",
[2] "three"
]
>> $vars.each | String $var | {
debug::break()
notify{$var:}
}
From file: puppet_debugger_input20170417-97123-qjwbaj.pp
1: $vars.each | String $var | {
=> 2: debug::break()
3: notify{$var:}
4: }
1:>> $var
=> "one"
2:>> exit
From file: puppet_debugger_input20170417-97123-qjwbaj.pp
1: $vars.each | String $var | {
=> 2: debug::break()
3: notify{$var:}
4: }
1:>> $var
=> "two"
2:>> exit
From file: puppet_debugger_input20170417-97123-qjwbaj.pp
1: $vars.each | String $var | {
=> 2: debug::break()
3: notify{$var:}
4: }
1:>> $var
=> "three"

Pyinstaller --onefile warning pyconfig.h when importing scipy or scipy.signal

This is very simple to recreate.
If my script foo.py is:
import scipy
Then run:
python pyinstaller.py --onefile foo.py
When I launch foo.exe I get:
WARNING: file already exists but should not: C:\Users\username\AppData\Local\Temp\_MEI86402\Include\pyconfig.h
I've tested a few versions but the latest I've confirmed is 2.1dev-e958e02 running on Win7, Python 2.7.5 (32 bit), Scipy version 0.12.0
I've submitted a ticket with the Pyinstaller folks but haven't heard anything yet. Any clues how to debug this further?
You can hack the spec file to remove the second instance by adding these lines after a=Analysis:
for d in a.datas:
if 'pyconfig' in d[0]:
a.datas.remove(d)
break
The answer by wtobia# worked for me. See https://github.com/pyinstaller/pyinstaller/issues/783
Go to C:\Python27\Lib\site-packages\PyInstaller\build.py
Find the def append(self, tpl): function.
Change if tpl[2] == "BINARY": to if tpl[2] in ["BINARY", "DATA"]:
Expanding upon Ilya's solution, I think this is a little bit more robust solution to modifying the spec file (again place after the a=Analysis... statement).
a.datas = list({tuple(map(str.upper, t)) for t in a.datas})
I only tested this on a small test program (one with a single import and print statement), but it seems to work. a.datas is a list of tuples of strings which contain the pyconfig.h paths. I convert them all to lowercase and then dedup. I actually found that converting all of them all to lowercase was sufficient to get it to work, which suggests to me that pyinstaller does case-sensitive deduping when it should be case-insensitive on Windows. However, I did the deduping myself for good measure.
I realized that the problem is that Windows is case-insensitive and these 2 statements are source directories are "duplicates:
include\pyconfig.h
Include\pyconfig.h
My solution is to manually tweak the .spec file with after the a=Analysis() call:
import platform
if platform.system().find("Windows")>= 0:
a.datas = [i for i in a.datas if i[0].find('Include') < 0]
This worked in my 2 tests.
A more flexible solution would be to check ALL items for case-insensitive collisions.
I ran the archive_viewer.py utility (from PyInstaller) on one of my own --onefile executables that has the same error and found that pyconfig.h is included twice:
(31374007, 6521, 21529, 1, 'x', 'include\\pyconfig.h'),
(31380528, 6521, 21529, 1, 'x', 'Include\\pyconfig.h'),
(31387049, 984, 2102, 1, 'x', 'pytz\\zoneinfo\\CET'),
Sadly though, I don't know how to fix it.
PyInstaller Manual link:
http://www.pyinstaller.org/export/d3398dd79b68901ae1edd761f3fe0f4ff19cfb1a/project/doc/Manual.html#archiveviewer

Algebra filter error in moodle

I installed moodle 1.9.12 and now I want to use Algebra notation in content. I enable "TeX Notation" and "Algebra Notation" in administrator panel and also install mimetext and dvips and Imagemagic on the server. fortunately Tex Notation works fine but I got the following error for Algebra:
sh: /var/www/html/moodle/filter/tex/mimetex.linux: not found
The shell command
"/var/www/html/moodle/filter/tex/mimetex.linux" -e "/var/www/moodledata/filter/algebra/de06d6c44d98ba4e42dffca988bf530b.gif" -- '\Large \frac{\sin\left(z\right)}{x^{2}+y^{2}}'
returned status = 127
File size of mimetex executable /var/www/html/moodle/filter/tex/mimetex.linux is 830675
The file permissions are: 100775
The md5 checksum of the file is 56bcc40de905ce92ebd7b083c76e019e
Image not found!
Note: /var/www/html/moodle/filter/tex/mimetex.linux exists on the server and is executable!!!
What is the problem?? Any idea?????
From what you have described, calling the general tex filter debug page works and does not show up the same error.
/filter/tex/texdebug.php works, but /filter/algebra/algebradebug.php does not.
If this is the case, perhaps you could check for an open_basedir, or safe_mode_exec_dir being set to include the current working directory, or otherwise restricting the execution of /var/www/html/moodle/filter/tex/mimetex.linux, while the current working directory is /var/www/html/moodle/filter/algebra.
You could look at this by visiting /admin/phpinfo.php at your site, and look carefully at the effective values of open_basedir, safe_mode and safe_mode_exec_dir.
You could also check the apache error log or add the following lines to the top of the algebra debug php file, and you might see some extra error messages:
$CFG->debug = 6143 ;
$CFG->debugdisplay= 1 ;
Hope that helps