Trying to open a python file using power shell but it brings up a list 'index out of range' error... but the items are not out of range? - powershell

PS C:\OIDv4_ToolKit> python convert_annotations.py
Currently in subdirectory: train
Converting annotations for class: Vehicle registration plate
0%| | 0/400 [00:00<?, ?it/s]0317.44 497.91974400000004 413.44 526.08
0%| | 0/400 [00:00<?, ?it/s]
Traceback (most recent call last):
File "C:\OIDv4_ToolKit\convert_annotations.py", line 66, in <module>
coords = np.asarray([float(labels[1]), float(labels[2]), float(labels[3]), float(labels[4])])
IndexError: list index out of range
python file: this is the error it refers to as line 66 (Line 7 here)
with open(filename) as f:
for line in f:
for class_type in classes:
line = line.replace(class_type, str(classes.get(class_type)))
print(line)
labels = line.split()
coords = np.asarray([float(labels[1]), float(labels[2]), float(labels[3]), float(labels[4])])
coords = convert(filename_str, coords)

This doesn't look like a PowerShell issue; the python interpreter looks like it is being run correctly. I suggest adding the python tag to your question to get the right people involved.
Having located the source, it seems as if some of the text files in the following directory aren't in the format expected by convert_annotations.py:
C:\OIDv4_ToolKit\OID\Dataset\train\Vehicle registration plate\Label\
You can verify this with:
print("labels length =", len(labels))
after the line.split() method. If you get a length of 1, it is likely the items on a line somewhere aren't separated with whitespace, for example with commas. You can also inspect the files manually to determine the format. To find them, you can use:
print(os.path.join(os.getcwd(), filename))
inside the the for loop, which is on Line 54 in the source I linked above. Note also that the string split() method supports a custom separator as the first argument, should the files be in a different format.

This issue occurs when you don't put the class name in classes.txt
The class name should be same in classes.txt as downloaded class.

Related

Safely pass input line by line (from generator) to subprocess' stdin on Python

I want to manage a subprocess with the subprocess module, and I need to pipe a (really) large numbers of lines to the child stdin. I'm creating the input with a generator, and passing onto the subprocess like this:
def my_gen (end): # simplified example
for i in range(0, end):
yield f"line {i}"
with subprocess.Popen(["command", "-o", "option_value"], # simplified example
stdin = subprocess.PIPE, stdout = sys.stdout, stderr = sys.stderr) as process:
for line in my_gen(1e7):
process.stdin.write(line.encode()) # This is apparently not safe
out, err = process.communicate() # out and err will be None,
# but this closes the process gracefully, which "with" does too
This results in a Broken Pipe Error, although it does't happen all the time on every machine I've tried:
Traceback (most recent call last):
File "my_script", line 170, in <module>
process.stdin.write(line.encode())
BrokenPipeError: [Errno 32] Broken pipe
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "path/tolib/python3.8/subprocess.py", line 171, in <module>
File "path/tolib/python3.8/subprocess.py", line 914, in __exit__
self.stdin.close()
BrokenPipeError: [Errno 32] Broken pipe
So, what's the safe way to pass input line by line from a generator to a subprocess?
Edit: I've been getting suggestions about using communicate, which is of course in the docs. That answers how to communicate safely, but it doesn't accept a generator as input.
Edit2: as Booboo pointed out, the example will throw a runtime error (not the one I was finding in my code), the call to range should be range(0, int(end)) so my_gen can accept numbers in 1e7 notation.
First of all, if you want stdout and stderr to not be piped, then either do not specify these arguments to the Popen call at all or specify their values as None, the default value if not specified (but do not specify these as sys.stdout and sys.stderr).
Why not? Looking at the source for the Popen.communicate method I can see that there is special optimized code for the case where there is only one non-None argument and when that argument is the sysin argument then Popen.communicate is implemented by simply doing a write of the past input string to the pipe and ignores any BrokenPipeError error that might occur. But by passing the stdout and stderr arguments as you are, I suspect that communicate is confused and is now starting threads to handle the processing and this is ultimately intermittently leading to your exception.
Now I believe that you can execute your writes without using communicate and also ignore the BrokenPipeError. When I tried the following code (substituting my own command being executed by Popen that writes what is being piped in to a file and using text mode), I, in fact, did not encounter any BrokenPipeError exceptions (nor do I expect to with the proper setting of stdout and stderr). So I can't swear to whether the output will still be correct if such an exception should occur.
As an aside, the range built-in function does not take a float object (at least not for me), so I don't know how you are able to specify 1e7.
I have also modified the code to add terminating newline characters at the end of each line and to process in text mode, but you should not feel constrained to do so.
import subprocess
import sys
def my_gen (end): # simplified example
for i in range(0, end):
yield f"line {i}\n"
with subprocess.Popen(["command", "-o", "option_value"], stdin=subprocess.PIPE, text=True) as process: # simplified example
for line in my_gen(10_000_000):
try:
process.stdin.write(line)
except BrokenPipeError as e:
pass
out, err = process.communicate()
Docs say to use .communicate:
Warning: Use communicate() rather than .stdin.write, .stdout.read or .stderr.read to avoid deadlocks due to any of the other OS pipe buffers filling up and blocking the child process.
https://docs.python.org/3/library/subprocess.html#subprocess.Popen.communicate

How to read .yml file in matlab

I have a sequence of .yml files generated by opencv that I was trying to read into MATLAB using yamlmatlab, but I am getting the following error:
y_data = ReadYaml(yaml_file);
Error using ReadYamlRaw>load_yaml (line 78)
while scanning a directive
in "<string>", line 1, column 1:
%YAML:1.0
^
expected alphabetic or numeric character, but found :(58)
in "<string>", line 1, column 6:
%YAML:1.0
^
My YAML Files look like the following:
%YAML:1.0
Vocabulary: !!opencv-matrix
rows: 100
cols: 78
dt: f
data: [ 1.00037329e-001, 8.75103176e-002, 1.09445646e-001,
1.05232671e-001, 6.78173527e-002, 9.65989158e-002,
1.62132218e-001, 1.56320035e-001, 1.12932988e-001,
1.27447948e-001, 1.88054979e-001, 1.88775390e-001,.....
And
%YAML:1.0
---
vocabulary: !!opencv-matrix
rows: 100
cols: 1
dt: f
data: [ 3.54101445e-04, 1.23916077e+02, 9.93522644e+01,
2.42377838e+02, 3.53855858e+01, 1.69853516e+02, 5.81151466e+01,
8.07454453e+01, 1.83035984e+01, 2.13557846e+02, 1.52394699e+02,
1.10933914e+02, ......
I have tried it with YAMLMatlab but am still getting the same error. Please help how to read these file and convert them into .mat files.
You can use the parser I wrote and published recently on matlabcentral and github, cvyamlParser. It can handle the header in yaml file properly.
https://zenodo.org/record/2703498#.XNg20NMzafU
https://github.com/tmkhoyan/cvyamlParser
https://in.mathworks.com/matlabcentral/fileexchange/71508-cvyamlparser
It is a MEX-file compiled for linux and osx. You can use the src file and instructions on to compile a windows version.
It will take a yaml file written by open cv and convert it to a structure with the same variable names as provided in the yaml. The variable data type is inferred at runtime, optionally you can use sorting for variables that have a numerical index like A1,A2,A4,A5 etc.
Use it like so:
s = readcvYaml('../data/test_data.yaml')
s =
struct with fields:
matA0: [1000×3 double]
matA1: [1000×3 double]
matA2: [1000×3 double]
Or with sorting:
s = readcvYaml('../data/test_data.yaml','sorted')
s =
struct with fields:
matA: [1×3 struct]
It appears that the linked library (which appears to use SnakeYAML under the hood) is not able to parse the YAML 1.0 YAML directive which contains a colon (:) rather than a space in later versions of the specification.
%YAML:1.0
Became:
%YAML 1.2
It appears that the contents of the YAML file are compatible with newer YAML formats, so you could try remove the directive from the file prior to parsing (delete the first line).
As far as converting once you have the data loaded into MATLAB, you should be able to do something like:
% Read the yaml file
yaml = yaml.ReadYaml(yaml_file);
% Load in the matrix and reshape into the desired size
mat = reshape(yaml.data, yaml.cols, yaml.rows).';
% Save to .mat file
save('output.mat', 'mat')

How to read line with comma-separated fields from file?

I have task to read a positional file. I am able to read positional file with hard-coded data length in code but my task is to read data lengths from external file.
val lengths = Seq(3,10,5,4) // <-- I'd like to read it from an external file
Say, you have a file with the following content (that corresponds to the positions):
$ cat positions.csv
3,10,5,4
In Scala, you could read the file as follows:
val lengths = scala.io.Source.
fromFile("positions.csv").
getLines.
take(1).
toArray.
head.
split(",").
map(_.toInt).
toSeq
scala> lengths.foreach(println)
3
10
5
4

opening a batch file that opens a text file in python

I am writing a script that can execute a batch file, which needs to open a file in the same folder first. My current code is:
from subprocess import Popen
p = Popen("Mad8dl.bat <RUNTHISTO.txt>", cwd=r"C:\...\test")
stdout, stderr = p.communicate()
where the ... is just the path to the folder. However, everytime I run it I get the syntax error:
The syntax of the command is incorrect
Any help regarding the syntax would be greatly appreciated.
First, you should probably remove the < and > angle brackets from your code; just pass the filename, without any brackets, to your batch file. (Unless your filename really does contain < and > characters, in which case I really want to know how you managed it since those characters are forbidden in filenames in Windows).
Second, your code should look like:
from subprocess import Popen, PIPE
p = Popen(["Mad8dl.bat", "RUNTHISTOO.txt"], cwd=r"C:\...\test", stdout=PIPE, stderr=PIPE)
stdout, stderr = p.communicate()
Note the list containing the components of the call, rather than a single string. Also note that you need to specify stdout=PIPE and stderr=PIPE in your Popen() call if you want to use communicate() later on.

Spotify Tech Puzzle - stdin in Python

I'm trying to solve the bilateral problem on Spotify's Tech Puzzles. http://www.spotify.com/us/jobs/tech/bilateral-projects/ I have something that is working on my computer that reads input from a file input.txt, and it outputs to ouput.txt. My problem is that I cannot figure out how to make my code work when I submit it where it must read from stdin. I have looked at several other posts and I don't see anything that makes sense to me. I see some people just use raw_input - but this produces a user prompt?? Not sure what to do. Here is the protion of my code that is suposed to read the input, and write the output. Any suggestions on how this might need changed? Also how would I test the code once it is changed to read from stdin? How can I put test data in stdin? The error i get back from spotify says Run Time Error - NameError.
import sys
# Read input
Input = []
for line in sys.stdin.readlines():
if len(line) <9:
teamCount = int(line)
if len(line) > 8:
subList = []
a = line[0:4]
b = line[5:9]
subList.append(a)
subList.append(b)
Input.append(subList)
##### algorithm here
#write output
print listLength
for empWin in win:
print empWin
You are actually doing ok.
for line in sys.stdin.readlines():
will read lines from stdin. It can however be shortened to:
for line in sys.stdin:
I don't use Windows, but to test your solution from a command line, you should run it like this:
python bilateral.py < input.txt > output.txt
If I run your code above like that, I see the error message
Traceback (most recent call last):
File "bilateral.py", line 20, in <module>
print listLength
NameError: name 'listLength' is not defined
which by accident (because I guess you didn't send in that) was the error the Spotify puzzle checker discovered. You have probably just misspelled a variable somewhere.