Save a string to a h5py dataset using type "array of 8-bit integers (80)" - h5py

I wish to create a h5py "string" dataset (for example "A"), using the data type "array of 8-bit integers (80)" (as shown in HDFView, see here). Each integer of this array of length 80 is in fact ord(x) of the corresponding character of this string. So for instance Top is stored as 84 111 112 0 0 0 ..., with in total 80 int8.
The desired dataset should look like this
DATASET "NOM" {
DATATYPE H5T_ARRAY { [80] H5T_STD_I8LE }
DATASPACE SIMPLE { ( 1 ) / ( 1 ) }
DATA {
(0): [ 84, 111, 112, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 ]
}
However I'm unable to create this dataset using h5py. Using a standard numpy array gives this
DATASET "NOM" {
DATATYPE H5T_STD_I8LE
DATASPACE SIMPLE { ( 1, 80 ) / ( 1, 80 ) }
DATA {
(0,0): 84, 111, 112, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
(0,15): 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
(0,31): 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
(0,47): 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
(0,63): 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
(0,79): 0
}
}
So what is data and dtype needed, if my string, say, is "Top".
.create_dataset("NOM", data=data, dtype=dtype)
According to https://github.com/h5py/h5py/issues/955, maybe I need to use a lower level interface...?
Thanks!
Solution
The problem is that if we create the numpy dataset data before writing it by using .create_dataset("NOM", data=data), internally numpy will always interpret my 80int8 data type as a 1d array of int8
dtype = np.dtype("80int8")
x = np.array(2, dtype=dtype)
# x.dtype = dtype('int8')
The solution is thus to declare the data set with the desired dtype first, then fill in the data.
dataset = gro.create_dataset("NOM", (len(nom),), dtype="80int8")
for i in range(len(nom)):
nom_80 = nom[i] + "\x00" * (80 - len(nom[i])) # make nom 80 characters
dataset[i] = [ord(x) for x in nom_80]
# dataset.dtype = dtype(('i1', (80,)))

Make a uint8 array of right size and content:
In [417]: x = np.zeros(80, dtype='uint8')
In [419]: x[:3]=[ord(i) for i in 'Top']
In [421]: ds1=hf.create_dataset('other4', data=x)
A structured array approach:
In [486]: dt = np.dtype([('f0','80int8')])
In [487]: dt
Out[487]: dtype([('f0', 'i1', (80,))])
In [488]: x = np.zeros(1, dt)
In [489]: x['f0'][0][:3]=[ord(i) for i in 'Top']
In [490]: x
Out[490]:
array([([ 84, 111, 112, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],)],
dtype=[('f0', 'i1', (80,))])
In [491]: ds1=hf.create_dataset('st1', data=x)
In [492]: ds1
Out[492]: <HDF5 dataset "st1": shape (1,), type "|V80">
produces
DATASET "st1" {
DATATYPE H5T_COMPOUND {
H5T_ARRAY { [80] H5T_STD_I8LE } "f0";
}
DATASPACE SIMPLE { ( 1 ) / ( 1 ) }
DATA {
(0): {
[ 84, 111, 112, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 ]
}
}
}

Related

about nodelist in triadic_census for netwrokx

I want to get triadic_census in a single node, this is the code example:
import netwrokx as nx
G = nx.DiGraph([("b", "a"), ("b","c")])
print(nx.triadic_census(G))
print(nx.triadic_census(G,nodelist=["c"]))
But the result is different from what I think. this is the result of code:
{'003': 0, '012': 0, '102': 0, '021D': 1, '021U': 0, '021C': 0, '111D': 0, '111U': 0, '030T': 0, '030C': 0, '201': 0, '120D': 0, '120U': 0, '120C': 0, '210': 0, '300': 0}
{'003': 0, '012': 0, '102': 0, '021D': 0, '021U': 0, '021C': 0, '111D': 0, '111U': 0, '030T': 0, '030C': 0, '201': 0, '120D': 0, '120U': 0, '120C': 0, '210': 0, '300': 0}
I think Node C should also be part of 021D, i want to know whether the parameters I have entered are not right?
The problem has been fixed networkx pr

Remove zero rows from a list of list in Scala

I have a list of list in Scala such as:
val lst = List(List(60, 0, 1, 2, 3, 28, 0, 0, 0, 0), List(0, 0, 0, 0, 0, 0, 0, 0, 0, 0), List(47, 0, 1, 1, 2, 28, 0, 0, 0, 0))
and I want to remove all zero rows and the result should be like:
List(List(60, 0, 1, 2, 3, 28, 0, 0, 0, 0), List(47, 0, 1, 1, 2, 28, 0, 0, 0, 0))
Does Scala list have any built-in method to remove these rows?
You can use filter to keep only items (lists) matching a predicate; The predicate can use exists to check for non-zero elements:
lst.filter(_.exists(_ != 0))
#Tzach Zohar answer is perfectly fine but here is another way to approach it.
scala> lst.filterNot(xs => xs.forall(_ == 0))
res0: List[List[Int]] = List(
List(60, 0, 1, 2, 3, 28, 0, 0, 0, 0),
List(47, 0, 1, 1, 2, 28, 0, 0, 0, 0)
)

Wrong layout of tiles

I create a level an array of int's. This is the code:
using UnityEngine;
using System.Collections;
public class Level1 : MonoBehaviour
{
int[][] level = new int[][]
{
new int[] { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 83, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0},
new int[] { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 83, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0},
new int[] { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 83, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0},
new int[] { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 83, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0},
new int[] { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 83, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0},
new int[] { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 83, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0},
new int[] { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 83, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0},
new int[] { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 83, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0},
new int[] { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 83, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0},
new int[] { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 83, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0},
new int[] { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 83, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0},
new int[] { 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16}
};
public Transform tile00;
public Transform tile16;
public Transform tile38;
int rows = 12;
int cols = 32;
void Start ()
{
BuildLevel ();
}
void BuildLevel(){
int i, j;
GameObject dynamicParent = GameObject.Find ("DynamicObjects");
for(i=0; i<rows; i++)
{
for(j=0; j<cols; j++)
{
Transform toCreate = null;
Debug.Log (i + " , " + j + " " + level[i][j]);
if (level[i][j] == 0)
toCreate = tile00;
if (level[i][j] == 83)
toCreate = tile38;;
if (level[i][j] == 16)
toCreate = tile16;
Vector3 v3 = new Vector3(16-j, 6-i, 0);
Transform newObject = Instantiate(toCreate, v3, Quaternion.identity) as Transform;
newObject.parent = dynamicParent.transform;
}
}
}
}
The output screen is like that:
The tiles are 50 X 50. I changed the dimensions of tiles, I changed the positions on X and Y. I tried everything but I found no solution.Could you give me an ideea, please ?
For the horizontal tiles the layout I want to obtain is (the image is processed with paint) :
The most likely answer is because of this line
Vector3 v3 = new Vector3(16-j, 6-i, 0);
You say that your images are 50 x 50 px each. Assuming that you haven't changed the pixels to units property of your sprite, this would make each of these image occupy a space of 0.5 Unity units on both the X & Y axes.
Now, in your calculation, here's what is happening.
Iteration 1 - (i = 0, j = 0). Position = Vector3(16, 6, 0)
Iteration 2 - (i = 0, j = 1). Position = Vector3(15, 6, 0)...
Iteration 33 -(i = 1, j = 0). Position = Vector3(16, 5, 0)
Now, the difference in the X values between Iteration 1 & Iteration 2 is 1 Unity unit. We've already established earlier that these Sprites will occupy only 0.5 Unity unit due to their size.
Same thing along the Y axis for Iteration 1 & Iteration 33. A difference of 1 unit, with each image occupying only 0.5 units.
So, either change the image to be 100 x 100 px, or change the pixels to units

Avoid the exception java.lang.OutOfMemoryError without growing the heap space?

I met a problem with the Java heap space in which I try to group the consecutive elements of one array in order to create a matrix for computing his transposed. I have a lot of values in the array (26726400) and I try to have buckets of size 29. But when I tested the following code, I get the exception java.lang.OutOfMemoryError: Java heap space
val arr = new Array[Int](256 * 3600 * 29)
arr: Array[Int] = Array(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,...
scala> arr.grouped(29).toArray
java.lang.OutOfMemoryError: Java heap space
My purpose is to transpose the matrix. If I run sbt -mem 2048, this code works but is it an another way to do this task without growing the heap space ?
This may not save much memory, though it is surely more efficient than grouped, which does a couple of copies between buffers internally.
scala> val arr = new Array[Int](256 * 3600 * 29)
arr: Array[Int] = Array(0, 0, 0,...
scala> Array.tabulate(256 * 3600, 29)((i,j) => arr(i * 29 + j))
res0: Array[Array[Int]] = Array(Array(0, 0, 0,...
It's noticeably faster in my scientific trial.
You could also use 1-dim tabulate, allocate Array.ofDim(29) and Array.copy.
Well, the default memory for a JVM instance on machines with >1Gb of RAM is RAM/4. So, add more memory to your computer, and you won't have to pass that parameter to sbt.
Joking aside, you have at least 3 copies of the data here. First is the original arr instance, then the result of grouped operation, then the result of toArray call. And it could even be more, I'm not sure about the implicit conversion to ArrayOps, which is required by calling the grouped method (it's not defined on the Array class, actually).
Given your data size and type, one copy takes ~101Mb of memory, excluding any overhead associated with storage. To solve the problem, reduce the amount of copies you make. For example, I don't really understand why you need the last toArray call.
As a side note, if it's not a homework, consider using some existing libraries for matrix operations, like jBLAS.

Print whole result in interactive Scala console

When I type something into the Scala interactive console, the console prints the result of the statement. If the result is too long, the console crops it (scroll right to see it):
scala> Array.fill[Byte](5)(0)
res1: Array[Byte] = Array(0, 0, 0, 0, 0)
scala> Array.fill[Byte](500)(0)
res2: Array[Byte] = Array(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...
scala> "a"*5000
res3: String = aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa...
How can I print the same or equivalent output, for any given object (not just a collection or array) without the cropping occurring?
The result is not "cropped", simply println is invoking java.lang.Arrays.toString() (since scala.Array is a Java array).
Specifically, Arrays defines a toString overload that works with Object, which calls the toString implementation of java.lang.Object on every element. Such implementation prints the reference of the object, so you end up with
[Lscala.Tuple2;#4de71ca9
which is an Array containing the reference 4de71ca9 to a scala.Tuple2 object.
That has been discussed in this ticket years ago.
In the specific case of arrays, you can simply do
println(x.mkString("\n"))
or
x foreach println
or
println(x.deep)
Update
To answer your last edit, you can set the maximum lenght of the strings printed by the REPL
scala> :power
** Power User mode enabled - BEEP WHIR GYVE **
** :phase has been set to 'typer'. **
** scala.tools.nsc._ has been imported **
** global._, definitions._ also imported **
** Try :help, :vals, power.<tab> **
scala> vals.isettings.maxPrintString = Int.MaxValue
vals.isettings.maxPrintString: Int = 2147483647
try this
scala> :power
Power mode enabled. :phase is at typer.
import scala.tools.nsc._, intp.global._, definitions._
Try :help or completions for vals._ and power._
scala> vals.isettings.maxPrintString
res9: Int = 800
scala> vals.isettings.maxPrintString = 10000
vals.isettings.maxPrintString: Int = 10000
try
x map println
or
x foreach println