Support for basic datatypes in H5Attributes? - ilnumerics

I am trying out the beta hdf5 toolkit of ilnumerics.
Currently I see H5Attributes support only ilnumerics arrays. Is there any plan to extend it for basic datatypes (such as string) as part of the final release?
Does ilnumerics H5 wrappers provide provision for extending any functionality to a particular
datatype?

ILNumerics internally uses the official HDF5 libraries from the HDF Group, of course. H5Attributes in HDF5 correspond to datasets with the limitation of being not capable of partial I/O. Besides that, H5Attributes are plain arrays! Support for basic (scalar) element types is given by assuming the array stored to be scalar.
Strings are a complete different story: strings in general are variable length datatypes. In terms of HDF5 strings are arrays of element type Char. The number of characters in the string determines the length of the array. In order to store a string into a dataset or attribute, you will have to store its individual characters as elements of the array. In ILNumerics, you can convert your string into ILArrray or ILArray (for ASCII data) and store that into the dataset/ attribute.
Please consult the following test case which stores a string as value into an attribute and reads the content back into a string.
Disclaimer: This is part of our internal test suite. You will not be able to compile the example directly, since it depends on the existence of several functions which may are not available. However, you will be able to understand how to store strings into datasets and attributes:
public void StringASCIAttribute() {
string file = "deleteA0001.h5";
string val = "This is a long string to be stored into an attribute.\r\n";
// transfer string into ILArray<Char>
ILArray<Char> A = ILMath.array<Char>(' ', 1, val.Length);
for (int i = 0; i < val.Length; i++) {
A.SetValue(val[i], 0, i);
}
// store the string as attribute of a group
using (var f = new H5File(file)) {
f.Add(new H5Group("grp1") {
Attributes = {
{ "title", A }
}
});
}
// check by reading back
// read back
using (var f = new H5File(file)) {
// must exist in the file
Assert.IsTrue(f.Get<H5Group>("grp1").Attributes.ContainsKey("title"));
// check size
var attr = f.Get<H5Group>("grp1").Attributes["title"];
Assert.IsTrue(attr.Size == ILMath.size(1, val.Length));
// read back
ILArray<Char> titleChar = attr.Get<Char>();
ILArray<byte> titleByte = attr.Get<byte>();
// compare byte values (sum)
int origsum = 0;
foreach (var c in val) origsum += (Byte)c;
Assert.IsTrue(ILMath.sumall(ILMath.toint32(titleByte)) == origsum);
StringBuilder title = new StringBuilder(attr.Size[1]);
for (int i = 0; i < titleChar.Length; i++) {
title.Append(titleChar.GetValue(i));
}
Assert.IsTrue(title.ToString() == val);
}
}
This stores arbitrary strings as 'Char-array' into HDF5 attributes and would work just the same for H5Dataset.

As an alternative solution you may use HDF5DotNet (http://hdf5.net/default.aspx) wrapper to write attributes as strings:
H5.open()
Uri destination = new Uri(#"C:\yourFileLocation\FileName.h5");
//Create an HDF5 file
H5FileId fileId = H5F.create(destination.LocalPath, H5F.CreateMode.ACC_TRUNC);
//Add a group to the file
H5GroupId groupId = H5G.create(fileId, "groupName");
string myString = "String attribute";
byte[] attrData = Encoding.ASCII.GetBytes(myString);
//Create an attribute of type STRING attached to the group
H5AttributeId attrId = H5A.create(groupId, "attributeName", H5T.create(H5T.CreateClass.STRING, attrData.Length),
H5S.create(H5S.H5SClass.SCALAR));
//Write the string into the attribute
H5A.write(attributeId, H5T.create(H5T.CreateClass.STRING, attrData.Length), new H5Array<byte>(attrData));
H5A.close(attributeId);
H5G.close(groupId);
H5F.close(fileId);
H5.close();

Related

Check if values are those same objects

I'm pretty new to Dart and I got some problems with understanding what is going on under the hood. I've simple code like this:
void main() {
String s1 = "test1";
StringBuffer sb = StringBuffer();
sb.write("test");
sb.write("1");
String s2 = sb.toString();
print(identical(s1, s2));
print(s1 == s2);
int a = 1;
int b = 1;
print(identical(a, b));
}
result of this code is 3x true
Which is little tricky for me. As I understands all variables refers to some values, and since == operator is expected to return true, I'm little confused why identical also return true. If I would use code like:
String s1 = "test1";
String s2 = "test1";
Then I would expect that there is compiler optimization for immutable strings. But since I combine it via StringBuffer I would expect that I got two different values in memory. I suspect that identical uses == operator internally... Why I'm curious: I'm working on app which will use sets of tags. And I'm not sure If should I create some pool of strings and checks if some strings are already created and reuse it or just spawn strings and Dart has it's own string pool?

how can this Repeated string concatenation function

using UnityEngine;
using System.Collections;
public class NewMonoBehaviour1 : MonoBehaviour
{
void ConcatExample(int[] intArray)
{
string line = intArray[0].ToString(); // the line is the var of the first in array
for(i =1;i <intArray.Length; i++) // the length is unknown ?
{
line += ", " + intArray[i].ToString(); //
}
return line;
//each time allocate new in original place
}
}
How can this function work ? the length of array is unknown , so how the for loop works ?Besides, this is void function but shouldn't return anythings right ,or is there any exceptional case ,finally,according to the unity manual, it is said that the function will keep producing a string but with new contents in the same place , resulting in consuming large memory space .Why ?thx
What makes you think that the Length should be unknown? It is a property that any array simply has
Gets the total number of elements in all the dimensions of the Array.
Of course it is not unknown the moment you call your method with an according parameter!
The return line; will not even compile since as you say the method is of type void so it can not return anything. It should probably be private string ConcatExample
Then what the unity manual (don't know where exactly you read this) means lies in
line += ", " + intArray[i].ToString();
under the hood every string in c# is an immutable char[]. So everytime you do a string concatenation via stringC = stringA + stringB what happens under the hood is basically something similar to
char[] stringC = new char[stringA.Length + stringB.Length];
for(var iA = 0; iA < stringA.Length; iA++)
{
stringC[i] = stringA[i];
}
for(var iB = 0; iB < stringB.Length; iB++)
{
stringC[iB + stringA.Length] = stringB[iB];
}
so whenever dealing with loops especially with large data it is strongly recommended to rather use a StringBuilder like
private string ConcatExample(int[] intArray)
{
var builder = new StringBuilder(intArray[0]);
for(i =1; i < intArray.Length; i++)
{
builder.Append(", ").Append(intArray[i].ToString());
}
return builder.ToString();
}
The length of the array will be the length of the array of ints you pass into the function as an argument.
say you pass it
Int[] ints = {1,2,3}
ConcatExample(ints); //the length of the array is now 3
add a debug.log() function to the ConcatExample method
void ConcatExample(int[] intArray)
{
string line = intArray[0].ToString();
for (int i = 1; i < intArray.Length; i++)
{
line += ", " + intArray[i].ToString(); //
Debug.Log(line);
}
}
debug.log would produce the following in the console
1, 2
1, 2, 3
and finally the return line; at the end would just result in an error because yes you are correct void returns nothing
This function CANNOT work, unless it gets the data it expects. A NULL passed to this function, for example, would generate a runtime null-reference exception. Passing a valid integer array, of length zero would generate an invalid index error on the first line.
You are correct, the function returns nothing, and appears pointless. In fact, I would have expected return line; to generate a complier error.
The string type appears "dynamic" meaning, it will indeed allocate more and more memory as needed. Technically, it is actually the string "+" operator, (a function that takes two strings as parameters) that is allocating this space. This function returns a new string, of the appropriate size. The garbage collector will DEallocate "old" strings when they are no longer referenced by any variables.

Concatenate multiple PDF/A with different conformance levels

Is it possible to concatenate a number of pdf/a (with possibly different conformance levels: some pdf/a-1b, some pdf/a-3b ecc) into a single pdfa ?
I was thinking that using the latest level (3-a or 3b) would be ok but I get errors when validating with VeraPDF:
Here is my code (where :
public static byte[] CreateConformantCopy(List<byte[]> sourcePdfs)
{
var version = PdfVersion.PDF_1_7;
var type = PdfAType.PDF_A_3B;
WriterProperties wp = new WriterProperties();
wp.UseSmartMode();
wp.SetPdfVersion(version.ToPdfVersion());
PdfOutputIntent oi = new PdfOutputIntent("Custom", "", "http://www.color.org", "sRGB IEC61966-2.1", Assembly.GetExecutingAssembly().GetManifestResourceStream("xxx.Resources.sRGB_CS_profile.icm"));
using (var mergedPdf = new MemoryStream())
{
var writer = new PdfWriter(mergedPdf, wp);
using (PdfADocument newDoc = new PdfADocument(writer, type.ToPdfAConformanceLevel(), oi, new DocumentProperties() { }))
{
Document document = new Document(newDoc, PageSize.A4.Rotate());
newDoc.SetTagged();
newDoc.GetCatalog().SetLang(new PdfString(Thread.CurrentThread.CurrentUICulture.Name));
newDoc.GetCatalog().SetViewerPreferences(
new PdfViewerPreferences()
.SetDisplayDocTitle(true)
.SetCenterWindow(true)
);
PdfMerger merger = new PdfMerger(newDoc);
for (int k = 0; k < sourcePdfs.Count; k++)
{
using (var inDoc = PdfHelper.GetDocument(sourcePdfs[k]))
{
var numberOfPages = inDoc.GetNumberOfPages();
merger.Merge(inDoc, 1, numberOfPages);
}
}
newDoc.Close();
}
return mergedPdf.ToArray();
}
}
PDF/A-1 and PDF/A-2 have several differences in the requirements. So, merging them together might not be possible. Looking on your validation errors, I think this is exactly the case. For example, the very first one is about XMP metadata. The PDF/A-2 is more strict here, and you get this error because your first file (which is probably a valid PDF/A-1) does not actually satisfy the PDF/A-2 rules.
What is possible however is to attach a PDF/A-1 document to PDF/A-2 one. This does not even require the use of PDF/A-3, which allows arbitrary attachments. The PDF/A-2 standard does allow attaching valid PDF/A-1 (as well as PDF/A-2 documents).

Sort/Order an Undetermined Number of Columns (LINQ\Entity Framework)

Need to sort/order a list of data based on an undetermined number of columns (1 or more).
What i'm trying to do is loop through the desired columns and add an OrderBy or ThenBy based on their number to the query'd list, but i'm unsuccessful...
Done this, but it doesn't compile:
var query = GetAllItems(); //returns a IQueriable list of items
//for each selected column
for (int i = 0; i < param.Columns.Length; i++)
{
if (i == 0)
{
query = query.OrderBy(x => x.GetType().GetProperty(param.Columns[i].Name));
}
else
{
//ERROR: IQueriable does not contain a definition for "ThenBy" and no extension method "ThenBy"...
query = query.ThenBy(x => x.GetType().GetProperty(param.Columns[i].Data));
}
}
How can i resolve this issue? Or any alternative to accomplish this requirement?
SOLUTION: #Dave-Kidder's solution is well thought and resolves the compile errors i had. Just one problem, OrderBy only executes (actually sorts the results) after a ToList() cast. This is an issue because i can't convert a ToList back to an IOrderedQueryable.
So, after some research i came across a solution that resolve all my issues.
Microsoft assembly for the .Net 4.0 Dynamic language functionality: https://github.com/kahanu/System.Linq.Dynamic
using System.Linq.Dynamic; //need to install this package
Updated Code:
var query = GetAllItems(); //returns a IQueriable list of items
List<string> orderByColumnList = new List<string>(); //list of columns to sort
for (int i = 0; i < param.Columns.Length; i++)
{
string column = param.Columns[i].Name;
string direction = param.Columns[i].Dir;
//ex.: "columnA ASC"
string orderByColumn = column + " " + direction;
//add column to list
orderByColumnList.Add(orderBy);
}
//convert list to comma delimited string
string orderBy = String.Join(",", orderByColumnList.ToArray());
//sort by all columns, yay! :-D
query.OrderBy(orderBy).ToList();
The problem is that ThenBy is not defined on IQueryable, but on the IOrderedQueryable interface (which is what IQueryable.OrderBy returns). So you need to define a new variable for the IOrderedQueryable in order to do subsequent ThenBy calls. I changed the original code a bit to use System.Data.DataTable (to get a similar structure to your "param" object). The code also assumes that there is at least one column in the DataTable.
// using System.Data.DataTable to provide similar object structure as OP
DataTable param = new DataTable();
IQueryable<DataTable> query = new List<DataTable>().AsQueryable();
// OrderBy returns IOrderedQueryable<TSource>, which is the interface that defines
// "ThenBy" so we need to assign it to a different variable if we wish to make subsequent
// calls to ThenBy
var orderedQuery = query.OrderBy(x => x.GetType().GetProperty(param.Columns[0].ColumnName));
//for each other selected column
for (int i = 1; i < param.Columns.Count; i++)
{
orderedQuery = orderedQuery.ThenBy(x => x.GetType().GetProperty(param.Columns[i].ColumnName));
}
you should write ThenBy after OrderBy like this:
query = query
.OrderBy(t=> // your condition)
.ThenBy(t=> // next condition);

Build dynamic LINQ queries from a string - Use Reflection?

I have some word templates(maybe thousands). Each template has merge fields which will be filled from database. I don`t like writing separate code for every template and then build the application and deploy it whenever a template is changed or a field on the template is added!
Instead, I'm trying to define all merge fields in a separate xml file and for each field I want to write the "query" which will be called when needed. EX:
mergefield1 will call query "Case.Parties.FirstOrDefault.NameEn"
mergefield2 will call query "Case.CaseNumber"
mergefield3 will call query "Case.Documents.FirstOrDefault.DocumentContent.DocumentType"
Etc,
So, for a particular template I scan its merge fields, and for each merge field I take it`s "query definition" and make that request to database using EntityFramework and LINQ. Ex. it works for these queries: "TimeSlots.FirstOrDefault.StartDateTime" or
"Case.CaseNumber"
This will be an engine which will generate word documents and fill it with merge fields from xml. In addition, it will work for any new template or new merge field.
Now, I have worked a version using reflection.
public string GetColumnValueByObjectByName(Expression<Func<TEntity, bool>> filter = null, string objectName = "", string dllName = "", string objectID = "", string propertyName = "")
{
string objectDllName = objectName + ", " + dllName;
Type type = Type.GetType(objectDllName);
Guid oID = new Guid(objectID);
dynamic Entity = context.Set(type).Find(oID); // get Object by Type and ObjectID
string value = ""; //the value which will be filled with data from database
IEnumerable<string> linqMethods = typeof(System.Linq.Enumerable).GetMethods(BindingFlags.Static | BindingFlags.Public).Select(s => s.Name).ToList(); //get all linq methods and save them as list of strings
if (propertyName.Contains('.'))
{
string[] properies = propertyName.Split('.');
dynamic object1 = Entity;
IEnumerable<dynamic> Child = new List<dynamic>();
for (int i = 0; i < properies.Length; i++)
{
if (i < properies.Length - 1 && linqMethods.Contains(properies[i + 1]))
{
Child = type.GetProperty(properies[i]).GetValue(object1, null);
}
else if (linqMethods.Contains(properies[i]))
{
object1 = Child.Cast<object>().FirstOrDefault(); //for now works only with FirstOrDefault - Later it will be changed to work with ToList or other linq methods
type = object1.GetType();
}
else
{
if (linqMethods.Contains(properies[i]))
{
object1 = type.GetProperty(properies[i + 1]).GetValue(object1, null);
}
else
{
object1 = type.GetProperty(properies[i]).GetValue(object1, null);
}
type = object1.GetType();
}
}
value = object1.ToString(); //.StartDateTime.ToString();
}
return value;
}
I`m not sure if this is the best approach. Does anyone have a better suggestion, or maybe someone has already done something like this?
To shorten it: The idea is to make generic linq queries to database from a string like: "Case.Parties.FirstOrDefault.NameEn".
Your approach is very good. I have no doubt that it already works.
Another approach is using Expression Tree like #Egorikas have suggested.
Disclaimer: I'm the owner of the project Eval-Expression.NET
In short, this library allows you to evaluate almost any C# code at runtime (What you exactly want to do).
I would suggest you use my library instead. To keep the code:
More readable
Easier to support
Add some flexibility
Example
public string GetColumnValueByObjectByName(Expression<Func<TEntity, bool>> filter = null, string objectName = "", string dllName = "", string objectID = "", string propertyName = "")
{
string objectDllName = objectName + ", " + dllName;
Type type = Type.GetType(objectDllName);
Guid oID = new Guid(objectID);
object Entity = context.Set(type).Find(oID); // get Object by Type and ObjectID
var value = Eval.Execute("x." + propertyName, new { x = entity });
return value.ToString();
}
The library also allow you to use dynamic string with IQueryable
Wiki: LINQ-Dynamic