Jackrabbit / JCR organisation of text content data - organization

i was thinking about, how to organize "normal" text content (i.e a String, HTML Code ...) in Jackrabbit.
Are there any recommended structures for plain text content (like for files)?
Should i store each text content as a binary (like i do with files)
Node(nt:folder)--> Node(nt:file) --> Node(jcr:content with a jcr:data property which holds the binary)
Or is it better to have something like
Node(nt:folder)--> Node(nt:unstructured with a jcr:message property which holds the string)
My third idea was to create a separate name space for text content
Node(nt:folder)--> Node(my:text with a jcr:message property which holds the string)
Node(nt:folder)--> Node(my:html with a jcr:message property which holds the string)
...
What do you thing is the best solution?
It would be great to discuss this.

Storing text and html content as nt:file structures makes it visible via WebDAV and other tools that understand those structures. That can be useful depending on your application.
If you don't need this, you can just store your textual content as properties. In this case, using standard property names: jcr:title, jcr:description etc. as defined in the Standard Application Node Types section of the JSR-283 spec helps make things consistent.
See also http://wiki.apache.org/jackrabbit/DavidsModel which has some related recommendations.

I would store regular text in a string property, unless it's a large (multi-kilobyte) text. This is similar to VARCHAR in a relational database.
For really large texts that are not 'files', I would use a binary property (a stream). Such properties are stored in the DataStore, which is slower to write and access than a string property, but will not load the whole item in memory, and will only store the same data once. This is similar to BLOB / CLOB in a relational database.
For files, I would use nt:folder / nt:file. This is similar to a file in a file system.

Related

Decoding/parsing CSV and CSV-like files in Swift

I'll have to write a very customised CSV-like parser/decoder. I have looked for open source ones on Github, but not found any that fits my needs. I can solve this, but my question is if it would be a total violation of the key/value decoding, to implement this as a TopLevelDecoder in Swift.
I have keys, but not exactly key/value pairs. In CSV files, there is rather a key for each column of data,
There are a number of problem with the files I need to parse:
Commas are not only for separation of fields, but there are also commas within some fields. Example:
//If I convert to an array
Struct Family {
let name: String?
let parents: [String?]
let siblings: [String?]
}
In this example, both parents' names are within the same field, and needs to be converted into an array, and also the siblings field.
"Name", "Parents","Siblings"
"Danny", "Margaret, John","Mike, Jim, Jane"
In the case of the parents, I could have split that into two fields in a struct like
Struct Family {
let name: String?
let mother: String?
let father: String?
}
but with the Siblings field that doesn't work, since there can be all from zero to many siblings. Therefore I will have to use an array.
There are cases when I will split into two fields though.
All the files I need to parse are not strictly CSV. All of the files have tabular data (comma-or tab-separated), but some of the files have a few rows of comments (sometimes containing metadata) that I need to consider. Those files have a .txt extension, instead of .csv.
## File generated 2020-05-02
"Name", "Parents","Siblings"
"Danny", "Margaret, John","Mike, Jim, Jane"
Therefore I need to peek at the first line(s) to determine if there are such comments, and after that has been parsed I can continue to treat the rest of the file as CSV.
I plan to make it look like any Decoder, from the applications point of view, but internally in my decoder i can handle things like they were a key/value pair, because there is just one set of keys, and that is the first line in the file, if there are no comments in the beginning. I still want to use CodingKeys though.
What are your thoughts? Should I implement in as a decoder (actually TopLevelDecoder in Swift), or would that be an abuse of the idea of key/value decoding? The alternative is to implement this as a parser, but I have to handle several types of files (JSON, GraphQL, CSV and CSV-like files), and I think my application code would be a lot simpler if I could use Decoders for all the types of files.
For JSON there's no problem, since there is already a HSON decoder in Swift. For GraphQL it's not a problem either, because I can write a decoder with an unkeyed container. The problem files are those CSV and CSV-like files.
Some of them have everything in double-quotes, but for the "keys" in the CSV header and for the values. Some only have double-quotes for the keys, but not for the values. Some have comma-separated fields, and some tab-separated. Some have commas within fields, that needs special handling. Some have comments in the beginning of the file, that needs to be skipped, before parsing the rest of the file as CSV.
Some files have two fields in the first column. I have no influence whatsoever of the format of these files, so I just have to deal with it.
If you wonder what files they are, I can tell you that they are files of raw DNA, files with DNA matches, files with common DNA segments with people I have matching DNA with. It's quite a few slightly different files, from several DNA testing companies. I wish they all had used JSON in a standard format, where all keys also were standard for all the companies. But they all have different CSV headers, and other differences.
I also have to decode Gedcom files, which sort of also has key/value coded pairs, but that format too doesn't conform to a pure key/value coding in the files.
ALso: I have searched for others with similar problems, but not exactly the same, so I didn't want to hijack their threads.
See this thread Advice for going from CSV > JSON > Swift objects
That was more of a question of how to convert from CSV to JSON and then to internal data structs in Swift. I know I can write a parser to solve this, but I think it would be more elegant to handle all these files with decoders, but I want your thoughts about it.
I was also think of making a new protocol
protocol ColumnCodingKey: CodingKey {
)
I haven't decided yet what to have in the protocol, if anything.
It might work by just having it empty like in the example, and then let my decoder conform to it, then it maybe wouldn't be a very big violation of the key/value decoding.
Thanks in advance!
CSV files could be parsed using regular expression. To get you started this might save some time. It's hard to know what you really need because it looks like there are many different scenarios, it might grow to even more situations?
Regex expression to parse one line in a CSV file might look something like this
(?:(?:"(?:[^"]|"")*"|(?<=,)[^,]*(?=,))|^[^,]+|^(?=,)|[^,]+$|(?<=,)$)
Here is a detailed description on how it works with a javascript sample
Build a CSV parser

Most efficient way to change the value of a specific tag in a DICOM file using GDCM

I have a need to go through a set of DICOM files and modify certain tags to be current with the data maintained in the database of an external system. I am looking to use GDCM. I am new to GDCM. A search through stack overflow posts demonstrates that the anonymizer class can be used to change tag values.
Generating a simple CT DICOM image using GDCM
My question is if this is the best use of the GDCM API or if there is a better approach for changing the values of individual tags such as patient name or accession number. I am unfamiliar with all of the API options but have a link to the API documentation. It looks like the DataElement SetValue member could be used, but it doesn't appear that there is a valid constructor for doing this in the Value class. Any assistance would appreciated. This is my current approach:
Anonymizer anon = new Anonymizer();
anon.SetFile(myFile);
anon.Replace(new Tag(0x0010, 0x0010), "BUGS^BUNNY");
Quite late, but maybe it would be still useful. You have not mention if you write in C++ or C#, but I assume the latter, as you do not use pointers. Generally, your approach is correct (unless you use System.IO.File instead of gdcm.File). The value (second parameter of Replace function) has to be a plain string so no special constructor is needed. You should probably start with doxygen documentation of gdcm, and there is especially one complete example. It is in C++, but there should be no problems with translation.
There are two different ways to pad dicom tags:
Anonymizer
gdcm::Anonymizer anon;
anon.SetFile(file);
anon.Replace(gdcm::Tag(0x0002, 0x0013), "Implementation Version Name");
//Implementation Version Name
DatsElement
gdcm::Attribute<0x0018, 0x0088> ss;
ss.SetValue(10.0);
ds.Insert(ss.GetAsDataElement());

What is the HTTP content type for binary plist?

I am modifying a rails server to handle binary plist from an iPhone client via POST and PUT requests. The content type for text plist is text/plist, as far as I can tell. I would like the server to handle both text and binary plists, so I would like to distinguish between the two forms. What is the content type for binary plist?
I believe that most binary formats are preceded by application so maybe application/plist.
See the bottom of RFC1341.
Update
Like Pumbaa80 mentioned, since application/plist is not a standard mime-type it should be application/x-plist.
In RFC2045 it explains this:
In the future, more top-level types
may be defined only by a standards-track extension to this standard.
If another top-level type is to be used for any reason, it must be
given a name starting with "X-" to indicate its non-standard status
and to avoid a potential conflict with a future official name.

Laying out a table in a GWT UiBinder (with Grid?)

I want to make a table of data in a UiBinder. I need programmatic access so I can add data at runtime, but I'd like my designer to have access to header names, column styles, etc, in the ui.xml file.
Is there a solution that meets these needs? A Grid perfectly satisfies my programmatic access, but I don't see a way to specify rows or cells in a Grid from the ui.xml.
I'd let the designers change the style via CSS files: Either include those in your host page, or use CssResource in a ClientBundle.
The header names etc. can be provided e. g. by properties files via GWT's internationalization Constants (even if you only want to support one language).
If you want to go one step further, and let the designer specify, which columns to show, and in which order, then it might be a good idea to create your own widget. Maybe the CricketScores example serves as a good starting point on how to use an XML attribute to specify the columns from your ui.xml.

What is the use of plist?

In my application, I am using a plist. Please, can anyone explain what are the uses of plist with an example or a sample code?
In the context of iPhone development, Property Lists are a key-value store that your application can use to save and retrieve persistent data.
All iPhone applications have at least one of these by default, the Information Property List:
The information property list is a
file named Info.plist that is included
with every iPhone application project
created by Xcode. It is a property
list whose key-value pairs specify
essential runtime-configuration
information for the application. The
elements of the information property
list are organized in a hierarchy in
which each node is an entity such as
an array, dictionary, string, or other
scalar type.
Plist are XML files in a specific format. Prior to XML, they had a custom format now called 'old plist'. (You almost never see that anymore save in legacy code.)
Foundations collection classes automatically generate XML files in the plist format when you use their serialization methods to write them to disk. They also automatically read them back. You can also write your own serializers for your own custom objects. This allows you to persistently store complex objects in a robust, human readable format.
One use for plist for programmers is that it is easier to use the plist editor to input and manage a lot of data than it is to try and code it. For example, if you have an class that requires setting a large number of ivars, you can create a plist, read it into an NSArray or NSDictionary and then initialize the instance by passing it the dictionary.
I use this technique when I have to use a large number of paths to draw complex objects. You define the path in the plist file instead of the code and edit the path in the plist editor.
It's also a handy way to create a large amount of detailed test data.
PList means PropertyList
It is XML file format
It is mainly user for store and reterve the data
It can store the key-value pair
It's been a long time since I've looked at them, but plist is a short-form of "properties list" and can be used to store application configuration settings that need to persist between instances of an application's execution. Could be similar to a .properties file (I see those a lot on Java projects).
A plist is essentially just a data file, it stores information in a documented format.
From Wikipedia:
In the Mac OS X Cocoa, NeXTSTEP, and
GNUstep programming frameworks,
property list files are files that
store serialized objects. Property
list files use the filename extension
.plist, and thus are often referred to
as plist files. Property list files
are often used to store a user's
settings. They are also used to store
information about bundles and
applications, a task served by the
resource fork in the old Mac OS.
.plist
Info.plist is key/value persistence storage(property list) which is used by system and user. It contains user-friendly text in XML format. Info.plist is mandatory file for any Bundle. For example it contains Bundle id[About] which is usually is used by system but as a programmer/user you are not limited on changing/reading[More]. The same as you can add K/V for your own purposes and read it in runtime. You could noticed that some frameworks forces you to add K/V into your's application to identify you or some other cases.
.entitlements is a property list with enabled capabilities(e.g. ApplePay)
[Info.plist location]
[Vocabulary]