Origin of FIX data dictionaries

Origin of FIX data dictionaries - quickfix

The QuickFix website provides data dictionaries for various versions of FIX (note: I am talking about FIX rather than FIXML). I have not seen any mention of such data dictionaries in the FIX specification, so my assumption is that an independent person (perhaps a QuickFix developer) invented them and they became a de facto standard used across multiple FIX implementations. Does anyone know who invented them? I ask because I want to briefly discuss FIX data dictionaries in a book I am writing, and I would like to credit the inventor.
In addition, I have not been able to find a schema (e.g, DTD or XML Schema) for a FIX data dictionary (again, I am talking about FIX rather than FIXML). Do such things exist?

Confirming, it was me.
When I originally created it, there was no structured version of the FIX specification. The spec was distributed as a Microsoft Word document.
Word would allow you to export a document as HTML. So I did that manually, then created a parser for the resulting HTML documents for each version.
Since the Word documents seemed to be edited by hand, they were quite inconsistent, and the parser needed to handle all sorts of edge cases. But eventually it was able to produce usable structured data.
Since then the FIX repository has been released, which is how the QuickFIX xml document is currently generated (though surprisingly, there is still data cleaning that needs to be done before a good QuickFIX document is generated).
I think there are a few reasons that the QuickFIX version of the spec became popular over the FIX repository.
It is a single document for each version of the spec.
It is more human readable. Tags are referenced only in field definitions. Messages are constructed with the relatively more human readable field names.
There is no barriers to download. Originally if I recall correctly the FIX repository was only available to members. I don't think that's still the case, but you do still need to have an account and be logged in to download the repository.
It's served a function for a large user base. Since the spec can be plugged into QuickFIX and used to generate QuickFIX messages, there was a large installed user base that can plug it and connect to your platform more easily. I believe this led to the adoption by other FIX engines to piggy back on this advantage, which I think is great.

There is no schema for the dictionaries.
Oren Miller created QuickFIX. I assume he or someone from his team invented the format: https://en.wikipedia.org/wiki/QuickFIX
Initial commit from Oren Miller which also contains initial version of DataDictionary:
https://github.com/quickfix/quickfix/commit/3b4df170aa518dd92cb05dc7c3bdbc83779516de#diff-bd791d8e47e80c1bbefe35e7a16453eb6e918c2d76bd26c38a139fc7c5ccc3ca

Related

Sails.js REST server based on jsonapi.org specification

I need to develop REST server strictly accrding to jsonapi.org specification and I'm not sure if there is some complex solution or even if it's easy to develop such thing.
I've found sails-hook-jsonapi, but it looks unmaintained for some time.
I'm new to Sails and not aware of all it's features and would appreciate any help, I may missed something obvious.

I have needed this too. There is not anything that works yet with Sails. sails-hook-jsonapi does not work correctly. I Forked that code and am maintaining my own version of it but there are still significant attribute serialization issues with multiple records. However, it does work at a basic level. I am also working on a new project sails-generate-jsonapi-blueprints but it is not nearly ready yet.
Sails is great but can ba a royal PIA. The guys maintaining Sails have had many requests for jsoanapi.org support but I do not believe that will happen anytime in the near future. If you REALLY must have JSONapi.org format I would suggest Loopback or some other API that already has support for it out of the box.

Actually, I take part of that back. sails-hook-jsonapi is working. I made a little change in the fork I maintain. https://github.com/NikkiDreams/sails-hook-jsonapi. Ian is maintaining the original project fork too I believe. https://github.com/IanVS/sails-hook-jsonapi
So the catch about the hook is that it hijacks every single request sent to responses/ok.js If you need something like an Authorizer that does not need jsonapi create a variant of ok.js that simply does a res.json(data) without the jsonapi-serializer being called when serializing the response.
sails-hook-jsonapi will serialize most of your data to your needs. But it still has a few limitations. Depending on the complexity of your queries these may not be an issue.
TODOs: Included request parameter handling (400 response if present)
Links
Top-level "self" links
Top-level "related" links
Resource-level "self" links
Related resource relationship links
Metadata links
Pagination
Formatting
Non-dasherized attributes
Sparse fieldsets

Long story short - there is no way to do it out of the box with little time investment. At least for now.
But sails-hook-jsonapi looks like good starting point, repository seems to be active now.
I've done project prototype on loopback.io framework, because I was in hurry and loopback had better jsonapi support.

per-paragraph commenting system

I'm very interested in the emerging trend of comments-per-paragraph systems (also called "annotations systems"), such as the ones implemented by medium.com and qz.com and i'm looking at the idea of developing one for my own.
Question: it seems they are mainly implemented via javascript, that runs through the text's html paragraphs uniquely identified by an id attribute (or, in the case of Medium, a name attribute). Does it mean their CMS actually store each paragraph as a separate entry in the database? Seems overly complex to me, but otherwise, how do they manage the fact that a paragraph can be deleted, edited or moved around in the overall text? How would the unique id be preserved if the author changes the paragraph?
How is that unique id logically structured? (post_id + position_in_post)?
Thank you for your insights...

I can't speak to the medium side, but as one of the developers for Quartz, I can give insight into how qz.com annotations work.
The annotations code is custom php code and is independent of the CMS for publishing articles (wordpress VIP). We do indeed store a reference to each paragraph as a row in the database, in order to track any updates to the article content. We call this an annotation thread and when a user saves an annotation the threadId gets stored along with the annotation.
We do not have a unique id stored on the wordpress side for each paragraph, instead we store the paragraphs relative position in that article (nodeIndex “3" and nodeSelector “p” == the third p-tag in the content body for a given article) and the javascript determines where exactly to place the annotation block. We went this route to avoid heavier customizations on the wordpress side, though depending on your CMS it may be easier to address this directly in the CMS code and add unique ids in the html before sending to the client.
Every time an update to an article is published, each paragraph in the updated article is compared against what was previously stored with the annotation threads for that article. If the position and paragraph text do not match up, it attempts to find the paragraph that is the closest match and update the row for that thread and new threads are created and deleted where appropriate. All of this is handled server side whenever changes are published to an article.
A couple of alternate implications that are also worth looking at are Gawker's Kinja text annotations (currently in use on Jalopnik) and the word-for-word annotations of rapgenius.com.

(disclaimer: I'm a factlink dev.)
I work for a company trying to allow per-paragraph (or per-phrase) commenting on arbitrary sites. Essentially, you've got two choices to identify the anchor of a comment.
Remember the structure of the page (e.g. some path from a root to a paragraph), and place comments at the same position next time.
Identify the content of the paragraph and place comments near identical or similar content next time.
Both systems have their downsides, but you pretty much need to go with option 2 if you want a robust system. Structural identification is fragile in the face of changing structure. Especially irrelevant changes such theming or the precise html tags used can significantly impact the "path". When that happens, you really can't fix it - unless you inspect the content, i.e. option (2).
Sam describes what comes down to a server-side content-based in his answer. Purely client-side content-based matching is what factlink and (IIRC) hypothesis use. Most browsers support non-standard but fast substring search in page content using either window.find or TextRange.findText. Alternatively, you could walk the DOM, which is slower but gives you the flexibility to implement (e.g.) fuzzy matching.
It may seem like client-side matching is overkill or complex, but really, it's simpler: it's a very robust way to decouple your content-management from your commenting. Neither is really simple, so decoupling those concerns can be a win.

I had created a fiddle on the same lines to demonstrate power of JQuery during a training session.
http://fiddle.jshell.net/fotuzlab/Lwhu5/
Might help as a starting point along with Sam's detailed and useful insights. You get the value of textfield in Jquery function where you can send it across to your CMS using ajax/APIs.
PS: The function is not production ready. Its only meant as a starting point. A little tweaking will make it usable.

I've recently published a post on how to do this with WordPress building on an existing plugin.
Like qz.com, I assign paragraph ids on the client and then provide that info to WordPress to store as comment meta when a new comment is created. I used hashing of the paragraph text to create the id which means that the order of paragraphs is unimportant but does mean that if a paragraph is edited then any associated comments become orphaned.
At first I thought this was an issue but thinking about it, if a reader comments on a paragraph then editing that text subsequently seems a little sneaky.
The code is freely available on GitHub if you feel like forking it and enhancing it.

There is one other wordpress plugin called "commentpress" which exist since a long time.
I use an old version of this plugin for my blog and it's work very well.
You can choose to comment per lines or per paragraphs, and ergonomics is really thinking!
A demo here:
http://futureofthebook.org/
and all the code is on github:
https://github.com/IFBook/commentpress-core
After a quick look on the code, it seems they use the second approch like #Eamon Nerbonne explains on his answer.
They parse each paragraphs to make a signature based on the first char of each words. Here is the function to do that.

In case someone comes looking in here, I've implemented a medium like functionality as a Django app.
It is open source and can be found as package on Pypi, and on github.
I used one of my other apps, blogging to allocate unique Paragraph IDs to each content object (currently we're only looking at <p> tags) and puts uses some extra internal meta data in the backend while storing it in DB (MySQL currently, but what we've done is JSONed the Blob, this method is more natively suited for a document oriented DBs). The frontend is mainly jQuery driven with REST API plugging the backend with the frontend.
I took cues from this post, but then rejected the creation of some kind of digest value from paragraph because content can change. What I wanted was to preserve the annotations as long as the paragraph was not completely over-written. In the complete over-write case, I provided for collection of the annotations in an orphaned bucket.
More in these tutorials
A legacy version of the same is running on those tutorials pages, that was the first revision. (But you won't be able to post without logging in, but you could always login using social accounts to check it out :-) )

How can I code or use directly a version control system such as Subversion with Mongodb?

I am setting up a simple online cms/editing system with a few multiple editors and would like a simple audit trail with diff, history, comparison and roll back functionality for small bits of text.
Our editors have gotten used to the benefits of using XML / Svn and I really would like to create a simple version of this in my system.
I realise I could probably create my own using say, a versions / history db with linked ids like this but I wondered if this is the best way or if there is an equivalent to an Svn api style interface available?
Btw I am totally new to Mongodb so go easy on me :-)
Cheers

Putting the data that create the database is not a good idea since it consits only of binary data. Additionally, this is rather huge in the beginnging since MongoDB allocates some disk space for it. So you have no benefit of putting the data folders under version controll.
If you want to track changes, you could export the data into its serialized form and store it in your VCS. If this is getting bigger, the advantage of the VCS may also drop since it will become very slow.
I assume you need to track the changes from within the data but since you deal with binary data, you are out of luck.

Two-way syncing with iPhone and Web service

Yeah I know there are couple of questions which are related to syncing with iPhone and Web DB but none of them helped me.
I also did a lot googling but I rarely found informations about two-way-syncing. Maybe I just used the wrong keywords.
I'm building an App right now and I came up with the idea to add a two-sync to my App and my Web service.
My first thought were that it would be ridiculously easy but it turns out to be not that easy at all.
I found couple of problems and some solutions to my problems but I would like to hear from you guys if these soultions would create other problems or if these solutions are good or bad.
The idea of my App is helping me sync my notes which I will take on the go with my iPhone and at work or at home with a Web App.
Those two ends should always sync'd cause I don't know on any time which device (iPhone or Computer) I will use to take, edit or just read my notes.
What I have on both sides:
For my Web service (and web app) I will use rails and I think mysql on the DB side.
On the iPhone I will use a SQLite DB with a Objective-C wrapper (FMDB).
Both will exchange data via JSON (using a JSON framwork on iPhone side).
My ideas so far:
Primary key has to be unique on both sides
As a primary key I will use a UUID. I think it's a unique solution on both sides and it won't make any duplicates (at least I hope).
Revisions for changes of data
Each change will be saved as revision with a SHA1 key, which I will create from date + note data.
The revision object is also including information like:
date
which note object belongs to this revision
changes are made on which device?
what chaged? (atually I'm not sure about including this information)
My "solution" so far is I will track every modification (create, update, delete) on a histroy-table with revisions on both sides.
On the iPhone side I will first update my history-table from Web DB and then commit my changes to the Web DB.
This should work, right?
That sounds not that bad to me but my question here is how can I handle conflicts? I don't want to bother the user with messages how to handle the conflicts.
Roundup of my questions:
Is my "solution" good or bad? What should I change to make it better?
How can I handle change conflicts so the user don't notice them?
Do you have any resources I could read about two-way-sycing?
EDIT:
Thank you all for your answers. I know now that I'm not alone with this "problem" and there is no simple and all fitting solution for all Apps. I assume that I'm doing good with my ideas or solutions so far and I will try to come up with syncing rules.
My idea so far is: I will develop it simple as possible and will use it for my own needs. Solve problems I discovered while using and syncing. After that I will invite my friends to test and solve problems they have.
I think this way I can came up with real world rules for syncing my data with Web cause I see what people are actually doing and where problems are.
What you think?

"It depends."
Everyone loves that line in their answers.
Two way sync boils down fundamentally to conflict resolution. And only you as the application designer can come up with the rules for conflict resolution.
Without conflict, syncing is easy.
One way syncing is "easy" because it's just like two way sync, save that the rules for conflict always favor one party. "Make this look like that." Simple rule.
Fine grained two way syncing isn't that hard, you just need to record the specific changes that are made and when they are done, then when you sync, you take that log of changes from each party, combine them in to a single log, and then apply that log to each party starting with the last time they were in sync.
By specific changes I don't mean "record changed", as it's too coarse. Rather you want to know that "lastName" of record changed. It changed at 01/01/2011 12:23:45.
When party A says lastName changed to "Johnson" at 01/01/2011 12:22:45 and party B says lastName changed to "Smith" at 01/01/2011 12:22:46, then "Smith" is the right answer, since it's the latest.
But wait, did you see what happened there? I just pulled a rule out of thin air. "Latest wins". Maybe that doesn't work for you, maybe you have different rules. "It Depends".
So, really, it all comes down to the rules. You can make it as fine grained as you want. There will ALWAYS be conflicts. That's what the rules are for.
So you need to decide what those are for you application.

actually i consider that the only problem in any kind of two way syncing only happens when there are conflicts. Really. Take for example any version control system (svn, cvs, git, etc.). They solve this conflict more granularly because they split the file itself, and they are checking for line conflicts, so changes in two different parts of the file are not treated as conflicts.
However i suppose this solution would not be really feasible because it's a pain to implement it :) ...
If you decide to handle the conflicts at the level of notes, and not it's lines, then probably at the end of the day you need to come up with some business rule that defines what happens when there are changes that result in conflict.
Possibilities:
Use the last change. Override the older. This is easy.
A solution what Dropbox uses i've seen it a couple of times when we were changing the same document on multiple machines is, that it creates multiple files appending a suffix to let users know about the changes on the multiple machines. Something like this you could do easily with notes as well.
I'm not sure i've helped though ...
Moszi

How do you CM an application with managed content

We have a web application which contains a bunch of content that the system operator can change (e.g. news and events). Occasionally we publish new versions of the software. The software is being tagged and stored in subversion. However, I'm a bit torn on how to best version control the content that may be changed independently. What are some mechanisms that people use to make sure that content is stored and versioned in a way that the site can be recreated or at the very least version controlled?

When you identify two set of files which have their own life cycle (software files on one side, "news and events" on the other, you know that:
you can not versionned them together at the same time
you should not put the same label
You need to save the "news and event" files separatly (either in the VCS or in a DB like Ian Jacobs suggests, or in a CMS - Content Management system), and find a way to link the tow together (an id, a timestamp, a meta-label, ...)
Do not forget you are not only talking about two different set of files in term of life cycle, but also about different set of files in term of their very natures:
Consider the terminology introduced in this SO question "Is asset management a superset of source control" by S.Lott
software files: Infrastructure information, that is "representing the processing of the enterprise information asset". Your code is part of that asset and is managed by a VCS (Version Control System), as part of the Configuration management discipline.
"news and events": Enterprise Information, that is data (not processing); this is often split between Content Managers and Relational Databases.
So not everything should end up in Subversion.

Keep everything in the DB, and give every transaction to the DB a timestamp. that way you can keep standard DB backups and load the site content at whatever date you want if the worst happens.

I suppose part of the answer depends on what CMS you're using, and how your web app is designed, but in general, I'd regard data such as news items or events as "content". In other words, it's not part of your application - it's the data which your application processes.
Of course, there will be versioning issues between your CMS code and your application code. You could manage this by defining the interface between the two. Personally, I'd publish the data to the web app as XML, which gives you the possibility of using XML schema to define exactly what the CMS is required to produce, and what the web app should expect to process.
This ought to mean that most changes in the web app can be made without a corresponding alteration in the rendering of the data. When functionality changes require this, you can create a new version of the schema and continue to make progress. In this scenario, I'd check the schema in with the web app code, but YMMV.
It isn't easy, and it gets more complicated again if you need additional data fields in your CMS. Expect to plan for a fairly complex release process (also depending on how complex your Dev-Test-Acceptance-Production scenario is.)
If you aren't using a CMS, then you should consider it. (Of course, if the operation is very small, it may still fall into the category where doing it by hand is acceptable.) Simply putting raw data into a versioning system doesn't solve the problem - you need to be able to control the format in which your data is published to the web app. Almost certainly this format should be something intended for consumption by software, and therefore not usually suitable for hand-editing by the kind of people who write news items or events.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse