Big data visualization using "search, show context, and expand on demand" concept [closed]

Big data visualization using "search, show context, and expand on demand" concept [closed] - visualization

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
I'm trying to visualize a really huge network (3M nodes and 13M edges) stored in a database. For real-time interactivity, I plan to show only a portion of the graph based on user queries and expand it on demand. For instance, when a user clicks a node, I expand its neighborhood. (This is called "Search, Show Context, Expand on Demand" on this paper).
I have looked into several visualization tools, including Gephi, D3, etc. They take a text file as input, but I don't have any idea how they can connect a database and update the graph based on users' interaction.
The linked paper implemented a system like that, but they didn't describe the tools they were using.
How can I visualize such data with above criteria?

There are several solutions out there, but basically every one is using the same approach:
create layer on top of your source to let you query at high level
create a front end layer to talk with the level explained above
use the visualization tool you want
As miro marchi pointed, there are several solutions to achieve this goal, some of them locked to particular data sources others with much more freedom but that would require some coding skills.
Datasource
I would start with the choice of the source type: from the type of data probably I would choice either Neo4J, Titan or OrientDB (if you fancy something more exotic with some sort of flexibility).
All of them offer a JSON REST API, the former with a proprietary system and language (Cypher) and the other two using the Blueprint / Rexster system.
Neo4J supports the Blueprint stack as well if you like Gremlin over Cypher.
For other solutions, such other NoSQL or SQL db probably you have to code a layer above with the relative REST API, but it will work as well - I wouldn't recommend that for the kind of data you have though.
Now, only the third point is left and here you have several choices.
Generic Viz tools
Sigma.js it's a free and open source tool for graph visualization quite nice. Linkurious is using a fork version of it as far as I know in their product.
Keylines it's a commercial graph visualization tool, with advanced stylings, analytics and layouts, and they provide copy/paste demos if you are using Neo4J or Titan. It is not free, but it does support even older browsers - IE7 onwards...
VivaGraph it's another free and open source tool for graph visualization tool - but it has a smaller community compared to SigmaJS.
D3.js it's the factotum for data visualization, you can do basically every kind of visualization based on that, but the learning curve is quite steep.
Gephi is another free and open source desktop solution, you have to use an external plugin with that probably but it does support most of the formats out there - graphML, CSV, Neo4J, etc...
Vendor specific
Linkurious it's a commercial Neo4J specific complete tool to search/investigate data.
Neo4J web-admin console - even if it's basic they've improved a lot with the newer version 2.x.x, based on D3.js.
There are also other solutions that I probably forgot to mention, but the ones above should offer a good variety.
Other nodes
The JS tools above will visualize well up to 1500/2000 nodes at once, due to JS limits.
If you want to visualize bigger stuff - while expanding - I would to recommend desktop solutions such Gephi.
Disclaimer
I'm part of the the Keylines dev team.

Related

Sequence Diagram Out Of Existing Java Code [duplicate]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 7 years ago.
Improve this question
I tried (though not very comprehensively) numerous solutions including ModelGoon (only class and interaction diagrams available), ObjectAid (class diagram only), eUML free edition (quits with an ominous "license not found" error on first use), MoDisco (with the only option on the menu being "browse corresponding model element"). And also some standalone tools - ArgoUML and BOUML either dont provide this feature or at least i was not able to find it. Jsonde started only after i fixed a msvcr71.dll-error and was then unable to connect to the VM for reasons unknown...Java Call Tracer is just a bunch of files with pages of options to apply to the JVM directly and there is no executable...
I also read following posts on the topic : featuring commercial options, too general (not seq diagrams), also too general, featuring standalone commercial solutions
By working out of the box i mean - the default installation is not broken and there is an option like "generate sequence diagram" or similar resulting in a (modifiable would be great) sequence diagram.
I am getting an impression, that there is simply no such thing (yet?) as a free UML sequence diagram reverse engineering eclipse plug-in working out of the box.
Please prove me wrong. Thank you

The other day, I discovered a tool from the University of Victoria called Diver: Dynamic Interactive Views For Reverse Engineering. You can either find a method and create a static sequence diagram starting with that method or you can run an application in a trace mode to capture the sequence diagram for a particular execution of an application.

I'm the initiator of the ModelGoon project, and I'm currently working on building sequence diagrams from a method. And I plan a release in few weeks. Therefore, I don't really know what are the features expected by users. I mean it is possible to build a very detailed sequence diagram from a method body, but is it really useful? I usually use sequence diagrams "to think something through, either to verify the logic in a use case or to design a method or service" as advised in Agile Modeling.
Can you tell me more about your use of the generated sequence diagram? as you said it would be better if it is modifiable, what kind of modifications do you expect, and what about code synchronization? What is the level of detail you're expecting from it?
Have you tried the Netbeans UML Modeling module?
Fell free to contact me from my website.

TPTP seems to be the only real option so far. That crystallized over the last few days after trying a number of different solutions. After installing TPTP from the regular Eclipse update site, follow these steps :
select Profile As from the context menu of a runnable element (eg. a method, a test, a testcase..)
select Profile Configurations
select the Profile Setting tab
select Exection Time Analysis as the data collector
in the Profiling and Logging view, select Open with from the context menu of the profiling data (the clock with glasses) and pick UML2 Class Interactions
After that you will have to hide many many lifelines per right-click to make the diagram remotely readable while getting annoyed by TPTPs choice of color (light blue on white - also very unreadable).
You can then print the diagram, effectively exporting it to pdf, tiff, eps and other formats via your favorite file printer.
There is one big bitter pill to swallow though : what you get is an execution trace in disguise of a sequence diagram. This means - no loops, no conditions, no notes and such. Even the diagram title sucks, being a cryptic 50-odd character monstrosity you cant change.
On the other hand, TPTP offers you much more that a sequence diagram. For example, you get a color-coded execution hot spot analysis on the side of the diagram as a bonus.
But it seems that even the expensive tools boasting round-trip code engineering like Enterprise Architect offer nothing more than tracing (and admittedly much nicer graphics). Reverse engineering a real sequence diagram seems to be quite non-trivial.

ObjectAid has a sequence diagram now. It's not free, but not expensive either. It does reverse engineering from source code, stack traces and call stacks in the debugger.

BPMB visualization

We need to visualize BP (business process) into BPMN, but NOT by hands using modeler. We need to do it automatically in crm-web-based system written on PHP. I have input data (etc. array, xml, not care...(but not BPEL)), then I need to process it into nice BPMN graph (using SVG).
We have first nice-looking realization of it. We use matrix to draw: several times goes through matrix and optimize graph each time, no no, it working fast, but it not agile, hard to rebuilt, upgrade, add new features... We made this algorithm by ourselves (I mean we didn't find it in google or books). Problem is that we couldn't find any algorithms in the internet. I suppose we don't know correct keywords to do it. Every try returned us to BPEL vis. from BPMN, "Data flow vis." returned modelers...
Please help us to find some algorithms, or give correct keywords to find out information.

Think you're probably looking for "graph layout algorithms". The only library I'm aware of that can (I think) generate BPMN directly is the yFiles library from yWorks. It's not free. They do however offer a free application using the library that does auto-layout. Perhaps you could do some prototyping with that.
If that's not applicable, there are several other options. I'm not aware any of these can generate BPMN symbols directly; you'd have to construct the symbols. However all will auto-layout graphs according to various algorithms. Also all open source/free.
graphviz. Written in C. Quite old now but well used, stable and scalable.
tulip. Newer than graphviz. Haven't used it but heard good things about flexibility and scalability.
see also this post for javascript based options.
There are many more, just google for graph layout algorithms / libraries.
hth.

How do you collaboratively write specs? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 years ago.
Improve this question
I am working with a small team (2 others) of developers that are geographically dispersed, and I'm looking for good ways for us to collaborate on specs... We're thinking we might use Google Docs to write the spec in so we can all have access to modify it in a central location.
What have you done? What good ideas do you have?

If you have an intranet or VPN, I would actually consider installing and using a small Wiki for these specs.
Compared to Google docs you get:
Much better versioning and change tracking (IMHO)
Much easier to start new documents for subsections
An actual markup rather than WYSIWYG (a matter of taste, I prefer LaTeX to Word).
Possible to attach variety of other file types
Very easy to backup
Very easy to create an offline version
You don't have to worry about storing sensitive materials elsewhere.
The disadvantage is that it is not WYSIWYG, which may or may not be an issue to you.
Of course, you can pick a Wiki implementation that supports a better editor, and possibly even a synchronous collaboration one.

Google Wave - exactly what it's meant for - collaboration

IMHO, a word processor is the wrong tool for a programmer. A spec should be written in a plain text editor, and utilize lightweight markup such as reStructuredText, AsciiDoc etc.
The benefits of such an approach are:
There are excellent tools to manage plain text, that are already in the hands of programmers (VCS, automated build systems, diff, patch, programming editors, grep, etc.)
A markup language allow for expressing intent rather then formatting.
That in mind, a Wiki seem to be the obvious choice.
Personally my tool chain of choice is:
reStructuredText as the markup language.
Trac as a Wiki
Firefox + the it's all text extension
Emacs + rst-mode

The choice of technology is one issue and Google docs is a good choice IMHO. But the real challenge is how to manage the process e.g. divide the tasks.
My suggestion is to first make sure that the platform and all related technologies are decided-upon as best as feasible. Then, compose a a thorough table of contents. A well-designed TOC will allow you to divide tasks properly and not "step" on each others' work. From then on you each "flesh" out your assigned sections as well as review each others' work.
In effect, each TOC subsection becomes an atomic unit of work that can be assigned and maintained by an individual who is also accountable for said section(s).
Good luck!

I think it depends on
How heavily into writing the specs you all are
If you're likely writing at the same time
Whether you intend to publish the specs.
Google Docs is nice and easy to get started with. It's also great that you can now export folders all at once. Still, for something that's going to be published to the web, a wiki or general cms is a better presentation vehicle. A wiki will also integrate with your existing site.
If you've got small specs, primarily written by one person then use whatever tool is available where you're hosting the project code or website. If you're not likely to be editing at the same time then a wiki is good.
I've done the wiki thing, the passed document thing and the Google Docs thing.
The wiki thing has a low starting effort and lasts a pretty long time. At a certain size it does get to be a pain.
The passed document thing (writes, email, edit, email, etc) only works while one person is starting everything up. As soon as there are even minor edits then it sucks.
The Google Docs thing is fine until you have several docs and several editors or want to publish it online.
hth

This isn't programming related, but I've personally used Google Docs to write shared documents and found it easy to use.
I would suggest enabling Google Gears however, in the event that the Google servers go down momentarily or an internet connection isn't available.

For writing specs collaboratively, you could try Gingko.
It's a card-tree editor, which means it's a mix between index cards and an outliner, with real-time collaboration and full Markdown support (as well as basic LaTeX).
We are still missing several features (version history, comments, etc), but for some the benefits of having everything in a tree structure outweigh these drawbacks.
Writing specs with it is great, because you can create a card for each user story, and drill into it as much as you like (and organize them into categories if you'd like).
http://gingkoapp.com

Developing an asset/node based CMS [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
I'd like to develop a CMS for fun/personal using asset-based architecture rather than page-based (why, is the purpose of this question), but I can't find much information on the subject. All I've found barely scrapes the surface (there's a good chance I'm searching with the wrong terms).
An asset-based CMS stores information
as blocks of text called assets. These
individual assets are then related to
each other to automatically build
pages.
What are the (dis/)advantages of such a system?
What are the primary principles of asset-based architecture?
What should and shouldn't be an 'asset'? Where can I read more?

Decided to try to answer this after leaving my comment :)
If your definition of "asset" is along the lines of a "node" (such as in Drupal), or a document (such as the JSON-style documents in MongoDB or CouchDB), then here is some info:
I'll use the term "node" for this post. I think it's closest to "asset" and more popularly used. This also might be a very abstract answer, but hopefully it will at least get you thinking and pointed in the right direction.
Node-based architecture, could be described as a cross between neural networking patterns and object-oriented programming. The key is that "nodes" are points of data, and nodes can be connected to each other in some way.
Some architectures will treat nodes much like object-oriented classes, where you have different classes of nodes that can inherit various characteristics of parent nodes - every type of node inherits the basic properties of its parent - an "Essay" node might inherit the properties of a "Text-Document" node, which in turn inherits the properties of the base node. Drupal implements this inheritance model well, although it does not emphasize the connections between nodes in the way that something like Facebook's GraphAPI/Open Graph Protocol does.
This pattern of node-based architecture can be implemented at any level too, and exists in nature - think of social circles within society or ecosystems ;) On a software engineering level, it can take the form of a database, such as how MongoDB simply has nodes of data (which are called documents in that case). These documents can reference other documents, although, like Drupal, Mongo does not emphasize connectedness. Ironically, relational databases like MySQL that are the opposite of document-based databases actually emphasize connectedness more, but that's a discussion for another day. Facebook's GraphAPI that I mentioned above is implemented on a Web-API level. The Open Graph Protocol shapes it. And again, something like Drupal is implemented at the front-end level (although its back-end implements the node pattern on a lower level, of course).
Lastly, node-based architecture is much more flexible than traditional document/page based CMS architecture, but that also means there is a lot more programming and configuring to be done on the side of the developers. A node-based system will end up being far more inter-connected and its components will be integrated with one-another a deeper level, but it can also be more susceptible to breaking because of this deep level of connection - it is less than separated into individual modules. Personally, I see a huge trend where people are moving to become more "node-based" and less "content-based" as people begin to interact with websites more like applications than as electronic magazines as they did in the 90's. Plus, the node-pattern fits well with the increasing emphasis on user-contribution and social browsing because adding people and their accounts/profiles to a web site dramatically increases the complexity.
I know you said "asset," so I'll also say that asset emphasizes the data side of the node pattern more, whereas "node" emphasizes the connections between the pieces of data more.
But for further reading, I'd recommend reading up on the architecture of the software I mentioned. You could also check out node.js, JSON, and document-based databases, and GraphAPI's as they seem to fit well with this idea of asset/node-based architecture. I'm sure Wikipedia has some good stuff on these patterns as well.

You could very quickly scale this up using the CakePHP framework. It uses an MVC pattern and it provides classes called elements that may be inserted into layouts and can load whatever content you want based on the page, user, moon phase, etc.
<page>
<element calls methodX>
<element calls methodY>
<Default Content relies on Controller Action(view/edit/add/custom)>
<element calls methodZ>
</page>

I think you might be describing a CMS backed up by a content repository.
The repository itself is implemented by Apache Jackrabbit based on JSR 170:
The API should be a standard, implementation independent, way to access content bi-directionally on a granular level within a content repository. A Content Repository is a high-level information management system that is a superset of traditional data repositories. A content repository implements "content services" such as: author based versioning, full textual searching, fine grained access control, content categorization and content event monitoring. It is these "content services" that differentiate a Content Repository from a Data Repository.
For a CMS working on top a content repository, look at Nuxeo.

Are there any version control systems for 3d models / 3d data? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 7 years ago.
Improve this question
Well the subject is the question basically. Are there any version control systems out there for 3d models. An open source approach would be preferred of course.
I am looking for functionality along the lines of subversion however more basic systems would be of interest as well. Basic operations like branching / merging / commit should be available in one form or another.
UPDATE: With open source approach I don't mean free but the option of heavily expanding and customizing the system when needed
UPDATE2: I don't know how to describe this in the best way but the format of the 3d model is not that important. We will be using IFC models and mostly CAD programs. The approach Adam Davis describes is probably what I am looking for.

This is going to be difficult since most 3D CAD programs do not take into account the possibility of revision, so when you load something and then save it again it may completely re-order the points (there are reasons for this, usually done for performance).
Further, large models represented in a text format are huge files, and will take forever to copy/merge/etc.
There is no current system that will manage this, but there's a really big need in the industry for it.
I would expect such a system would have a model normalizer that converts to and from the desired CAD format and the revision format. It could then handle merges and track changes more easily.
It would also need to output diffs in a form that you could open a "diffed" model in a cad program and the changes are shown in a different color or otherwise highlighted. No one is going to be able to look at a text diff and understand what they're looking at. This diffing program would ultimately need to support understanding that two models are the same even though the 0,0,0 location and rotation are not the same (difficult matching problem) and give the user some interface to allow them to help it when it gets stuck.
You'd probably have to deal with the parts of the model separately (bones, mesh, textures, etc) and have a third file that synchronizes them when converting them to an inclusive model file for use and modification.
It's not a trivial problem... But if you started on something that just handled meshes and open sourced it, you'd probably get a lot of people interested.

Although this question is old, it's still in Google's results for 3d version control. Luckily in the years that have passed since the question was asked, Github has started supporting 3d STL files with visual diffs!!

Have a look at http://3drepo.org
It's open source revision control framework for 3D assets and highly extensible.

Although I realize it's a slightly different topic, you might be interested in the answers to the question Version Control for Graphics...

Similar to what GingaNinja said, if all you care about is management of binary files at different revisions, most revision control systems will work for you. However, if you are looking for a tool that will display the changes in the actual images, you might be hard pressed to find a recommendation of a tool here. I would start by asking on a graphic artist forums.

There is a diff tool for common 3D formats being released in about a week. It supports dxf/dwg, obj, stl, igs. It may not be perfect since it is still in version 1 but hopefully it can help with your problem. The tool is called Differ3D and it can be found at http://www.blackspiralsoftware.com.
Disclaimer: I work for the company that released this product. We are looking to improve it so any feedback would be welcome.

I was under the impression that SVN is perfect for any kind of project that uses text files. So if your model is made up of text files, then it would be fine.
I don't see how binary data would work, as all version control that I know of makes use of diff management, which uses text comparisons.

3d models and data are just data files whether their format is text or binary. Version control systems can handle both since often you check in libraries etc, which are binary files.
I'm not quite sure what you mean by "open source approach". Do you mean a free solution? You can get open source projects which have to pay for, depending on your usage, e.g. Qt.
Subversion or CVS would store text or binary models and are both free. Subversion is preferrable to CVS since it can commit multiple files in change sets. On Windows you can use TortoiseSVN, which is an excellent, free tool set.

If you use Subversion you must remember to lock (assuming the files are binary, which nearly all 3D model formats are). Other than Subversion and other OSS like it, you might look at Gridiron Flow- the new content/workflow management software from Gridiron Software. John Nack of Adobe gave it a rave review.

DXF is a text-file standard (similarish to XML) but I don't think merging these types of files is a particularly good idea.
If you wanted to perform a Diff operation on 2 AutoCAD files, you can programmatically address individual objects by their "Handle" - a unique hex identifier. Location, rotation, scaling, colour etc are properties of the object. CAD drawings are basically an object database. I don't know of any product that does this. Change tracking is a viable proposition but merging would be a lot more complicated.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse