I know, the use case might be specific but more and more stuff in all industry sectors is digitalized—and so is the communication between different departments which sometimes talk in very different languages. I searched the internet, but I wasn't able to find a clear answer (either I didn't find the right search phrases or the internet itself just doesn't know).
Here's my scenario: I'm working with several departments which work with diagrams (for example a lighting setup). This diagram solves different purposes:
which devices are used?
where are they placed?
where are they pointing?
how are they configured (e.g. exposure)?
They tend to export their finalized diagram as either an image or a PDF— which is fine if you want to print it out but considerably less helpful if another department (mine) has to work with the raw information. That's where I wondered if there's some kind of industry standard (SVG, XML, JSON, etc.) which is both supported by the programs these departments used and can be interpreted by some sort of programming language. Do you know anything like that?
Thanks in advance!
Related
My project needs some natural language processing. I'm completely new to the field.
what I'm trying to achieve is that when the User enter the description of the product I look for in my database which description is nearest and suggest that the category, product group and sub-group (the tree of the product).
For this titles 250 extracts products for each subgroup.
What is the specific term in NLP for doing this? I tried googling for a while, but had no luck since I don't know the term. Any good tutorials to start with? Are there any good libraries in doing this specific task?
Thank you.
From what I can tell autocomplete or text prediction/predictive search isn't really a big research area in NLP. It wasn't even covered in any of my graduate level classes and I do research in this area. I think the reason is that there are solutions that exist which are good enough for the vast majority of real world problems.
I'm not sure which language you work in, but the library you want to work with is probably Lucene if you are dealing with java, perhaps setting up a Solr instance if this is a general problem for you and you are dealing with a large number of ontologies.
You can find some reason tutorials/examples here on stack overflow, such as:
How to implements auto suggest using Lucene's new AnalyzingInfixSuggester API?
I've noticed in pretty much every company I've worked that they have a common library that is generally shared across a number of projects. More often than not this has been a single companyx-commons project that ends up as a dumping ground for common programs including:
Command Line Parsers
File Utilities
Framework Helpers
etc...
Some of these are well thought out and some duplicate functionality found in Apache commons-lang, commons-io etc..
What are the things you have in your common library and more importantly how do you structure the common libraries to make them easy to improve and incorporate across other projects?
In my experience, the single biggest factor in the success of a common library is user buy-in; users in this case being other developers; and culture of your workplace/team(s) will be a big factor.
Separate libraries (projects/assemblies if you're in .Net) for different application tiers is essential (e.g: there's obviously no point putting UI and data access code together).
Keep things as simple as possible; what you don't put in a common library is often at least as important as what you do. Users of the library won't want to have to think, so usage needs to be super easy.
The golden rule we stuck to was keeping individual functions focused on a single task - do one thing and do it well (or very very well); don't try and provide something that tries to take every possibility into account, the more reusable you think you're making it - the less likely it is to be used. Code Complete (the book) has some excellent content on common libraries.
A good approach to setting/improving a library up is to do regular code reviews and retrospectives; find good candidates that you've already come up with and consider re-factoring them into a library for future projects; a good candidate will be something that more than one developer has had to do on more that one project (for example).
Set-up some sort of simple and clear governance of the libraries - someone who can 'own' a specific library and ensure it's overal quality (such as a senior dev or team lead).
I have so far written most of the common libraries we use at our office.
We have certain button classes that are just slightly more useful to us than the standard buttons
A database management class that does some internal caching and can connect to ODBC, OLEDB, SQL, and Access databases without even the flip of a parameter
Some grid and list controls that are multi threaded so we can add large amounts of data to them without the program slowing and without having to write all the multithreading code every time there is a performance issue with a list box/combo box.
These classes make it easier for all of us to work on each other's code and know how exactly they work since we all use the exact same interfaces throughout our products.
As far as organization goes, all of the DLL's are stored along with their source code on a shared development drive in the office that we all have access to. (We're a pretty small shop)
We split our libraries by function.
Commmon.Ui.dll has base classes for ui elements.
Common.Data.Dll is sort of a wrapper around Enterprise library Data access classes.
Common.Business is a dumping ground for other common classes that don't fit into one of those.
We create other specialized dlls as needs arise.
I'd like to play around with building a recommendations system, and by that I mean an algorithm that looks at preferences and/or reviews posted by a user and then makes recommendations for them, similar to what netflix or amazon use.
What are some good resources for learning how to write something like this? Where should I start?
Check out the Wikipedia page on the Netflix Prize and its discussion forum. Also, the somewhat related 2009 GitHub Contest is a good source for full source code on a number of different recommendation engines. And obviously there's also the Wikipedia page on the topic itself, which has some decent links.
If you start writing your own, you'll want to use a corpus. I'd actually recommend using the Netflix Prize's data set. Just carve the data set into two pieces. Train on the first piece and score your algorithm on the second piece.
Addenda: A somewhat related and scary application of this sort of thing is predicting demographic information: a user's gender, age, household income, IQ, sexual orientation, etc. You could probably do most of these attributes with the Netflix Prize dataset with a fairly high degree of accuracy. Fortunately everyone in that dataset is just a number.
Take a look at pysuggest a Python library that implements a variety of recommendation algorithms for collaborative filtering (which is used by Amazon.com).
What qualities made it so great, and what made it stand out compared to the not-so-great specs that you've had to deal with? Or, if you've never worked with a good functional spec before, what sort of things would you expect in a great spec?
Sorry this is obviously subjective but I'm creating a functional spec (not my first) and it just occurred to me that I may get some good ideas from the bright folks on SO!
The Project Aardvark specs from Joel on Software are the best I've come across so far. Each screen is defined very well, with pictures. The main features of the software are described, as well as some technical details.
Sadly the specs I've received personally aren't that brilliant. Usually they are just a bulleted list of features they expect from each section of the system, and they expect you to work out all the details. Which is fine, I guess. However, I'm writing a game design document for an RPG game I'm working on as a personal project, and I think the specs I'm writing are very well written. I've divided the game into Sections such as
Characters
Weapons & Armor
Levels
Map
Physics
and so on, and described each section in terms of gameplay as well as some technical details. Its very easy to work through.
I also highly recommend reading the Painless Functional Specs Series from Joel on Software for anyone interested in writing better specs.
IMHO, a key quality should be that the functional spec specifies the "what" in great detail but not the "how". That way, the requestor (marketing?) gets the look & feel and feature set that they want, but the implementation is left to those who know it best -- the developers.
Obviously, the specification should be complete, consistent and comprehensible. IMO it should also be well-organized, in that it keeps all requirements for a specific part of the product together. I've more than once read specifications where requirements for some module were scattered throughout the whole document, e.g. the general description is in chapter 4, but additional requirements can found in clauses in chapters 2, 5, 7 and appendix B. To work with such a specification, I first have to create a cross-reference map of requirements to modules.
A good spec should state what the application is supposed to do, in a clear manner.
This seems obvious, but the stuff I usually get is often very vague. Apparently it's not very easy for people to express what they want on paper, IF they even know what they want.
I was wondering if there exists any open source frameworks that will help me include the following type of functionality to my website:
1) If I am viewing a particular product, I would like to see what other products may be interesting to me. This information may be deduced by calculating for example what other people in my region (or any other characteristic of my profile) bought in addition to the product that I am viewing. Kind of like what Amazon.com does.
2) Deduce relationships between people based on their profile, interaction with one another on the website (via commenting on one another´s posts for example), use of the website in terms of areas most navigated, products bought in common etc.
I am not looking for a open source website with this functionality, but something like an object model into which I can feed information about users and their use of the site including rules about relationships and then at a later point ask it questions described in (1) and (2) above.
Any pointers to white papers / general information about best approaches to do this, or any related links will really help too.
(I am the developer of Taste, which is now part of Apache Mahout)
1) You're really asking for two things here:
a) Recommend items I might like
b) Favor items that are similar to the thing I am currently looking at.
Indeed, Mahout Taste is all about answering a). Everything it does supports systems like this. Take a look at the documentation to get started, and ask any questions to mahout-user#apache.org.
For 1b) in particular, Mahout has two answers:
If you are only interested in what items are similar to the current item, you would be interested in the ItemSimilarity abstraction in Mahout (org.apache.mahout.cf.taste.similarity.ItemSimilarity) and its implementations, like PearsonCorrelationSimilarity. Based on a set of user-item ratings, this could tell you an estimated similarity between any two items. You'd then just pick the most similar items. In fact, look at the TopItems class in Mahout which can just figure this for you quickly.
But also, you can combine a) and b) by computing recommendations, then applying a Rescorer implementation which then favors items that are similar to the currently-viewed item.
2) Yes likewise, you would be interesting the UserSimilarity abstraction, implementations, etc. This would deduce similarities based on item ratings. Mahout however does not help you deduce these ratings by, say, looking at user behavior. This is domain-specific and up to you.
Sound confusing -- read the docs and feel free to follow up on mahout-user#apache.org where I can tell you more.
I am researching the same topic, as I'm working on a project to help people decide how to vote on California's complicated ballot measures. Here are some open-source collaborative filtering engines that I've found:
Vogoo (PHP)
acts_as_recommendable (Ruby on Rails)
Mahout (formerly Taste) (Java)
There's also a good overview of these engines here.
There are also the Duine framework and OpenSlopeOne.
But in my opinion, Mahout is still the best.
You can find a survey about Open Source Recommender Systems here:
http://girlincomputerscience.blogspot.com.br/2012/11/open-source-recommendation-systems.html
Hope it helps!
You can find a List of Recommender Systems here