Is there any open source tool that automatically 'detects' email threading like Gmail? [closed] - email

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
For instance, if the original message (message 1) is...
Hey Jon,
Want to go get some pizza?
-Bill
And the reply (message 2) is...
Bill,
Sorry, I can't make lunch today.
Jonathon Parks, CTO Acme Systems
On Wed, Feb 24, 2010 at 4:43 PM, Bill Waters wrote:
> Hey John,
> Want to go get some pizza?
> -Bill
In Gmail, the system (a) detects that message 2 is a reply to message 1 and turns this into a 'thread' of sorts and (b) detects where the replied portion of the message actually is and hides it from the user. (In this case the hidden portion would start at "On Wed, Feb..." and continue to the end of the message.)
Obviously, in this simple example it would be easy to detect the "On <Date>, <Name> wrote:" or the ">" character prefixes. But many email systems have many different style of marking replies (not to mention HTML emails). I get the feeling that you would have to have some damn smart string parsing algorithms to get anywhere near how good GMail's is.
Does this technology already exist in an open source project somewhere? Either in some library devoted to this exclusively or perhaps in some open source email client that does similar message threading?
Thanks.

There's a good article written by Zawinski here:
http://www.jwz.org/doc/threading.html

I believe Gmail works by subject title. I can't check it at the moment, but a quick change to the title might break the threading.
The following is difficult to predict, as you mention:
On Wed, Feb 24, 2010 at 4:43 PM, Bill Waters wrote:
but grabbing the email title Pizza tomorrow and assuming a prefix of Re: Pizza tomorrow is considerably more predictable. You could also assume the cases of FW: and RE: (in caps).

Do you mean to solve problems where the correspondent doesn't set In-Reply-To: or References: header fields?
Otherwise, you might use mutt and configure it to not show quotes by default.
(Should be done by any other mail-tool on earth too. (Well, i never got a tree-thread-view in Outlook.)
[edited below in reaction to comment]
If you try to build your own software, then this question obviously is suited well. But then, I can only give you my 2c on this. If you cannot rely on the explicit headers, than the only thing to do is take a bunch of mails, learn the most common phrases used to indicate quotes. (Luckily there are some conventions, and date formats and names/emails are not completely arbitrary.)
If you do this for analysis of communication threads, you probably want to indicate the likelyness of the relation. If you only do it for convenience of the user... well,... my personal opinion? Don't sweat about people not able to use a decent mailtool.

What kind of Mail Delivery Agent are you using?
Are you developing your own? In that case, are you planning to implement IMAP protocol?
If you're using Cyrus (or any other product that handles IMAP) with SORT and THREAD extensions, then it's already built in.
In both cases, you should take a look at RFC 5256.

You could have a look at sup http://freshmeat.net/articles/sup-gmail-meets-the-console as it does almost what you want

Related

GitHub vs Google Code for a hobby project [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 9 years ago.
Note: I have seen this and tried to take as much from it as possible; but I believe my context is different.
I am working on a small-ish project. Call it Foobar. I'm wanting to get this done more organised..I've tried a few projects, mostly as an unorganised programming-as-a-light-hobby student. I'm trying to get more organised; 90% of those projects went after I either failed to document at all, or because I lost them.
As such, I've been thinking about getting version control/hosting going. Not only will it organise me more, but (a big if here) if it gets anywhere into a usable state, it will be easier for people to get.
The two places I'm considering are Google Code and GitHub. From the question I linked:
Google Code:
As with any Google page, the complexity is almost non-existent
Everyone (or almost everyone) has a Google account, which is nice if
people want to report problems using the issues system
GitHub:
May (or may not) be a little more complex (not a problem for me though) than Google's pages but...
...has a much prettier interface than Google's service
It needs people to be registered on GitHub to post about issues
I like the fact that with Git, you have your own revisions locally
From this I'm leaning towards GitHub, as Google Code doesn't look appealing to me.
For a small hobby project - basically making community features irrelevant - are there features that should take me over to one side or the other?
I prefer Google Code since it's just easier for my small personal projects. At the end of the day, for free projects, it's hard to steal time from family, friends or other commitments and the key to making small free projects a success is being realistic with your time. (Elsewise, you get the "80% done" problem.)
Google Code now has GIT support.
Biggest advantage of Google Code is that you don't need a website.
- The frontpage of the project is enough.
- You can add simple binary downloads in the Downloads section.
- In comparison, GITHUB's interface is REALLY confusing to non-programmers. Your frontpage is full of technobabble and so unless it's a coder's tool, you'll need a separate website.
- Marketing's really good- You get a good rank on Google and often you'll be picked up and sometimes reviewed by other download sites. There's no sense donating your time if no one can find your project.
If it is entirely a coder's tool (not just a handy IT tool), then perhaps GITHUB is better.
You say "I believe my context is different", but don't give any reasons why it is. As such, I can't offer you any specific suggestions other than the generic pros and cons, which are outlined in various documents and tutorials online.
My suggestion: pick a program first (git, Mercurial, or SVN) and use it. Find a hosting site that supports the software (at the time of this answer, GitHub for git, BitBucket or Google Code for Mercurial, Google Code for SVN) and use it. If you run into problems, switch to another one.
I've used all three, and typically the problem isn't the hosting, but the fact that you need to learn the program itself. All of the hosting providers listed here will suit you fine until you have a specific reason why it doesn't.
I would go with Github. The single reason for this is, that Google code shows your email and your full name (name only if you have google+ i think). And you cannot disable this at the moment.
Let's split the problem into two parts: for developers and for users.
In fact, if just terminal users are considered, both google code and gitbud has friendly interfaces, and as we all know, google is more well-known towards those who do not program.
But when we turn to programmers, git is more fashion and more comfortable(question?).
So, personally I will choose google code if I am planning an terminal user oriented product and github of course if I want to involve lots of potential collaborators of I was developing an complete programmers' product, like a API something.

Suggestions on project planning? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 9 years ago.
This is not a programming question per se, but here it goes. I am a senior CS undergrad, and I started an internship this summer for a mid-sized software company. I've done a few freelancing jobs before, but it's the first time I've been officially (more or less) employed as a software developer.
I've been asked to code an internal website from scratch to be used by separate teams in the company, and I've been given a lot of flexibility in designing it. And therein lies the problem: we've had several meetings and design reviews and everyone seems to have an idea on a new feature, and even conflicting ideas on how things should work.
So far my initial prototype has survived all of this, which is something I was told not to expect - but I knew I had a solid design. While I am not behind schedule, the work is progressing significantly slower than I had predicted. A lot of this has to do with loose specs and constant features requests and changes.
I am to deploy an alpha in a couple of weeks, which I thought wouldn't be a problem, but the way things are going I am not sure how that's going to work out.
Does anyone have any ideas?
Thanks in advance
You are asking a timeless question about (software) project management. There are careers made writing books on the subject.
I agree generally with rockinthesixstring on this.
If you don't have an effective project manager who can filter the customters' requests and manage their expectations, and say "no", then that will have to be part of your job.
Sometimes there is an art to not saying "no". Sometimes you can say it more like "As you see in the schedule, version 1.1 is going alpha next week. The feature list for version 1.2 is already set. I'll add your new feature to the top of the list for 1.3. But if you like, I can call a meeting with the other teams to see if we can reprioritize the 1.2 features."
As to conflicting ideas, if there is no other "decider", than that becomes part of your job as well.
Understand that not everyone will get their way.
Without an approach that addresses these sorts of issues you simply won't succeed by any measure.
I'd start by locking into the features that have been agreed upon and place all feature requests into some sort of project planning software (OnTime Perhaps). Then roll out the Alpha release with the agreed upon specs before moving onto the "we'd like" and the "bells and whistles".
You need to prioritize and triage feature requests, possibly even some of the ones you have already agreed too.
It sounds like product ownership is not clear (as can be expected with internal projects with multiple teams). You should probably run some form of planning game. If you have multiple stakeholders, you might give them each 50 points to vote on all the features in an iteration.
You, as developer, decide how large each feature is. The features with most points/size get into the iteration. If some teams are more important, give them more points. You should also spend some points yourself.
I would like to express my approval of James McLeod's post. The only justification anybody needs for wanting a feature is "user x might try to...". The difficulty is resolving contradictions between their opinions and somebody else's. The feature with a higher 'priority' as assigned by your project management process is the one that gets implemented, at the expense of its competitors if necessary. Ask the people suggesting features to go away and put something down on paper explaining the reasoning behind the feature and the circumstances they think might preclude its inclusion. Letting everybody else see what limitations they think their approach has could help break a decision deadlock. The feature whose case is stated more thoroughly 'wins'.

Which browsers/plugins block HttpReferer from being sent? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
I am trying to interpret HttpReferer strings in our server logs. It seems like there is quite a high number of empty values.
I am wondering how many of these empty values are due to direct hits from people entering our URL directly into a browser and how many might be due to some kind of blocking utility that prevents the Referer from being sent.
I really have no idea how many people are using tools or browsers or 'anonymizers' that might block the refer. Any input?
I personally disable it using "Web Developer" extension of Firefox, only because of some "helpful" sites that highlight the search terms that I used to get to that page.
Thanks, I am fully capable of installing a highlighter plugin, or search for the words inside your page.
I think a large proportion may actually be caused by ISPs' restrictions. I know my ISP (BT, in the UK) filters it out (probably at the router) which is bloody annoying at times.
As it turns out, the block is actually put in place by Zone Alarm, a software firewall, which is often supplied by ISPs.
Opera has a quick toggle in the F12 menu that you can switch on "Send Referrer Information" or not to the site(s) you're surfing around.
I used to log all this stuff in my blogging app - pretty much all bots never send referrer info.
You should be able to make an educated guess as to whether it's down to it being filtered out or just people entering the URL.
If the first hit has no referrer but the loading of images/CSS etc has referrer info then they just entered the URL directly.
If they only ever pull down HTML with no images or CSS they are most likely a bot (or using Lynx perhaps).
If they pull down HTML, images and CSS with no referrer then it's being filtered out.
Some antivirus software is retarded and also started doing this for "security" reasons.
We had an email form that used referrer tracking to eliminate the gist of the random bot-spam an some people moaned that it didn't work.
Not entirely wonderful, but there are far more good uses of the referrer header than for just 'lets be evil and watch where people came from' to legitimise it.
( Some antivirus packages have been known to stop email working altogether for instance, and the clients will ring you and tell you its your fault until you tell them to get rid of their rubbish i've never heard of that company before' antivirus for the 40th time and they listen and their problem magically resolves )
Addendum on security
Referrer tracking is very useful for keeping state within a site. (Without needing cookies)
Referrer tracking is very useful to acknowledge that a users origin was from the site itself ( without needing cookies )
Though I see a legitimate privacy concern with leaking 3rd party sites leaking data via referrer, and the recipient seeing that.
So:
3rd-party => site # referrer preferred blank
local => local # referrer preferred kept
At least here you can easily distinguish between a "hotlink" from an external source and an internal link.
Also, because of this, cross-domain referrals from SSL websites are blocked by default by some browsers.

Family Website CMS [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 11 years ago.
I am looking for a CMS that would be incredibly user-friendly and would have the following features:
really simple message board (no login required)
family tree
story telling area
photo section
news section
Is there anything out there like this that is really easily configurable? I've already messed around with Mambo and Family Connects, but I didnt like either of those. In the past I've just programmed my own websites, for lack of easily implementable features. However, I've assuming there's something I need out there just like this, that I can't find. Thanks.
I don't want anyone to have to login, for one. The is for a family website, and much of my family really don't know what a website is, let alone how to use one. I want a super simple website with huge buttons and not a whole lot of distractions. Family Connects is a good example of what I want, except the photo album is horrible. I want people to post messages without logging in or signing up, and haven't seen that ability in mambo sites I've looked at.
I can understand your stipulation that your users (family) shouldn't have to sign up - but without a sign-in, your site will be a free-for-all for spammers, hackers and other bored Internet denizens.
That said, my suggestion is to use WordPress for a front end - register your family members yourself, and use a very basic template - or better yet, create one.
I have created a CMS for exactly what you are looking for. My family uses it all the time and the majority of them are not computer savy. The only downside is that it requires a login, but like other people have said, their really isn't a way around that if you want your information to be private.
Anyway, if you are still looking, try http://www.familycms.com/
I've been using http://www.myfamily.com/ and it fits all my needs. It includes:
Pictures (with option to order prints)
Discussion
Family Trees (free from ancestry.com)
Videos
Files
Events
I've setup CMS Made Simple a couple times now. It's all PHP and you can edit it to your heart's content. Give it a try.
CMS made simple seems to die according to this study about content management systems found on MytestBox.com
But if it's just for a family website...
maybe you can try other CMSs which any web hosting company provides (like Joomla or Wordpress).
These can be installed in several clicks (especially Wordpress - you can build a good site in Wordpress and it's very easy to maintain it).
For a family website I thiknk Wordpress is the best and enough (lots of plugins and skins can be found for it on the web.
If you're going for a family website you do have the option of removing the usernames/passwords/accounts by setting it up as an intranet site. Then you can browse it at home or from selective addresses.
I recommend geni.com. It's much better than Myfamily.com

Telligent's Community Server [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
The company I work for is wanting to add blog functionality to our website and they were looking to spend an awful amount of money to have some crap being built on top of a CMS they purchased (sitecore). I pointed them to Telligent's Community Server and we had a sales like meeting today to get the Marketing folks on board.
My question is if anyone has had issues working with Community Server, skinning it and extending it?
I wanted to explain a bit why I am thinking Community Server, the company is wanting multiple blogs with multiple authors. I want to be out of the admin part of this as much as possible and didn't think there were too many engines that having multiple blogs didn't mean db work. I also like the other functionality that Community Server provides and think the company will find it useful, particularly the media section as right now we have some really shotty way of dealing with whitepapers and stuff.
edit: We are actually using the Sitecore blog module for a single blog on our intranet (which is actually what the CMS is serving). Some reasoning for why I don't like it for our public site are they are on different servers, it doesn't support multiple authors, there is no built in syndication, it is a little flimsy feeling to me from looking at the source and I personally think the other features of Community Server make its price tag worth it.
another edit: Need to stick to .net software that run on sql server in my company's case, but I don't mind seeing recommendations for others. ExpressionEngine looks promising, will try it out on my personal box.
I've done quite a few projects using Community Server. If you're okay with the out-of-the-box functionality, or you don't mind sticking to the version you start with, I think you'll be very happy.
The times I've run into headaches using CS is when the client wants functionality CS does not provide, but also insists on keeping the ability to upgrade to the latest version whenever Telligent releases an update. You can mostly support that by making all of your changes either in a separate project or by only modifying aspx/ascx files (no codebehinds). Some kind of merge is going to be required though no matter how well you plan it out.
Community Server itself has been very solid for me, but if all you need is a blogging engine then it may be overkill. Skinning it, for example, is quite a bit of work (despite their quite powerful Chameleon theme engine).
I'd probably look closer at one of the dedicated blog engines out there, like BlogEngine.NET, dasBlog or SubText, if that's all you need. Go with Community Server if you think you'll want more "community-focused" features like forums etc.
You can also take a look at Telligent Graffiti CMS.
http://graffiticms.com/
It supports multiple blogs and authors.
Update: It's now open source and available at http://graffiticms.codeplex.com/
Community Server 2008.5 lets you add several members that can post articles. Also with
Community Server 2008.5 you now have wiki's along with forums and the blogs. This probably has one of the better web based admin control panel's I seen in a while. This let's you easily change several things including the site's theme (or skin). To me it is one of the most scalable applications I have seen in a while. We are using it for our site http://knowledgemgmtsolutions.com.
Skinning is pretty straightforward, and the sidebar widgets aren't very difficult to create (if you don't mind building controls in code). The widgets also allow options for the users to customize them in the control panel very easily. I doubt you'll find a strong community of widget builders for Community Server however. Nothing compared to the dev community for blogs like wordpress.
I recommend starting templates from scratch and adding in CS controls as needed, to get the markup you prefer for styling and to use only what you need.
Setting up different roles for users to post to different blogs is also very easy and requires no coding. You can have blog groups, and allow only certain users to post to certain blogs.
Sitecore's Forum module is powered by Community Server and integrated with Sitecore CMS.
Expression Engine with the Multi-Site Manager works great for that kind of situation.
Have you had a look at the Shared Source blog module for Sitecore?