I've been tasked to look into having an email function that we have in place which sends emails using uuencoding to something else more widely accepted. I guess there have been issues which recipients are not receiving attachments (.csv) files because it is uuencoded.
I'm guessing that we'd want to switch this to MIME encoding?
I wanted to get some suggestions, and perhaps some good starting places to look for something like this.
Yes, you'll want to switch to MIME. You should know, though, that MIME is not an encoding in the same way that UUEncode is an encoding. MIME is essentially an extension to the rfc822 message format.
You don't specify which language you plan to use, but I'd recommend looking at 1 of the 2 MIME libraries I've written as they are among the (if not the) fastest and most rfc-compliant libraries out there.
If you plan to use C or C++, take a look at GMime.
If you plan to use C#, take a look at MimeKit.
The only other decent MIME libraries I can recommend are libetpan (a very low-level C API) and vmime (an all-in-one C++ library which does MIME, IMAP, SMTP, POP3, etc).
The only "advantage" that libetpan has over GMime is that it implements its own data structures library that it uses internally instead of doing what I did with GMime, which is to re-use a widely available library called GLib. GLib is available on every platform, though, so it seemed pointless for me to reinvent the wheel - plus GLib offers a ref-counted object system which I made heavy use of. For some reason I can't figure out, people get hung up on depending on GLib, complaining "omg, a dependency!" as if they weren't already adding a dependency on a MIME library...
Oh... I guess if you are using Java, you should probably look at using JavaMail.
Beyond that, there are no other acceptable MIME libraries that I have ever seen. Seriously, 99% of them suffer from the same design and implementation flaws that I ranted about on a recent blog post. While the blog post is specifically about C# MIME parsers, the same holds true for all of the JavaScript, C, C++, Go, Python, Eiffel, etc implementations I've seen (and I've seen a lot of them).
For example, I was asked to look at a popular JavaScript MIME parser recently. The very first thing it did was to use strsplit() on the entire MIME message input string to split it by "\r\n". It then iterated through each of the lines strsplit()'ing again by ':', then it strsplit() address headers by ',', and so on... it literally screamed amateur hour. It was so bad that I could have cried (but I didn't, because I'm manly like that).
Related
I am looking to start Web Programming in Perl (Perl is the only language I know). The problem is, I have no prior knowledge of anything to do with the web, except surfing it. I have no idea where to start.
So my question(s) is...
Where do I start learning Web Programming? What should I know? What should I use?
I thank everybody in advance for answering and helping.
The key things to understand are:
What you can send to browsers
… or rather, the things you intend to send to browsers, but having an awareness of what else is out there is useful (since, in complex web applications in particular, you will need to select appropriate data formats).
e.g.
HTML
CSS
JavaScript
Images
JSON
XML
PDFs
When you are generating data dynamically, you should also understand the available tools (e.g. the Perl community has a strong preference for TT for generating HTML, but there are other options such as Mason, while JSON::Any tends to be my goto for JSON).
Transport mechanisms
HTTP (including what status codes to use and when, how to do redirects, what methods (POST, GET, PUT, etc) to use and when).
HTTPS (HTTP with SSL encryption)
How to get a webserver to talk to your Perl
PSGI/Plack if you want modern and efficient
CGI for very simple
mod_perl if you want crazy levels of power (I've seen someone turn then Apache HTTPD into an SMTP spam filter using it).
Security
How to guard against malicious input (which basically comes down to knowing how to take data in one format (such as submitted form data) and convert it to another (such as HTML or SQL).
Web Frameworks
You can push a lot of work off to frameworks, which provide structured ways to organise a web applications.
Web::Simple is simple
Dancer seems to be holding the middle ground (although I have to confess that I haven't had a chance to use it yet)
Catalyst probably has the steepest learning curve but comes with a lot of power and plugins.
Dependent on complexity of your project, you could have a look at Catalyst MVC. This is a good framework, messing up with the most request stuff, but gives you enough in deep view whats going on.
There is a good tutorial in CPAN
If you want to start with mod_perl or CGI, there are also some Tutorials :
mod_perl
CGI Doc
If you're looking to try some web programming in Perl, you could try hosting a Dancer app for free on OpenShift Express.
There's even a "Dancer on OpenShift Express" repo to get you started: https://github.com/openshift/dancer-example
Hey all, this is more of a philosophy question than anything. I have written my first web app, a Boston event calendar at http://bloozit.com I wrote it in Python, and I'm pretty happy with the simplicity of it, however, my Python contains blobs of HTML, CSS, and Javascript, like a stew with fish heads and eyeballs floating in it.
Then I saw this tutorial on web apps using Lisp: http://www.adampetersen.se/articles/lispweb.htm I've read On Lisp and written some software in Lisp, so I'm no Paul Graham, but I am familiar with it. One thing that appealed to me tremendously was how the author generated both the HTML and the Javascript from Lisp, makingn the whole thing nice and homogeneous.
The question is, how valuable is that homogeneity in the real world? Whenever anything goes wrong, you have to load the page up in Firebug, and then you'll be looking at the generated HTML, CSS, and Javascript, not the Lisp source, so you have to hold the mapping in your head. Does the homogeneity provided by Lisp really solve anything, or just wallpaper over the problem, which eventually pops up again downstream?
If there's anyone out there who's actually tried both approaches, I'd REALLY like to hear from you!
Well, I spent a year coding with parenscript and ht-ajax and eventually gave up and just generated the javascript by hand (still using hunchentoot on the server). I found that the result was much more predictable and, as you imply in your question, this made it a lot easier to figure out what was going on when using firebug. (I also switched to using jquery, which was much better than ht-ajax, but that's not really what you're asking).
That said, I massively recommend cl-who (http://weitz.de/cl-who/), which makes the dynamic generation of HTML much neater.
The question is, how valuable is that homogeneity in the real world?
Probably fairly significant: look at all the people doing server-side Javascript these days. Javascript isn't superlative at anything, and its library support for server-side code isn't that great at all, but there's big gains to be had by using it.
Whenever anything goes wrong, you have to load the page up in Firebug,
Depends on what the "anything" is. I can't actually remember the last time I had to open up Firebug to see what's going wrong -- I've certainly been through phases where it was, but there's also plenty of times when it's not.
For example, if you generate your CSS from s-exps, then trouble with your CSS might only make you need to look at the "compiled" CSS for weird syntax issues (like IE6 tricks). If you just look at the page and decide you need an extra border, then you can add (:border 1) and be done with it. (Of course, if you process that to generate a whole set of CSS rules to serve to the client, then it's an even bigger win.)
Another way to think about it: on very rare occasions I've needed to pull out a packet sniffer and a disassembler when working on a modern web app. Yeah, it sucks, but with good libraries, it's also very uncommon. I wouldn't rather write low-level code all day just to avoid the impedance mismatch of switching to a packet sniffer on the rare occasion when I do need that level of information.
This assumes that you want to and can get to a level where you're writing (V)HLL code. Common Lisp can't beat C at being C, and if you're just trying to spit out a simple blog in HTML then you're not in the sweet spot there, either: Rails is really good at that kind of thing already. But there's plenty of experimental programming where being able to change one flag and run code on the client rather than the server is useful.
and then you'll be looking at the generated HTML, CSS, and Javascript, not the Lisp source, so you have to hold the mapping in your head. Does the homogeneity provided by Lisp really solve anything, or just wallpaper over the problem, which eventually pops up again downstream?
I've written all-Lisp and all-Javascript web apps, and I think the best answer I can give right now is: it could. I've used Parenscript, and the major problem is that Parenscript is a Lisp-y language but it's not Common Lisp, nor is it any other complete language you can use on the server side. (If there was a Common Lisp to Javascript compiler, like GWT is for Java, then it'd be great. I don't see anyone seriously trying to make one, though.) So you've still got, as you observe, two languages.
Javascript is a bit better today, in this regard, because you can run exactly the same code in both places. It's not quite ideal because your server-side Javascript probably has features that you can't guarantee will exist on the client-side (unless you limit your users to, say, recent versions of Firefox). If you're like me, you don't want to limit your server code to JS that happens to run in every browser, so your server-side JS and client-side JS will be subtlety different. It's not a dealbreaker -- it's still pretty nice -- but it's still 2 slightly different languages.
I think it would be pretty cool if there was a program that could take code written in the latest Javascript (1.8.5), and generated old-school Javascript that ran in any browser. I don't think that such a program exists, but I don't know how difficult it'd be.
There are Scheme implementations for Javascript, and so maybe the situation with Scheme is better. I should probably look into that one of these days.
I'm often frustrated when having to use a server-side language that's completely different from my client-side language (Javascript). But then I'm also frustrated when I have to use a language that is lower-level than Lisp (which is most of them). Is it a bigger win to be more Lisp-like, or more Javascript-like? I don't know. I wish I didn't have to choose.
This isn't so much an answer as a strong opinion, but the fundamental problem is that HTML and CSS are just terrible(1). Neither does well what it is supposedly intended to do. Javascript is better, and is often pressed into service to make up for the shortcomings of those two(2), but it isn't an ideal solution(3). And as result server side languages are needed to generate HTML and CSS which just further complicates the mess. It is laughable that the simplest web application requires programming in no less than four different languages.
So, yes, your desire to have one good reliable language which which you can interface instead of those others is understandable, but so long as you are writing code that generates HTML/CSS such that you have to be concerned with the details of HTML and CSS, then you are just wearing mittens that might (read "probably") interfere when you go to play the piano. If your Lisp code is looking like this: (:body (:div (:# (:style (:border "1"))) (:p "hello"))), then you aren't really free from the concerns that plague you.
Personally, I think we need something else to take the place of the soup we've got now and it should compile to HTML/CSS/JS but keep the user free from their concerns. C compiles to assembly but the C programmer never sees the STA, MOV, LDX opcodes that it compiles to in their own written code. And, were it to be popular, then the browsers could support it directly. Anyway, it's just an idea. A glimmer.
Good Luck,
Chris Perkins
medialab.com
(1) HTML documents are compound documents with images, scripts, stylesheets, etc all being stored in other files. But the one thing that an HTML document cannot do is fluidly embed another HTML document - the one thing it most needs. iframes/object tags are fixed size and both adversely impact SEO. This one trivial task is often the sole reason a server side language like PHP is used on many websites.
You don't need me to tell you how bad CSS is.
(2) Examples abound: LESS (lesscss.org), document.write, AJAX, and more.
(3) The impedence mismatch between the Javascript DOM and CSS rules is nearly unbelievable. How many heights does a div have in the DOM (scrollHeight, offsetHeight, clientHeight, and more)? 4 or more, maybe? How many of those are addressable via CSS? 0 or 1.
Also, while Javascript can plug a lot of holes, it often does so at the expense of SEO
I am looking at linking a few applications together (all written in different languages like C#, C++, Python) and I am not sure how to go about it.
What I mean by linking? The system I am working on consists of small programs each responsible for a particular processing task. I need to be able to transfer a data set from one application to another easily (the data set in question is not huge, probably a few megabytes) and I also need some form of way to control the current state of the operation (This is where a client-server model rings a bell)
It seems like sockets or maybe SOAP would be a universal solution but just wanted to get some opinions as to what people think about this subject.
Comments/suggestions will be appreciated, thanks!
I personally take a liking towards ØMQ. It's a library that has a familiar BSD-sockets-like interface for passing messages, but you'll find it implements interesting patterns for distributing tasks.
It sounds like you want to arrange several processes in a pipeline. ØMQ allows you to do that using push and poll sockets. (And afterwards, you'll find it's even possible to scale up across multiple processes and machines with little effort.) Take a look at the guide to get started, and the zmq_socket(3) manpage specifically for how push and pull works.
Bindings are available for all the languages you mention.
As for the contents of the message, ØMQ doesn't concern itself with that, they are just blocks of raw data. You can use any format that suits you, such as JSON, or perhaps Protocol Buffers.
What I'm not sure about is the ‘controlling state’ you mention. Are you interested in, for example, cancelling a job halfway through?
For C# to C# you can use Windows Communication Foundation. You may be able to use it with Python and C++ as well.
You may also want to checkout named pipes.
I would think about moving to a model where you eliminate the issue by having centralized data that all of the applications look at. Keep "one source of the truth" so to speak.
Most outside software has trouble linking against C++ code, due to the name-mangling algorithm it uses for its symbols. For that reason, when interfacing with programs written in other languages, it is often best to declare wrappers to things as extern "C" or inside an extern "C" { block.
I need to be able to transfer a data set from one application to another easily (the data set in question is not huge, probably a few megabytes)
Use the file system.
and I also need some form of way to control the current state of the operation
Again, use the file system. A "current_state.json" file with a JSON serialized object is perfect for multiple languages to work with.
It seems like sockets or maybe SOAP would be a universal solution.
Perhaps. But it's overkill for this kind of thing. Your OS already has all the facilities you need. Just use the file system. It's very simple and very reliable.
There are many ways to do interprocess communication. As you said, sockets may be a universal solution. SOAP, i think, is somewhat an overkill. You may also use mailslots. I wrote C++ application using it a couple of years ago. Named pipes could be also a solution, but if you are coding on Windows, it may be difficult.
In my opinion:
Sockets
Mailslots
Are the best candidates.
I need to move some data from one machine to another. Is it a good idea to write a client server app using sockets in Perl to do the transfer? Will I have problems if one side is written in Java?
I mean, should I be aware of any issues I might face when I try to attempt the above?
Short answer: Using a Perl program as the client or server is just fine. Your only problem might be your personal skill and experience level, but after you do it you know how to do it. :) Most of the problem is choosing how you need to do it, not the technology involved. Perl isn't going to be the problem, but it doesn't have an advantage over other languages either.
As some have already noted, the socket portion of the problem is going to be the same in most languages since almost everything uses the BSD stuff. Perl doesn't have any roadblocks or special gotchas for that. To move data around you create one side to listen on a socket and the other to open a connection and send the data. Easy peasy. You might want to check out Lincoln Stein's Network Programming with Perl for that bit. That can get you the low-level bits.
For higher-level networking, POE is very useful and easy to work with once you get started. It's a framework for dealing with event-driven programming and has many plugins to easily communicate between processes. You might spend a little time learning it, but it gives a lot back too.
If you aren't inventing your own protocol, there's most likely already a Perl module that can format and parse the messages.
If you just want to transfer data, there are several things you can do. The easiest in concept might be just to write lines to the socket and read them as lines from the other end. A bit more complicated than that is using something like Data::Dumper, YAML, or JSON to serialize data to text and send that. For more complex things, such as sharing Perl objects, you might want to use Storable. You freeze your objects, send them as data over the network, then thaw them on the other side.
If you want to implement your client and server in different languages you have a bit more work to figure out how they'll talk to each other. The socket stuff is mostly the same, but a Java server won't understand the output of Perl's Storable (it's possible, but you'll have to parse it yourself and that's no good :). If you do everything right, neither side should care what you used on the other side.
I can only think of one gotcha off the top of my head: most text based network protocols use CRLF for line endings, but Perl on UNIX type machines assumes LF endings by default, this means you will need to change the input and output record separators if you want to use readline (aka <>) and print (also beware of printf, since it doesn't use the output record separator). Of course, if you are going to use a pre-existing protocol, there is probably already a Net::<PROTOCOL NAME> module on CPAN, so you won't need to worry about that. If you are designing your own protocol, I would keep the CRLF convention because it makes it easy to debug the server with telnet (which is really the last valid use for that program).
You don't say whether you need to implement your application to support any particular protocol or whether you need to implement a home grown protocol. The networking support in Perl is flexible enough to support either (or most places in between).
At the low level socket end, your code is going to be fairly similar whatever language your are using - BSD socket APIs are pretty well the same everywhere they are supported. The support you need for this is built into Perl but low level socket programming can be frustrating - it's very low level.
However, Perl's standard library contains the Socket module which is rather easier to use (and well documented).
If you need to implement an existing protocol you may well find that it has already been implemented. For example Net::Telnet implements command/response protocols (like Telnet) making a client app trivial.
Searching CPAN may save you a lot of pain. Look at modules in the Net::* hierarchy
I don't think you're gonna have any major issues that you won't have by not using Perl. Even performance will be comparable to other solutions due to network latencies.
You might want to look at POE framework. It makes writing such components a breeze.
It probably depend on a few factors. Does speed or responsiveness matter? Are you moving data between they same type of machines (Unix to Unix, Windows to Windows)? What type of data are you trying to move (Text or Binary)? What is knowledge about sockets and what languages do you have experience?
I have sent and received binary data over PERL sockets from differing applications, but I don't have much experience with the text processing over sockets from differing machines. If you are moving data between machine you need to keep in mind the way the data is marshalled and if it is packed or aligned on some byte boundry. I have not exchanged data with a Java programs, but is should be similiar.
It probably would help to have some experience with PERL, and I would recommend looking at the examples in the "camel" book. I have used the ones in the book as a starting point and made modification for what I needed to achieve. You may have to consult some other areas of the book if you are dealing with binary data, or to help in doing translations for sending data.
Write socket communication in Perl is relatively easy. Do it right and reliable is big pain even CPAN modules are examples of error prone code. It depends of your expectations.
You are basically asking two questions:
Is Perl a proper language for socket communication?
Is Perl a proper language for UI?
Referring to e5's answer, Perl is indeed a string-centric language with a focus on readable strings, less well equipped to handle binary data. Thus the answer probably lies in the questions: Is your communication string based? Is your UI string based?
If doing binary interaction through a socket, well, you probably could be doing better than Perl (not talking about C, but maybe C-ish languages). If you want to do graphical user-interaction you probably reach faster results by choosing one of the higher languages that focus more on gui interaction. (Java-ish might be the thing here.)
I'm looking for ActiveX components that can easily:
get and send emails via SMTP and POP3
strip out and save attachments.
Convert RTF (Outlook emails) to HTML
Sanitize HTML.
What components would you recommend? What components do you use?
Sendong and receiving email is simple with CDOSYS. And RTF isn't really that complex a format to handle.
But I think Chilkat SMTP/POP# ActiveX component is something you migth want to look into.
seanyboy, I can help you out here, but before you look at commerical solutions, there are a couple things you need to understand.
First, there are hundreds, or thousands of controls out there to do what you want.
But, you have to consider HOW you are going to use them. I used to work for an Anti-virus company, and when we decided to hook our product into Exchange, it became obvious that the solution we chose was NOT going to work. The issue was, the commerical apps follow the RFC's (usually) to a T. (Or is it TEE? I dunno..) But, viruses NEVER follow the RFC standards. So, I ended up writing my own Mime parser for our scanner, and my detection rate was MUCH better than anything else we tried. Why? Because each time I spotted an email that broke the RFC's I tweaked the code to deal with it. The one example that comes to mind was the "Content-Type: maintype/subtype; param =". Notice the space after param and equals. This breaks the RFC rules, but most mail readers deal with it, allowing the virus to do it's thing.
But, this also is a double sided coin.... In MY code, I was not able to decode an attachment formatted as follows:
....
Content-Type: application/octet-stream
Content-Transfer-Encoding: base64
http://virus.virussite.com
JVBERi0xLjMgCiXi48/TIAo3IDAgb2JqCjw8Ci9Db250ZW50cyBbIDggMCBSIF0gCi9QYXJlbnQg
NSAwIFIKL1Jlc291cmNlcyA2IDAgUgovVHlwZSAvUGFnZQo+PgplbmRvYmoKNiAwIG9iago8PAov
...
But, the commerical apps had no problem parsing it... Most likely because they followed the RFC's again, and did not accept Base64 data if it was not exactly 77 chars long (I think 77 is the RFC std.. I'd have to reference it.).
But, I had bigger problems with broken B64, and B64 that ran all on a single line, etc, and it had to be decoded, so I took everything in the data block as Base64 data that was in fact a valid base64 char. Everything else was simply skipped over...
Anyways, they key is, decide what you NEED out of this control, and then decide if you want to consider writing your own, buying a commercial one, or even paying someone (like myself) to write one for you.
(I'm not exactly sure my last sentence is acceptable by Stack Overflow rules, so I'm not soliciting you, just tell you know your options. I mention this option because you'd have access to the source code, and would be able to maintain it yourself, or find someone else to maintain it, if you decided to break relations with your developer. This is not an option for 99.99% of the commercial solutions...) If they make a change that screws you or your application, you are well, screwed.. :)
Hope this helps, or at least gives you something to read. heh..
Let me know if I can be of any more help.