I am trying to read and process Japanese emails. I have set my regional and language options to East Asian and languages for non-Unicode in the XP control panel. I have to process .pst files and preserve the true metadata and I am having trouble with the subject line and sometimes the to: and cc: fields. I get my message body to show Japanese fine but then I get gibberish in the subject as shown below
CC FIELD: cc. │ᄄネヤᄏ ̄タタ₩ンノ¥ᄆᄆ₩ルᄎ₩チメ
SUBJECT FIELD: Re: 三è±ï¼¬ï¼£ï¼¤æ’¤é€€ã«é–¢ã™ã‚‹æƒ…å ±åŠã³åŒ—米液晶状æ³
MESSAGE BODY: 佐藤さんへ:情報ありがとうございます。この機に是非とも三菱パークをリプレースしたいものです。ところでこのシニアマネージャーはどうされたのですか?内も苦しいですが。
中村マネージャー:ADIはCPTへ売却打診中とのこと。うーん。
I am not a programmer so please simplify any recommendations you have as to how I can fix the subject line. FYI, I am using Outlook 07 Pro, Windows XP Pro and the .pst files are pre-existing so they are being opened via: File-->open outlook data file.
Most likely, the header lines contain Japanese characters encoded in ISO-2022-JP without this being specified so, i.e. the emails contained in the PST files are violating the specifications. You may be able to get around this by specifying the encoding manually within the Outlook settings - I don't have Outlook, so I can't tell you where exactly to look. If Outlook does not have that option, then you're pretty much hosed - you'd have to find a Japanese version of Outlook, or a third-party application that can read PST files and allows you to manually set the encoding.
Related
My university deletes students' Outlook email account after they graduate and so I am exporting my inbox at a .olm file.
I figured this would be sufficient to save my meaningful emails that I want to save, but I wonder how I will ever open the .olm file if the account itself will be deleted...
Any ideas/info?
Cheers
OLM files are used only by Mac as Database file by Microsoft Outlook and can't be opened by the Windows version of Outlook because the Windows version uses .PST files rather than the OLM format.
assuming you have mac if not then To open OLM files in Windows, you can first convert the OLM file to the PST.
But there are other ways to save Outlook emails
Text only format
Outlook Message Format .msg – the older version of .msg
does not support the full range of Unicode characters.
Outlook Message Format – Unicode the newer of .msg that
includes Unicode characters.
I will use this .msg format. These days ‘plain’ can have Unicode for emoji etc.
Save to Word
Outlook Template .oft to make a template for new emails.
HTML – a web page version of the message
MHT – also a web page but with images etc embedded into a single
file.
making the subject line of the message the file name.
Remember all the above formats are indexed by OS, You will be able to find a saved message by searching words in the message.
Save to PDF
PDF is another way to store ‘permanent’ or archival documents.
look into examples like python or VBA code that can help you save emails to the format you need.
The automated outlook emails using pywin32 and plain HTML were great till people started using it for forwarding and reply, Once you forward all the HTML formats are getting stripped and the borders of the table suddenly disappears. The way around is to go to your outlook settings and disable the option "Reduce message size by removing format information not necessary for the message".
The question is how to format the email so that it wont be lost when forwarded and make the format information necessary for the message ?
I have found out a work around though, It is observed that outlook is stripping of those styles which are defined in style block, If the styles are defined embedded in tags its escaping the stripping. As of now I have taken this approach
I'm getting emails occasionally that are having strange encoding issues. The quotation marks show up as ³example², and apostrophes show up as that¹s. I can't imagine that the other person actually meant to use those symbols, even though the email headers specify an encoding of Windows-1252. I'm using Thunderbird for Mac OSX, and I'm not sure what email client is being used to send these messages.
These are the characters ` and angled double-quotes. In my experience, these are typically from OSX because it uses a specialized version of ISO-8859, that's what I recall reading when researching this issue a few months ago, if I find the reference I will add the link.
If the sender specifies UTF-8, this goes away.
When I tried to FTP Russian named files, it is showing as "junk" characters in Linux machine. But when I copied the Russian names it is correctly showing up.
Is there any settings or anything need to be done in Filezilla during FTP. I tried with both Ascii and Binary mode.
The Linux machine is having locale set to ru_RU.cp1251.
FTP was invented with US-ASCII as character set in mind, so it lacks a concept for different character sets at all. The server sends filenames as-is and the client has to properly interpret them.
FileZilla can do that as well: Add your site to the Site Manager (File then Site Manager…). For your site, go to Charset tab and select Use custom charset. As I do not know how the accepted character set name is, you have to try a bit: cp-1251, windows-1251, cp1251, etc.
If possible, make sure the FTP server supports UTF-8 and then always use UTF-8 (Unicode). This way you do not have such problems anymore.
ASCII and binary modes by the way are completely unrelated to character sets - see FileZilla Wiki regarding data type for more information.
I want to send booking information through mail in an attachment to add in MS Outlook.
Which format is better? Especially for MS Outlook 2003?
iCalendar was based on a vCalendar and Outlook 2007 handles both formats well so it doesn't really matters which one you choose.
I'm not sure if this stands for Outlook 2003. I guess you should give it a try.
Outlook's default calendar format is iCalendar (*.ics)
Both .ics and .vcs files are in ASCII. If you use "Save As" option to save a calendar entry
(Appt, Meeting Request/Response/Postpone/Cancel and etc) in both .ics and .vcs format and
use vimdiff, you can easily see the difference.
Both .vcs (vCal) and .ics (iCal) belongs to the same VCALENDAR camp, but .vcs file shows
"VERSION:1.0" whereas .ics file uses "VERSION:2.0".
The spec for vCalendar v1.0 can be found at http://www.imc.org/pdi/pdiproddev.html. The spec for iCalendar (vCalendar v2.0) is in RFC5545. In general, the newer is better, and
that is true for Outlook 2007 and onward, but not for Outlook 2003.
For Outlook 2003, the behavior is peculiar. It can save the same calendar entry in both
.ics and .vcs format, but it only read & display .vcs file correctly. It can read
.ics file but it omits some fields and does not display it in calendar mode. My guess is
that back then Microsoft wanted to provide .ics to be compatible with Mac's iCal but
not quite committed to v2.0 yet.
So I would say for Outlook 2003, .vcs is the native format.
You can try VCS to ICS file converter (Java, works with Windows, Mac, Linux etc.). It has the feature of parsing events and todos.
You can convert the VCS generated by your Nokia phone, with bluetooth export or via nbuexplorer.
Complete support for UTF-8
Quoted-printable encoded strings
Completely open source code (GPLv3 and Apache 2.0)
Standard iCalendar v2.0 output
Encodes multiple files at once (only one event per file)
Compatible with Android, iOS, Mozilla Lightning/Sunbird, Google Calendar and others
Multiplatform
The VCS files can have its information coded in Quoted printable which is a nightmare. The above solution recommending "VCS to ICS Calendar Converter" is the way to go.
The newer iCalendar format, with more data attached, includes information about the person who created the event, so that when it is imported into Outlook (for example), changes to that event are communicated via email to the creator. This can be helpful when you need to inform others of any changes.
However, when I am just exporting an event from one of my calendars to another, I prefer to use vCalendar, since this does not require sending an email message to the creator (usually myself) if I make a change or delete something.