In MS Word, I would like to do a word-to-word translation which involves a list (MS Excel or anything else) of professional vocab with a translated vocab, while grammar is ignored.
Let say, I have a million-word DOC document. And I have created a list of vocabs (or strings) that Google Translation and others softwares cannot translate with respect to a certain profession. And I have > 1000 words in the list (XLS).
A (Vocab) B (Translation)
1 qwertyuiop ABC
2 asdfghjkl DEF
3 zxcvbnm XYZ
What I want is, if I check through the DOC document, where-ever I find "asdfghjkl" (which is A3), it would be replaced into "DEF" (which is B3).
In MS Word, I tried "Replace All" in Ctrl + F, but I have to go through 1000 times if I have 1000 vocabs to be translated. How can I, within MS Word (VBA??) or from a software, replace those vocabs with ease??
Thanks a million!! =)
Related
I'm compiling an acronyms/abbreviations table for a document. Beyond a simple acronym finder, I would like to find special acronyms that aren't entirely conventional.
Generally I can find acronyms by using <[A-Z]{2,}> in an advanced search. This captures any whole word that is comprised solely of uppercase letters. But I have acronyms that take on other forms too. Beyond an acronym being in the form ABC I have acronyms in this document of other forms.
ABC Generic form, 2 or more uppercase letters
AB&C 1 or more letters preceding and following &
ABC(D) 1 letter in parentheses following 2 or more letters (this only appears twice, so I'm not too worried about it.)
A/C 1 or more letters both preceding and following /
ABC-12 2 or more letters followed by a hyphen and 1 or 2 numbers. This only appears once, so I'm not really worried about it.
In my efforts to create an acronym finder, I've developed this specialized search.
<[A-Z]{1,}[\&\/]*[A-Z]{1,}>
Trying to translate this, I see that this is searching for 1 or more uppercase letters preceding 0 or more of & or / followed by 1 or more uppercase letters. In theory this should find forms 1,2, and 4, but in reality it only finds forms 2 and 4, and not 1. (I'm not as much worried about form 3 as I am form 1, 2, and 4.) I'm stumped at what I need to change. I've tried doing an OR | statement to find one or more form, but Microsoft Word's 'regex' options are different (or appear to be different) from what I generally use.
In summary, my question is what form should my special acronym finder be to find forms 1, 2, and 4 in the table above?
You can use a wildcard Find, where:
Find = <[A-Z][A-Z0-9&()/-]{1,}
Beyond that, for identifying acronyms in parentheses and the text to which they refer, see: https://www.msofficeforums.com/word-vba/42313-acronym-definiton-list-generator.html
See also: https://www.msofficeforums.com/word-vba/19395-acronym-finder-macro-microsoft-word.html
OCR programs often mistakenly recognize the capital letter O as a zero or vice versa. For example, they might recognize Over as 0ver or well as we11.
I tried to add
REP 0 O
REP 1 l
to the affix file, but it didn't work because numbers are apparently considered word boundaries.
(I had a look at the hunspell man page, but I can't figure out which of the numerous settings needs to be changed to allow numbers in words.)
From the manpages:
REP what replacement
This table specifies modifications to try first. First REP is
the header of this table and one or more REP data line are
following it. With this table, Hunspell can suggest the right
forms for the typical spelling mistakes when the incorrect form
differs by more than 1 letter from the right form. The search
string supports the regex boundary signs (^ and $). For example
a possible English replacement table definition to handle
misspelled consonants:
REP 5
REP f ph
REP ph f
REP tion$ shun
REP ^cooccurr co-occurr
REP ^alot$ a_lot
Did you add the first line, REP + number of replacements?
A web form collects data on students in a band organization at school. The form data is fed into a google sheet that then populates a merge template and the merged forms are emailed to the recipient. A parent needs to print, sign and turn in the forms. There are hundreds of kids in this band and at registration time when the forms are turned in it is easier to sort all the papers in the stack if you have a short sort number in the corner... Volunteer kids don't apply alphabetization well. I'm trying to create a formula that will give me that sorting number to merge onto the header of each page of the PDF they receive after submitting the form. I want it based on last name and then first name and be able to create that number (in the google sheet) on the fly because the merging happens almost instantly when the user submits the form. Hence, an excel type formula is desired that will result in a number representing the kids name. I'd like for each number to be unique but some names are the same for the first few letters, also some names are only 2 characters long. I tried making A=10, B=11, z=35 etc. (so all are 2 digits) So, using only the first 3 characters, Bob Jones would = 192423112411 - hardly easy to sort the paper at a glance and it doesn't really differentiate between Bob Janes either. 4 digits is preferable. I also looked at =code() formula and it came out with long numbers too. Any advice is appreciated. Thanks!
Side note: What method do spreadsheets use to sort text? Do they weight the characters or what? Before I got the automerge thing to work I assigned each kid in the list a number higher than the one below and lower than above (on the sheet), then did the merge.
One option is to:
sort the name list alphabetically
add a sort number column, and put a =TEXT(row(),"0000") formula to generate a unique ID
on the merge spreadsheet, use a VLOOKUP function to retrieve the unique ID for that specific name.
First off, that wall of text was kind of hard to read through. Please try and do a little formatting so the people trying to help you can easily follow what you're trying to convey.
Personally I would suggest a hyphenated system. First initial of last name converted to a number, followed by a hyphen, followed by the first two letters of their first name converted to numbers.
Bob Jones becomes 11-1956 assuming you differentiate between upper and lower case, or 11-1924 if you convert everything to upper case, which I guess makes more sense.
You could use this VBA function to convert names to a system like that:
Function ConvertToIndex(strInput As String) As String
Dim strLast As String
Dim arrName() As String
Dim strFirst1 As String
Dim strFirst2 As String
arrName = Split(strInput, " ")
strLast = Mid(arrName(1), 1, 1)
strFirst1 = Mid(arrName(0), 1, 1)
strFirst2 = Mid(arrName(0), 2, 1)
ConvertToIndex = Asc(UCase(strLast)) - 55 & "-" & Asc(UCase(strFirst1)) - 55 & Asc(UCase(strFirst2)) - 55
'MsgBox ConvertToIndex
End Function
Thank you Tim, Nutsch and Mad Tech for your responses. I appreciate your input. Sorry the paragraph was so long, I get wordy. Because the members get their merged PDF sheet immediately after submitting I need the number to be based on the name as soon as it's entered, not after the fact; so I was looking for a formula that would reside in the sheet. Interesting VBA function too though. I'll settle for numbering them afterwards, maybe when the sheets are turned in. By then I'll know all who are in the band and can assign numbers like before. Thanks again!
Is there a generic (Macro, XML Parser, ...) way to extract all footnotes in a MS Word file and also keep the corresponding number from the original text?
You might use Elod Pal Czirmaz's macro as a starting point. I suspect that footnotes don't carry their numbers with them. When the document is rendered, the numbers are just assigned in order.
I'm new to programming (just started!) and have hit a wall recently. I am making a fansite for World of Warcraft, and I want to link to a popular site (wowhead.com). The following page shows what I'm trying to figure out: http://www.wowhead.com/?talent#ozxZ0xfcRMhuVurhstVhc0c
From what I understand, the "ozxZ0xfcRMhuVurhstVhc0c" part of the link is a hash. It contains all the information about that particular talent spec on the page, and changes whenever I add or remove points into a talent. I want to be able to recreate this part, so that I can then link my users directly to wowhead to view their talent trees, but I havn't the foggiest idea how to do that. Can anyone provide some guidance?
The first character indicates the class:
0 Druid
c Hunter
o Mage
s Paladin
b Priest
f Rogue
h Shaman
I Warlock
L Warrior
j Death Knight
The remaining characters indicate where in each tree points have been allocated. Each tree is separate, delimited by 'Z'. So if e.g. all the points are in the third tree, then the 2nd and 3rd characters will be "ZZ" indicating "end of first tree" and "end of second tree".
To generate the code for a given tree, split the talents up into pairs, going left-to-right and top-to-bottom. Each pair of talents is represented by a single character. So for example, in the DK's Blood tree segment, the first character will indicate the number of points allocated to Butchery and Subversion, and the second character will stand for Blade Barrier and Bladed Armor.
What character represents each allocation among the pair? I'm sure there's an algorithm, probably based on the ASCII character set, but all I've worked out so far is this lookup table. Find the number of points in the first talent along the top, and the number of points in the second talent along the left side. The encoded character is at the intersection.
0 1 2 3 4 5
0 0 o b h L x
1 z k d u p t
2 M R r G T g
3 c s f I j e
4 m a w N n v
5 V q i A y E
So if our Death Knight has one point in Butchery and two points in Subversion, the first character is 'R'. If instead we put no points in those two and five in Blade Barrier, the first two characters will be "0x". Trailing '0's (all the other pairs in the tree with no points allocated) can be omitted, as can trailing 'Z' delimiters (when there are no points in the subsequent trees). For one final example, the entire code for a DK with just a single point in Toughness would be "jZ0o": "Death Knight", "End of the first tree", "No points in the first pair of talents", "one point in the first talent of the second pair".
Can anyone work out what function generates the lookup table above? There's probably a clue in the codes for the classes: in alphabetical order (except for the DK which was added to the game after the others), they correspond to a series in the lookup table of (0,0), (0,3), (1,0), (1,3), (2,0), etc.
If you go to http://www.wowhead.com/?talent and start using the talent tree you can see the mysterious code being built up in the address bar as you click on the various boxes. So it's definitely not a hash but some kind of encoded structure data.
As the code is built up as you click the logic for building the code will be in the JavaScript on that page.
So my advice is do a view source on the page, download the JavaScript files and have a look at them.
I think it isn't a hash value, because hash values are normally one-ways values. This means you cannot (easily) restore the original information from which the hash code was generated.
Best thing would be to contact someone from wowhead.com and ask them how to interpret this information. I am sure they will help you out with some information about what type of encoding they use for the parameters. But without any help of the developers from wowhead.com it is almost impossible to figure out what information is encoded into this parameter.
I am not even sure the parameter you mentioned contains the talents of your character. Maybe it's just a session id or something like that. Take a look into the post data your browser sends to the server, it may contain a hidden field with the value you are searching for (you can use Tamper Data Firefox Addon).
I don't think ozxZ0xfcRMhuVurhstVhc0c is a hash value. I think it is a key (probably encrypted/encoded in some way). The server uses this key to retrieve information from it database. Since you don't have access to the database you don't know which key is needed, let alone how to encode it.
You need the original function that generates the hash.
I don't think that's public though :(
Check this out: hash wikipedia
Good luck learning how to program!
These hashes are hard to 'reverse engineer' unless you know how it was generated.
For example, it could be:
s1 = "random_string-" + score;
hash = encrypt(s1)
...etc
so it is hard to get the original data back from the hash (that is the whole point anyway).
your best bet would be link to the profile that would have the latest score ..etc