thai-language.comInternet resource
for the Thai language
Page 1 of 1

DB ThaiText conversion from old Word doc

PostPosted: Mon May 13, 2013 2:49 pm
by FL_Farang
I have some MS Word documents I created in Thai and English pre-1995 (boy am I getting old).

They open fine in Word 2007. I can even save them as docx files with all Thai and English readable.

The problem is that the Thai encoded in DB ThaiText can't be changed to any other font. Worse, it can't be cut and pasted into a browser.

I can cut and paste it into another Word doc...but it's only readable in DB ThaiText. It turns to this with all other Thai fonts and in my browser:

ÊØÃÔÃѵ¹ì ˧ÉìÅÍÂÅÁ

I saw one post about font conversions, but nothing about converting complex (and long) Word docs.

Thank you for your help!

Re: DB ThaiText conversion from old Word doc

PostPosted: Mon May 13, 2013 3:27 pm
by Eric67

Re: DB ThaiText conversion from old Word doc

PostPosted: Mon May 13, 2013 3:47 pm
by FL_Farang
Wow, Eric, you're quick, thank you!

Your link is a terrific resource (like so many others on T-L.com).

This does allow me to convert words and sentences into text that works in my browser. Also, when I paste it back into the original docs it goes in as BowalliaUPC.

A great step one, but I have hundreds of pages of docs (mixed Thai and English) so if you have any ideas for Word conversion I'm still looking. (-:

Re: DB ThaiText conversion from old Word doc

PostPosted: Mon May 13, 2013 7:04 pm
by Richard Wordingham
Can you write computer programs?

If you can, and the programs only contain ASCII and Thai, the solution is to save the files as .rtf or .docx and convert from there.

For RTF, you are likely to find that Thai in DB Thai looks like a whole lot of RTF escape codes such as \'f1. Simply use a global edit to convert them to the escape codes used for Thai. (There might be some issues that you need to convert the RTF specification about; I haven't done such mass conversions.)

A .docx format file boils down to (possibly nested) zip files. If you can, then extract the file and do global edits from 'gobbledegook' (accented Roman characters and a number of symbols) to Thai. In the right environment, you can use iconv to do the conversions. Change the font if you can, and then put the converted file back. It should work, though you may get ignorable complaints about check details not matching. This is what I was reduced to doing with OpenOffice files when I needed to change the character encoding of characters previously not supported by Unicode.

I have a similar page to Glenn's at http://homepage.ntlworld.com/richard.wo ... tis620.htm , but mine does the conversion on the user's computer, and the source code for the conversion is included in the file - it can be run off-line.

If you aren't a programmer, the Abiword word processor might be able to do the conversion for you, at the risk of losing some Word formatting. The wvWare package seems to provide the basic utilities, but it may be easier to use on Linux than on Windows.

Re: DB ThaiText conversion from old Word doc

PostPosted: Tue May 14, 2013 4:00 pm
by FL_Farang
Richard I thank you for your expert advice.

I'm not a programmer and won't be able to fix the entire documents but your online conversion pages are a great help for sections and they work perfectly. I did try saving the docs in rtf mode but nothing changed (for better or worse).

I do notice that Word displays the notice (compatibility mode) above each of these documents. I also see glitches with Thai characters mixing with English (apostrophes become ไม้ทัณฑฆาต); simple Thai letters like substituting for , etc. Ultimately I'm happy I can read the docs at all!

Fonts remain one of the biggest software mysteries to me, and I've had my issues with them over the years. My toughest experiences were in trying to get Khmer to migrate from Word to Adobe applications, compounded by moving from PC to Mac. Whew. (-:

Thanks again for your help!

With best regards,

Kent

Re: DB ThaiText conversion from old Word doc

PostPosted: Wed May 15, 2013 12:36 am
by Richard Wordingham
FL_Farang wrote:I did try saving the docs in rtf mode but nothing changed (for better or worse).

The purpose of saving the documents as RTF is that RTF is actually text, albeit with an enormous quantity of mark-up. It is easier to write a program in a general purpose language to convert a text file than to convert an unformatted file, at least, without Word-specific libraries. There's probably a Word Visual Basic macro somewhere that will do exactly what you want, but I'm not fluent in Visual Basic.

Re: DB ThaiText conversion from old Word doc

PostPosted: Wed May 15, 2013 2:36 am
by FL_Farang
That's beyond my skill too. But the simple paste text decoders you gave get me to the next level.

Thanks!

Kent

Copyright © 2024 thai-language.com. Portions copyright © by original authors, rights reserved, used by permission; Portions 17 USC §107.