keetper wrote:Keep us in the loop, when your master degree is defended.
Well, it was defended last fall, but that thesis ended up getting bogged down in the grammar-supporting
toolset, rather than the development of the Thai grammar itself. If you're still interested, the paper can be found here.
Meanwhile, I'm hoping that the actual further development of the Thai grammar
proper (which is still in "toy" status at present) will now be my Ph.D. work! (As it turns out, developing a competent analytical grammar of a natural language is hard.

Who knew.)
Also meanwhile, a graduate seminar I'm currently taking is advancing my fledgling work on Thai-English
semantic transfer. If it were not for the seminar, this work--though long planned--would probably have been delayed until after the Thai grammar was upgraded somewhat. So for the moment I'm sketching out a new formalism within which one will be able to author a set of declarative correspondences between English and Thai semantic structures (yes, there does appear to be some additional tool work involved!).
Putting the pieces together, given:
(1.) an upgraded HPSG-computaional grammar of Thai (ongoing work, currently the grammar is still in "toy" status);
(2.) the English Resource Grammar (Flickinger et. al 2000), which is a highly-developed, open-source HPSG grammar of English (emphatically
not a toy);
(3.) an efficient HPSG
parser (done!);
(4.) an efficient HPSG
generator (done!);
(5.) a processing engine for the declarative transfer rule formalism (demo project for current seminar, but a real version will eventually be needed); and
(6.) a set of Thai-English declarative "transfer rules" written to this formalism (which I'm also mocking-up at present);
...one theoretically has all the ingredients for a fairly competent rule-based Thai-English and English-Thai machine translation system
Note that, unlike the Bing and Google translation functions which rely on purely statistical techniques for machine translation (SMT), the planned system is entirely based on hand-written analytical grammars. These "precision grammars" are so-called because they offer
precision at the expense of
recall. SMT systems are an example of systems with great recall: We've all the the experience of getting gibberish tranlsations from SMT systems, but their high recall means that: at least you got something (regardless of whether it's usable or not).
Contrast this to a system characterized by high
precision (such as the one I'm working on): some of the translation inputs you submit may be more likely to return nothing, but at least when you do get a result, you'll know that it's guaranteed to be grammatically correct. In 2013, there are very few, if any, widely-known, public systems that offer rule-based MT to the public for free general use. It will be interesting to see if this is still the case when it comes to pass that I'm finally able to launch reasonably competent Thai-English and English-Thai translation on this website.
It will be also be interesting to see how the advantages and disadvantages of each approach (statistical versus rule-based MT) become weighted vis-a-vis particular language pairs. For example, consider the English-Thai pair versus, say, English-Spanish. Though each pairing will surely exhibit it's own distinct linguistic challenges, I don't have an intuition about which MT approach might fare better for which pairing. This is ironic since I've essentially bet the past 6 years of my life on the idea that rule-based MT will be more successful for the former pairing.