thai-language.comInternet resource
for the Thai language
Lookup:
» more options here
Browse

F.A.Q. Check out the list of frequently asked questions for a quick answer to your inquiry

e-mail the author
guestbook
site settings
site news
bulk lookup
Bangkok
Thanks for your

recent donations!

Narisa N. $+++!
John A. $+++!
Paul S. $100!
Mike A. $100!
Eric B. $100!
John Karl L. $100!
Don S. $100!
John S. $100!
Peter B. $100!
Ingo B $50
Peter d C $50
Hans G $50
Alan M. $50
Rod S. $50
Wolfgang W. $50
Bill O. $70
Ravinder S. $20
Chris S. $15
Jose D-C $20
Steven P. $20
Daniel W. $75
Rudolf M. $30
David R. $50
Judith W. $50
Roger C. $50
Steve D. $50
Sean F. $50
Paul G. B. $50
xsinventory $20
Nigel A. $15
Michael B. $20
Otto S. $20
Damien G. $12
Simon G. $5
Lindsay D. $25
David S. $25
Laurent L. $40
Peter van G. $10
Graham S. $10
Peter N. $30
James A. $10
Dmitry I. $10
Edward R. $50
Roderick S. $30
Mason S. $5
Henning E. $20
John F. $20
Daniel F. $10
Armand H. $20
Daniel S. $20
James McD. $20
Shane McC. $10
Roberto P. $50
Derrell P. $20
Trevor O. $30
Patrick H. $25
Rick @SS $15
Gene H. $10
Aye A. M. $33
S. Cummings $25
Will F. $20
Get e-mail

Sign-up to join our mail­ing list. You'll receive e­mail notification when this site is updated. Your privacy is guaran­teed; this list is not sold, shared, or used for any other purpose. Click here for more infor­mation.

To unsubscribe, click here.

thai-language.com Site News



Project History

The parchment at left, scrawled in a Shibuya apartment in 1997 documents the genesis of this web site. A database object model is laid out with the notation, "June 9: Wrote computer program for [above]."

Originally, the web site was a series of simple HTML files hosted on a commercial ISP's unix server. The dictionary, for example, consisted of two large, static HTML files (one for Thai-to-English, and one for English-to-Thai) which were generated offline by a custom Win32 console application. The input source file to this generation process was a proprietary text file format consisting of simple tags describing the dictionary entries.

A crude graphical Win32 application (written in C++) allowed the webmaster append new entries to this source file. Editing existing entries was only possible by editing the proprietary text file itself manually, a tedious process. Meanwhile, running Xitami for Windows, I took over hosting for the server over a residential DSL line.

2001 Reimplementation

When thai-language.com was redesigned in 2001, a major thrust was the reimplementation of the online dictionary system and a significant addition to the number of entries. The goals of the redesign were to:
  • modernize the look-and-feel of the entire site,
  • add searching and other functionality enhancements,
  • generally expand the site's content and increase the number of dictionary entries,
  • provide a vehicle for the author to learn new programming/web technologies,
  • allow the webmaster and others to more easily edit dictionary entries, and
  • expand the dictionary's data model (schema).
To preserve the existing dictionary data, a new C++ program was envisioned which could read in the old-format file, convert it to the new data model, provide non-linear editing on entries, and save the new model in proprietary binary-format files. The first attempt at writing such a program was abandoned after completing only the portion which converts the old-format file, and it was used in a one-time conversion for this purpose.

The second version of the program, DBEdit, uses a library of C++ objects which is shared with the a server-side COM/ActiveX object (discussed below). These C++ objects provide basic functionality for the new database model, and provide a single point of access to the proprietary format of the binary disk files.

With these new binary-format files, the "compilation" step was eliminated and server-side COM/ActiveX provided on-the-fly access to the dictionary data.

Server-side COM/ActiveX objects were created to read the data, and these objects were integrated into ASP pages using server-side VBScript. In this way, each dictionary entry could now present a dynamically generated page of its own.

These changes required switching the backend software from Xitami to Microsoft Internet Information Server (IIS) running on Windows 2000 Server, then on a multiprocessor Compaq Proliant 1850R, so that ASP, VBScript and server-side COM/ActiveX could be supported. Several steps were taken to ensure adequate performance of this design, such as engineering the dictionary COM/ActiveX objects as "neutral-threaded," and including the dictionary "application object" (which performs time-consuming initialization) in the IIS global.asa/application object. In November 2002, further optimization was achieved by recoding critical database routines in assembly language.

In addition, several site monitoring tools were developed which allow activity to be displayed on a remote system. This system eschews bulk HTTP logging in favor of an "invasive" COM/ActiveX probe object in the ASP code to provide specific, concise data.

2005 Rewrite

In January 2005, I did a fresh-install upgrade to Windows Server 2003 and the latest versions of ancillary software. Over four grueling days in March later that year, the entire "glue" code, over 3000 lines of VBScript, was entirely rewritten in C#, thus formally converting the server-side to use the Microsoft .NET Framework 1.1. The custom COM/ActiveX objects were wrapped for early binding using the "tlbimp" tool so that they could be used from the "managed" C# code. The advantages of this switchover were:
  • large speedup associated with the "just-in-time" compilation-to-native model of .NET
  • much richer and deeper access to operating system functionality than VBScript
  • strong typing, allowing for speedups and more reliable programming
  • compile-time error-checking (versus interpreted VBScript)
  • complete banishment of VBScript
  • detailed error reporting
  • object-oriented programming model
  • preserving gigantic investment in existing custom-developed COM/ActiveX objects
The database schema has evolved quite a bit since 1997. Click here to see how it stood in late 2003, after another round of elaboration.

2007

In January, 2007, the site's server hardware was again replaced, its C++ code was migrated to the Visual Studio 2005 development environment, and its C# migrated to ASP .NET version 2.0. In May, another major overhaul was completed, again involving the rewrite of nearly all the server-side C# code. This time the goal was to organize the content into compartmentalized modules in order to enhance maintainability and performance. Also at this point, 90% of the app moved into a pre-compiled .NET HTTP module, as opposed to being JIT compiled by ASP.NET. By June 2007, all of the COM/ActiveX code had been excised in favor of much more elegant C++/CLI interop code.

2008

In the year that followed, the benefits of migrating to these modern new platforms (now .NET 3.5 with LINQ) were manifesting in new features and increased reliability and performance. By June of 2008 the dictionary had 35000 entries and there were new reference articles, lessons, audio clips, features (such as upgrades to 'reverse phonemic transcription') and layout enhancements.

Up until July 2008, the site had been hosted residentially over the slow direction (608 kb/s) of an ADSL connection. In that month the site marked a huge milestone as it was relocated to a professional data center with generous bandwidth. To facilitate this transition and reduce recurring fees, a replacement 1U server was deployed, now running Windows Server 2008.

2009

In the final weeks of 2008, the aging database code was wholly replaced with managed C# code which uses LINQ extensively. On January 1, 2009, the new code was uploaded. At this point, the database became fully Unicode, and very little unmanaged code remained. The original core access routines (binary search coded in assembly language) were junked in favor of CLR managed generic Dictionary<T,K> objects.

The summer of 2009 saw the replacement of the aging UltimateBB message board software, which had been abandoned by its manufacturer, with phpBB. Integrating a full-blown php application into the .NET environment seamlessly required an unexpected amount of work, but after a false start or two, this was accomplished thanks to the design of the IIS7 "integrated pipeline," which allowed me to install managed input and output filters on php/FastCGI requests.

I had also been planning to change the server operating system to the 64-bit version of Windows Server 2008, and for logistical reasons I deployed this change on the same day as the new message board, September 9th. With full access to 32 gigabytes of memory and a new SQL server installation (required for phpBB), the server was now poised to support advanced new features such as computational grammar and search keystroke completion.

2014
Code running the site continues to be maintained with the latest .NET updates (now 4.5.1). New features include an AJAX-based prefix search facility which dynamically returns results in order of Thai word frequency, as estimated from a newswire corpus.
Copyright © 2024 thai-language.com. Portions copyright © by original authors, rights reserved, used by permission; Portions 17 USC §107.