for the Thai language
F.A.Q. Check out the list of frequently asked questions for a quick answer to your inquiry
The parchment at left, scrawled in a Shibuya apartment in 1997 documents the genesis of this web site. A database object model is laid out with the notation, "June 9: Wrote computer program for [above]."
Originally, the web site was a series of simple HTML files hosted on a commercial ISP's unix server. The dictionary, for example, consisted of two large, static HTML files (one for Thai-to-English, and one for English-to-Thai) which were generated offline by a custom Win32 console application. The input source file to this generation process was a proprietary text file format consisting of simple tags describing the dictionary entries.
A crude graphical Win32 application (written in C++) allowed the webmaster append new entries to this source file. Editing existing entries was only possible by editing the proprietary text file itself manually, a tedious process. Meanwhile, running Xitami for Windows, I took over hosting for the server over a residential DSL line.
When thai-language.com was redesigned in 2001, a major thrust was the reimplementation of the online dictionary system and a significant addition to the number of entries. The goals of the redesign were to:
The second version of the program, DBEdit, uses a library of C++ objects which is shared with the a server-side COM/ActiveX object (discussed below). These C++ objects provide basic functionality for the new database model, and provide a single point of access to the proprietary format of the binary disk files.
With these new binary-format files, the "compilation" step was eliminated and server-side COM/ActiveX provided on-the-fly access to the dictionary data.
Server-side COM/ActiveX objects were created to read the data, and these objects were integrated into ASP pages using server-side VBScript. In this way, each dictionary entry could now present a dynamically generated page of its own.
These changes required switching the backend software from Xitami to Microsoft Internet Information Server (IIS) running on Windows 2000 Server, then on a multiprocessor Compaq Proliant 1850R, so that ASP, VBScript and server-side COM/ActiveX could be supported. Several steps were taken to ensure adequate performance of this design, such as engineering the dictionary COM/ActiveX objects as "neutral-threaded," and including the dictionary "application object" (which performs time-consuming initialization) in the IIS global.asa/application object. In November 2002, further optimization was achieved by recoding critical database routines in assembly language.
In addition, several site monitoring tools were developed which allow activity to be displayed on a remote system. This system eschews bulk HTTP logging in favor of an "invasive" COM/ActiveX probe object in the ASP code to provide specific, concise data.
In January 2005, I did a fresh-install upgrade to Windows Server 2003 and the latest versions of ancillary software. Over four grueling days in March later that year, the entire "glue" code, over 3000 lines of VBScript, was entirely rewritten in C#, thus formally converting the server-side to use the Microsoft .NET Framework 1.1. The custom COM/ActiveX objects were wrapped for early binding using the "tlbimp" tool so that they could be used from the "managed" C# code. The advantages of this switchover were:
In January, 2007, the site's server hardware was again replaced, its C++ code was migrated to the Visual Studio 2005 development environment, and its C# migrated to ASP .NET version 2.0. In May, another major overhaul was completed, again involving the rewrite of nearly all the server-side C# code. This time the goal was to organize the content into compartmentalized modules in order to enhance maintainability and performance. Also at this point, 90% of the app moved into a pre-compiled .NET HTTP module, as opposed to being JIT compiled by ASP.NET. By June 2007, all of the COM/ActiveX code had been excised in favor of much more elegant C++/CLI interop code.
In the year that followed, the benefits of migrating to these modern new platforms (now .NET 3.5 with LINQ) were manifesting in new features and increased reliability and performance. By June of 2008 the dictionary had 35000 entries and there were new reference articles, lessons, audio clips, features (such as upgrades to 'reverse phonemic transcription') and layout enhancements.
Up until July 2008, the site had been hosted residentially over the slow direction (608 kb/s) of an ADSL connection. In that month the site marked a huge milestone as it was relocated to a professional data center with generous bandwidth. To facilitate this transition and reduce recurring fees, a replacement 1U server was deployed, now running Windows Server 2008.
In the final weeks of 2008, the aging database code was wholly replaced with managed C# code which uses LINQ extensively. On January 1, 2009, the new code was uploaded. At this point, the database became fully Unicode, and very little unmanaged code remained. The original core access routines (binary search coded in assembly language) were junked in favor of CLR managed generic Dictionary<T,K> objects.
The summer of 2009 saw the replacement of the aging UltimateBB message board software, which had been abandoned by its manufacturer, with phpBB. Integrating a full-blown php application into the .NET environment seamlessly required an unexpected amount of work, but after a false start or two, this was accomplished thanks to the design of the IIS7 "integrated pipeline," which allowed me to install managed input and output filters on php/FastCGI requests.
I had also been planning to change the server operating system to the 64-bit version of Windows Server 2008, and for logistical reasons I deployed this change on the same day as the new message board, September 9th. With full access to 32 gigabytes of memory and a new SQL server installation (required for phpBB), the server was now poised to support advanced new features such as computational grammar and search keystroke completion.
Code running the site continues to be maintained with the latest .NET updates (now 4.5.1). New features include an AJAX-based prefix search facility which dynamically returns results in order of Thai word frequency, as estimated from a newswire corpus.