The Language Continuum
The term higher level language (hereafter HLL) seems to cover a narrower landscape than it did in the past. While C might have once been considered a HLL most computer scientists nowadays would likely consider it a middle level language (MLL); significantly more expressive than assembly and featuring basic platform independence, but quite a bit closer to the bare metal than, say, Java. Java, C++, and C# would all be considered modern HLLs (some might quibble with the inclusion of C++ because some C++ programs are basically C with a few extra features, but I definitely consider the modern features of the language such as the Standard Template Library and full OOP support to be high level components). The exact continuum is not something that can be specified precisely, but in general the higher the level the fewer the lines of code needed to accomplish a task and the greater the set of standard libraries available to eliminate the tedium of reinventing well solved wheels.
Any complete system generally requires some software to be written at many levels. Device drivers and operating systems will require some assembly (a low level language, or LLL) and tend to be written mainly in MLLs like C, while modern applications tend to use C as a minimum with preference for HLLs. The trend at all levels is upward, as improvement in development tools such as compilers with fairly automatic optimizations have reduced or eliminated the need for human developers to dig into the nitty gritty details of a piece of code to ensure decent performance. In addition, hardware architectures have begun to grow past the levels of complexity at which humans can reliably outperform compilers at determining optimal implementations at the assembly code level.
Moving to higher levels typically trades off performance for reduced development and maintenance costs. With today's increasingly fast PCs, with the processor idle 99% of the time under the not-so-strenuous loads imposed by web browsing, email, and word processing that the best choice for many new applications would be one of the highest level languages available. Can we not do better than Java and C#, even today? What about the so called scripting languages, such as Python and Ruby? These languages are rarely considered for full-fledged applications, but is it really because they aren't up to the task?
The Level of Optimal Optimization
For high level application development it seems obvious that high level languages are desirable because they can reduce total project complexity, especially in terms of total lines of code written specifically for the given project. But many projects require some code to be written at lower levels as well. A 3D game engine that renders a million triangles every few seconds needs to squeeze every bit of performance out of the inner sections of rendering loops, and this might require a bit of hand tuned assembly inserted into what is already highly optimized C to achieve this performance. What's key is that modern compilers can optimize C or C++ code better than most programmers, and these tools should only continue to get better. Meanwhile, the skills of a typical programmer from generation to generation will stay fairly constant unless huge breakthroughs are made in educational techniques.
What does this all mean? I think it means that the Level of Optimal Optimization will rise near monotonically over time. That is to say, in maybe 15 years hand coded assembly routines may become obsolete (accessing architecture specific features not withstanding) because the compiler will produce more optimal code than a human could in 99.99% of cases. In maybe 30 years hand optimizations at the middle levels may similarly be obsoleted by advanced development tools. So how far can this be extended? Is there any reason why a future compiler could not turn a decently written Perl script into a chunk of object code that runs faster than hand tuned C? Possibly optimistic projections aside, if these trends continue eventually even modern HLLs may be obsoleted by the one-two punch of future languages and their super-compiler/optimizers.
Upping the Ante: Dynamic Typing
This question was posed above: "Are scripting languages up to the task of full scale application development?" The big knock on Python et al. in this domain is not so much that they are typically used for scripting (which I believe is a bit of a Red Herring), but that most are dynamically typed. While dynamic typing facilitates rapid development (making prototyping a solid niche for these languages), the thought is that scalability and long term maintainability will suffer. Strongly typed languages ensure that a basic error such as an assignment between incompatible variable types is caught at compile time, but with dynamic typing such bug discoveries will not occur until run time (when generally an exception is thrown). I believe that this run time verses compile time trade-off underlies the primary objection to dynamically typed languages for large application development.
My personal opinion is that this is flawed thinking. The errors caught by the compiler are generally the simplest errors because the compiler can only judge syntax. Logic errors (legal expressions that result in flawed program behavior) are usually much more work to find and debug, and are language agnostic. Such errors are best caught using solid software engineering principles, including modular design and unit testing. These principles also apply regardless of chosen language. Even if software bugs written in dynamic languages truly are more difficult to debug, I believe that the trade-off between ease of debugging and quickness of development, as well as reductions in total code size for dynamically typed languages must be weighed against this primary drawback before truly determining the best approach for any project.
Virtual Roadblocks
Using dynamically typed languages has the potential for full scale application development today. Just as assembly can be be used to optimize MLLs, MLLs can be used to write modules called from what could be called a general class of very high level languages (VHLLs). Even without the advanced compilers of the future, use of profiling to identify performance bottlenecks and replacement of only such bottlenecks with more efficient modules coded in C, C++ or Fortran can capture some of the best of all worlds between MLLs and VHLLs.
There's a problem with this picture though, and that is where do the VM-based languages fit? Java and C# being the most prominent examples, this class of language has no problem making use of lower level components through invocation of native methods, but it is a problem coming from the other direction, e.g. calling a Java method from Python. VHLLs provide higher levels of abstraction and reduced code complexity, but the cost of invoking a VM to run specific modules from a VHLL interpreter may be too costly to make it worthwhile in any sort of optimization strategy. It would be simply easier to cut the VM-based HLL out of the loop and create interpreter callable modules in MLLs, or at least HLLs such as C++ that are typically compiled to native code.
The VM roadblock issue is primarily historical/cultural and has little to do with the language itself. Rather it is an issue with the standard implementations of the language. There is no reason why Java cannot be compiled to native code. Simply replace the VM provided for each platform with standard libraries providing the necessary support. From the viewpoint of an application distributor or end user performing installation the difference is significant because the "write once run everywhere" condition is violated, but from pure coding and application usage standpoints the differences should be nil. What would be gained is the ability to seamlessly integrate Java or C# object code into applications written primarily in VHLLs, just as is already possible with C, C++, Fortran, etc. object code. Whether the VHLLs use a VM, an interpreter or native compilation is of no consequence as long as they themselves are at the top of the application hierarchy.
As a consequence I think that the VM strategy is a potential dead end, or at least a roadblock on the path of programming language progress. Developers looking to use the best application development tools of the near future may find themselves locked into a suboptimal level of abstraction today if a VM-based language is chosen as the foundation of a new project. Even if a decision is made to move up to a VHLL later on, existing projects may be stuck with no easy way to utilize their VM entrenched code base from the VHLL unless tools to produce native binaries from the VM-based HLL code become more common and as mature as their bytecode producing counterparts. My suggestion? Leapfrog the VM roadblock completely. The overlap between the VHLLs, with their rich libraries often rivaling those of the VM-based HLLs, and the performance benefits of the natively compiled MLLs and HLLs leave few gaps which need filling.
The Ultimate Goal: Programming the Holodeck
By this point you might be wondering where can this all lead? What is the logical conclusion of the chain of ever ascending levels of abstraction, processing power, and optimization capabilities? Star Trek: The Next Generation introduced us to the holodeck, in essence the ultimate virtual reality producing environment. What really fascinated me about the idea of the holodeck though was its incredible programmability. The user was in effect writing a new program each time he used the holodeck, though no esoteric programming code or extensive training was required. The language used was natural language, which while not as concise or succinct as Lisp, more than makes up for it in terms of sheer expressivity.
A typical usage scenario goes like this: the user would first describe the general setting (application) in just a few sentences. The computer would produce a prototype setting, presumably by calling upon an incredibly vast knowledge base containing huge sets of templates. This knowledge base is the logical (though extreme) extension of the increasingly featureful core libraries and APIs that are becoming standard parts of modern programming languages. This prototype will obviously be lacking in details the user might wish implemented, but would start with a reasonable set of default values in cases of uncertainty. After initial inspection of the prototype the user may specify progressively finer details until the application meets desired specifications. While such tuning could no doubt go on for hours in some cases, the program was typically usable in mere seconds and well customized within minutes, and it was all done interactively. Perhaps most importantly the tuning was done by the actual user, rather than by some programmer who could only hope to anticipate the actual needs of the actual user.
While the holodeck is sci-fi, it's characteristic of ultimate programmability should be the grand vision of programming language, development library and API designers. If a tool doesn't help someone finish a job more quickly, effectively, or cheaply compared to what's already available then it isn't a very worthwhile tool. The gap between modern languages and natural language processing is a great one, but it will only be achieved by pushing for higher and higher levels of abstraction. Lower level programmers will still be necessary, but their functions should be increasingly shielded from the end users and even the developers of end user applications to tackle problems of increasing complexity.
Final Thoughts
For now I'm doing most of my own development in C/C++, with some supplemental Python. Why not more Python and less C/C++? Partly it's portability issues, in the case of my current project (a fluid flow simulation visualizer) it's simply easier to output binaries with only a few common dependencies for the target machines than to generate Python that will incur additional dependencies simply for the appropriate Python-to-C bindings. Performance is also an issue, as fluid flow visual visualization can be fairly memory and CPU hungry.
My hope is that in the near future dependency issues at least will become nearly insignificant. If I cannot rely on the package "foo" being available on most machines (much less the additional Python bindings for package "foo"), it would be almost as good if I could be sure that an end user could easily or automatically acquire such dependencies as needed. My preferred OS makes such actions relatively easy, and a few other projects may provide their own solutions. Better optimizers for VHLLs will help my cause as well.
In any case, the VM-based languages are one step on the ladder of progress I'll probably be skipping over.
Copyright (C) 2005 Matt Heinzen. Redistribution allowed under the terms of the GNU Free Documentation License.