Definition of a Scripting Language
There are many definitions of the term scripting language, and
every definition you can find does not fully match some of the
languages known to be representatives of scripting languages.
Some people categorize languages by their purpose and others
by their features and the concepts they introduce. In this chapter,
we discuss all the characteristics defining a scripting language.
In Chapter 2, we categorize scripting languages based on
their role in the development process.
Compilers Versus Interpreters
Strictly speaking, an interpreter is a computer program that
executes other high-level programs line by line. Languages executed
only by interpreters are called interpreted languages.
To better understand the differences between compilers and
interpreters, let’s take a brief look at compiler architecture (see
Figure 1.1).
As you can see in Figure 1.1, translating source code to
machine code involves several steps:
-
First, the source code (which is in textual form) is read
character by character. The scanner groups individual
characters into valid language constructs (such as variables,
reserved words, and so on), called tokens.
- The tokens are passed to the parser, which checks that
the correct language syntax is being used in the program.
In this step, the program is converted to its parse
tree representation.
- Semantic analysis performs type checking. Type checking
validates that all variables, functions, and so on, in
8 SCRIPTING IN JAVA the source program have been used consistently with
their definitions. The result of this phase is intermediate
representation (IR) code.
- Next, the optimizer (optionally) tries to make equivalent
but improved IR code.
- In the final step, the code generator creates target
machine code from the optimized IR code. The generated
machine code is written as an object file.
Figure - 1:Compiler architecture
To create one executable file, a linking phase is necessary.
The linker takes several object files and libraries, resolves all
external references, and creates one executable object file.
When such a compiled program is executed, it has complete
control of its execution.
Unlike compilers, interpreters handle programs as data that
can be manipulated in any suitable way (see Figure 1.2).
As you can see in Figure 1.2, the interpreter, not the user
program, controls program execution. Thus, we can say the user
program is passive in this case. So, to run an interpreted program
on a host, both the source code and a suitable interpreter
must be available. The presence of the program source (script) is
the reason why some developers associate interpreted languages
with scripting languages. In the same manner, compiled languages
are usually associated with system-programming languages.
Interpreters usually support two modes of operation. In the
first mode, the script file (with the source code) is passed to the
interpreter. This is the most common way of distributing
scripted programs. In the second, the interpreter is run in interactive
mode. This mode enables the developer to enter program
statements line by line, seeing the result of the execution after
every statement. Source code is not saved to the file. This mode
is important for initial system debugging, as we see later in the
book.
In the following sections, I provide more details on the
strengths and weaknesses of using compilers and interpreters.
For now, here are some clear drawbacks of both approaches
important for our further discussion:
-
It is obvious compiled programs usually run faster than
interpreted ones. This is because with compiled programs,
no high-level code analysis is being done during
runtime.
-
An interpreter enables the modification of a user program
as it runs, which enables interactive debugging
capability. In general, interpreted programs are much
easier to debug because most interpreters point directly
to errors in the source code.
-
Interpreters introduce a certain level of machine independence
because no specific machine code is generated.
-
The important thing from a scripting point of view, as
we see in a moment, is interpreters allow the variable
type to change dynamically. Because the user program
is reexamined constantly during execution, variables do
not need to have fixed types. This is much harder to
accomplish with compilers because semantic analysis is
done at compile time.
From this list, we can conclude interpreters are better suited
for the development process, and compiled programs are better
suited for production use. Because of this, for some languages,
you can find both an interpreter and a compiler. This means
you can reap all the benefits of interpreters in the development
phase and then compile a final version of the program for a
specific platform to gain better performance.
Many of today’s interpreted languages are not interpreted
purely. Rather, they use a hybrid compiler-interpreter approach,
as shown in Figure 1.3.
In this model, the source code is first compiled to some
intermediate code (such as Java bytecode), which is then interpreted.
This intermediate code is usually designed to be very
compact (it has been compressed and optimized). Also, this language
is not tied to any specific machine. It is designed for
some kind of virtual machine, which could be implemented in
software. Basically, the virtual machine represents some kind of
processor, whereas this intermediate code (bytecode) could be
seen as a machine language for this processor.
This hybrid approach is a compromise between pure
interpreted and compiled languages, due to the following
characteristics:
-
Because the bytecode is optimized and compact, interpreting
overhead is minimized compared with purely
interpreted languages.
-
n The platform independence of interpreted languages is
inherited from purely interpreted languages because the
intermediate code could be executed on any host with a
suitable virtual machine.
Lately, just-in-time compiler technology has been introduced,
which allows developers to compile bytecode to
machine-specific code to gain performance similar to compiled
languages. I mention this technology throughout the book,
where applicable. |