||
Yinpu
Computer College
Wuhan University
Wuhan, China
e-mail: yinpu@whu.edu.cn
Abstract—A new programming language—Quasi-Natural Language and an implementation of this language—Kaimeng language processing platform is introduced in this paper. This language represents knowledge as in natural language. And the elements of language are unlimited words instead of dozens of commands as in traditional programming language. This language is purposed to complicated and abstract programming requirement. This language is an open language, both the extension and intension of language are expandable unlimitedly, that is the volume of vocabulary and the meaning of word are unbounded. Any letter or character accepted by computer may serve as alphabet in this language e.g., Chinese, English even mixture of multi-language. All the features presented in this paper are implemented in Kaimeng, which is developed by the author.
Keywords- Quasi-Natural Language Programming; Natural language understanding; Knowledge representation
When you ask your wife “take a pound of spinach home” in natural language, you can just ask a computer doing an addition in computer language. When you claim “the two man looks alike” in natural language, you can just let computer judge if an integer larger than another in computer language. We may even ask so complicated task in natural language in simple form as “count the females working as computer scientist”. Compare natural language and computer programming language, natural languages express complicated and abstract concepts at ease, computer languages just deal with simple and detailed trivial. To perform a task which is expressed simply in natural language may become intolerable work in computer languages. Making computer accept instruction in natural language will undoubtedly promote the programming capability to a higher level. But 2 major barriers prevent the emerging of programming language approaching to natural language. The first is new parser is required, for natural language employ unlimited amount of words instead of a small amount of commands in traditional computer language. Compiler or interpreter based on formal language is no longer effective. The second is new semantic represent method is necessary, for traditional languages just define a set of syntax and the explanation of data is left to application program. On the contrary, the words in natural language are tightly bound with physical meaning. So proper representation of semantic is important in natural language processor.
The most significant differences between natural language and traditional language appear in two aspects: knowledge representation and openness of language.
First, confirm that you have the correct template for your paper size. This template has been tailored for output on the US-letter paper size. If you are using A4-sized paper, please close this template and download the file for A4 paper format called “CPS_A4_format”.
Natural languages are evolving successively. New words join the dictionary and old words embody new meanings by time. Word “WWW” and “CPU” emerge in vocabulary with computer. And “surf” is assigned new connotation. These expansions of language do not require reconstruction of language architecture. Therefore natural languages are opened language.
Computer languages are closed language composed dozens of commands which are predefined in compilers or interpreters. When compiler or interpreter is built, the amount and meaning of command is fixed. If new command or new function of command is needed, language processor needs to be redesigned.
Researchers made feasibility studies even in early stage of computer [1][2][3][4]. Some researchers tried to implement natural language programming, as Hugo Liu and Henry Lieberman’s work [1]. Some elementary ideas are proposed in their paper, as “Action, or non-copular verbs map to functions, while noun phrases map to classes. Adjectival modifiers map to properties of a class, and adverbial modifiers map to auxiliary arguments to functions ”. And other important topics are also discussed as object inheritance and object inference.
Some new progresses are made recently, as sEnglish [5] and PENG [6]. Most of these systems are from formal logic theories, which is specially fit for reasoning. But the author believes logic calculus is not the core of natural languages. New view points should be considered.
Different approaches are adopted in quazi-natural language from other programmable natural language systems. Massive knowledge storage in revisable dictionaries serves as the foundation of quazi-natural language. Upon existing knowledge, complicated cases may be briefly expressed in short sentences.
Quasi_natural language is promoted here in an attempt to program as human thinking. Before totally free natural language become accepted understood by computer. A controlled language with natural language features is a logical solution. The language processor of Quasi_Natural Language and the language itself are 2 independent systems. They correlate under syntax and semantic regulations of Quasi_Natural Language. Under these regulations, both may develop independently.
As in natural languages, the elements of Quasi_Natural Language are words. The vocabulary is unlimited. A word is composed of a word name and multiple properties as in Figure 1.
A property is composed of a property name and a property value. A word has at least one property named “Part of speech”. This property defines the syntax part of the word and is necessary. Other properties are defined as needed. The amount of property is unlimited. TABLE I show the definition of noun “wav”. Required properties to deal with voice in .wav format are defined.
PropertyName | PropertyValue |
PartofSpeech | noun |
DataType | complex |
wavHeader | |
wavFormat | |
wavData | |
DevieceHandle | |
FileSpecification | |
LeftChannelSignal | |
RightChannelSignal |
And all the properties in TABLE I are defined further more as property “wavHeader” in TABLE II.
PropertyName | PropertyValue |
PartofSpeech | noun |
DataType | complex |
ChunckID | RIFF |
FileSize | |
WavID | WAVE |
After all data properties are defined, actions upon the word can be defined in verbs. TABLE III shows definition of verb “play”. The property “Reference” indicates this verb is only property for noun “wav”. Other verbs may be defined likely.
PropertyName | PropertyValue |
PartofSpeech | verb |
VerbPath | c:\Kaimeng\dsp |
VerbFile | wavproc.dll |
VerbFunction | _PlayWav |
Reference | wav |
The detail definition of words is boring. Fortunately, it’s not the burden of language users. It is left to the designers of language.
All words are defined in dictionaries. Although the structure of dictionary abides by language regulations, the content of dictionary is independent from the language processor. Language may be amended with modification of dictionary except rebuilding of language processor.
Knowledge inheritance is performed simply in Quasi_Natural Language by defining a “WordClass” property in a noun and set the property value to another noun. The former called offspring, the latter called ancestor. The offspring inherits all properties and proper verbs from ancestor. In TABLE IV, noun “WavArchive” inherits all properties and proper verbs from noun “wav” by define the value of property “WordClass” as “wav”. The offspring may define its own properties. If there is homonymous property in ancestor noun, it’s would be overlaid by offspring property, otherwise a new property is added. Any noun may be used as class noun.
PropertyName | PropertyValue |
PartofSpeech | noun |
ClassWord | wav |
FileSpecification | WavArchiveSpecification |
wavFormat | StandardFormat |
After all nouns and verbs are defined, programs could be written:
ThereIs WavArchive, naming MyVoice. record MyVoice. play MyVoice.
There are 3 sentences in this program. The first claims an object of “WavArchive”, called “MyVoice”. The second perform record upon object “MyVoice” through microphone. The third simply replay the sound of “MyVoice”.
Kaimeng language processing platform is consisted of 4 parts: dictionary manager, program editor, scene manager and language interpreter.
Provides interactive interfaces for creation, deletion, modification and browsing of dictionaries and words.
Provides interactive interfaces for creation, opening and editing for Quasi_Natural Language program. Since the source program of Kaimeng is structured, normal text editor can’t edit Kaimeng program.
Provide access to scene. The scene manager is not an independent application. Other applications access the scene through interface commands in scene manager. These commands include creation, modification, deletion and retrieving of objects in the scene.
Provide program interpretation and execution functions for program in Kaimeng.
In following examples, the texts printed in bold are executable program. They are all run successfully.
Following script implement factorial from 1 to 9.
Script :ThereIs an integer, naming p. p be 1. ThereIs an integer, naming i. i be 1. if i<10 , repeat {link1} .
Link1:p be p*i . i be i+1 .
Each sentence denotes as:
ThereIs an integer, naming p :create an integer object, and name it as p.
p be 1:assign 1 to p.
ThereIs an integer , naming i :create an integer object, and name it as i.
i be 1 :assign 1 to i.
if i<10 , repeat {link1} :execute {Link1} repeatedly while expression i<10 is true.
{Link1}
p be p*i :assign p*i to p.
i be i+1 :assign i+1 to i.
{Link1} is like compound-statement in C language.
Evidently, quasi-natural language showed no advantage compare to traditional languages in low level programming.
Following script implements voice record, filter and play. Every operation in this example requires considerable programming effort in traditional languages.
Script :ThereIs a wav , naming original. original descend wavFormat be StandardFormat. record original. FFT original descend LeftChannelSignal, produce spectra1. filter spectra 3000 and 20. reFFT spectra, produce FilteredSignal. LeftChannelSignal be FilteredSignal . play wav.
Each sentence denotes as:
ThereIs a wav , naming original:create a wav object, named as “original”
original descend wavFormat be StandardFormat:assign noun “StandardFormat” to object “wavFormat” under “original”. That means set “wavFormat” property under “original” as defined in word “StandardFormat”.
record original:execute verb “record” upon object “original”. Verb “record” performs record action and writes voice data into LeftChannelSignal and RightChannelSignal properties under “original”.
FFT original descend LeftChannelSignal, produce spectra: execute Fast Fourier Transformation with verb “FFT” upon object “LeftChannelSignal” under object “original”. And name the spectrum of “LeftChannelSignal” as “spectra”, which is produced by verb “FFT”.
filter spectra and 3000 and 20: execute verb “filter” upon object “spectra” and constant 3000 and 20. The result is written to “spectra” itself.
reFFT spectra, produce FilteredSignal: execute Reverse Fast Fourier Transformation with verb “reFFT” upon object “spectra” and name the new object as “FileteredSignal”, which is produced by verb “reFFT”.
LeftChannelSignal be FilteredSignal: assign object “FilterdSignal” to object “LeftChannelSignal ”.
play FilteredSignal: execute verb “play” upon object “FilteredSignal”. To play the sound be filtered.
If Kaimeng is similar to traditional language in first instance, and showed no advantage at all. The second instance indicate Kaimeng may simply describe a task in a few sentences that may require thousands of statements in traditional language
Although totally free natural language programming is still not in the view, program with words in natural form under restricted syntax is practical. The impact on information science may compare with natural language on human civilization.
[1]Hugo Liu, Henry Lieberman, "Toward a Programmatic Semantics of Natural Language," vlhcc, pp.281-282, 2004 IEEE Symposium on Visual Languages - Human Centric Computing (VLHCC'04), 2004
[2]Henry Lieberman and Hugo Liu,"Feasibility Studies for Programming in Natural Language",in Human-Computer Interaction Series,Volume 9,Springer Netherlands,2006,pp 459-473.
[3]Hugo Liu, Henry Lieberman, Programmatic semantics for natural language interfaces, Conference on Human Factors in Computing Systems, CHI '05 extended abstracts on Human factors in computing systems, Portland, OR, USA, SESSION: Late breaking results: short papers, Pages: 1597 – 1600, Year of Publication:2005.
[4]Manolis Maragoudakis, Nikolaos Cosmas and Aristogiannis Garbis, Mining Natural Language Programming Directives with Class-Oriented Bayesian Networks,in Lecture Notes in Computer Science,Volume 5139/2008,2008,pp 15-26.
[5] S M Veres and L Molnar, Publishing papers and books for autonomous vehicle agents, http://system-english.com/files/IEEE_CM_agv.pdf.
[6] Kerry Trentelman, Processable English:The Theory Behind the PENG System, DSTO Formal Reports, Report number: DSTO-TR-2301, AR number: AR-014-554, File number: 2009/1016220 , Issue Date: 2009-06.
Archiver|手机版|科学网 ( 京ICP备07017567号-12 )
GMT+8, 2024-5-20 04:02
Powered by ScienceNet.cn
Copyright © 2007- 中国科学报社