类自然语言程序设计分享 http://blog.sciencenet.cn/u/yinpu 博主开发开蒙语言。具备自然语言主要特征:语言表达知识;无限词汇量

博文

Quasi_Natural Language Programming

已有 2045 次阅读 2019-11-3 11:03 |个人分类:类自然语言程序设计|系统分类:论文交流

Quasi-Natural Language Programming

 


Pu Yin

Computer Science College. Wuhan University

Wuhan, China

yinpu@whu.edu.cn

 

 


Abstract—A new programming language—Quasi-Natural Language and an implementation of this language—Kaimeng language processing platform is introduced in this paper. Although much effort was made in early days to program in natural language, it could not succeed without an effective knowledge processing mechanic. Kaimeng represent knowledge as in natural language. It is designed to work at higher level than traditional programming language. The elements of language are unlimited words in a form close to natural language. Both the extension and intension of vocabulary are expandable. The alphabet or character in any natural language may be used in this language, e.g., Chinese, English even mixture of multi-language.

Keywords-Quasi-Natural Language Programming; Natural language understanding; Knowledge representation)

I.  Introduction

When traditional computer programming languages (referred as traditional language later) behave awkward in expressing abstract concept or complicated requirement, natural languages reveal no difference in describingabstract concept and detail concepts or intricate cases and simple cases. When you ask your wife “take a pound of spinach home” in natural language, you can just ask a computer doing an addition in computer language. When you claim “the two brothers looks alike”, you can just let computer judge if an integer variable A larger than variable B. Carrying same information, which is simply expressed in natural language, may become a suffering task in traditional language. The disparity in these 2 kinds of language comes from knowledge representation. We set off from existing knowledge in natural language, but start from detailed statements almost without any knowledge in traditional languages. Evidently programming  in natural language is undoubtedly liberation to all programmers.

Researchers explored the feasibility of programming in natural language even in early stage of computer science [1] [2] [3] [4]. Researchers tried to implement natural language programming [1] and some important ideas are proposed, as “Action, or non-copular verbs (everything except verbs like to be, and to seem) map to functions, while noun phrases map to classes. Adjectival modifiers map to properties of a class, and adverbial modifiers map to auxiliary arguments to functions”. And other important topics are also discussed as object inheritance and object inference.

Although totally free style natural language programming is still not practical, programming in natural words under certain lexical and syntactic rules is possible. A language architecture called quazi-natural language is brought forward in this paper and implemented on a language processing platform titled Kaimeng.

II. DIVERSITIES BETWEEN NATURAL LANGUAGE AND TRADITIONAL LANGUAGE

The most fundamental differences between natural language and traditional language appear in two aspects: the first is knowledge representation, when natural languages are tools to represent knowledge, traditional languages are just procedure description tools which contain almost no knowledge; the second is the openness of language, natural languages are open languages, expandable both intensively and extensively, while traditional languages are closed languages with limited amount of predefined statements or commands, which is immutable in format and connotation after the language processors are built.

These 2 primary reasons cause some new features appear in Quasi-Natural Language:

A. Knowledge representation

Knowledge is always implicitly or explicitly expressed in natural language. When we say “send my regard to John by E-mail”. Some implied knowledge is employed, as what is regard and E-mail, who is John, what is his e-mail address and how to send a mail. So words in natural language not only play roles literally, but also significatively.

B. Openness of language

Natural languages are evolving ceaselessly. New words join the dictionary and old words absorb new implication by time. Word “WWW” and “CPU” emerge in vocabulary along with “computer”. And “surf” is assigned new connotation. These expansions of language do not require reconstruction of language architecture. Therefore natural languages are opened language.

Computer languages are closed language composed of dozens of commands which are predefined in compilers or interpreters. When compiler or interpreter is built, the amount and meaning of command is fixed. If new command or new function of command is needed, language processor needs to be redesigned. So traditional computer languages are closed.

C. Independence of language

Not like traditional languages, in which language and language processor is integrative, the openness of language require language independent from language processor. When language processor keeps unchanged, the language could evolve independently.

D. Knowledge accumulation

When new knowledge is added into the language, previous knowledge should keep untouched and effective unless deliberately do. And new knowledge may be built upon existing knowledge.

E. Knowledge inheritance

The best way to make use of existing knowledge is inheritance as in OOP. It’s also adopted in quasi-natural language. 

F. Close to natural language

Although it is not necessary, quasi-natural language appeared in a form similar to natural language, to release the burden for learning new language and approach the way of human thinking.

III. KNOWLEDGE REPRESENTATION

Knowledge representation in quazi-natural language is mainly describing data structure of objects and suitable operation upon objects. All knowledge appears in form of words as in natural language. Word consists of a word name and multiple properties. A word contains at least a property “partofspeech”, to define its syntactic part. Other properties are defined as needed.

The formal expression of word:

<word><word name> <property> {<property >}

A. Data and Structure Definition

Data and its structure are represented by noun. TABLE I show the structure of noun sound in .wav format.

TABLE I      NOUN wav

PropertyName

PropertyValue

PartofSpeech

noun

DataType

complex

wavHeader


wavFormat


wavData


DevieceHandle


FileSpecification


LeftChannelSignal


RightChannelSignal


Property PartofSpeech and DataType are syntax properties, to define the syntax part of word and type of data. They are predefined in language interpreter.  Other properties in Table 1 are entity properties; need to be defined further, as property wavHeader is defined in TABLE II.

TABLE II      NOUN wavHeader

PropertyName

PropertyValue

PartofSpeech

noun

DataType

complex

ChunckID

RIFF

FileSize


wavID

WAVE

And properties in TABLE I and TABLE II are defined by more nouns, until all property names corresponding to a homonymic noun.  TABLE III shows definition of noun FileSize in TABLE II. Its type is elementary type int predefined in Kaimeng. At this point, all properties in TABLE III are predefined properties; more definition is no longer needed.

TABLE III      NOUN FileSize

PropertyName

PropertyValue

PartofSpeech

noun

DataType

int

Besides element type and complex type, customer type is acceptable in Kaimeng. TABLE IV show a noun DeviceHandle, its type is customer. Kaimeng does not deal with objects with customer type directly; their suitable verbs take the job.

TABLE IV      NOUN DeviceHandle

PropertyName

PropertyValue

PartofSpeech

noun

DataType

customer

CreateVerb

Create

DeleteVerb

Delete

Form definition above, a tree type hierarchy would be created at run time for noun wav as in Figure 1.

 

Figure 1      hierarchy of wav

Every node in the tree is called an object. A path is used to reference an object. As wav descend wavDate descend dataID to address object dataID. The path order is in Chinese convention (for Kaimeng is originally designed for programming in Chinese), high level object at left and low level object at right.

B. Suitable Operation Definition

Suitable operation appears in form of verb or operator. Verbs in quasi-natural language are implemented by dynamic link functions or verb scripts in quasi-natural language. TABLE V show definition of verb “play”, which is implemented with dll function _PlayWav(). There may be several “Reference” properties for verb, which define parameters passed to verb function or script, and also be used as validate check for semantic matching in language interpreter. Duplicated noun name is forbidden in quasi-natural language, but homonymic verbs are permitted. The interpreter chooses a suitable verb from multi homonymic verbs in a specific sentence by the objective noun and “Reference” properties of the verb. Other information necessary for function loading or executing are required too, as “VerbPath”, VerbFile”.

TABLE V        VERB play

PropertyName

PropertyValue

PartofSpeech

verb

VerbPath

c:\Kaimeng\dsp

VerbFile

wavproc.dll

VerbFunction

_PlayWav

Reference

wav

Some verbs producing return result have property “ReturnObject”. In TABLE VI, verb “ShowSignal” produces an object “SignalWindow”.

TABLE VI      VERB ShowSignal

PropertyName

PropertyValue

PartofSpeech

verb

VerbPath

c:\Kaimeng\dsp

VerbFile

wavproc.dll

VerbFunction

_ShowSignal

Reference

Signal

CreatedObject

SignalWindow

 

C. Knowledge Inheritance

A methodology similar to Object Oriented Programming is adopted to perform knowledge inheritance. Any noun may be inherited by other noun. The former called parent noun and the latter called offspring noun. By defining a “WordClass” property and set the property value to the name of parent noun, the offspring inherit all properties and suitable verbs from parent. When an object is created by offspring noun, it is created by its parent noun first, and then properties of offspring noun are added to the object or overlay former property. If a property defined in offspring noun is not found in parent object, a new property is added to offspring object; otherwise, property defined in offspring noun overlay the former property with same name. TABLE VII shows a noun “PureTone”. With definition of property “WordClass”, “PureTone” inherit all properties and suitable verbs from parent noun “wav”. And other 2 properties “FileSpecification” and “wavFormat” overlay homonymic properties in “wav”.

TABLE VII      NOUN PureTone

PropertyName

PropertyValue

PartofSpeech

noun

ClassWord

wav

FileSpecification

PureToneSpecification

wavFormat

StandardFormat

 

IV. PROGRAMMING INSTANCES

Two instances are designed to show the features of Kaimeng. The bold text is executable script. In Kaimeng semicolon serve as sentence separator to avoid ambiguity with dot in number.

A. Simple routine control

This example performs a factorial of 10.

Script ThereIs an integer , naming p ; p be 1; ThereIs an integer, naming i ; i be 1 ; if  i<10 , repeat {link1} .

Link1p be p*i ; i be i+1 .

 

Each sentence denotes as:

ThereIs an integer , naming p create an integer object, and name it as p.

p be 1assign 1 to p.

ThereIs an integer , naming i create an integer object, and name it as i.

i be 1 assign 1 to i.

if i<10 , repeat {link1} execute {Link1} repeatedly while expression i<10 is true. There is only one routine  control sentence in Kaimeng, which perform both selection and repetition structure.

 

{Link1}

<link> is an executable text written in Kaimeng which is embedded in script, as compound-statement or block in C language. The text in {} is the name of  <link>. In this case, it is “Link1”, which denotes following sentences.

p be p*i assign p*i to p.

i be i+1 assign i+1 to i.

Evidently, quasi-natural language reveals no advantage to traditional languages in low level programming.

B. High level programming

Following script implements voice record, filter and replay.

Script: ThereIs a wav, naming original; original descend wavFormat be StandardFormat; record original; FFT original descend LeftChannelSignal, produce spectra; bandfilter spectra 3000 and 20; reFFT spectra, produce FilteredSignal; LeftChannelSignal  be FilteredSignal; play wav;

Each sentence denotes as:

ThereIs a wav, naming original: create a wav object, named as original. wav is defined in TABLE I.

original descend wavFormat be StandardFormat: assign noun StandardFormatto object wavFormat which is a property under object original. That means set wavFormat property value under original to the noun StandardFormat.

 record originalact verb record upon object original. Verb record performs record action and writes voice data into LeftChannelSignal and RightChannelSignal properties under original.

FFT original descend LeftChannelSignal, produce spectra1: act verb “FFT” upon object “LeftChannelSignal” under object “original”. And name the new object as “spectra1, which is produced by verb “FFT”.

bandfilter spectra and 3000 and 20: act verb “bandfilter” upon object “spectra” and constant 3000 and 20. It performs a band filtering upon object “spectra”.

reFFT spectra, produce FilteredSignal: act verb “reFFT” upon object “spectra” and name the new object as “FileteredSignal”, which is produced by verb “reFFT” produced.

LeftChannelSignal  be FilteredSignal: assign object “FilterdSignal” to object “LeftChannelSignal ”.

play wav: act verb “play” upon object wav. The channel is assumed to be 1 here, so not bother to process “RightChannelSignal” property.

 

This instance indicates Kaimeng may describe a task simply upon existing knowledge as in natural languages. These tasks may require thousands of statements in traditional languages. 

One sentence in this script may corresponding to a chunk of code in traditional language. As verb FFT has a kernel in C Language along with necessary shell (the code is originally designed for Chinese, some still stay in Chinese).

Kernel of FFT in C Language:

bool FFT_Recursive(int N,int offset,int interval,long double *signal[2],

long double *output[2],long double *media[2])

{

  int half=N/2;

  long double cs,sn;

  int k00,k01,k10,k11;

  long double tmp0,tmp1;

  if (N>2)

  {

FFT_Recursive(half,offset,2*interval,signal,media,output);

FFT_Recursive(half,offset+interval,2*interval,signal,media,output);

    for (int k=0;k<half;k++)

    {

      k00=offset+k*interval;

      k01=k00+half*interval;

      k10=offset+2*k*interval;

      k11=k10+interval;

      cs=cos(TWO_PI*k/(long double)N);

      sn=sin(TWO_PI*k/(long double)N);

tmp0=cs*media[0][k11]+sn*media[1][k11];

tmp1=cs*media[1][k11]-sn*media[0][k11];

output[0][k00]=media[0][k10]+tmp0;

output[1][k00]=media[1][k10]+tmp1;

output[0][k01]=media[0][k10]-tmp0;

output[1][k01]=media[1][k10]-tmp1;

    }

  }

  else

  {

    k00=offset;

    k01=k00+interval;

output[0][k00]=signal[0][k00]+signal[0][k01];

output[1][k00]=signal[1][k00]+signal[1][k01];

output[0][k01]=signal[0][k00]-signal[0][k01];

output[1][k01]=signal[1][k00]-signal[1][k01];

  }

}

Shell of FFT in C Language:

extern "C" __declspec(dllexport) bool FFTShell(Server cmd,CSemanticStack* rf,void** rslt)

{

  bool result=false;

  *rslt=NULL;

  if (rf->GetCount()==1)

  {

    SIngredient ingredient;

    rf->Goto(0);

    rf->GetElement(ingredient);

    //获得信号指针           `

    void* start=NULL;

    start=ingredient.Pointer;

    if (start)

    {

      Reference rfr;

      rfr.Start=start;

      strcpy(rfr.Route,"");

      DataType dt;

      Signal signal;

signal.Type=*(DataType*)cmd(GET_OBJECT,"数值类型",&rfr,NULL,dt);

signal.Length=*(int*)cmd(GET_OBJECT,"信号长度",&rfr,NULL,dt);

signal.Period=*(int*)cmd(GET_OBJECT,"信号周期",&rfr,NULL,dt);

signal.Amplitude=*(int*)cmd(GET_OBJECT,"振幅",&rfr,NULL,dt);

signal.Phase=*(int*)cmd(GET_OBJECT,"相位",&rfr,NULL,dt);

      signal.Start=cmd(GET_OBJECT,"信号指针",&rfr,NULL,dt);

      Spectrum spectra;

      result=FFT(signal,&spectra);

      if (result)

      {

        *rslt=cmd(CREATE_OBJECT,"复变信号",NULL,NULL,dt);

        rfr.Start=*rslt;

        strcpy(rfr.Route,"实部");

        cmd(GET_OBJECT,"信号指针",&rfr,NULL,dt);

        cmd(MODIFY_OBJECT,"信号指针",&rfr,spectra.Re.Start,dt);

        cmd(GET_OBJECT,"信号长度",&rfr,NULL,dt);

        cmd(MODIFY_OBJECT,"信号长度",&rfr,&spectra.Re.Length,dt);

        int t=spectra.Re.Type;

        cmd(GET_OBJECT,"数值类型",&rfr,NULL,dt);

        cmd(MODIFY_OBJECT,"数值类型",&rfr,&t,dt);

        strcpy(rfr.Route,"虚部");

        cmd(GET_OBJECT,"信号指针",&rfr,NULL,dt);

        cmd(MODIFY_OBJECT,"信号指针",&rfr,spectra.Im.Start,dt);

        cmd(GET_OBJECT,"信号长度",&rfr,NULL,dt);

        cmd(MODIFY_OBJECT,"信号长度",&rfr,&spectra.Im.Length,dt);

        cmd(GET_OBJECT,"数值类型",&rfr,NULL,dt);

        cmd(MODIFY_OBJECT,"数值类型",&rfr,&t,dt);

      }

    }

  }

  return result;

}

Other verbs are implemented likewise. When some knowledge about data structures or suitable operations is achieved, they may be abstracted into simple nouns or verbs. And be used in simple sentence to replace long codes in traditional languages.

V. CONCLUSION

Although free natural language programming is still hard, programming with unlimited amount of words in natural form under restricted lexical and syntactic rules is practical. And may boost computer programming to higher level.

A new concept and a language system are brought forward. All the ideas and methodology is implemented in Kaimeng which is developed by author as a prototype for quasi-natural language. It is expected to be a step toward free natural language programming.

References

 

[1] Hugo Liu, Henry Lieberman, "Toward a Programmatic Semantics of Natural Language," vlhcc, pp.281-282, 2004 IEEE Symposium on Visual Languages - Human Centric Computing (VLHCC'04), 2004

[2] Henry Lieberman and Hugo Liu,"Feasibility Studies for Programming in Natural Language",in Human-Computer Interaction Series,Volume 9,Springer Netherlands,2006,pp 459-473.

[3] Hugo Liu, Henry Lieberman, Programmatic semantics for natural language interfaces, Conference on Human Factors in Computing Systems, CHI '05 extended abstracts on Human factors in computing systems, Portland, OR, USA, SESSION: Late breaking results: short papers, Pages: 1597 – 1600, Year of Publication:2005.

[4] Manolis Maragoudakis, Nikolaos Cosmas and Aristogiannis Garbis, Mining Natural Language Programming Directives with Class-Oriented Bayesian Networks,in Lecture Notes in Computer Science,Volume 5139/2008,2008,pp 15-26.

[5] Pu Yin, “Implementation of Quasi-Natural Language,” unpublished.




https://wap.sciencenet.cn/blog-271176-1204632.html

上一篇:象人类一样思考
下一篇:Think As Human
收藏 IP: 171.43.162.*| 热度|

0

该博文允许注册用户评论 请点击登录 评论 (0 个评论)

数据加载中...

Archiver|手机版|科学网 ( 京ICP备07017567号-12 )

GMT+8, 2024-5-20 04:33

Powered by ScienceNet.cn

Copyright © 2007- 中国科学报社

返回顶部