科学网

 找回密码
  注册

tag 标签: 编译原理

相关帖子

版块 作者 回复/查看 最后发表

没有相关内容

相关日志

2009秋季《编译原理》课件
light 2010-1-14 17:55
1 Introduction 2 FormalLang 3 LexScanning 4 TopdownParsing 5 BottomupParsing 6 AttributeGrammar 7 IRgen 8 Symtab 9 RuntimeSys 10 CodeOpt
个人分类: 课程资料|5201 次阅读|0 个评论
这样的编译原理课如何
huangfuqiang 2009-4-8 20:39
国内如何改进? 改进要具备哪些条件?如何教?如何学? 系统软件(基础软件)要发展,有些课程教学必须改革啊! 学术型+应用型+项目管理型+.......... 主讲教授: Alfred V. Aho is Lawrence Gussman Professor of Computer Science at Columbia University. Teaching the Compilers Course Alfred V. Aho Department of Computer Science Columbia University I have always enjoyed teaching the compilers course. Compiler design is a beautiful marriage of theory and practice -- it is one of the first major areas of systems programming for which a strong theoretical foundation has developed that is now routinely used in practice. I am a strong believer in Donald Knuths credo: Theory and practice are not mutually exclusive; they are intimately connected. They live together and support each other. In the past five decades a number of forces have evolved the traditional compilers course in a computer science curriculum . This article looks at how compilers courses have responded to these forces since their inception. Ive organized the evolution based on material in the four books on compilers that Ive coauthored and used in compilers courses over the last three decades. Many of the earliest compilers courses put a heavy emphasis on the theory of parsing and syntax-directed translation. However, it was quickly realized that just focusing on the theory does not necessarily teach students how to build a useful compiler. This observation was positively reinforced by my interactions with the exceptionally talented researchers in the Computing Sciences Research Center at Bell Labs where I worked right after graduation. I began my career there as a computer science theorist after creating indexed grammars and nested stack automata for my PhD thesis at Princeton. With the excitement surrounding the invention of Unix, C, and C++ around me, however, I quickly became involved in the wonderful opportunities for the fruitful symbiosis of theory and practice in systems and compilers. Many of the early tools that appeared on Unix such as diff, grep, lex, and yacc were software incarnations of algorithms that had come from computer science theory. I tried my hand at implementing some string-pattern-matching algorithms that I had devised by writing the initial versions of egrep and fgrep on Unix, and then working with Brian Kernighan and Peter Weinberger in developing the AWK programming language. Through these experiences I learned that it often takes as much talent to implement an algorithm efficiently and correctly in practice as it does to develop it in theory. A compelling testimonial to the craftsmanship of the Unix pioneers is that many of their tools that appeared in the first versions of Unix are still in use today. The contrast between early and modern compilers is dramatic. Early compilers were implemented with just thousands of lines of code, usually written in low-level languages like assembler or C; modern compilers come in collections often comprising millions of lines of code written in a variety of programming languages (C, C++, C#, F#, Java, and OCAML are popular current choices). Early compilers were often crafted by single individuals; modern compilers are typically large software engineering projects involving teams of programmers using prebuilt components and existing compiler construction frameworks. In the early days of computer science there were relatively few programming languages; today there are thousands. In the early days of computer science, there were relatively few target machine architectures; today we still have the CISC and RISC architectures of the past, but now there are vector, VLIW, multicore, and many other eclectic architectures. The upshot of this evolution of languages and machines is that it is impossible to cover in one compilers course the diversity of algorithms and techniques used in a modern commercial compiler. In spite of this bewildering diversity of source languages and target machines, I have found that it is still possible to offer a compilers course that gives lots of educational benefit and enormous satisfaction to students. This is what my current compilers course has evolved to. It still covers the basics of lexical analysis, parsing, semantic analysis, intermediate code generation, runtime environments, resource management, and target code generation ― tasks present in virtually every compiler. The fundamental concepts underlying these tasks are important to know because they are useful in many areas outside of compiling such as natural language processing and program verification. However, there is no time in the first compilers course to cover in any degree of depth the algorithms used in machine-independent and machine-dependent code optimization. Implementing a compiler for a small programming language has always been a standard component of a compilers course. In the past I used to give students a specification of the source and target languages for the compiler that they were to implement. Although this was instructive, I found students were not that excited about implementing someone elses toy language. What students found infinitely more satisfying was to work in a small team to define their own little language and then build a compiler for it. Their eyes light up when they see programs written in their language being compiled and executed. From my industrial experience I found that the right-weight software engineering process surrounding this kind of project made it possible for every student team to design and implement a working compiler without fail in the course of a 15- week semester. Heres the process I used. The teams were asked to do the following tasks on the following schedule during the 15-week course, concurrently with learning the theory and practice behind compilers. Week Task 2 Form a team of five 4 Write a whitepaper on the proposed language modeled after the Java whitepaper 8 Write a tutorial patterned after Chapter 1 and a language reference manual patterned after Appendix A of Kernighan and Ritchies book, The C Programming Language 14 Give a ten-minute presentation of the language to the class 15 Give a 30-minute working demo of the compiler to the teaching staff 15 Hand in the final project report Once the teams are formed, each team is asked to elect a project manager, a system architect, a system integrator, a tester, and a language guru. Each of these positions carries specific team responsibilities: The project manager sets the project schedule, holds weekly meetings with the entire team, maintains a project log, and makes sure the project deliverables get done on time. The system architect defines the compiler architecture, modules, and interfaces. The system integrator defines the system development platform and tools, the integration environment, and a makefile to ensure the compiler components work together. The tester defines the test plan and test suites from the language reference manual. Each team member is expected to execute the test suites as the compiler is being developed to make sure the compiler meets the language specification. The language guru maintains the intellectual integrity of the language and defines a baseline system and process for managing changes to the language definition. I and the TAs give advice as needed to the teams on how to perform these tasks. There are many benefits to this kind of project. Students get a chance to exercise their creativity in designing their own new language. They learn that writing a value proposition for a language and a detailed specification for a software system are challenging tasks. They also learn valuable skills such as project management, teamwork, and communication (both oral and written). These skills are applicable to any kind of project, not just writing a compiler. The final project report has the following chapters: 1. Introduction (written by the team) 2. Language tutorial (written by the team) 3. Language reference manual (written by the team) 4. Project plan (written by the project manager) 5. Language evolution (written by the language guru) 6. Compiler architecture (written by the system architect) 7. Development environment (written by the system integrator) 8. Test plan and test suites (developed by the system tester) 9. Conclusions containing lessons learned (written by the team) 10. Appendix containing the complete source listing of the compiler Students get individual grades for the role they play on the team and the components of the compiler they individually write. They get team grades for the language whitepaper, tutorial, reference manual, class presentation, and project demo. I assign a TA to mentor each team, and I usually mentor three or four teams myself. Professor Stephen A. Edwards and I have taught the programming languages and translators course with this kind of project each semester for more than five years at Columbia. There are typically 50- 100 students in the class. We are always astounded by the ingenuity and novelty of the languages that the students create. They have included languages for simulating quantum computers, creating computer games, composing music, creating shared-session telecommunications services, and hosts of other innovative applications. Some of the language projects have resulted in publishable papers. There are several factors that make this kind of compilers course possible. The course has as prerequisites courses in data structures and algorithms, computer science theory, and computer systems. Students are expected to be fluent in at least Java and C. Many of the students serving as project managers have industrial experience in writing software and are familiar with design patterns such as the visitor pattern. Compiler construction tools such as lex and yacc (or their modern equivalents), integrated software development environments, and wikis facilitate distributed compiler software development so students can work independently and collaboratively on their compiler components. But perhaps most important, the theory and practice of compiler design has matured to the point where interesting working translators can now be routinely created by beginners during a 15-week course. The importance of the software engineering process used in this course should not be underestimated. I personally review the language whitepaper, tutorial, and reference manual carefully to make sure the compiler can be implemented with just a few thousand lines of original code. Each team member has to write at least four or five hundred lines of this code so every team member gets a taste of what it takes to write code that has to work as part of a system. The fact that every team delivers a working compiler on schedule is actually not surprising. I tell the students to grow their language and compiler so that they only need to deliver what is working at the end of the semester. On the other hand, I ask the students to state in their language reference manual how much of their language they are committing to implementing, and what they will implement if they have extra time. In the final project demo, I ask each team to execute two programs: one of the programs chosen at random from their language reference manual and an unknown new program the TA mentor has created to test his or her teams language. Students uniformly give glowing evaluations of the course and cite the benefits of learning enough compiler theory and tools so they can create their very own language in a semester. They frequently mention project management, teamwork, and how to give effective oral and written presentations as the most important skills learned. When the students interview for jobs, interviewers are universally impressed by the software engineering skills the students have learned from the course. In summary, a modern compilers course can give students a satisfying opportunity to exercise their creativity, a first-hand understanding of how theory and practice can be beneficially interwoven, a small-scale scenario in which to experience good software engineering practices that can greatly improve the robustness of an implementation project, and a realistic opportunity in which to develop project management, teamwork, and communications skills that can be subsequently useful in many aspects of life. References 1. A. V. Aho and J. D. Ullman, The Theory of Parsing, Translation, and Compiling, Volume I: Parsing, Volume II: Translation. Prentice-Hall, 1972 and 1973. 2. A. V. Aho and J. D. Ullman, Principles of Compiler Design. Addison Wesley, 1977. 3. A. V. Aho, R. Sethi, and J. D. Ullman, Compilers: Principles, Techniques, and Tools. Addison Wesley, 1986. 4. A. V. Aho, M. S. Lam, R. Sethi, and J. D. Ullman, Compilers: Principles, Techniques, and Tools, Second Edition. Pearson Addison Wesley, 2007. 5. D. E. Knuth, Theory and Practice, Keynote address for the 11th World Computer Congress (Information Processing 89), San Francisco, 28 August 1989. 6. W. M. Waite, The Compiler Course in Todays Curriculum: Three Stategies. SIGCSE, Houston, Texas, March 1-5, 2006, pp. 87-91. ALFRED V. AHO主页
个人分类: 计算机软件理论与工程|4992 次阅读|1 个评论
沙的智慧—从硅到人工智能
biotrader 2008-11-16 23:17
这是我给生物专业的博士们介绍IT业基本概念的一个PPT。IT业是一个热闹的行业,基本上每个月都有新技术,新名词出现,什么是这个行业这个学科的大根、大本?只有从历史出发,“重走长征路”理清那些最原始最朴素的思想才能有所发展创新吧,从其他行业进入这个行业(或者要完成学科的集成融合)也才能不是只看热闹吧。 硅的提纯制造工艺,我们国家还没有掌握,有了CPU的设计版图还要到台湾去流片。其实我们在装备制造业链条中我们很多东西不能生产,这大大的阻碍了科技研发。怀念六七十年代,人人几乎都可以买到一些电阻、电容等等自己去组装收音机,收音机可是当时的高科技产品呀。实际上两弹一星和神七的很多技术基础也是那时候搞出来的。 场效应管象一个水阀门一样,把电子从两个绿色的方块中间推开,电路断;把电子从底部吸引到两个绿色中间搭桥则电路通。这个水阀门简称门电路,P(postive)表示正,N(negative)表示负,M—金属 O—氧化物 S—半导体 图中红蓝白三个东西。 单用NMOS或单用PMOS也能完成数字逻辑,但是静态功耗不是0,就是说不用也耗电。在大规模集成电路情况下,这点尤其不能容忍。曙光集群一天的电费可是以万元计的。所以用NMOS和PMOS组成CMOS,C是互补的意思P永远接正极,N接地,静态功耗为0. 数字逻辑,布尔代数的运算可以在二进制下实现很多组合。 1930年左右数学家图灵提出了串行状态的计算和存储的模型(是一个数学的可计算性问题),计算机的方向有了,物理就剩下解决集成电路如何运算,如何存储的问题了。加法是最基本的,再复杂的运算也可以归结到加法,实变函数是y轴上的积分,还可以归到加法。其他很多可以用级数、连分数逼近的值计算机就更拿手了。ENAIC是世界上第一台计算机,但是还要有人推最后一把,因为运算和存储解决了,但程序指令还是外部人工用纸带输入的,这个速度太慢了,不解决的话计算机就会夭折。这时冯诺依曼加入了ENAIC团队,他把指令当作数据一样存储起来,读指令——执行——读指令。。。。。自动完成。计算机终于实用化了。当然这个存储程序的思想不是冯诺依曼首先提出的,在图灵之前就有,图灵本人也提过。 我们知道电是不好存贮的,用自反馈实现存储真是一个天才的想法! 其实最需要加和跳转就好了。CPU设计在体系结构方面为了更快的速度又衍生出很多奇思妙想,比如多级流水、乱序发射、动态调度、转移猜测等。它最大程度利用有效运算能力,保证指令进去时和最后出来时有序就行了,中间执行序允许在不影响结果的情况下打乱,当然有时打乱是错误的,但它有机制回退到原处。通用CPU有一个评定的标准SPEC是一套程序,有编译器,图形处理,浮点运算看运行效果来给CPU打分。所以通用CPU设计要权衡的因素很多,有时CPU中cache占的面积和运算单元的面积不相上下。这就如同排兵布阵一样,“运用之妙,存乎一心”。 FPGA的设计需要了解verilog,SystemC等。注意Verilog是描述语言和我们一般的C语言不同,不要在里面for,他是并行的就是一下全部执行,而C语言是串行的。这些描述和硬件的排布关系很大。不过随着SystemC等发展软件工程师和硬件工程师之间的鸿沟正在磨平。要学的话最好有相关设备,学电脑都知道要买电脑,这个学硬件,光有书效果不好。 验证和设计是一体的两面,因为芯片验证不到位,欧洲掉过火箭,Intel赔过几十亿美元。 层层的封装往往使人忘了来时路,社会大分工下的人们只能看到细枝末节吗?歧路亡羊尤可叹。BTW,现在有了VMware server可以用GDB远程调试linux内核,学习操作系统就方便多了。 编译器转化的汇编人看不懂了吧,但是注意有很多add,mov吧,其实还是归结到前面的加法器。 理解人的语言?那是机器吗?那是人!人工智能的永恒悖论。 但是在简化的计算机语言中我们还是可以实现的,如上下文无关文法。 1.小明放学回家了,他已经不在了。 2.小明去世10年了,他已经不在了。 后半句“他已经不在了”意思不同,因为和上半句有关,机器语言不允许这种情况,叫上下文无关文法,而上下文有关文法中“他已经不在了”这段可以是一段话,也可以是空集。当选空集时,就是上下文无关文法了。 而编译原理教材上来就告诉你乔姆斯基2型(无关)文法属于乔姆斯基1(有关)型文法,也不介绍乔老原始和朴素的思想过程,你很容易糊涂,我无拘无束的(无关),还属于你有限制多多的了(有关)? 编译器是用了一个超复杂的数据结构,而操作系统是用了一大批较简单(也有复杂的)的数据结构,总之都是数据结构了。数据结构和算法是一体的两面,树、图等复杂的结构配合复杂的算法。但是也有实现有两个路线,现在主流的语言基本是队列、树、图、分别定义使用。但是相Lisp这样的语言只有一个广义表,List因为这个列表本身内部的元素可以又是列表,有嵌套,所以它一种就可以实现树、图等所有的结构了。 开个玩笑,聪明人学不好数据结构。因为聪明人他总是用人的思维,还是人的思维中难度最大的,哪里能容忍你电脑把好好的并行的东西,硬拗到串行硬件上,然后再通过栈等的延迟来回到假并行。艺术家学数据结构能把他委屈死。 这里挂一漏万地提到了计算机学科的各个领域,整篇语言粗陋而且学术上也很不严谨,但是目的在于帮助初学者在心中建立一个计算技术大厦的整个体系——从沙到人工智能。我们希望教材要有体系,语言要严谨,但绝不能空洞乏味,更不能割裂知识与其原创者直觉之间的联系。 邮箱 zyubin AT gmail DOT com
个人分类: 未分类|6269 次阅读|1 个评论

Archiver|手机版|科学网 ( 京ICP备07017567号-12 )

GMT+8, 2024-6-17 21:39

Powered by ScienceNet.cn

Copyright © 2007- 中国科学报社

返回顶部