From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: From: Alexander Bokovoy To: mandrake-russian@altlinux.ru Message-ID: <20020118093207.GC7077@sam-solutions.net> Mail-Followup-To: mandrake-russian@altlinux.ru References: <20020116152759.F112B2E06@linux.ru.net> <41290127.20020116203430@otstavnov.com> <1011210965.3252.15.camel@rromas.rail.net.ru> <191387384.20020117105526@otstavnov.com> <3C46981C.9424149B@altlinux.ru> <7111133739.20020117133753@otstavnov.com> <3C46B5C9.2B4CE028@altlinux.ru> <8020193086.20020117160852@otstavnov.com> <3C46E448.EAF82670@altlinux.ru> <20020118062222.4002d7b5.a_prokudin@pub.tmb.ru> Mime-Version: 1.0 Content-Type: text/plain; charset=koi8-r Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20020118062222.4002d7b5.a_prokudin@pub.tmb.ru> Subject: [mdk-re] =?koi8-r?B?98nS1NXBzNjO2cUgzcHbyc7ZIChX?= =?koi8-r?B?YXM6IFJlOiBDRCBlamVjdCBpbiAgTmF1LCAgINHXzNHF1NPRIM/H0s/NztnN?= =?koi8-r?B?IMvPzsbMycvUz80g0w==?= automount.) Sender: mandrake-russian-admin@altlinux.ru Errors-To: mandrake-russian-admin@altlinux.ru X-BeenThere: mandrake-russian@altlinux.ru X-Mailman-Version: 2.0 Precedence: bulk Reply-To: mandrake-russian@altlinux.ru List-Help: List-Post: List-Subscribe: , List-Id: Linux-Mandrake RE / ALT Linux discussion list List-Unsubscribe: , List-Archive: Date: Fri Jan 18 12:31:01 2002 X-Original-Date: Fri, 18 Jan 2002 11:32:07 +0200 Archived-At: List-Archive: List-Post: On Fri, Jan 18, 2002 at 06:22:22AM +0300, Alexandre Prokoudine wrote: > C# - достаточно интересный язык сам по себе. Я ни в коем случае не > собираюсь начинать holy war по поводу языков. С моей стороны это > было бы в высшей степени непрофессионально :-) Я не удержусь и все-таки перешлю это старое письмо Дэвида Симмонса. Дэвид известен как один из лучших экспертов в области проектирования виртуальных машин для трансляторов с различных языков программирования. Фактически, этот человек стоял у истоков всех современных реализаций ускоряющих (just-in-time) трансляторов. Я "вырезал" куски дискуссии, не относящиеся к теме, которую я хочу продемонстрировать. Речь идет о вызове виртуальных методов и о скорости/правильности реализации этого процесса в C++ и Java по сравнению с динамическими языками вроде Smalltalk или Ruby. Особенно обратите внимание на фрагмент его ответа в предыдущем письме дискуссии, отмеченный ниже между символами (***). Собственно, все описание ниже есть развертывание того тезиса. Дэвид сейчас работает по контракту с Микрософт над усовершенствованием IL, внутреннего языка .NET, в который транслируются все остальные, в том числе и C#. Основная его задача -- научить IL работать с динамическими языками, в которых классы/объекты могут изменять себя в runtime. ----- Forwarded message from David Simmons ----- Date: Mon, 26 Nov 2001 08:05:26 +0900 From: "David Simmons" To: ruby-talk@ruby-lang.org (ruby-talk ML) Subject: [ruby-talk:26473] Re: Table: Ruby versus Smalltalk, Objective-C, C++, Java; > same in statical typing: http://www.eptacom.net/pubblicazioni/pub_eng/mdisp.html I was not aware that someone was "bothering" to use RTTI (or templates [macros]) to address this issue in C++. Given the importance of C++ I should hardly be surprised. However (in broad terms), this is the problem with implementing genericity in languages which only support static typing [as I alluded to in my previous post regarding the languages Java and C#]. As I said in my previous post, the ideal is a language which has both static and dynamic typing with overloading/multi-methods -- lacking that numerous issues crop up. This is just a "facet" of that generics issue. [s/.*//skip] > > > http://people.mandrakesoft.com/~prigaux/overloading2.java > > > > Sure, the example is clearly illustrating the problem. Static type binding > > of dynamic type information does not work. I.e., languages which only have > > static binding cannot provide proper semantics for method-implementation > > selection based on argument types. > > hum. C++ and Java do not have "only static binding". I think we both understand the issue. The problem here is that the "semantics" of the term dynamic-binding have been polluted/overloaded. So, as with many topics in comparing languages with regard to "static" and "dynamic" subjects, we quickly get bogged down in terminology as we try to use that terminology to explain/discuss concepts. If all parties don't have a clear concensus on the definitions of the terminology the discussions rapidly digress into areas that are the result of that lack of concensus. So, ignoring the terminology: C++ and Java "typically" (but not always) implement virtual functions using vtables. Where a vtable is an array of function pointers to virtual methods associated with a given type. These languages are not only statically typed, they are based on the concept of static binding of type information as well. In an effort to be "object oriented" [whatever that means] they attempt to provide [OO] polymorphism using vtables which are a weak/limited solution to providing polymorphism over one-type (the message receiver). For static binding based on type information, this means that at compile time the decision is made as to what method to invoke based on the types of the arguments. The partial-polymorphic-solution exception is built by assuming that all types are known at compile time. And that, therefore, tables can be constructed containing common methods for type-trees based on the "type of the receiver ". So one gets a binary-deployment-system that cannot break those compile-time rule of all types known at runtime [it is not dynamic/extensible at runtime and so is brittle], and which also cannot address the problem of polymorphism over the arguments to methods [which leads problems in implementing/supporting genericity, etc because the model is based entirely on static knowledge]. Some languages, like ML, have taken great strides to addressing this problem and ensuring correct behavior but I (currently) think it requires supporting both a static and dynamic type and binding model. So, back to exploring issues in the vtable approach. As mentioned before it is based/built-upon static knowledge of the available set of types. It's indirection mechanism is therefore based on static knowledge, which makes it a static (lookup table) binding mechanism. I.e., the vtable-index [a first-degree binding abstraction approach -- as opposed to a selector-object in dynamic language which is a second-degree binding abstraction] is statically bound into source and thus [among other things] precludes handling dynamic [binding impact] changes to the classes and methods. I.e., vtables are limited in two semantic ways and additionally in one technical way. 1. Semantically it only works for whole-cloth programs where no schema changes [in the binary/compiled-form] will occur to class, namespace, or methods available within the program [that's what makes them static languages]. Why? Because if such changes occur the vtable layouts may need to be changed, the call-sites that use indexed-lookup will have to have the indices changed [not even considering the issues of the methods themselves which have optimized-away/unfolded encapsulation of object structure, etc]. None of which is possible because all the necessary meta-information was thrown away at compile-time [many things are not objects, most things have no self-describing information (i.e., nominal to no reflection facilities) -- RTTI being C++'s only facility]. 2. In a unified OO view, there are no "pure-functions", because classes [or prototypes], namespaces, etc are all objects and functions are just methods on those classes [or prototypes] and namespaces. All operations semantically become about the messages that objects can understand (perform). The concerns about physical-layout/structure of a type vanish leaving us [humans] only concerned [in general] with the message vocabulary [i.e., behavior/interfaces]. The compilers are where the optimization and knowledge about physical representation become paramount for performance and external inter-op [and humans only care when they are meshing with the boundary points -- usually to create optimized small algorithms; an act which requires greater attention and focus than one would normally want to have to put into describing a problem to the computer via a programming language -- generally we want our programming language to be capable of being as transparent as possible so we can focus on the problem domain itself rather than the problem of describing the problem to the computer]. Parameterized types in object definitions are relevant to humans in terms of contractual (behavioral) adherance. To the compiler(s) they allow optimizations to be achieved for better performance and resource utilization [potentially as guided by a human author]. Which gets us back to genericity and best efforts at being informed "statically" about design time behavior-binding (type) errors in our use of contracts as opposed to runtime contractual behavior-binding (type) errors that can always be detected in a well designed language execution architecture. Which leads us into the discussion of genericity in-static-only-binding-type systems [which have the potential to yield optimum performance at a price in terms of correct behavior and/or expressiveness], in-dynamic-only-binding-type systems [which have the potential to yield correct at a price in terms of performance], and in systems that offer both forms of binding [which has the potential to yield both correct behavior and optimum performance]. Proponents of static-typing and correctness argue the virtues of design-time detection. Proponents of dynamic-typing argue the virtues of unit-testing and the dangers of reliance on static-checking of type-bindings as a substitute for verifying interop relations semantics inherent in any system of basic complexity. Both parties have valid and good points and neither are wrong in my view. Both are addressing the same problem space with different techniques; and both have been shown to eliminate the large proportion of defects [albeit leaving different types of defects undetected]. But, in my view, the significance is in the human "level of effort" factor in expressing an original problem, coming later to understand such a design, and later still working with an existing design to maintain or extend it. ** sigh, I seriously digressed [and now I will be in a deep rathole as people beat me up for errors and/or disagreements with my assertions] ** #2 is principally that given that it is desireable to discriminate a function's implementation based on the types of the arguments, vtable as a polymorphic solution offer no mechanism of support for dispatching to a type-specific implementation based on polymorphism amongst the arguments to a function/method. The general approach is to use hand-crafted secondary dispatch [which for arity-1/1-arg methods is called double-dispatch]. 3. The technical issue is that vtables are slower (on current hardware technology) than using adaptive jitting techniques with self-modifying code. Primarily due to the cost of indirectly accessing memory to obtain an address from a vtable which precludes processors from using eager/optimistic prediction; where accessing the memory also results in bouncing from L1-cache to L2-cache to primary memory [and that is increasingly expensive as the timing gap between those forms of memory increases -- similar in nature to the impact on design that one observed in algorithms designed based on the 50's-70's problem of tape, disk, ram gaps]. *. In a just-in-time binding approach pioneered in Smalltalk during the mid 80's, one uses self-modifying code and assumes that most call sites are not polymorphic [which is generally true]. If they are polymorphic, then that breaks down into the case of having a low-degree of polymorphism [say 2-10 types] and the case of having a large degree of polymorphism which literature in this area popularly terms mega-morphism. See OOPSLA papers from the mid-80's to early 90's. This technique was the basis for the design of the self-language, which in turn led to the design of the Animorphic HotSpot VM/Execution-Engine/Runtime technology for Smalltalk; which in turn was acquired by Sun to try and speed Java [JVM] up. What has not, to my knowledge, been applied is a more generalized solution of the same technique multiple-dispatch (overloading/mult-methods). I have done it for both my own general-purpose dynamic-language virtual machine (AOS Platform), and the Microsoft .NET platform (less-optimally through IL). I must presume that someone must have implemented my technique before; almost certainly someone in the lisp family community. However, it is unlikely that it has also been optimized for hi-performance execution with important newer binding predicates including sandboxing and selector namespaces. Providing hi-performance dynamically-dispatched-multi-methods has significant impact on scripting languages and their ability to evolve into full fledged languages that compete in features and performance with mainstream languages today such as C, C++, Java, [C# will soon be in this group if it is not already], Pascal-derivatives, etc. This is especially important because there are, by most accounts, 10-20 times more people and programs using scripting languages and techniques and that number is likely increasing as cost factors and training/experience come into play. (***) > > It is worse than just describing C++ like mechanism as "the vtable-trick". > > VTables are actually (demonstrably) slower than "true" (receiver only) > > dynamic-binding-dispatch mechanisms. This fact is (reasonably well) > > understood today. It is a reality that will increasingly be the case as long > > as the gap between processor core speeds and L2 cache and memory speeds > > continues to widen. > > > > It is fairly easy to illustrate on the Intel processor family. I've posted > > (on comp.lang.smalltalk within the last 12 months) at least two detailed > > explanations showing the machine instructions, cycle-times, and benchmarks. > > The originally published technique was developed by David Ungar and > > published in OOPSLA papers in the mid-80's. It has been a standard part of > > most jit-based Smalltalk implementations for the last ten years or so. (***) > > are you talking about http://www.sun.com/research/self/papers/type-feedback.html ? > > the idea is quite simple: specialize for a given type of object to allow > inlining. In that case, to know which specialization to have, run-time > feedback is used. This applies well to JITs of course. > > I don't see why it prooves that vtable is bad/slower. vtable can also benefit > from specialization. This is also why the default in C++ in no "virtual" > methods, so that performance is the best. > > This is a well known, no? I think, based on your comments, that you are not aware of what I have been doing for the last ten years or so. One of my professional areas of specialty is the design of virtual machines. Your comments sound like the kind of things I would have written as responses to you . As to the bad/slower, in the last 12 months, I wrote two different threads of discussion on this topic [in comp.lang.smalltalk] describing the instructions, cycles, and benchmark information. I also made reference to that fact somewhere in this current thread of discussion. Where I also mentioned that I had not published information on the much more general techniques (which I suspect I have pioneered) for hi-performance dynamic dispatch of common/important predicates with extensibility for general predicate dispatch through Dynamic-AOP/Managed-Object facilities in the object-model of the AOS Platform [the VM architecture I designed and have been evolving for the last ten years] as well as its related peer/work, an enabler for the Microsoft .NET platform. > > In the ML family, the same happens when going from polymorphic functions > (needing boxing) to monomorphic functions (with unboxed data) > see for example: > - the SPECIALIZE pragma in ghc http://www.haskell.org/ghc/docs/4.04/users_guide/users_guide-5.html) > - type-based unboxing (Leroy) http://citeseer.nj.nec.com/88305.html > > > > > However, I intentionally have not published information on techniques I have > > developed (on the AOS Platform VM) for hi-performance predicate based > > (incl - multi-method) dispatching. Especially with regard to implementation > > on the .NET platform, where I am still exploring with Microsoft [who needs > > this technology as much as Sun/Java does]. > > "Where Do You Want To Go Today?" ;p > > Microsoft has a *lot* of people working on languages. Hopefully many are > allowed to publish their work (and even release GPL apps) I know. As a 3rd party, I and others have been been fortunate to have the opportunity to interact [and influence] quite a few of them [it would be nice to be able to observe the same attitude and opportunity to have an influence on Sun folks]. Of course, Microsoft has a clear business objective, and I both hope and greatly fear they will be very successful in achieving based on their approach [the word trepidation has constantly been in my thoughts since I first got involved with Microsoft on the .NET project in 1999]. The real challenge for scripting and dynamic languages lies in the predominance and momentum of ideas and beliefs regarding type-theory and statically-typed languages; and the relative/disproproportionate lack thereof for dynamic languages (especially OO languages like Smalltalk, as opposed to functional dynamic languages like scheme). -- Dave S. [www.smallscript.com] ----- End forwarded message ----- -- / Alexander Bokovoy $ cat /proc/identity >~/.signature `Senior software developer and analyst for SaM-Solutions Ltd.` --- Nov 21 20:58:58 alconost kernel: VFS: Busy inodes after unmount. Self-destruct in 5 seconds. Have a nice day...