Suzanne Cook - Developing the CLR, Part II

preview
Posted by The Channel 9 Team // Sun, Aug 21, 2005 5:15 PM

Suzanne talks about why she got into programming, what her favorite language is, and gives more insights into what it's like working on the CLR team.

If you missed it, here's part I.

Show: WM_IN

Tags:

Video Length: 00:20:54 Replies: 75 // Views: 76,529
  dotnetjunkie
 
 
  Mon, Feb 14 2005 8:56 PM

Does the tour continue in a Part III ? :-)



  Minh
  Does this make my head look fat?
 
  Mon, Feb 14 2005 10:14 PM
Do you forsee a day where JIT'ed code runs as fast as unmanaged C++ code? I mean, that's theoretically possible, right? Just a matter of how efficient the JIT compiler compares the the C++ compiler? And you guys are working on a MSIL CPU, right?


  dotnetjunkie
 
 
  Mon, Feb 14 2005 10:32 PM
I don't think that's an issue.  99% of managed code runs without any performance problems today.  And for the remaining few applications, you can always switch to faster hardware :).  Which doesn't mean that the .NET developers at Microsoft don't have to care about performance while coding of course!!! ;)

  dotnetjunkie
 
 
  Mon, Feb 14 2005 11:17 PM
LOL, is this an old homepage of Suzanne Cook?
http://www.cs.utah.edu/~scook/

:-)

Must be with all the geek stuff on it! :-)

It was funny to see that video clip on the front page! I've seen it on TV many times, it was made here in Belgium (where I live).


  Minh
  Does this make my head look fat?
 
  Mon, Feb 14 2005 11:43 PM
dotnetjunkie wrote:
I don't think that's an issue.  99% of managed code runs without any performance problems today.  And for the remaining few applications, you can always switch to faster hardware :).  Which doesn't mean that the .NET developers at Microsoft don't have to care about performance while coding of course!!! ;)
I think there's a real noticeable difference in GUI apps.


  Beer28
  I contend Channel9 is a covert research project
 
  Tue, Feb 15 2005 1:39 AM
Minh wrote:
Do you forsee a day where JIT'ed code runs as fast as unmanaged C++ code? I mean, that's theoretically possible, right? Just a matter of how efficient the JIT compiler compares the the C++ compiler? And you guys are working on a MSIL CPU, right?


I don't think it's possible, unless you have a homeade C++2ASM generator. Then you made your own homebrew style ASM2Opcode generator to complement that as well.

If your 2 instruction converters were really bad, then you could output more instructions net than managed easily.

A C or to less extent C++ asm generator is supposed to code assembly language like a master assembly programmer though, and they've been fine tuning it for years, so fat chance is my opinion. Plus you have fine tuning keywords like fastcall for passing in args through regs instead of the stack, and inlining ect...

Have you tried the gnu jcg compiler?

http://gcc.gnu.org/java/

If you compile managed code to native such as this compiler does, then through some optimizations you'd have a shot at writing in managed code and spitting out opcode as optimized as C. The whole "managing of the code" is going to make that impossible if you don't though.

As far as a "managed" cpu, your opcodes are actually managed inside the cpu you have now. There are several tiers of steps inside the pentium that break down your code, and throw exceptions and faults ect....

It is managed, but on a level that is too obscure for normal users.

If they made a CPU that took managed instructions, it wouldn't be managed or portable code anymore. That and the CLR inside the CPU that did the virtual stack for objects larger than 32/64 bit ect... and recognized object references would be horribly complex for hardware embedded software, could never be updated. You could possibly make it flashable, but can you imagine grandma flashing her CPU? **


The whole point behind managed code is that it's virtual. It's made for application level code, and coders who don't need the fine grain of tuning provided by direct instruction conversions to cpu readable code.

EDIT: Even there, it would be unmanaged at some point inside the CPU. It's been years since the Altair 8800 was released, but ironically, inside the CPU, as I've been reading, instructions are broken down to microops, and in the end that big bad pentium is just a really complex altair doing masking of 8-32 bit codes.

EDIT2: ...with branch prediction, out of order microop processing, and instruction caching.


  rhm
  Love the OS, hate the advocates.
 
  Tue, Feb 15 2005 7:03 AM
dotnetjunkie wrote:
LOL, is this an old homepage of Suzanne Cook?
http://www.cs.utah.edu/~scook/


Web stalker alert!


  Minh
  Does this make my head look fat?
 
  Tue, Feb 15 2005 11:56 AM
Beer28 wrote:

Have you tried the gnu jcg compiler?

http://gcc.gnu.org/java/


.NET has the equivalence in NGEN -- the Native Image Generator -- that compiles the entire .NET assembly right away to x86 code. If they can stick an optimized .NET compiler into NGEN, then it'd be darn close. Maybe that new Unified Compiler that Research is working on can be it.

Beer28 wrote:

If they made a CPU that took managed instructions, it wouldn't be managed or portable code anymore. That and the CLR inside the CPU that did the virtual stack for objects larger than 32/64 bit ect... and recognized object references would be horribly complex for hardware embedded software, could never be updated. You could possibly make it flashable, but can you imagine grandma flashing her CPU? **


OK, I'll let the Managed CPU idea go. How about a Managed Extension card. Maybe kind of like a Graphics Co-processor. Kinda like back in the days you'd buy Expanded Memory cards to run Lotus 1-2-3.



  PDoug
 
 
  Tue, Feb 15 2005 3:42 PM
dotnetjunkie wrote:
LOL, is this an old homepage of Suzanne Cook?
http://www.cs.utah.edu/~scook/

:-)

Must be with all the geek stuff on it! :-)

It was funny to see that video clip on the front page! I've seen it on TV many times, it was made here in Belgium (where I live).


Hey! That kid is me! At least that is the way I was when I was small. (This what my mother alleges. Of course I have no recollection of it.) :-)

  Beer28
  I contend Channel9 is a covert research project
 
  Tue, Feb 15 2005 3:13 PM
Minh wrote:
Beer28 wrote:

Have you tried the gnu jcg compiler?

http://gcc.gnu.org/java/


.NET has the equivalence in NGEN -- the Native Image Generator -- that compiles the entire .NET assembly right away to x86 code. If they can stick an optimized .NET compiler into NGEN, then it'd be darn close. Maybe that new Unified Compiler that Research is working on can be it.



You don't understand, gcj actually compiles to a native binary you can zip/tar and give to a friend and he can run it on his or her machine with the runtime "dll" just like you could with a visual basic 5-6 app by including msvbvm5-6.dll

If you ngen a program and copy it out of the assembly cache, which you must do in command.com incidentally because explorer shell will not let you navigate to it, it will not work.

NGEN does not produce redistributable binary files. The stuff in the NGEN file is made for the clr on that machine to reload when it starts it back up, it's not intellegable by the windows module loader, so if you zip it up and send it off to somebody with the same runtime of clr, and they double click it, it will not work.

gcj actually makes something tangibly reusable that you can distribute as a standalone with the support "dll", which in unix speak is a shared object file, .so.


  Minh
  Does this make my head look fat?
 
  Tue, Feb 15 2005 4:26 PM
Beer28 wrote:

You don't understand, gcj actually compiles to a native binary you can zip/tar and give to a friend and he can run it on his or her machine with the runtime "dll" just like you could with a visual basic 5-6 app by including msvbvm5-6.dll

Hmm... it seems I've misunderstood the function of NGEN.

Beer28 wrote:

gcj actually makes something tangibly reusable that you can distribute as a standalone with the support "dll", which in unix speak is a shared object file, .so.
Unfortunately, I don't have the time to commit to an entire new platform. For now, .NET is my koolaid.


  Beer28
  I contend Channel9 is a covert research project
 
  Tue, Feb 15 2005 6:24 PM
Minh wrote:
Beer28 wrote:

You don't understand, gcj actually compiles to a native binary you can zip/tar and give to a friend and he can run it on his or her machine with the runtime "dll" just like you could with a visual basic 5-6 app by including msvbvm5-6.dll

Hmm... it seems I've misunderstood the function of NGEN.


Alot of people that do misunderstand it think NGEN is actually like gcj, that it makes a native module that you can just copy out and put in an installer.

It doesn't do that. I'm not sure what the format is, if it's a mapped file dumped to the disk with VA's for call fixups or what but it's definately not runnable outside the cache or redistributable. That is one area where GNU has acceled past MS.


  rhm
  Love the OS, hate the advocates.
 
  Tue, Feb 15 2005 6:38 PM
It's actually not as big a deal as you think. See this.

There's a good reason MS doesn't support this kind of deployment. Applications deployed in this way won't get any hotfixes to the runtime or base class libraries. In fact people redistributing parts of the .NET framework using that tool are actually breaking the EULA.


  rhm
  Love the OS, hate the advocates.
 
  Tue, Feb 15 2005 6:44 PM
Now if I can mention something that's actually to do with the video.....

It was interesting to hear Suzanne talk about how/why she got into programming. I've wondered if I have a girl whether there was any chance she would be as geeky as her dad or whether she would rebel and refuse to use computers for anything other than communications/art/lifestyle etc. Note that I'm assuming any sons would automatically be geeks :)  They're all going to get Leapfrog stuff as soon as they can see and make coordinated hand movements.

Oh, and another thing from the video:  You mentioned Mono and it wasn't cut out by PR (do PR really review stuff posted by Microsofties?). That's as close to a public endorsement of the Mono project as I've seen yet :)


  Minh
  Does this make my head look fat?
 
  Tue, Feb 15 2005 6:50 PM
rhm wrote:
do PR really review stuff posted by Microsofties?
At my last job, every job posting must be made public first before a hire can be made (to prevent the Project Manager's cousin being the only applicant, I guess).


  Beer28
  I contend Channel9 is a covert research project
 
  Tue, Feb 15 2005 7:27 PM
rhm wrote:
It's actually not as big a deal as you think. See this.

There's a good reason MS doesn't support this kind of deployment. Applications deployed in this way won't get any hotfixes to the runtime or base class libraries. In fact people redistributing parts of the .NET framework using that tool are actually breaking the EULA.


You're actually mistaken, gcj isn't redistributing parts of the java runtime from Sun. Such as is with salamander.

It's actually compiling java to asm, as you would compile C++ to asm, then doing asm to opcodes and making an elf file, which is the unix/linux version of .exe

Salamander is a cheap trick next to gcj. To do the same on windows, somebody would have to make a C# compiler from scratch that compiles to real cpu ASM, not virtual asm codes, or a MSIL to asm compiler in the worst case scenario.

There is a huge gaping difference in your comparisons. You may have deployment situations where you want a certain functionality which has been depreciated or removed by a later version.

With the CLR, you never know which version a person will have on their machine. Say by the time the CLR 10 rolls around 25% of the CLR 1's functions have been depreciated or are gone. That's when your statically linked gcj shines. The OS at that time will come with the latest Virtual Machine, each additional will be a 20+ meg download. That's alot of versions by that time to load up for 1 program that has it as a dependancy.

Some functions in msvbvm5 were no longer present in msvbvm6, but if you included the msvbvm5.dll in your distribution, you could keep your code going for years in distro with no mods.

That is the power of binary images, loadable, usable binary images.
If you distribute the source, then that's another story.

Oh, that reminds me, with Java compiled to native instructions, you can NO LONGER USE djdecompiler, or .NET reflector or anything else.

Your code is now as hard to look at as C++ or C compiled code. Obfuscation won't fool good coders that use debuggers. When you get down to disassembled x86 or 64 code, that's a nighmare, unless you're looking for easily crackable progs, which have a bool type pass/fail which a cracker could no op through a check call or change a verification result.

So native code is also good because it's an added layer of protection against reverse engineering.

  rhm
  Love the OS, hate the advocates.
 
  Tue, Feb 15 2005 8:11 PM
Beer28 wrote:
rhm wrote:
It's actually not as big a deal as you think. See this.

There's a good reason MS doesn't support this kind of deployment. Applications deployed in this way won't get any hotfixes to the runtime or base class libraries. In fact people redistributing parts of the .NET framework using that tool are actually breaking the EULA.


You're actually mistaken, gcj isn't redistributing parts of the java runtime from Sun. Such as is with salamander.


You're the one who's mistaken, who doesn't read other people's posts properly before spouting complete crap. I'd like to be nice about this but you are proving to be a complete arse. You're assuming that because people don't agree with you that they don't understand. Take your head out of your arse for a while and actually read this reply done in the finest Usenet line-by-line nit-pick fashion and then if you feel the need to continue your misinformation campaign, do it in the knowlege that I won't be reading any of it.

Right, your first mistake is that I didn't say gcj was re-distributing anything from Sun. I said that Salamander users were likely re-distributing parts of the .NET framework which is something MS doesn't want them to do and not just for copyright reasons.

Beer28 wrote:

It's actually compiling java to asm, as you would compile C++ to asm, then doing asm to opcodes and making an elf file, which is the unix/linux version of .exe

Salamander is a cheap trick next to gcj. To do the same on windows, somebody would have to make a C# compiler from scratch that compiles to real cpu ASM, not virtual asm codes, or a MSIL to asm compiler in the worst case scenario.


What do you think ngen does? It compiles MSIL to actual platform dependant machine code and then puts it on disk just like gcj does. Now I know what you're thinking "but gcj really compiles it because it no longer needs a runtime present unlike ngen created executables". Well there's really nothing fundamentally different happening at all. Gcj's output requires a shared library to run. What do you think is in that library? All the same stuff as the ngen'd executables require from the .NET framework issthe answer. Code compiled by gcj still needs the use of a garbage collector as well other basic services mandated by the Java object model. It doesn't require a big framework to be present though because all the parts of the Java framework library that the program uses will have been compiled into the executable. Which is what salamander does for you. Between salamander and ngen you get exactly the same thing as you do with gcj.

Beer28 wrote:

There is a huge gaping difference in your comparisons. You may have deployment situations where you want a certain functionality which has been depreciated or removed by a later version.


No huge gaping difference, as explained above. Functionality doesn't just disappear from the .NET runtime. New versions are installed in parallel with old versions. If you're program requires a particular version of the framework it can be shipped with it or otherwise insist on it's presence. The only time the framework gets changed is for hotfixes - changes that do not affect functionality, generally only issued for security reasons.

Beer28 wrote:

With the CLR, you never know which version a person will have on their machine. Say by the time the CLR 10 rolls around 25% of the CLR 1's functions have been depreciated or are gone. That's when your statically linked gcj shines. The OS at that time will come with the latest Virtual Machine, each additional will be a 20+ meg download. That's alot of versions by that time to load up for 1 program that has it as a dependancy.


Lets say that in 10 CLR verions time you're still running an app that hasn't been updated for the latest framework or any version of the framework inbetween. You still only need that one version of the framework present to run that app. To get in a situation where you need 10 different frameworks installed at once you'd need to have obsolescent software written using every single one of those versions. I feel pretty confident that by the time MS have released  10 complete versions of the CLR that nobody will be that bothered by the disk space used. The nice thing is that the CLR is shared between apps (that target the same version), therefore you're not dealing with hulking great executables that have had everything statically linked into them. But if you want a hulking great exe with everything it needs either linked in or contained in the same folder, salamander does that for you.

Beer28 wrote:

Some functions in msvbvm5 were no longer present in msvbvm6, but if you included the msvbvm5.dll in your distribution, you could keep your code going for years in distro with no mods.


.NET doesn't have this problem as already explained. The right version of the CLR will load based on metadata in the executable.

Beer28 wrote:

That is the power of binary images, loadable, usable binary images.
If you distribute the source, then that's another story.


A lecture on "The power of binary images" from an open source advocate. You've got to love the irony.

Beer28 wrote:

Oh, that reminds me, with Java compiled to native instructions, you can NO LONGER USE djdecompiler, or .NET reflector or anything else.

Your code is now as hard to look at as C++ or C compiled code. Obfuscation won't fool good coders that use debuggers. When you get down to disassembled x86 or 64 code, that's a nighmare, unless you're looking for easily crackable progs, which have a bool type pass/fail which a cracker could no op through a check call or change a verification result.

So native code is also good because it's an added layer of protection against reverse engineering.


Don't kid yourself. The machine code generated by gcj is I'm willing to bet, very predictable. One thing that makes it much easier to decompile than C++ code is the presence of the class metadata needed to support garbage collection and late binding. Sure, it's an added layer of obscurity, but given that the sources of the compiler are available, I don't think it would take a decent programmer too long to write a decompiler for it. I was researching the decompilation of DOS binaries at university and there's plenty of stuff you can do even if you don't know the language the source was written in, never mind the compiler and the source for it. 


  Beer28
  I contend Channel9 is a covert research project
 
  Tue, Feb 15 2005 9:38 PM
"Which is what salamander does for you. Between salamander and ngen you get exactly the same thing as you do with gcj."

I disagree. distributing a gcj compiled app is no different than distributing an app compiled with glibc. There's the difference.

"To get in a situation where you need 10 different frameworks installed at once you'd need to have obsolescent software written using every single one of those versions"

With thousands of windows apps that can and probably will easily happen. Anybody know where I can get the VB3 runtime for this old app....

Niether do i.

"lecture on "The power of binary images" from an open source advocate. You've got to love the irony."

I'm just saying...

"Don't kid yourself. The machine code generated by gcj is I'm willing to bet, very predictable. One thing that makes it much easier to decompile than C++ code"

Hey, good luck, it's been 30 years since C has been compiled to asm then to opcodes and guess what?

No decompiler.

Sure you can debug in DASM, but that's for crackers not reverse engineers or copy/pasters. Even there, it's not something your average programmer can do with .NET reflector and a debugger. Mixing up symbols and using shortnames is a pale comparison to having real cpu binary instructions as far as obfuscation is concerned. As a matter i wouldn't even call the latter obfuscation because nobody has succeeded at decompiling it to any degree. Only disassembly




  Beer28
  I contend Channel9 is a covert research project
 
  Tue, Feb 15 2005 9:50 PM
and as if i had to say this

http://www.remotesoft.com/linker/


$500 min vs gcj - <monster garage voice>freebie</monster garage voice>

  Ovidiu.Platon
 
 
  Wed, Feb 16 2005 5:23 AM
Beer, I have a quick tip for you. Whenever you want to trollpost a reply, press Alt+F4 to get to the "Reply" page faster. It's optimized for you.

Everyone else, just use the regular UI.

  Beer28
  I contend Channel9 is a covert research project
 
  Wed, Feb 16 2005 12:50 PM
Ovidiu.Platon wrote:
Beer, I have a quick tip for you. Whenever you want to trollpost a reply, press Alt+F4 to get to the "Reply" page faster. It's optimized for you.

Everyone else, just use the regular UI.


oh snap!

linux managed compilers are more advanced than their windows counterpart and some people can't stand it, let's spread fud about gnu/free software


  Ovidiu.Platon
 
 
  Wed, Feb 16 2005 2:29 PM

Maybe we should take this offline, at least to let this thread go back on topic.



  Beer28
  I contend Channel9 is a covert research project
 
  Wed, Feb 16 2005 3:08 PM
back behind the jungle gym at 4:45, you're on buddy!


  Ovidiu.Platon
 
 
  Wed, Feb 16 2005 3:40 PM

You have my weblog URL. Bring a few friends to carry you back home, man!



Page 1 of 4 [76 total records]
Channel 9 Forums » The Videos » Suzanne Cook - Developing the CLR, Part II [1] 2 3 4 »