Designing a DCPU-16 emulator in Haskell: on determinism and I/O

CJefferson · on April 12, 2012

I suspect this is why I don't "understand" Haskell programmers (I do understand a reasonable amount of Haskell).

To me, it looks like you haven't "done" anything at all. You haven't emulated a single instruction. This part of the emulator in C++ would be a couple of structs and helper functions thrown at the top of the file.

As far as I can see, you've just written a bunch of very complicated code and monads, to get you to place would would have started at in a non-functional language. In that case, what exactly have you gained by using Haskell at all?

P.S. Sorry if this comment comes across as insulting / argumentative. I do not have the time to write it in a more pleasing way.

jaspervdj · on April 12, 2012

> To me, it looks like you haven't "done" anything at all. You haven't emulated a single instruction. This part of the emulator in C++ would be a couple of structs and helper functions thrown at the top of the file.

It seems there's a bit of a misunderstanding. The code in the blogpost consists of incomplete snippets, taken from the actual implementation, found in this repo [1], e.g. the emulator [2]. The reason why I have not included the entire codebase in the blogpost is conciseness; I only wanted to include what's necessary to illustrate the points I wanted to make. I realise this is perhaps a bit unclear and I'll update the post to state this more clearly.

> As far as I can see, you've just written a bunch of very complicated code and monads, to get you to place would would have started at in a non-functional language. In that case, what exactly have you gained by using Haskell at all?

As much as you would gain by implementing a basic skeleton in any language. In this case, what the blogpost focusses on, is that I have written two backends ways, the ST and the IO one.

The advantage of the ST backend is that is guaranteed to be deterministic (which is possible in Haskell): the code cannot perform any side effects visible to the external world.

The IO backend does not offer this guarantee, and needs to be able to communicate with the outside world in order to grab keyboard events and display video.

The advantage of using the monadic abstraction is that the actual emulator implementation is the same (completely shared code) for both these backends. This allows you to e.g., get deterministic results for tests, while still being able to support all features in the actual emulator binary.

The deterministic property is an advantage of using Haskell. I am sure an equivalent of the monadic abstraction could also be implemented using some OOP system.

> P.S. Sorry if this comment comes across as insulting / argumentative. I do not have the time to write it in a more pleasing way.

No insult taken.

[1]: https://github.com/jaspervdj/dcpu16-hs [2]: https://github.com/jaspervdj/dcpu16-hs/blob/master/src/Emula...

dan00 · on April 12, 2012

Hi Jasper,

well, I'm also guilty of a Haskell dcpu16 emulator ;).

The memory access using Memory should be pretty fast, but what is the overhead of the MonadEmulator type class, of their load/store functions?

I used a Vector of unboxed Word16 for the ram and the ST monad to modfiy them.

   import qualified Data.Vector.Unboxed as V
   import qualified Data.Vector.Unboxed.Mutable as VM
   
   set :: Integral a => V.Vector Word16 -> a -> Word16 -> V.Vector Word16
   set vector index value = runST (setM vector (fromIntegral index) value)
      where
         setM vec i val = do
            mVec <- V.unsafeThaw vec
            VM.write mVec i val
            V.unsafeFreeze mVec

Have you a feeling, can you estimate the performance difference of using a Vector like this and your Memory?

Greetings, Daniel

jaspervdj · on April 12, 2012

I'm pretty sure there will be no overhead from the type class when the appropriate SPECIALIZE and INLINE pragmas are added, and compiled with -O2. I haven't taken the time to optimize it yet because I haven't had performance problems so far.

Your Vector implementation should offer roughly the same performance. It might be an unnoticeable bit slower because Vector does bounds checking, and my Memory module doesn't (out-of-bounds access is impossible because of the ADT I added).

One thing I'd like to comment on is that I think you should revise your use of unsafeThaw/unsafeFreeze: you don't easily get guarantees of determinism when using these unsafe functions.

dan00 · on April 12, 2012

"One thing I'd like to comment on is that I think you should revise your use of unsafeThaw/unsafeFreeze: you don't easily get guarantees of determinism when using these unsafe functions."

Unfortunately that seems to be the only way to get a mutable Vector from a immutable one without copying it. As long as the mutable version of the vector isn't used anymore after calling set, it should be safe. But yes, guarantees are always nicer.

Nice to learn about the SPECIALIZE pragma. :)

dy9 · on April 12, 2012

Here's a simpler approach that also supports either ST or IO: https://github.com/qpliu/dcpu16/blob/master/DCPU16.hs

exDM69 · on April 12, 2012

Background: the ST monad is a way to construct a sandbox of "almost functional" code where you can have mutable variables that do not escape that sandbox.

This directly addresses the difficulties that mutability introduces to the interpretation and compilation of computer programs. You can find a nice discussion of these issues in Structure and Interpretation of Computer programs chapter "The Costs of Introducing Assignment" here: http://mitpress.mit.edu/sicp/full-text/book/book-Z-H-20.html...

PS. If I were to write a DCPU emulator in Haskell, it would probably be very different. I'd simply allocate the entire 16 bit address space worth of memory with a simple alloca and write a simple I/O loop with peeks and pokes. Data.Binary or Attoparsec for parsing instructions. Very similar to what I'd do if I wrote it in C. Pretty much the opposite of what's in the original article. And a lot more dangerous :)

batterseapower · on April 12, 2012

Simple things can be done simply in Haskell too. It is straightforward to write an emulator without any monads, typeclasses and so forth.

The reason the code here looks so complex is that Jaspers implementation adds additional abstractions so he can use the type system to distinguish between DCPU computations that can are pure (except insofar as they access DCPU memory) and those that are impure (in that they communicate with the outside world by e.g. writing to the console).

heretohelp · on April 12, 2012

I would very much like an answer to these questions.

I have been struggling to understand the appeal of all this Haskell-mania (I say this as a long-time Lisp coder) and only come up short for an explanation.

jerf · on April 12, 2012

Haskell only becomes interesting once you start using it for real code, and the characteristics people are talking start manifesting in your designs and code reuse. All the short snippets you've seen are trivially surface translatable to Lisp, after all. For that matter a lot of the "Oh! Haskell! Wow!" blog posts have code trivially translatable to Perl.

You may also find it interesting to start with this: http://learnyouahaskell.com/functors-applicative-functors-an... That's Learn You A Haskell, jumping you past the part of the book that is teaching simple first-order functional programming and cutting straight to the stuff that makes Haskell Haskell.

By my reckoning, despite superficial similarities and similar heritages, Lisp and Haskell are actually polar opposites at a philosophical level, so you will certainly encounter a lot of difficulty trying to understand Haskell through a Lisp lens. I go more into this at http://www.jerf.org/iri/post/2908 .

heretohelp · on April 12, 2012

>By my reckoning, despite superficial similarities and similar heritages, Lisp and Haskell are actually polar opposites at a philosophical level, so you will certainly encounter a lot of difficulty trying to understand Haskell through a Lisp lens.

That has been my suspicion and my experience.

ac · on April 12, 2012

Never understood the Lisp-mania: how could you like a language where you won't get a type error until after you run the program?

heretohelp · on April 12, 2012

You'd be amazed how little that matters in Lisp.

It bites me in Ruby/Python, but not in Lisp.

sthatipamala · on April 12, 2012

Could you explain why is that?

heretohelp · on April 13, 2012

Arbitrary interfacing in Ruby and Python, method_missing, TypeError, ValueError, operations, etc.

Most of the time, you're just map, car, cdr, mapcar, reduce, fold, etc in Lisp. Functional iteration/processing semantics are the default in Lisp. Common Lisp makes it VERY obvious when you're not using a standard data structure so it's hard to mess up.

Ruby and Python proliferate arbitrary Object-Oriented interfaces, combined with dynamic typing, it's easy to get the wrong instance of something-or-other and fuck it up.

Contrary to popular belief, Common Lisp code tends to be more consistent in terms of how you interact with your data, even if they go on flights of fancy re: Reader Macros.

Drbble · on April 13, 2012

That's dodging the point. If you forget whether your list A is an association list ((key value)) or a reverse association list ((value key)), your cars and cdrs won't crash the program, but your program will still fail to produce a correct result, and the source of the bug is a type error.

heretohelp · on April 13, 2012

The universe of common errors you can have in Common Lisp due to dynamic typing is constrained compared to Python and Ruby.

I say this as someone who works with Python daily as my working language and grew up with a copy of this:

http://www.cs.cmu.edu/afs/cs/Web/Groups/AI/lang/lisp/impl/cl...

I'm fully aware of the errors that can arise from dynamic typing in Common Lisp. It's just not my experience that it's that big of a deal.

The reverse association list is contrived compared to the issues one can have in Python or Ruby. Most people just use association lists the normal way and use rassoc as needed.

Lets not kid ourselves and pretend that a reversed assoc list is used in place of a normal rassoc or find :key #'cdr.

Drbble · on April 13, 2012

Marketing is amazing. Why is she everyone doing DCPU emulators and not universal machines from Cult of the Bound Variable?

zserge · on April 13, 2012

Writing emulators is fun. Remember all those Chip8 emulators, Brainfck machines, P-code machines? I think DCPU is a really nice RISC CPU, and without marketing - how many people would start writing ANY emulators or diving into assembler?