Studying Metal Slug's Engine

There's a Debug Menu in many titles from the Metal Slug series. Today I played around with that debug menu in Metal Slug X, especially I turned on 'body rect' and 'attack rect', which revealed how the game works. Now not only did I have tons of hours of fun, I also studied a lot from this amazing game!

Of course the following are only my observations, or my best guess on how they have been implemented the game. Although I think my interpretation should work, but it's definitely not the only way, and it's quite possible that the developers have achieved the same effect with another (maybe better) method.

First Impression

At the first glance, the debug drawing is far cleaner than I had expected, it seems that it's not tiled based physics at all. But note the thin shadowed area below the line, which may imply that it's still using tiled based calculation somehow.


The ground is described as line segments, from the way the shadow is drawn I think that those are directed edges, so we know that on which side shall we keep things. Actually I think that all the edges are half-platforms, will talk about that below.

 In this picture, an obstacle is mark with 3 line segments, clever & cute! 


Besides the edges, there are two types of rectangles, 'body rects' (marked with a pair of right angles) and 'attack rects' (marked with a pair of arrows).

All the rectangles are axis aligned, even if they disagree with the visual. The missiles in the above pictures are heading toward different directions, but their body rects are still AA.  Sometimes we need to approximate the shape with multiple rectangles, for example in the pictures below.

Two body rects for the truck

Note the attack rects of the missles
Note the attack rects on the boss, marking the effective area of the spikes.

My interpretation of the rectangles is that they are using for hitting detection. Attack rects mark the effective area of attacking, and body rects mark the effective area of receiving damages.
About the attack rect of the player, I think it marks the melee range.


Objects in Metal Slug can be roughly categorized into the following groups:

- Terrain: ground, slopes, platforms
- Mobs: infantry, vehicles, bullet, projectiles
- Attack areas: logical areas that may cause damage

So let's try to talk about all possible ways of collidings.

Terrain vs Mobs

Terrain collide with mobs only.  Since terrain is mostly marked as line segments, for each mob it is easy to find out which point on the ground is supporting it. One way of doing this is by casting a gravity direction ray right below the mob and find the first intersection point with the terrain.

How to find the intersection then? I guess the game is using the line sweep algorithm from left to right. A clue is that there are never vertical edges, which do not have upper side and lower side. Whenever the game wants to setup vertical obstacles, unit squires are used, as in below: 

The stacked squares are used to block the player. I think that it may be easier to implement this way than vertical line segments. Also I think the squares are not static body rects, why? see below.

Sometimes multiple squares are used, not sure why, maybe to avoid bugs? Also note that the body rect of the player is colliding with the them, so this proves that body rects are not used for terrain colliding detection. Instead I think that the foot point (don't know the actual term, I just mean the mid-point between the feet) is used.


The entrance of a cave, the line segments with squares around them are reversed platforms, they prevent user going up but not going down. This picture also proves that the foot point is used.

More examples:


A door.


There are still some questions left:
 - Are the line segments represented by two endpoints or tiles? Both might be possible, but tiles may be more consistent with the squares
 - Why using all those reversed platforms inside (the door and the cave) entrace? I think it's enough to use the lowest one only. One possible answer is for robust.

Finally, the joints:

When a slope intersects with horizontal edges, special care must be taken, as marked in the screenshot. Interesting.

Attack Areas vs Mobs

Attack areas only collide with mobs. It's pretty straightforward,  if a mob collides with an attack rect, it may receive damage - need more logic checking, e.g. friendly fire. There are also other cases like saving hostages. It's too intuitive to put more words here.

Mobs vs Mobs

In Metal Slug, usually the player doesn't collide with enemy infanty. But there are exceptions:

This is the rightmost position I can reach from the left side of mummies. Note that I can jump into them from above, but will get pushed away gently.

This is the leftmost position I can reach from the right side of zombie dogs. Note that in this case if I try to jump onto them, I will bounce up instead of fall through.

With tanks it seems to be more complicated, it behaves differently when I try to approach from left and right. I didn't figure out if it's using the foot point of the body rect. There could be another type of rectangle, or there's an offset on the x axis.


So these are what I've got so far. I don't know why I haven't though of this idea before.

The using of directed edges and foot point is quite effective, since it's point-line segment intersections, which is much faster than polygon intersectons, or even AABB. Tunneling effects can be detected this way. And jump platforms are natural to implement.

On the other hand, this design justifies for Metal Slug, but may not for other games. Because the body rects may collide with the terrain, which may look weird in pure tiled based games like Super Mario. In Metal Slug, however, it's ok, since the background is not tiled, and we can see 'side faces' of '3d objects'.

Metal Slug has always been one of my top favorite games. Now there's a new way to play with more fun! I'd like to play through the series again with all the rectangles turned on!

Rocket lawn-cher!
Enemeee chaser!
Mission Complete!



surface pro stylus calibration

Search for 'UserLinearityData' in HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet and remove 'devicekind=pen'

merge tool: meld

linux虚拟机的硬盘大小不够,扩容以后又调整了分区,于是grub就不能进入了。启动说unknown filesystem然后进入grub rescue。

在grub rescue输入

set prefix=(hd0,5)/boot/grub
insmod linux
linux (hd0,5)/boot/vmlinuz-3.11.0-15-generic
initrd (hd0,5)/boot/initrd.img-3.11.0-15-generic




Converting synced C/C++ into asynced JavaScript

Emscripten is a C/C++ -> LLVM -> JavaScript compiler. It's useful and interesting, but one of its biggest limitations is about the sync/async model: JavaScript is single-threaded event-driven languages, therefore you cannot do a sleep() in the hope of receiving and processing extra events.


First of all, please enjoy this demo, on which I really spent some time.


  • Motivation: to make `sleep` and other similar functions actually work in our Runtime, without lots of labor work porting the C/C++ code
  • I actually made it worked
  •  `sync->async` is doable and I think it's not too hard - there are already implementations for JS
  • I think this is a feature that cannot be supported by merely an external library
  • Does it fit emscripten? Or is it possible to write a plugin for emscripten?

Demo explained(?)

The demo is basically a stop-time animation:

//draw something
//wait for some while
//draw something else

You can verify this in the source code.

We all know that it is hard to convert it into the async model of JavaScript. Now would you please take a look at my ported code, it's almost identical to the original code, except for:
  • A few macros are introduced, which are defined in asyn2.h — actually the macros are pretty much just placeholders. 
  • A js function js_napms defined  — which is a wrapper of setTimeout

I fed emscripten with the code and blablabla — and the demo works. But wait! The code should be identical to the original code, which is synced!
Well, please let me explain a few more things before I reveal the secrets.

Another Demo

Here's another demo, which is... the same as above. So what's the deal?

We may imagine that, to really 'go to sleep', we need to store all the context and restore it when we come back again. Indeed, I did so in the source code, whenever you see a `ASYNC_` macro, it involves pushing and poping to maintain the async stack.

The actual functions behind those macros are defined in async.h.

Well, I'm NOT going to propose a set of API or a library, instead I'm proposing a way of pre-processing the code, and I did that myself manually. It's doable and there're patterns, you may see how a for-loop is broken down according to the comments. I'll put technical details in the end.

The porting experience may not be as smooth as it looks like, actually `xmas` is rather straightforward, where there are rarely recursive for-loops or branches. But if you take a look at other demos, it is a nightmare to define callbacks and maintain the stack manually, just imagine that there's no `call` and `ret ASM macros, and you have to do `push`, `pop` and `jump` manually.

My point is that: the sync->async process can, and should be done by the pre-processor/compiler

The Secrets of the 1st Demo

You didn't skip the previous section did you?

Actually I made the second demo at first, before I knew the the secret weapon — streamlinejs, and here is an intuitive demo.

It's not a library, but a parser/compiler instead. I didn't go too deep into its mechanism, but from the results it generated, the mechanism should be similar as what I'll mentioned below. You may read  this article for more details.

To build the first demo, all the placeholders are replace with underscores, which will be recognized by streamlinejs (as placeholders for calback), fortunately un-optimized JS generated by emscripten can be parsed without any problem — at lesat my demo.

Technical stuffs

Imagine that there a stack dedicated for async function calls, it is different from traditional stacks in that this stack is not cleared when a function exits.

Async function calls are different from (normal) sync funtion calls, an async call pushes the context into the async stack, including the callback (similar as the return address in the synced case) and returns. The central event dispatcher (the JS engine in our case) will call the callback eventually.

So the central idea is to identify all the async function calls, which are usually casuse by two reasons:

  • Calling an async function
  • `jump` over an async call

The first one should be easy: some functions are async natively, e.g. `SDL_Delay`. And if a function calls any other async funtions inside, it is async.

The second one is usually originated from loops and branches, which will be explained later.

I think that these can be identified by the compiler, in one of following stages:

- Pre-processing C/C++ — I did that manually myself
- LLVM bitcode — which I'm not so sure
- JavaScript — streamline itself is an example

There are advantages and disadvantages in different stages, for example it might be easier to optimize the code when parsing the C code; while it may be more light-weighted to store the local variables using JavaScript closures.

Identify and transfrom async functions

Here's an example:

// sync version
void work()
    int j = 99;
    printf("result %d\n", j);

Since SDL_Delay is natively async, we have to transform `work` into its async counterpart, as follows:

// async version
// context: stack for async calls
int work(context_t * context)
    int j = 99;

    push_callback(context, work__cb1);  // set up callback
    put_variable(context, j); // save local variables

    SDL_Delay(1000, context); // async version of SDL_Delay
    return 0; // get out and wait for SDL_Delay
int work__cb1(context_t * context)
    get_variable(context, j);
    pop(context);  // prepare to return the previous chained callback
    printf("result %d\n", j);

For-loops make the situation more complicated, which causes another type of async calls:

int f()
    for(int i = 0; i < 10; ++i)
        printf("hi ");
        printf("%d\n", i);

f() can be flattened as

int f()
   int i = 0;
   if(i >= 10) goto end;
   printf("%d\n", i);
   ++ i;
   goto start;
   // nothing

Now it is clear that we can split the function and make async calls

int f(context)
  int i = 0;
  // save i to the stack
  // async call f_start();
int f_start(context)
  // restore i
  //pop stack

  if(i >= 10) // async call f_end();

  printf("hi ");

  // save i
  // push f_start2 into the stack
  SDL_Delay(10, context);
  return 0;
int f_start2(context)
   // restore i
   //pop stack

   printf("%d\n", i);

  // push i
  // async call f_start() to continue the loop
   return 0;
int f_end(context)
   // pop stack
   // async call callback of f()

Braches (if, switch etc)  are similar, as long as we consider them as `goto`'s.

local variables and return values

local variables may be stored and retrieved when we push/pop the async stack,
and so are return values.

Compiler/Preprocessor Integration: Step 1

It should be clear now that this feature is kind of transformation, which cannot be supported by linking to an external library. Of course the pre-condition is that the transformation should be (almost) transparent, it should not be necessary for developers to maintain the stack manually.

The first step, I'd imagine, is that the async functions are explicitly marked through some mechanism. In my example, a placeholder is used.

Developers may still write programs in the sync fashion, for two reasons: one for the convenience writing new program, and the other for porting existing ones.

The compiler should detect, split and setup async functions automatically, the async stack should be managed by standard library while some API might be exposed.

There are two ways  of managing the local variables, let me call them the C style and the JavaScript style:

The C style: Local variables of async functions are stored in dedicated area in the memory (HEAP or a special stack for async functions), instead of the normal stack. To avoid lots of memcpy's, the variables may be directly allocate there. Some push/pop operations may be optimized if the caller/callee is known (e.g. loops/branches)

The JavaScript style: streamlinejs is a good example. Async functions are broken into a series of resursive functions, and local variables are stored into the closures.

The JavaScript style is easy and intuitive, but the hidden overhead might not be negligible. It may be too late to optimize when the LLVM bitcode have been transformed into JavaScript.

Compiler/Preprocessor Integration: Step 2

It might be possible to further reduce the work of writing/porting, as even marking async functions and define the placeholders for every async function declaration and every async function call is boring and error-prone.

My (naive & wild) imagination is that by defining a few essential async functions (such as SDL_Delay), the compiler would automatically recognize async functions, and set up the hidden parameter. It's not perfect, especially when we need to link a number libraries, but at least I think a C/C++ transformer would be possible and nice, perhaps based on LLVM?


  • It might not work for muti-threading. Indeed I've been only thinking about single-threaded programs, especially most ones for terminal — But this should not affect the importance of this issue I think.
  • Lots of overhead might be introduced in this way — But I guess the performance should not be affected much if well optimized
  • C++: ctr/copy/dectr  of objects might be a problem, or maybe not since they can be `flattened` into C-style?
  • C++: try-catch might not work, the control flow is already diverted
  • There a few limitations of streamlinejs, but I think many of them can be addressed if we process in the C phase.


Please follow this thread on GitHub.


杂记 2013.10.23

png优化工具:pngnq(有损)和 pngcrush(无损)

ubuntu gnome 13.10的input source里找不到输入法:手动修改dconf键值
,在数组里添加('ibus','sunpinyin')('ibus', 'anthy')但是mozc似乎不行,原因不明。观望。mozc在dconf里要打('ibus', 'mozc-jp')


Ubuntu 下 mount windows 共享目录


sudo mount -t cifs //hostname//folder /localfolder -o user=username

不过总是报 Input/ouput error

google了一阵,有的说加上sec=ntlm有的说加iocharset=utf8,不过都没用。其中加了后者还会出 Can not access a needed shared library 的错误。根据 http://wiki.openwrt.org/doc/howto/cifs.client   和 https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1067681, 安装了linux-image-extra-***的包以后才可以,但是mount还是不行。(我的linux是在虚拟器,装的最小内核linux-virtual)

于是尝试了一下smbclient,也是报错:tree connect failed: NT_STATUS_INSUFF_SERVER_RESOURCES 根据 http://klebom.net/melazy/tricks/ 查到是IRPStackSize过小的问题, 我在windows也确实查到了事件:
Log Name:      System
Source:        srv
Date:          2013-10-20 14:08:29
Event ID:      2011
Task Category: None
Level:         Error
Keywords:      Classic
User:          N/A
Computer:      ***
The server's configuration parameter "irpstacksize" is too small for the server to use a local device.  Please increase the value of this parameter.





Install Ubuntu 13.10 via Grub Legacy

I need to install Ubuntu 13.10 into an old Linux machine. The BIOS has been locked such that I cannot boot from CD or USB drives.

The machine has grub 0.97 and a pretty old CentOS installed on it, which I've got the root access, so I decided to try to load the ubuntu iso file with it.

I tried to enter the grub commands there, but then it failed with error like this:

stdin: error 0
/init: line 7: can't open /dev/sr0: No medium found
After spending hours in searching and trying, I found this url: http://askubuntu.com/questions/47076/usb-boot-problems
which solved the problem for me.

The full commands I used are:

root (hd1,0)
kernel /casper/vmlinuz.efi boot=casper live-media-path=/casper/ ignore_uuid
initrd /casper/initrd.lz

This link http://manpages.ubuntu.com/manpages/karmic/man7/casper.7.html
describes the parameters for casper

Screen Color Calibration

configure softwares
gimp: edit->preferences->color management->try to use system monitor profile
firefox: about:config -> gfx.color_management.mode -> 1
test pages

I bought a spyder 4 express and would like to try dispcalGUI on my surface pro, so I have to install Argyll CMS's driver, which is not signed.

For windows 8, one option to disable driver signature enforcement is through "advanced startup settings", which lists a few options after a restart, and you can press f7 or 7 to disable it.
However for my surface pro, it refused to recognize my keyboard, no matter it is a bluebooth keyboard, a usb wireless keyboard or a normal usb keyboard.
I guess a touch/type cover might work, but I don't have one.

Another option is to enable test mode (bcdedit -set testsigning on), before which you should also disable secure boot in the bios setting 
(shut down the device, hold the volumn up key before pressing the power button and release it when you see the surface logo)
However in this case, any signed driver could be installed, but unsigned drivers are still rejected.
For windows 7, there are words that you may use dseo13b to hack, but no luck for me on windows 8.

I had thought about using my Linux VM to do the calibration, which is running under Hyper-V, but unfortunately Hyper-V does not support USB forwarding.
I'd even thought about other VM softwares, but according to the Internet, it might not work. (because VM may not access the graphics card)

Another solution came into my mind is to boot up the device with a Linux on a USB stick, then I'll be able to do whatever I want. 
But there is only one USB port on Surface Pro, and I'm not sure if USB boot or Spyder would work behind a USB hub.

Finally I gave up.

Other possible solutions:
- Sign the driver with WDK, with your own certificates
- Install Linux and free the power of the device